Hierarchical Signature Clustering for Time Series Microarray Data

Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 696)


Existing clustering techniques provide clusters from time series microarray data, but the distance metrics used lack interpretability for these types of data. While some previous methods are concerned with matching levels, of interest are genes that behave in the same manner but with varying levels. These are not clustered together using an Euclidean metric, and are indiscernible using a correlation metric, so we propose a more appropriate metric and modified hierarchical clustering method to highlight those genes of interest. Use of hashing and bucket sort allows for fast clustering and the hierarchical dendrogram allows for direct comparison with easily understood meaning of the distance. The method also extends well to use k-means clustering when a desired number of clusters are known.


Gene pattern discovery and identification Microarrays 


  1. 1.
    Spellman, P.T. et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccaromyces cerevisiae by microarray hybridization. Mol. Biol. Cell, 9, 3273–3297.PubMedGoogle Scholar
  2. 2.
    Eisen, M.B. et al. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Nat’l Acad. Sci. USA, 95(25):14863-8.CrossRefGoogle Scholar
  3. 3.
    Zou, M. and Conzen, S.D. (2005) A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, 21, 71–79.PubMedCrossRefGoogle Scholar
  4. 4.
    Hartigan, J.A. and Wong, M.A. (1979) A k-means clustering algorithm. Appl. Stat., 28, 100–108.CrossRefGoogle Scholar
  5. 5.
    Bhattacharya, A. and De, R.K. (2008) Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles. Bioinformatics, 24, 1359–1366.PubMedCrossRefGoogle Scholar
  6. 6.
    Kim, J. and Kim H. (2008) Clustering of change patterns using Fourier coefficients. Bioinformatics, 24, 184–191.PubMedCrossRefGoogle Scholar
  7. 7.
    Park, T. et al. (2003) Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics, 19, 694–703.PubMedCrossRefGoogle Scholar
  8. 8.
    Ernst, J. et al. (2005) Clustering short time series gene expression data. Bioinformatics, 21, 159–168.CrossRefGoogle Scholar
  9. 9.
    Phang T.L., Neville, M.C., Rudolph, M. and Hunter, L. (2003) Trajectory clustering: a non-parametric method for grouping gene expression time courses, with applications to mammary development. Pacific Symposium on Biocomputing, 351–362.Google Scholar
  10. 10.
    Dobosiewicz, W. (1978) Sorting by Distributive Partition. Information Processing Letters, 7, 1–6.CrossRefGoogle Scholar
  11. 11.
    Bréhélin, L., Gascuel1 O. and Martin O. (2008) Using repeated measurements to validate hierarchical gene clusters. Bioinformatics, 24, 682–688.Google Scholar
  12. 12.
    Alabady, M.S., Youn, E. and Wilkins, T.A. (2008) Double feature selection and cluster analyses in mining of microarray data from cotton. BMC Genomics, 9, 295.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Department of Computer ScienceTexas Tech UniversityLubbockUSA

Personalised recommendations