A New Profile Alignment Method for Clustering Gene Expression Data

  • Ataul Bari
  • Luis Rueda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4013)


We focus on clustering gene expression temporal profiles, and propose a novel, simple algorithm that is powerful enough to find an efficient distribution of genes over clusters. We also introduce a variant of a clustering index that can effectively decide upon the optimal number of clusters for a given dataset. The clustering method is based on a profile-alignment approach, which minimizes the mean-square-error of the first order differentials, to hierarchically cluster microarray time-series data. The effectiveness of our algorithm has been tested on datasets drawn from standard experiments, showing that our approach can effectively cluster the datasets based on profile similarity.


Feature Vector Correlation Method Validity Index Hierarchical Agglomerative Cluster Correlation Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brazma, A., Vilo, J.: Gene expression data analysis. FEBS Lett. 480, 17–24 (2000)CrossRefGoogle Scholar
  2. 2.
    Bréhélin, L.: Clustering Gene Expression Series with Prior Knowledge. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 27–38. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P., Herskowitz, I.: The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998)CrossRefGoogle Scholar
  4. 4.
    Drăghici, S.: Data Analysis Tools for DNA Microarrays. Chapman & Hall, Boca Raton (2003)CrossRefGoogle Scholar
  5. 5.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. John Wiley and Sons, Inc., New York (2000)Google Scholar
  6. 6.
    Eisen, M., Spellman, P., Brown, P., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. In: Proc. Natl Acad. Sci., USA, vol. 95, pp. 14863–14868 (1998)Google Scholar
  7. 7.
    Heyer, L., Kruglyak, S., Yooseph, S.: Exploring expression data: identification and analysis of coexpressed genes. Genome Res. 9, 1106–1115 (1999)CrossRefGoogle Scholar
  8. 8.
    Iyer, V., Eisen, M., Ross, D., Schuler, G., Moore, T., Lee, J., Trent, J., Staudt Jr., L., Hudson, J., Boguski, M.: The transcriptional program in the response of human fibroblasts to serum. Science 283, 83–87 (1999)CrossRefGoogle Scholar
  9. 9.
    Maulik, U., Bandyopadhyay, S.: Performance Evaluation of Some Clustering Algorithms and Validity Indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1650–1654 (2002)CrossRefGoogle Scholar
  10. 10.
    Peddada, S., Lobenhofer, E., Li, L., Afshari, C., Weinberg, C., Umbach, D.: Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 19(7), 834–841 (2003)CrossRefGoogle Scholar
  11. 11.
    Rueda, L., Bari, A.: Clustering Microarray Time-Series Data Using a Mean-Square-Error Profile Alignment Algorithm (submitted for publication), Electronically available at:
  12. 12.
    Sherlock, G.: Analysis of large-scale gene expression data. Curr. Opin. Immunol. 12, 201–205 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ataul Bari
    • 1
  • Luis Rueda
    • 2
  1. 1.School of Computer ScienceUniversity of WindsorWindsorCanada
  2. 2.Department of Computer ScienceUniversity of ConcepciónConcepciónChile

Personalised recommendations