Improved Robustness in Time Series Analysis of Gene Expression Data by Polynomial Model Based Clustering

  • Michael Hirsch
  • Allan Tucker
  • Stephen Swift
  • Nigel Martin
  • Christine Orengo
  • Paul Kellam
  • Xiaohui Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4216)


Microarray experiments produce large data sets that often contain noise and considerable missing data. Typical clustering methods such as hierarchical clustering or partitional algorithms can often be adversely affected by such data. This paper introduces a method to overcome such problems associated with noise and missing data by modelling the time series data with polynomials and using these models to cluster the data. Similarity measures for polynomials are given that comply with commonly used standard measures. The polynomial model based clustering is compared with standard clustering methods under different conditions and applied to a real gene expression data set. It shows significantly better results as noise and missing data are increased.


Gene Expression Data Time Series Analysis Polynomial Model Improve Robustness Direct Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altman, D.G.: Practical Statistics for Medical Research. Chapman and Hall, Boca Raton (1997)Google Scholar
  2. 2.
    Bozdech, Z., Llinás, M., Pulliam, B.L., Wong, E.D., Zhu, J., DeRisi, J.L.: The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum. PLoS Biology 1, 85–100 (2003)CrossRefGoogle Scholar
  3. 3.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998)CrossRefGoogle Scholar
  4. 4.
    Hand, D.J., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)Google Scholar
  5. 5.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computer Surveys 32(3), 264–323 (1999)CrossRefGoogle Scholar
  6. 6.
    Kaufman, L., Rousseeuw, P.J.: Clustering by means of Medoids. In: Dodge, Y. (ed.) Statistical Data Analysis based on the L1-Norm, pp. 405–416. North-Holland, Amsterdam (1987)Google Scholar
  7. 7.
    Kellam, P., Liu, X., Martin, N., Orengo, C., Swift, S., Tucker, A.: Comparing, Contrasting and Combining Clusters in Viral Gene Expression Data. In: Proceedings of the IDAMAP 2001 Workshop, London, pp. 56–62 (2001)Google Scholar
  8. 8.
    Lichtenberg, G., Faisal, S., Werner, H.: Ein Ansatz zur dynamischen Modellierung der Genexpression mit Shegalkin-Polynomen (An Approach to Dynamic Modelling of Gene Expression by Zhegalkin Polynomials). at – Automatisierungstechnik 53(12), 589–596 (2005)CrossRefGoogle Scholar
  9. 9.
    Ralston, A.: A First Course in Numerical Analysis. McGraw-Hill, New York (1965)MATHGoogle Scholar
  10. 10.
    Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell 9, 3273–3297, URL:
  11. 11.
    Stekel, D.: Microarray Bioinformatics. Cambridge University Press, Cambridge (2003)CrossRefGoogle Scholar
  12. 12.
    Vinciotti, V., Liu, X., Turk, R., de Meijer, E.J., t’ Hoen, P.A.C.: Exploiting the full power of temporal gene expression profiling through a new statistical test: Application to the analysis of muscular dystrophy data. BMC Bioinformatics 7, 183 (2006)CrossRefGoogle Scholar
  13. 13.
    Wit, E., McClure, J.: Statistics for Microarrays. John Wiley, Chichester (2004)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Michael Hirsch
    • 1
  • Allan Tucker
    • 1
  • Stephen Swift
    • 1
  • Nigel Martin
    • 2
  • Christine Orengo
    • 3
  • Paul Kellam
    • 4
  • Xiaohui Liu
    • 1
  1. 1.School of Information Systems Computing and MathematicsBrunel UniversityUxbridgeUK
  2. 2.School of Computer Science and Information Systems BirkbeckUniversity of LondonLondonUK
  3. 3.Department of Biochemistry and Molecular BiologyUniversity College LondonLondonUK
  4. 4.Department of InfectionUniversity College LondonLondonUK

Personalised recommendations