Co-expression Gene Discovery from Microarray for Integrative Systems Biology

  • Yutao Ma
  • Yonghong Peng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)


Advance of high-throughput technologies, such as the microarray and mass spectrometry, has provided an effective approach for the development of systems biology, which aims at understanding the complex functions and properties of biological systems and processes. Revealing the functional correlated genes with co-expression pattern from microarray data allows us to infer the transcriptional regulatory networks and perform functional annotation of genes, and has become one vital step towards the implementation of integrative systems biology. Clustering is particularly useful and preliminary methodology for the discovery of co-expressed genes, for which many conventional clustering algorithms developed in the literature can be potentially useful. However, due to existing large amount of noise and a variety of uncertainties in the microarray data, it is vital important to develop techniques which are robust to noise and effective to incorporate user-specified objectives and preference. For this particular purpose, this paper presented a Genetic Algorithm (GA) based hybrid method for the co-expression gene discovery, which intends to extract the gene groups that have maximal dissimilarity between groups and maximal similarity within a group. The experimental results show that the proposed algorithm is able to extract more meaningful, sensible and significant co-expression gene groups than the traditional clustering methods such as the K-means algorithm. Besides presenting the proposed hybrid GA-based clustering algorithm for co-expression gene discovery, this paper introduces a new framework of integrative systems biology employed in our current research.


Hybrid Genetic Algorithm Gene Coexpression Network Gene Expression Data Analysis Sufficient Similarity Functional Related Gene 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Tanay, A., Steinfeld, I., Kupiec, M., Shamir, R.: Integrative analysis of genome-wide experiments in the context of a large high-throughput data compendium. Molecular Systems Biology (published online March 29, 2005)Google Scholar
  2. 2.
    Imbeaud, S., Auffray, C.: Functional Annotation: Extracting functional and regulatory order from microarrays. Molecular Systems Biology (published online May 25, 2005)Google Scholar
  3. 3.
    Zhou, X.J., Kao, M.C., et al.: Functional annotation and network reconstruction through cross-platform integration of microarray data. Nat. Biotechnol. 23, 238–243 (2005)CrossRefGoogle Scholar
  4. 4.
    Jordan, I.K., et al.: Conservation and co-evolution in the scale-free human gene coexpression network. Mol. Biol. Evol. 21, 2058–2070 (2004)CrossRefGoogle Scholar
  5. 5.
    Grigorov, M.G.: Global properties of biological networks. Drug Discov. Today 10, 365–372 (2005)CrossRefGoogle Scholar
  6. 6.
    Famili, A.F., Liu, G., Liu, Z.: Evaluation and Optimization of Clustering in Gene Expression Data Analysis. Bioinformatics 20, 1535–1545 (2004)CrossRefGoogle Scholar
  7. 7.
    Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Trans. Knowledge and Data Engineering 16, 1370–1386 (2004)CrossRefGoogle Scholar
  8. 8.
    Bandyopadhyay, S., Maulik, U.: Genetic Clustering for Automatic Evolution of Clusters and Application to Image Classification. Pattern Recognition 35, 1197–1208 (2002)CrossRefMATHGoogle Scholar
  9. 9.
    Povinelli, R.J., Feng, X.: A New Temporal Pattern Identification Method for Characterization and Prediction of Complex Time Series Events. IEEE Transactions on Knowledge and Data Engineering 15, 339–352 (2003)CrossRefGoogle Scholar
  10. 10.
    Golub, T.R., Slonim, D.K.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  11. 11.
    Maulik, U., Bandyopadhyay, S.: Genetic Algorithm Based Clustering Technique. Pattern Recognition 33, 1455–1465 (2000)CrossRefGoogle Scholar
  12. 12.
    Maulik, U.: Performance Evaluation of Some Clustering Algorithms and Validity Indices. IEEE trans. Pattern Analysis and Machine Intelligence 24, 1650–1654 (2002)CrossRefGoogle Scholar
  13. 13.
    Famili, A.F., Liu, G., Liu, Z.: Evaluation and Optimization of Clustering in Gene Expression Data Analysis. Bioinformatics 20, 1535–1545 (2004)CrossRefGoogle Scholar
  14. 14.
    Lukashin, A.V., Fuchs, R.: Analysis of Temporal Gene Expression Profiles: Clustering by Simulated Annealing and Determining The Optimal Number of Clusters. Bioinformatics 17, 405–414 (2001)CrossRefGoogle Scholar
  15. 15.
    Arima, C., Hanai, T.: Gene Expression Analysis Using Fuzzy K-Means Clustering. Genome Informatics 14, 334–335 (2003)Google Scholar
  16. 16.
    Eisen, M.B., Spellman, P.T., Brown, P.O.: David Boststein, Cluster Analysis and Display of Genome-wide Expression Patterns. Proceedings of the National Academy of Sciences of the United States of America 95, 14863–14868 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yutao Ma
    • 1
  • Yonghong Peng
    • 1
  1. 1.Department of ComputingUniversity of BradfordWest YorkshireUK

Personalised recommendations