Gene Ontology Assisted Exploratory Microarray Clustering and Its Application to Cancer

  • Geoff Macintyre
  • James Bailey
  • Daniel Gustafsson
  • Alex Boussioutas
  • Izhak Haviv
  • Adam Kowalczyk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5265)


Gene expression profiling provides insight into the functions of genes at a molecular level. Clustering of gene expression profiles can facilitate the identification of the underlying driving biological program causing genes’ co-expression. Standard clustering methods, grouping genes based on similar expression values, fail to capture weak expression correlations potentially causing genes in the same biological process to be grouped separately. We have developed a novel clustering algorithm which incorporates functional gene information from the Gene Ontology into the clustering process, resulting in more biologically meaningfull clusters. We have validated our method using a multi-cancer microarray dataset. In addition, we show the potential of such methods for the exploration of cancer etiology.


Microarray Gene Ontology Clustering Cancer 


  1. 1.
    Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Schlicker, A., Domingues, F., Rahnenfuhrer, J., Lengauer, T.: A new measure for functional similarity of gene products based on gene ontology. BMC Bioinf. 7 (2006)Google Scholar
  3. 3.
    Zhang, P., Zhang, J., Sheng, H., Russo, J., Osborne, B., Buetow, K.: Gene functional similarity search tool (gfsst). BMC Bioinf. 7, 135 (2006)CrossRefGoogle Scholar
  4. 4.
    Cheng, J., Cline, M., Martin, J., Finkelstein, D., Awad, T., Kulp, D., Siani-Rose, M.A.: A knowledge-based clustering algorithm driven by gene ontology. J. Biopharm. Stat. 14, 687–700 (2004)CrossRefPubMedGoogle Scholar
  5. 5.
    Liu, J., Wang, W., Yang, J.: Gene ontology friendly biclustering of expression profiles. In: Proceedings of CSB 2004, pp. 436–447. IEEE, Los Alamitos (2004)Google Scholar
  6. 6.
    Huang, D., Pan, W.: Incorporating biological knowledge into distance-based clustering analysis of microarray gene expression data. Bioinf. 22, 1259–1268 (2006)CrossRefGoogle Scholar
  7. 7.
    Pan, W.: Incorporating gene functions as priors in model-based clustering of microarray gene expression data. Bioinf. 22, 795–801 (2006)CrossRefGoogle Scholar
  8. 8.
    Boratyn, G.M., Datta, S., Datta, S.: Incorporation of biological knowledge into distance for clustering genes. Bioinformation 1 (2007)Google Scholar
  9. 9.
    Castillo-Davis, C.I., Hartl, D.L.: Genemergepost-genomic analysis, data mining, and hypothesis testing. Bioinf. 19, 891–892 (2003)CrossRefGoogle Scholar
  10. 10.
    Al-Shahrour, F., et al.: Fatigo: a web tool for finding significant associations of gene ontology terms with groups of genes. Bioinf. 20, 578–580 (2004)CrossRefGoogle Scholar
  11. 11.
    Martin, D., Brun, C., Remy, E., Mouren, P., Thieffry, D., Jacq, B.: Gotoolbox: functional analysis of gene datasets based on gene ontology. Genome biology 5 (2004)Google Scholar
  12. 12.
    Lee, S.G., Hur, J.U., Kim, Y.S.: A graph-theoretic modeling on go space for biological interpretation of gene clusters. Bioinf. 20, 381–388 (2004)CrossRefGoogle Scholar
  13. 13.
    Alexa, A., Rahnenfuhrer, J., Lengauer, T.: Improved scoring of functional groups from gene expression data by decorrelating go graph structure. Bioinf. 22, 1600–1607 (2006)CrossRefGoogle Scholar
  14. 14.
    Zhong, S., Tian, L., Li, C., Storch, K.F., Wong, W.: Comparative analysis of gene sets in the gene ontology space under the multiple hypothesis testing framework. In: Proceedings of CSB 2004, pp. 425–435. IEEE, Los Alamitos (2004)Google Scholar
  15. 15.
    Tothill, R.W., et al.: An expression-based site of origin diagnostic method designed for clinical application to cancer of unknown origin. Cancer Res. 65, 4031–4040 (2005)CrossRefPubMedGoogle Scholar
  16. 16.
    Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. PNAS 95, 14863–14868 (1998)CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Gentleman, R., et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biology 5 (2004)Google Scholar
  18. 18.
    Su, A.I., et al.: Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res. 61, 7388–7393 (2001)PubMedGoogle Scholar
  19. 19.
    Ramaswamy, S., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. PNAS 98, 15149–15154 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Segal, E., Friedman, N., Koller, D., Regev, A.: A module map showing conditional activity of expression modules in cancer. Nat. Genet. 36, 1090–1098 (2004)CrossRefPubMedGoogle Scholar
  21. 21.
    Joshi, V.V.: Primary krukenberg tumor of ovary. review of literature and case report. Cancer 22, 1199–1207 (1968)CrossRefPubMedGoogle Scholar
  22. 22.
    Subramanian, A., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102, 15545–15550 (2005)CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Geoff Macintyre
    • 1
    • 2
  • James Bailey
    • 1
    • 2
  • Daniel Gustafsson
    • 4
  • Alex Boussioutas
    • 3
  • Izhak Haviv
    • 5
    • 6
  • Adam Kowalczyk
    • 2
  1. 1.Department of Computer Science and Software EngineeringUniversity of MelbourneVictoriaAustralia
  2. 2.National ICT AustraliaVictorian Research LabAustralia
  3. 3.Ian Potter Centre for Cancer Genomics and Predictive MedicinePeter MacCallum Cancer CentreEast MelbourneAustralia
  4. 4.Department of Computer Science and Computer EngineeringLa Trobe UniversityVictoriaAustralia
  5. 5.The Alfred Medical Research and Education PrecinctBaker Medical Research Institute, Epigenetics GroupMelbourneAustralia
  6. 6.Department of Biochemistry and Molecular BiologyUniversity of MelbourneVictoriaAustralia

Personalised recommendations