Integrative Functional Analysis Improves Information Retrieval in Breast Cancer

  • Juan Cruz RodriguezEmail author
  • Germán González
  • Cristobal Fresno
  • Elmer A. Fernández
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9423)


Gene expression analysis does not end in a list of differentially expressed (DE) genes, but requires a comprehensive functional analysis (FA) of the underlying molecular mechanisms. Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology (GO) are the most used FA approaches. Several statistical methods have been developed and compared in terms of computational efficiency and/or appropriateness. However, none of them were evaluated from a biological point of view or in terms of consistency on information retrieval. In this context, questions regarding “are methods comparable?”, “is one of them preferable to the others?”, “how sensitive are they to different parameterizations?” All of them are crucial questions to face prior choosing a FA tool and they have not been, up to now, fully addressed.

In this work we evaluate and compare the effect of different methods and parameters from an information retrieval point of view in both GSEA and SEA under GO. Several experiments comparing breast cancer subtypes with known different outcome (i.e. Basal-Like vs. Luminal A) were analyzed. We show that GSEA could lead to very different results according to the used statistic, model and parameters. We also show that GSEA and SEA results are fairly overlapped, indeed they complement each other. Also an integrative framework is proposed to provide complementary and a stable enrichment information according to the analyzed datasets.


Breast Cancer Information Retrieval Differentially Express Enrichment Score Enrich Term 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Desmedt, C.: Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin. Cancer Res. 13, 3207–3214Google Scholar
  2. 2.
    Edelman, E.: Analysis of sample set enrichment scores: assaying the enrichment of sets of genes for individual samples in genome-wide expression profiles. Bioinformatics 22, 108–116Google Scholar
  3. 3.
    Ein-dor, L.: Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc. Natl. Acad. Sci. 103, 5923–5928Google Scholar
  4. 4.
    Fresno, C.: The multi-reference contrast method: facilitating set enrichment analysis. Comput. Biol. Med. 42, 188–194Google Scholar
  5. 5.
    Fresno, C.: RDAVIDWebService: a versatile R interface to DAVID. Bioinformatics 29, 2810–2811Google Scholar
  6. 6.
    Huang, D.W.: Bioinformatics enrichment tools: paths toward comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13Google Scholar
  7. 7.
    Maciejewski, H.: Gene set analysis methods: statistical models and methodological differences. Brief. Bioinform. 15, 504–518Google Scholar
  8. 8.
    McCarthy, D.J.: Testing significance relative to a fold-change threshold in a treat. Bioinformatics 25, 765–771Google Scholar
  9. 9.
    Minn, A.J.: Lung metastasis genes couple breast tumor size and metastatic spread. Proc. Natl. Acad. Sci. 104, 6740–6745Google Scholar
  10. 10.
    Miller, K.D.: Randomized phase III trial of capecitabine compared with bevacizumab plus capecitabine in patients with previously treated metastatic breast cancer. J. Clin. Oncol. 23, 792–799Google Scholar
  11. 11.
    Mishra, P.: Gene set analysis: limitations in popular existing methods and proposed improvements. Bioinformatics 30, 2747–2756Google Scholar
  12. 12.
    Mootha, V.K.: PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273Google Scholar
  13. 13.
    Parker, J.S.: Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160–1167Google Scholar
  14. 14.
    Schmidt, M.: The humoral immune system has a key prognostic impact in node-negative breast cancer. Cancer Res. 68, 5405–5413Google Scholar
  15. 15.
    Smyth, G.K.: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3, 1–25Google Scholar
  16. 16.
    Sotiriou, C.: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J. Natl. Cancer. Inst. 98, 262–272Google Scholar
  17. 17.
    Subramanian, A.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. 102, 15545–15550Google Scholar
  18. 18.
    Van De Vijver, M.J.: A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347 1999–2009Google Scholar
  19. 19.
    Van’t Veer, L.J.: Gene Expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536Google Scholar
  20. 20.
    Wang, Y.: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365, 671-679Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Juan Cruz Rodriguez
    • 1
    Email author
  • Germán González
    • 1
  • Cristobal Fresno
    • 1
  • Elmer A. Fernández
    • 1
  1. 1.CONICET-Universidad Católica de CórdobaCórdobaArgentina

Personalised recommendations