Breast Cancer Research and Treatment

, Volume 118, Issue 3, pp 433–441 | Cite as

Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients

  • Balazs Györffy
  • Reinhold Schäfer
Preclinical Study


The transcriptome of breast cancers have been extensively screened with microarrays and large sets of genes associated with clinical features have been established. The aim of this study was to validate original gene sets on a large cohort of raw breast cancer microarray data with known clinical follow-up. We recovered 20 publications and matched them to Affymetrix HGU133A annotations. Raw Affymetrix HGU133A microarray data were extracted from GEO and MAS5 normalized. For classifying patients using the selected gene sets, we applied prediction analysis of microarrays and constructed Kaplan–Meier plots. A new classification including all patients was generated using supervised principal components analysis. Seven studies including 1,470 patients were downloaded from GEO. Notably, we uncovered 641 microarrays representing 251 individual tumor specimens among them, which were repeatedly described under independent GEO identifiers. We excluded all redundant data and used the remaining 1,079 samples. Eight of the 20 gene sets were able to predict response at a significance of P < 0.05. The discrimination of good and poor prognosis groups exclusively relying on gene expression data resulted in high significance (P = 1.8E−12). A model including genes fitted by both gene expression and clinical covariates (lymph node status and grade) contains 44 genes and can predict response at P = 9.5E−7. The outcome provides a ranking of the gene lists regarding applicability on an independent dataset. We established a consensus predictor combining the available clinical and gene expression data. The database comprising expression profiles of 1,079 breast cancers can be used to classify individual patients.


Microarray Gene expression signature Breast cancer prognosis Bioinformatics 



This study was supported by a Bolyai fellowship to BG.

Supplementary material

10549_2008_242_MOESM1_ESM.pdf (69 kb)
Supplemental Table 1 (PDF 69 kb)
10549_2008_242_MOESM2_ESM.txt (234.9 mb)
Supplemental Table 2 (TXT 240574 kb)
10549_2008_242_MOESM3_ESM.pdf (272 kb)
Supplemental Table 3 (PDF 273 kb)
10549_2008_242_MOESM4_ESM.pdf (138 kb)
Supplemental Table 4 (PDF 139 kb)
10549_2008_242_MOESM5_ESM.txt (22.3 mb)
Supplemental Table 5 (TXT 22795 kb)
10549_2008_242_MOESM6_ESM.txt (68 kb)
Supplemental Table 6 (TXT 69 kb)


  1. 1.
    Surowiak P, Materna V, Gyorffy B et al (2006) Multivariate analysis of oestrogen receptor alpha, pS2, metallothionein and CD24 expression in invasive breast cancers. Br J Cancer 95:339–346. doi: 10.1038/sj.bjc.6603254 CrossRefPubMedGoogle Scholar
  2. 2.
    Sorlie T, Perou CM, Tibshirani R et al (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 98:10869–10874. doi: 10.1073/pnas.191367098 CrossRefPubMedGoogle Scholar
  3. 3.
    Sotiriou C, Wirapati P, Loi S et al (2006) Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 98:262–272PubMedGoogle Scholar
  4. 4.
    Hu Z, Fan C, Oh DS et al (2006) The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 7:96. doi: 10.1186/1471-2164-7-96 CrossRefPubMedGoogle Scholar
  5. 5.
    Ivshina AV, George J, Senko O et al (2006) Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res 66:10292–10301. doi: 10.1158/0008-5472.CAN-05-4414 CrossRefPubMedGoogle Scholar
  6. 6.
    Petersen OW, Hoyer PE, van DB (1987) Frequency and distribution of estrogen receptor-positive cells in normal, nonlactating human breast tissue. Cancer Res 47:5748–5751PubMedGoogle Scholar
  7. 7.
    Kuukasjarvi T, Kononen J, Helin H et al (1996) Loss of estrogen receptor in recurrent breast cancer is associated with poor response to endocrine therapy. J Clin Oncol 14:2584–2589PubMedGoogle Scholar
  8. 8.
    Gruvberger S, Ringner M, Chen Y et al (2001) Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res 61:5979–5984PubMedGoogle Scholar
  9. 9.
    Paik S, Shak S, Tang G et al (2004) A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351:2817–2826. doi: 10.1056/NEJMoa041588 CrossRefPubMedGoogle Scholar
  10. 10.
    Loi S, Haibe-Kains B, Desmedt C et al (2007) Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. J Clin Oncol 25:1239–1246. doi: 10.1200/JCO.2006.07.1522 CrossRefPubMedGoogle Scholar
  11. 11.
    Oh DS, Troester MA, Usary J et al (2006) Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers. J Clin Oncol 24:1656–1664. doi: 10.1200/JCO.2005.03.2755 CrossRefPubMedGoogle Scholar
  12. 12.
    West M, Blanchette C, Dressman H et al (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98:11462–11467. doi: 10.1073/pnas.201162998 CrossRefPubMedGoogle Scholar
  13. 13.
    Ransohoff DF (2004) Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev Cancer 4:309–314. doi: 10.1038/nrc1322 CrossRefPubMedGoogle Scholar
  14. 14.
    Buyse M, Loi S, van’t Veer L et al (2006) Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst 98:1183–1192PubMedGoogle Scholar
  15. 15.
    Michiels S, Koscielny S, Hill C (2005) Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365:488–492. doi: 10.1016/S0140-6736(05)17866-0 CrossRefPubMedGoogle Scholar
  16. 16.
    Ioannidis JP (2005) Microarrays and molecular research: noise discovery? Lancet 365:454–455PubMedGoogle Scholar
  17. 17.
    Naderi A, Teschendorff AE, Barbosa-Morais NL et al (2007) A gene-expression signature to predict survival in breast cancer across independent data sets. Oncogene 26:1507–1516. doi: 10.1038/sj.onc.1209920 CrossRefPubMedGoogle Scholar
  18. 18.
    Tibshirani R, Hastie T, Narasimhan B et al (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99:6567–6572. doi: 10.1073/pnas.082099299 CrossRefPubMedGoogle Scholar
  19. 19.
    Bair E, Tibshirani R (2004) Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2:E108. doi: 10.1371/journal.pbio.0020108 CrossRefPubMedGoogle Scholar
  20. 20.
    Lauss M, Kriegner A, Vierlinger K et al (2008) Consensus genes of the literature to predict breast cancer recurrence. Breast Cancer Res Treat 110:235–244. doi: 10.1007/s10549-007-9716-3 CrossRefPubMedGoogle Scholar
  21. 21.
    Gormley M, Dampier W, Ertel A et al (2007) Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets. BMC Bioinformatics 8:415. doi: 10.1186/1471-2105-8-415 CrossRefPubMedGoogle Scholar
  22. 22.
    Ntzani EE, Ioannidis JP (2003) Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment. Lancet 362:1439–1444. doi: 10.1016/S0140-6736(03)14686-7 CrossRefPubMedGoogle Scholar
  23. 23.
    Simon R, Radmacher MD, Dobbin K et al (2003) Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 95:14–18PubMedCrossRefGoogle Scholar
  24. 24.
    Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97:77–87. doi: 10.1198/016214502753479248 CrossRefGoogle Scholar
  25. 25.
    Shi L, Reid LH, Jones WD et al (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24:1151–1161. doi: 10.1038/nbt1239 CrossRefPubMedGoogle Scholar
  26. 26.
    Pawitan Y, Bjohle J, Amler L et al (2005) Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res 7:R953–R964. doi: 10.1186/bcr1325 CrossRefPubMedGoogle Scholar
  27. 27.
    Wang YX, Klijn JGM, Zhang Y et al (2005) Gene-expression pro-files to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365:671–679PubMedGoogle Scholar
  28. 28.
    Miller LD, Smeds J, George J et al (2005) An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci USA 102:13550–13555. doi: 10.1073/pnas.0506230102 CrossRefPubMedGoogle Scholar
  29. 29.
    Desmedt C, Piette F, Loi S et al (2007) Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res 13:3207–3214. doi: 10.1158/1078-0432.CCR-06-2765 CrossRefPubMedGoogle Scholar
  30. 30.
    van ‘t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536. doi: 10.1038/415530a CrossRefPubMedGoogle Scholar
  31. 31.
    van de Vijver MJ, He YD, van ’t Veer LJ et al (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347:1999–2009. doi: 10.1056/NEJMoa021967 CrossRefPubMedGoogle Scholar
  32. 32.
    Sotiriou C, Neo SY, McShane LM et al (2003) Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci USA 100:10393–10398. doi: 10.1073/pnas.1732912100 CrossRefPubMedGoogle Scholar
  33. 33.
    Ma XJ, Salunga R, Tuggle JT et al (2003) Gene expression profiles of human breast cancer progression. Proc Natl Acad Sci USA 100:5974–5979. doi: 10.1073/pnas.0931261100 CrossRefPubMedGoogle Scholar
  34. 34.
    Ramaswamy S, Ross KN, Lander ES et al (2003) A molecular signature of metastasis in primary solid tumors. Nat Genet 33:49–54. doi: 10.1038/ng1060 CrossRefPubMedGoogle Scholar
  35. 35.
    Huang E, Cheng SH, Dressman H et al (2003) Gene expression predictors of breast cancer outcomes. Lancet 361:1590–1596. doi: 10.1016/S0140-6736(03)13308-9 CrossRefPubMedGoogle Scholar
  36. 36.
    Chang JC, Wooten EC, Tsimelzon A et al (2003) Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 362:362–369. doi: 10.1016/S0140-6736(03)14023-8 CrossRefPubMedGoogle Scholar
  37. 37.
    Foekens JA, Atkins D, Zhang Y et al (2006) Multicenter validation of a gene expression-based prognostic signature in lymph node-negative primary breast cancer. J Clin Oncol 24:1665–1671. doi: 10.1200/JCO.2005.03.9115 CrossRefPubMedGoogle Scholar
  38. 38.
    Paik S, Tang G, Shak S et al (2006) Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J Clin Oncol 24:3726–3734. doi: 10.1200/JCO.2005.04.7985 CrossRefPubMedGoogle Scholar
  39. 39.
    Teschendorff AE, Miremadi A, Pinder SE et al (2007) An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer. Genome Biol 8:R157. doi: 10.1186/gb-2007-8-8-r157 CrossRefPubMedGoogle Scholar
  40. 40.
    Korkola JE, Blaveri E, DeVries S et al (2007) Identification of a robust gene signature that predicts breast cancer outcome in independent data sets. BMC Cancer 7:61. doi: 10.1186/1471-2407-7-61 CrossRefPubMedGoogle Scholar
  41. 41.
    Sorlie T, Tibshirani R, Parker J et al (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA 100:8418–8423. doi: 10.1073/pnas.0932692100 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2008

Authors and Affiliations

  1. 1.Research Group for Pediatrics and NephrologyHungarian Academy of Sciences and Semmelweis UniversityBudapestHungary
  2. 2.Children’s Hospital Boston Informatics ProgramHarvard-MIT Health Sciences and TechnologyBostonUSA
  3. 3.Laboratory of Molecular Tumor Pathology and Laboratory of Functional GenomicsCharité, Universitätsmedizin BerlinBerlinGermany

Personalised recommendations