Abstract
The transcriptome of breast cancers have been extensively screened with microarrays and large sets of genes associated with clinical features have been established. The aim of this study was to validate original gene sets on a large cohort of raw breast cancer microarray data with known clinical follow-up. We recovered 20 publications and matched them to Affymetrix HGU133A annotations. Raw Affymetrix HGU133A microarray data were extracted from GEO and MAS5 normalized. For classifying patients using the selected gene sets, we applied prediction analysis of microarrays and constructed Kaplan–Meier plots. A new classification including all patients was generated using supervised principal components analysis. Seven studies including 1,470 patients were downloaded from GEO. Notably, we uncovered 641 microarrays representing 251 individual tumor specimens among them, which were repeatedly described under independent GEO identifiers. We excluded all redundant data and used the remaining 1,079 samples. Eight of the 20 gene sets were able to predict response at a significance of P < 0.05. The discrimination of good and poor prognosis groups exclusively relying on gene expression data resulted in high significance (P = 1.8E−12). A model including genes fitted by both gene expression and clinical covariates (lymph node status and grade) contains 44 genes and can predict response at P = 9.5E−7. The outcome provides a ranking of the gene lists regarding applicability on an independent dataset. We established a consensus predictor combining the available clinical and gene expression data. The database comprising expression profiles of 1,079 breast cancers can be used to classify individual patients.
Similar content being viewed by others
References
Surowiak P, Materna V, Gyorffy B et al (2006) Multivariate analysis of oestrogen receptor alpha, pS2, metallothionein and CD24 expression in invasive breast cancers. Br J Cancer 95:339–346. doi:10.1038/sj.bjc.6603254
Sorlie T, Perou CM, Tibshirani R et al (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA 98:10869–10874. doi:10.1073/pnas.191367098
Sotiriou C, Wirapati P, Loi S et al (2006) Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 98:262–272
Hu Z, Fan C, Oh DS et al (2006) The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 7:96. doi:10.1186/1471-2164-7-96
Ivshina AV, George J, Senko O et al (2006) Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res 66:10292–10301. doi:10.1158/0008-5472.CAN-05-4414
Petersen OW, Hoyer PE, van DB (1987) Frequency and distribution of estrogen receptor-positive cells in normal, nonlactating human breast tissue. Cancer Res 47:5748–5751
Kuukasjarvi T, Kononen J, Helin H et al (1996) Loss of estrogen receptor in recurrent breast cancer is associated with poor response to endocrine therapy. J Clin Oncol 14:2584–2589
Gruvberger S, Ringner M, Chen Y et al (2001) Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res 61:5979–5984
Paik S, Shak S, Tang G et al (2004) A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351:2817–2826. doi:10.1056/NEJMoa041588
Loi S, Haibe-Kains B, Desmedt C et al (2007) Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. J Clin Oncol 25:1239–1246. doi:10.1200/JCO.2006.07.1522
Oh DS, Troester MA, Usary J et al (2006) Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers. J Clin Oncol 24:1656–1664. doi:10.1200/JCO.2005.03.2755
West M, Blanchette C, Dressman H et al (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci USA 98:11462–11467. doi:10.1073/pnas.201162998
Ransohoff DF (2004) Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev Cancer 4:309–314. doi:10.1038/nrc1322
Buyse M, Loi S, van’t Veer L et al (2006) Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. J Natl Cancer Inst 98:1183–1192
Michiels S, Koscielny S, Hill C (2005) Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365:488–492. doi:10.1016/S0140-6736(05)17866-0
Ioannidis JP (2005) Microarrays and molecular research: noise discovery? Lancet 365:454–455
Naderi A, Teschendorff AE, Barbosa-Morais NL et al (2007) A gene-expression signature to predict survival in breast cancer across independent data sets. Oncogene 26:1507–1516. doi:10.1038/sj.onc.1209920
Tibshirani R, Hastie T, Narasimhan B et al (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA 99:6567–6572. doi:10.1073/pnas.082099299
Bair E, Tibshirani R (2004) Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2:E108. doi:10.1371/journal.pbio.0020108
Lauss M, Kriegner A, Vierlinger K et al (2008) Consensus genes of the literature to predict breast cancer recurrence. Breast Cancer Res Treat 110:235–244. doi:10.1007/s10549-007-9716-3
Gormley M, Dampier W, Ertel A et al (2007) Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets. BMC Bioinformatics 8:415. doi:10.1186/1471-2105-8-415
Ntzani EE, Ioannidis JP (2003) Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment. Lancet 362:1439–1444. doi:10.1016/S0140-6736(03)14686-7
Simon R, Radmacher MD, Dobbin K et al (2003) Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 95:14–18
Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97:77–87. doi:10.1198/016214502753479248
Shi L, Reid LH, Jones WD et al (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24:1151–1161. doi:10.1038/nbt1239
Pawitan Y, Bjohle J, Amler L et al (2005) Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res 7:R953–R964. doi:10.1186/bcr1325
Wang YX, Klijn JGM, Zhang Y et al (2005) Gene-expression pro-files to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365:671–679
Miller LD, Smeds J, George J et al (2005) An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci USA 102:13550–13555. doi:10.1073/pnas.0506230102
Desmedt C, Piette F, Loi S et al (2007) Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res 13:3207–3214. doi:10.1158/1078-0432.CCR-06-2765
van ‘t Veer LJ, Dai H, van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536. doi:10.1038/415530a
van de Vijver MJ, He YD, van ’t Veer LJ et al (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347:1999–2009. doi:10.1056/NEJMoa021967
Sotiriou C, Neo SY, McShane LM et al (2003) Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci USA 100:10393–10398. doi:10.1073/pnas.1732912100
Ma XJ, Salunga R, Tuggle JT et al (2003) Gene expression profiles of human breast cancer progression. Proc Natl Acad Sci USA 100:5974–5979. doi:10.1073/pnas.0931261100
Ramaswamy S, Ross KN, Lander ES et al (2003) A molecular signature of metastasis in primary solid tumors. Nat Genet 33:49–54. doi:10.1038/ng1060
Huang E, Cheng SH, Dressman H et al (2003) Gene expression predictors of breast cancer outcomes. Lancet 361:1590–1596. doi:10.1016/S0140-6736(03)13308-9
Chang JC, Wooten EC, Tsimelzon A et al (2003) Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 362:362–369. doi:10.1016/S0140-6736(03)14023-8
Foekens JA, Atkins D, Zhang Y et al (2006) Multicenter validation of a gene expression-based prognostic signature in lymph node-negative primary breast cancer. J Clin Oncol 24:1665–1671. doi:10.1200/JCO.2005.03.9115
Paik S, Tang G, Shak S et al (2006) Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J Clin Oncol 24:3726–3734. doi:10.1200/JCO.2005.04.7985
Teschendorff AE, Miremadi A, Pinder SE et al (2007) An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer. Genome Biol 8:R157. doi:10.1186/gb-2007-8-8-r157
Korkola JE, Blaveri E, DeVries S et al (2007) Identification of a robust gene signature that predicts breast cancer outcome in independent data sets. BMC Cancer 7:61. doi:10.1186/1471-2407-7-61
Sorlie T, Tibshirani R, Parker J et al (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA 100:8418–8423. doi:10.1073/pnas.0932692100
Acknowledgment
This study was supported by a Bolyai fellowship to BG.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Györffy, B., Schäfer, R. Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients. Breast Cancer Res Treat 118, 433–441 (2009). https://doi.org/10.1007/s10549-008-0242-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10549-008-0242-8