Breast Cancer Research and Treatment

, Volume 123, Issue 3, pp 725–731 | Cite as

An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients

  • Balazs GyörffyEmail author
  • Andras Lanczky
  • Aron C. Eklund
  • Carsten Denkert
  • Jan Budczies
  • Qiyuan Li
  • Zoltan Szallasi
Preclinical study


Validating prognostic or predictive candidate genes in appropriately powered breast cancer cohorts are of utmost interest. Our aim was to develop an online tool to draw survival plots, which can be used to assess the relevance of the expression levels of various genes on the clinical outcome both in untreated and treated breast cancer patients. A background database was established using gene expression data and survival information of 1,809 patients downloaded from GEO (Affymetrix HGU133A and HGU133+2 microarrays). The median relapse free survival is 6.43 years, 968/1,231 patients are estrogen-receptor (ER) positive, and 190/1,369 are lymph-node positive. After quality control and normalization only probes present on both Affymetrix platforms were retained (n = 22,277). In order to analyze the prognostic value of a particular gene, the cohorts are divided into two groups according to the median (or upper/lower quartile) expression of the gene. The two groups can be compared in terms of relapse free survival, overall survival, and distant metastasis free survival. A survival curve is displayed, and the hazard ratio with 95% confidence intervals and logrank P value are calculated and displayed. Additionally, three subgroups of patients can be assessed: systematically untreated patients, endocrine-treated ER positive patients, and patients with a distribution of clinical characteristics representative of those seen in general clinical practice in the US. Web address: We used this integrative data analysis tool to confirm the prognostic power of the proliferation-related genes TOP2A and TOP2B, MKI67, CCND2, CCND3, CCNDE2, as well as CDKN1A, and TK2. We also validated the capability of microarrays to determine estrogen receptor status in 1,231 patients. The tool is highly valuable for the preliminary assessment of biomarkers, especially for research groups with limited bioinformatic resources.


Survival analysis Breast cancer Prognosis 



B.G. was supported by a Bolyai fellowship. Z.S. was supported by the Breast Cancer Research Foundation.


  1. 1.
    Amat S, Penault-Llorca F, Cure H et al (2002) Scarff-Bloom-Richardson (SBR) grading: a pleiotropic marker of chemosensitivity in invasive ductal breast carcinomas treated by neoadjuvant chemotherapy. Int J Oncol 20:791–796PubMedGoogle Scholar
  2. 2.
    Ravdin PM, Siminoff LA, Davis GJ et al (2001) Computer program to assist in making decisions about adjuvant therapy for women with early breast cancer. J Clin Oncol 19:980–991PubMedGoogle Scholar
  3. 3.
    Olivotto IA, Bajdik CD, Ravdin PM et al (2005) Population-based validation of the prognostic model ADJUVANT! for early breast cancer. J Clin Oncol 23:2716–2725CrossRefPubMedGoogle Scholar
  4. 4.
    Harris L, Fritsche H, Mennel R et al (2007) American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. J Clin Oncol 25:5287–5312CrossRefPubMedGoogle Scholar
  5. 5.
    Paik S, Shak S, Tang G et al (2004) A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351:2817–2826CrossRefPubMedGoogle Scholar
  6. 6.
    Draghici S, Khatri P, Eklund AC et al (2006) Reliability and reproducibility issues in DNA microarray measurements. Trends Genet 22:101–109CrossRefPubMedGoogle Scholar
  7. 7.
    Shi L, Reid LH, Jones WD et al (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24:1151–1161CrossRefPubMedGoogle Scholar
  8. 8.
    Gyorffy B, Molnar B, Lage H et al (2009) Evaluation of microarray preprocessing algorithms based on concordance with RT-PCR in clinical samples. PLoS One 4:e5645CrossRefPubMedGoogle Scholar
  9. 9.
    Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481CrossRefGoogle Scholar
  10. 10.
    Colozza M, Azambuja E, Cardoso F et al (2005) Proliferative markers as prognostic and predictive tools in early breast cancer: where are we now? Ann Oncol 16:1723–1739CrossRefPubMedGoogle Scholar
  11. 11.
    Tan PK, Downey TJ, Spitznagel EL Jr et al (2003) Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res 31:5676–5684CrossRefPubMedGoogle Scholar
  12. 12.
    Györffy B, Schäfer R (2009) Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients. Breast Cancer Res Treat 118(3):433–441CrossRefPubMedGoogle Scholar
  13. 13.
    Gautier L, Cope L, Bolstad BM et al (2004) affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20:307–315CrossRefPubMedGoogle Scholar
  14. 14.
    Sims AH, Smethurst GJ, Hey Y et al (2008) The removal of multiplicative, systematic bias allows integration of breast cancer gene expression datasets—improving meta-analysis and prediction of prognosis. BMC Med Genomics 1:42CrossRefPubMedGoogle Scholar
  15. 15.
    Gong Y, Yan K, Lin F et al (2007) Determination of oestrogen-receptor status and ERBB2 status of breast carcinoma: a gene-expression profiling study. Lancet Oncol 8:203–211CrossRefPubMedGoogle Scholar
  16. 16.
    Dunnwald LK, Rossing MA, Li CI (2007) Hormone receptor status, tumor characteristics, and prognosis: a prospective cohort of breast cancer patients. Breast Cancer Res 9:R6CrossRefPubMedGoogle Scholar
  17. 17.
    Lacroix M, Querton G, Hennebert P et al (2001) Estrogen receptor analysis in primary breast tumors by ligand-binding assay, immunocytochemical assay, and northern blot: a comparison. Breast Cancer Res Treat 67:263–271CrossRefPubMedGoogle Scholar
  18. 18.
    Pusztai L, Ayers M, Stec J et al (2003) Gene expression profiles obtained from fine-needle aspirations of breast cancer reliably identify routine prognostic markers and reveal large-scale molecular differences between estrogen-negative and estrogen-positive tumors. Clin Cancer Res 9:2406–2415PubMedGoogle Scholar
  19. 19.
    Paik S, Tang G, Shak S et al (2006) Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J Clin Oncol 24:3726–3734CrossRefPubMedGoogle Scholar
  20. 20.
    Darb-Esfahani S, Wirtz RM, Sinn BV, Budczies J, Noske A, Weichert W, Faggad A, Scharff S, Sehouli J, Oskay-Ozcelik G, Zamagni C, De Iaco P, Martoni A, Dietel M, Denkert C (2009) Estrogen receptor 1 mRNA is a prognostic factor in ovarian carcinoma: determination by kinetic PCR in formalin-fixed paraffin-embedded tissue. Endocr Relat Cancer 16(4):1229–1239CrossRefPubMedGoogle Scholar
  21. 21.
    Bos PD, Zhang XH, Nadal C et al (2009) Genes that mediate breast cancer metastasis to the brain. Nature 459:1005–1009CrossRefPubMedGoogle Scholar
  22. 22.
    Desmedt C, Giobbie-Hurder A, Neven P et al (2009) The Gene expression Grade Index: a potential predictor of relapse for endocrine-treated breast cancer patients in the BIG 1–98 trial. BMC Med Genomics 2:40CrossRefPubMedGoogle Scholar
  23. 23.
    Zhang Y, Sieuwerts AM, McGreevy M et al (2009) The 76-gene signature defines high-risk patients that benefit from adjuvant tamoxifen therapy. Breast Cancer Res Treat 116:303–309CrossRefPubMedGoogle Scholar
  24. 24.
    Schmidt M, Bohm D, von TC et al (2008) The humoral immune system has a key prognostic impact in node-negative breast cancer. Cancer Res 68:5405–5413CrossRefPubMedGoogle Scholar
  25. 25.
    Loi S, Haibe-Kains B, Desmedt C et al (2008) Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. BMC Genomics 9:239CrossRefPubMedGoogle Scholar
  26. 26.
    Desmedt C, Piette F, Loi S et al (2007) Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res 13:3207–3214CrossRefPubMedGoogle Scholar
  27. 27.
    Loi S, Haibe-Kains B, Desmedt C et al (2007) Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. J Clin Oncol 25:1239–1246CrossRefPubMedGoogle Scholar
  28. 28.
    Minn AJ, Gupta GP, Padua D et al (2007) Lung metastasis genes couple breast tumor size and metastatic spread. Proc Natl Acad Sci USA 104:6740–6745CrossRefPubMedGoogle Scholar
  29. 29.
    Ivshina AV, George J, Senko O et al (2006) Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res 66:10292–10301CrossRefPubMedGoogle Scholar
  30. 30.
    Miller LD, Smeds J, George J et al (2005) An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci USA 102:13550–13555CrossRefPubMedGoogle Scholar
  31. 31.
    Sotiriou C, Wirapati P, Loi S et al (2006) Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst 98:262–272CrossRefPubMedGoogle Scholar
  32. 32.
    Wang YX, Klijn JGM, Zhang Y et al (2005) Gene-expression pro-files to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365:671–679PubMedGoogle Scholar
  33. 33.
    Pawitan Y, Bjohle J, Amler L et al (2005) Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res 7:R953–R964CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2009

Authors and Affiliations

  • Balazs Györffy
    • 1
    Email author
  • Andras Lanczky
    • 1
    • 2
  • Aron C. Eklund
    • 3
  • Carsten Denkert
    • 4
  • Jan Budczies
    • 4
  • Qiyuan Li
    • 3
  • Zoltan Szallasi
    • 3
    • 5
  1. 1.Joint Research Laboratory of the Hungarian Academy of Sciences and the Semmelweis UniversityBudapestHungary
  2. 2.Pazmany Peter UniversityBudapestHungary
  3. 3.Center for Biological Sequence AnalysisTechnical University of DenmarkLyngbyDenmark
  4. 4.Charité UniversitaetsmedizinBerlinGermany
  5. 5.Children’s Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology (CHIP@HST)Harvard Medical SchoolBostonUSA

Personalised recommendations