Information Systems Frontiers

, Volume 19, Issue 6, pp 1233–1241 | Cite as

Revealing determinant factors for early breast cancer recurrence by decision tree

  • Jimin Guo
  • Benjamin C. M. Fung
  • Farkhund Iqbal
  • Peter J. K. Kuppen
  • Rob A. E. M. Tollenaar
  • Wilma E. Mesker
  • Jean-Jacques Lebrun


Early breast cancer recurrence is indicative of poor response to adjuvant therapy and poses threats to patients’ lives. Most existing prediction models for breast cancer recurrence are regression-based models and difficult to interpret. We apply a Decision Tree algorithm to the clinical information of a cohort of non-metastatic invasive breast cancer patients, to establish a classifier that categorizes patients based on whether they develop early recurrence and on similarities of their clinical and pathological diagnoses. The classifier predicts for whether a patient developed early disease recurrence; and is estimated to be about 70% accurate. For an independent validation cohort of 65 patients, the classifier predicts correctly for 55 patients. The classifier also groups patients based on intrinsic properties of their diseases; and for each subgroup lists the disease characteristics in a hierarchal order, according to their relevance to early relapse. Overall, it identifies pathological nodal stage, percentage of intra-tumor stroma and components of TGFβ-Smad signaling pathway as highly relevant factors for early breast cancer recurrence. Since most of the disease characteristics used by this classifier are results of standardized tests, routinely collected during breast cancer diagnosis, the classifier can easily be adopted in various research and clinical settings.


Breast cancer Recurrence Decision tree Classifier Stroma TGFβ 



We would like to thank Drs. C. C. Engels, J. W. T. Dekker and E. M. de Kruijf for conducting immunohistochemistry staining, evaluating stroma percentage and recording original data; and Drs. A. Dibrov and Catalin Mihalcioiu for valuable discussions. J. Guo is supported by a Traineeship from the Breast Cancer Research Program of Congressionally Directed Medical Research Program (CDMRP). B. C. M. Fung is a Canada Research Chair in Data Mining for Cybersecurity. J.-J. Lebrun is a Sir William Dawson Research Chair of McGill University. This work was supported in part by grants from the Canadian Institutes for Health Research (CIHR) (fund codes 230670 and 233716 to J.-J. Lebrun), the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grants (fund code 356065-2013 to B. C. M. Fung), Canada Research Chairs Program (fund code 950-230623 to B. C. M. Fung), and Zayed University Research Incentive Fund and Research Cluster Award (fund codes R15048 and R16083 to F. Iqbal and B. C. M. Fung).


J. Guo, B. C. M. Fung and J.-J. Lebrun designed the study, analyzed and interpreted the results. F. Iqbal participated in interpreting the results. P. J. K. Kuppen, R. A. E. M. Tollenaar and W. E. Mesker collected patient samples and designed the tumor tissue microarrays.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Ahn, S., Cho, J., Sung, J., Lee, J. E., Nam, S. J., Kim, K. M., & Cho, E. Y. (2012). The prognostic significance of tumor-associated stroma in invasive breast carcinoma. Tumour biology : the Journal of the International Society for Oncodevelopmental Biology and Medicine, 33(5), 1573–1580.CrossRefGoogle Scholar
  2. Aubele, M., Auer, G., Voss, A., Falkmer, U., Rutquist, L., & Hofler, H. (1995). Disease-free survival of node-positive breast-cancer patients - improved prognostication by cytometric parameters. Pathology, Research and Practice, 191(10), 982–990.CrossRefGoogle Scholar
  3. Barton, S., Zabaglo, L., A'Hern, R., Turner, N., Ferguson, T., O'Neill, S., Hills, M., Smith, I., & Dowsett, M. (2012). Assessment of the contribution of the IHC4+C score to decision making in clinical practice in early breast cancer. British Journal of Cancer, 106(11), 1760–1765.CrossRefGoogle Scholar
  4. Brewster, A. M., Hortobagyi, G. N., Broglio, K. R., Kau, S. W., Santa-Maria, C. A., Arun, B., Buzdar, A. U., Booser, D. J., Valero, V., Bondy, M., & Esteva, F. J. (2008). Residual risk of breast cancer recurrence 5 years after adjuvant therapy. Journal of the National Cancer Institute, 100(16), 1179–1183.CrossRefGoogle Scholar
  5. Campbell, H. E., Gray, A. M., Harris, A. L., Briggs, A. H., & Taylor, M. A. (2010). Estimation and external validation of a new prognostic model for predicting recurrence-free survival for early breast cancer patients in the UK. British Journal of Cancer, 103(6), 776–786.CrossRefGoogle Scholar
  6. Carlson, R. (2010). Surveillance of patients following primary therapy. In Diseases of the breast, 4 edn. Lippincott Williams and Wilkins.Google Scholar
  7. de Kruijf, E. M., van Nes, J. G., van de Velde, C. J., Putter, H., Smit, V. T., Liefers, G. J., Kuppen, P. J., Tollenaar, R. A., & Mesker, W. E. (2011). Tumor-stroma ratio in the primary tumor is a prognostic factor in early breast cancer patients, especially in triple-negative carcinoma patients. Breast Cancer Research and Treatment, 125(3), 687–696.CrossRefGoogle Scholar
  8. de Kruijf, E. M., Dekker, T. J., Hawinkels, L. J., Putter, H., Smit, V. T., Kroep, J. R., Kuppen, P. J., van de Velde, C. J., ten Dijke, P., Tollenaar, R. A., & Mesker, W. E. (2013). The prognostic role of TGF-beta signaling pathway in breast cancer patients. Annals of Oncology : Official Journal of the European Society for Medical Oncology / ESMO, 24(2), 384–390.CrossRefGoogle Scholar
  9. Dekker, T. J., van de Velde, C. J., van Pelt, G. W., Kroep, J. R., Julien, J. P., Smit, V. T., Tollenaar, R. A., & Mesker, W. E. (2013). Prognostic significance of the tumor-stroma ratio: Validation study in node-negative premenopausal breast cancer patients from the EORTC perioperative chemotherapy (POP) trial (10854). Breast Cancer Research and Treatment, 139(2), 371–379.CrossRefGoogle Scholar
  10. Downey, C. L., Simpkins, S. A., White, J., Holliday, D. L., Jones, J. L., Jordan, L. B., Kulka, J., Pollock, S., Rajan, S. S., Thygesen, H. H., Hanby, A. M., & Speirs, V. (2014). The prognostic significance of tumour-stroma ratio in oestrogen receptor-positive breast cancer. British Journal of Cancer, 110(7), 1744–1747.CrossRefGoogle Scholar
  11. Esposito, N. N., Dabbs, D. J., & Bhargava, R. (2009). Are encapsulated papillary carcinomas of the breast in situ or invasive? A basement membrane study of 27 cases. American Journal of Clinical Pathology, 131(2), 228–242.CrossRefGoogle Scholar
  12. Galea, M. H., Blamey, R. W., Elston, C. E., & Ellis, I. O. (1992). The Nottingham prognostic index in primary breast cancer. Breast Cancer Research and Treatment, 22(3), 207–219.CrossRefGoogle Scholar
  13. Gujam, F. J., Edwards, J., Mohammed, Z. M., Going, J. J., & McMillan, D. C. (2014). The relationship between the tumour stroma percentage, clinicopathological characteristics and outcome in patients with operable ductal breast cancer. British Journal of Cancer, 111(1), 157–165.Google Scholar
  14. Huijbers, A., Tollenaar, R. A., van Pelt, G. W., Zeestraten, E. C., Dutton, S., McConkey, C. C., Domingo, E., Smit, V. T., Midgley, R., Warren, B. F., Johnstone, E. C., Kerr, D. J., & Mesker, W. E. (2013). The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: Validation in the VICTOR trial. Annals of Oncology : Official Journal of the European Society for Medical Oncology / ESMO, 24(1), 179–185.CrossRefGoogle Scholar
  15. Jerevall, P. L., Ma, X. J., Li, H., Salunga, R., Kesty, N. C., Erlander, M. G., Sgroi, D. C., Holmlund, B., Skoog, L., Fornander, T., Nordenskjold, B., & Stal, O. (2011). Prognostic utility of HOXB13:IL17BR and molecular grade index in early-stage breast cancer patients from the Stockholm trial. British Journal of Cancer, 104(11), 1762–1769.CrossRefGoogle Scholar
  16. Lebrun, J. J. (2012). The dual role of TGF in human cancer: From tumor suppression to cancer metastasis. ISRN Molecular Biology, 2012, 1–28.CrossRefGoogle Scholar
  17. Lee, H. M., & Hsu, C. C. (1990). A new model for concept classification based on linear threshold unit and decision tree. Proceedings of the International Joint Conference on Neural Networks (IJCNN-90-Wash D.C. IEEE/INNS), Washington, D.C., USA, vol. 2, pp. 631–634.Google Scholar
  18. LM, M. S., Altman, D. G., Sauerbrei, W., Taube, S. E., Gion, M., Clark, G. M., & Statistics Subcommittee of the NCIEWGoCD. (2005). REporting recommendations for tumour MARKer prognostic studies (REMARK). European Journal of Cancer, 41(12), 1690–1696.CrossRefGoogle Scholar
  19. Ma, X. J., Salunga, R., Dahiya, S., Wang, W., Carney, E., Durbecq, V., Harris, A., Goss, P., Sotiriou, C., Erlander, M., & Sgroi, D. (2008). A five-gene molecular grade index and HOXB13:IL17BR are complementary prognostic factors in early stage breast cancer. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 14(9), 2601–2608.CrossRefGoogle Scholar
  20. Massague, J. (2008). TGFbeta in cancer. Cell, 134(2), 215–230.CrossRefGoogle Scholar
  21. Mazars, P., Barboule, N., Baldin, V., Vidal, S., Ducommun, B., & Valette, A. (1995). Effects of TGF-beta 1 (transforming growth factor-beta 1) on the cell cycle regulation of human breast adenocarcinoma (MCF-7) cells. FEBS Letters, 362(3), 295–300.CrossRefGoogle Scholar
  22. Mitchell, T. M. (1997). Machine learning. The McGraw-Hill Companies, Inc., New York.Google Scholar
  23. Moorman, A. M., Vink, R., Heijmans, H. J., van der Palen, J., & Kouwenhoven, E. A. (2012). The prognostic value of tumour-stroma ratio in triple-negative breast cancer. European Journal of Surgical Oncology : the Journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology, 38(4), 307–313.CrossRefGoogle Scholar
  24. Muraoka, R. S., Dumont, N., Ritter, C. A., Dugger, T. C., Brantley, D. M., Chen, J., Easterly, E., Roebuck, L. R., Ryan, S., Gotwals, P. J., Koteliansky, V., & Arteaga, C. L. (2002). Blockade of TGF-beta inhibits mammary tumor cell viability, migration, and metastases. The Journal of Clinical Investigation, 109(12), 1551–1559.CrossRefGoogle Scholar
  25. Padua, D., Zhang, X. H., Wang, Q., Nadal, C., Gerald, W. L., Gomis, R. R., & Massague, J. (2008). TGFbeta primes breast tumors for lung metastasis seeding through angiopoietin-like 4. Cell, 133(1), 66–77.CrossRefGoogle Scholar
  26. Parisi, F., Gonzalez, A. M., Nadler, Y., Camp, R. L., Rimm, D. L., Kluger, H. M., & Kluger, Y. (2010). Benefits of biomarker selection and clinico-pathological covariate inclusion in breast cancer prognostic models. Breast Cancer Research, 12(5), R66.CrossRefGoogle Scholar
  27. Quinlan, J. R. (1993). C4.5 : programs for machine learning. San Mateo: Morgan Kaufmann Publishers.Google Scholar
  28. Shi, Y., & Massague, J. (2003). Mechanisms of TGF-beta signaling from cell membrane to the nucleus. Cell, 113(6), 685–700.CrossRefGoogle Scholar
  29. Zhang, Y., Schnabel, C. A., Schroeder, B. E., Jerevall, P. L., Jankowitz, R. C., Fornander, T., Stal, O., Brufsky, A. M., Sgroi, D., & Erlander, M. G. (2013). Breast cancer index identifies early-stage estrogen receptor-positive breast cancer patients at risk for early- and late-distant recurrence. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 19(15), 4196–4205.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Jimin Guo
    • 1
    • 2
    • 3
  • Benjamin C. M. Fung
    • 4
  • Farkhund Iqbal
    • 5
  • Peter J. K. Kuppen
    • 6
  • Rob A. E. M. Tollenaar
    • 6
  • Wilma E. Mesker
    • 6
  • Jean-Jacques Lebrun
    • 1
  1. 1.Division of Medical OncologyMcGill University Health CenterMontrealCanada
  2. 2.Department of Biomedical InformaticsHarvard Medical SchoolBostonUSA
  3. 3.John A. Paulson School of Engineering and Applied SciencesHarvard UniversityCambridgeUSA
  4. 4.School of Information StudiesMcGill UniversityMontrealCanada
  5. 5.Zayed UniversityAbu DhabiUnited Arab Emirates
  6. 6.Department of SurgeryLeiden University Medical CenterLeidenThe Netherlands

Personalised recommendations