Revealing determinant factors for early breast cancer recurrence by decision tree
Early breast cancer recurrence is indicative of poor response to adjuvant therapy and poses threats to patients’ lives. Most existing prediction models for breast cancer recurrence are regression-based models and difficult to interpret. We apply a Decision Tree algorithm to the clinical information of a cohort of non-metastatic invasive breast cancer patients, to establish a classifier that categorizes patients based on whether they develop early recurrence and on similarities of their clinical and pathological diagnoses. The classifier predicts for whether a patient developed early disease recurrence; and is estimated to be about 70% accurate. For an independent validation cohort of 65 patients, the classifier predicts correctly for 55 patients. The classifier also groups patients based on intrinsic properties of their diseases; and for each subgroup lists the disease characteristics in a hierarchal order, according to their relevance to early relapse. Overall, it identifies pathological nodal stage, percentage of intra-tumor stroma and components of TGFβ-Smad signaling pathway as highly relevant factors for early breast cancer recurrence. Since most of the disease characteristics used by this classifier are results of standardized tests, routinely collected during breast cancer diagnosis, the classifier can easily be adopted in various research and clinical settings.
KeywordsBreast cancer Recurrence Decision tree Classifier Stroma TGFβ
We would like to thank Drs. C. C. Engels, J. W. T. Dekker and E. M. de Kruijf for conducting immunohistochemistry staining, evaluating stroma percentage and recording original data; and Drs. A. Dibrov and Catalin Mihalcioiu for valuable discussions. J. Guo is supported by a Traineeship from the Breast Cancer Research Program of Congressionally Directed Medical Research Program (CDMRP). B. C. M. Fung is a Canada Research Chair in Data Mining for Cybersecurity. J.-J. Lebrun is a Sir William Dawson Research Chair of McGill University. This work was supported in part by grants from the Canadian Institutes for Health Research (CIHR) (fund codes 230670 and 233716 to J.-J. Lebrun), the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grants (fund code 356065-2013 to B. C. M. Fung), Canada Research Chairs Program (fund code 950-230623 to B. C. M. Fung), and Zayed University Research Incentive Fund and Research Cluster Award (fund codes R15048 and R16083 to F. Iqbal and B. C. M. Fung).
J. Guo, B. C. M. Fung and J.-J. Lebrun designed the study, analyzed and interpreted the results. F. Iqbal participated in interpreting the results. P. J. K. Kuppen, R. A. E. M. Tollenaar and W. E. Mesker collected patient samples and designed the tumor tissue microarrays.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- Ahn, S., Cho, J., Sung, J., Lee, J. E., Nam, S. J., Kim, K. M., & Cho, E. Y. (2012). The prognostic significance of tumor-associated stroma in invasive breast carcinoma. Tumour biology : the Journal of the International Society for Oncodevelopmental Biology and Medicine, 33(5), 1573–1580.CrossRefGoogle Scholar
- Barton, S., Zabaglo, L., A'Hern, R., Turner, N., Ferguson, T., O'Neill, S., Hills, M., Smith, I., & Dowsett, M. (2012). Assessment of the contribution of the IHC4+C score to decision making in clinical practice in early breast cancer. British Journal of Cancer, 106(11), 1760–1765.CrossRefGoogle Scholar
- Brewster, A. M., Hortobagyi, G. N., Broglio, K. R., Kau, S. W., Santa-Maria, C. A., Arun, B., Buzdar, A. U., Booser, D. J., Valero, V., Bondy, M., & Esteva, F. J. (2008). Residual risk of breast cancer recurrence 5 years after adjuvant therapy. Journal of the National Cancer Institute, 100(16), 1179–1183.CrossRefGoogle Scholar
- Carlson, R. (2010). Surveillance of patients following primary therapy. In Diseases of the breast, 4 edn. Lippincott Williams and Wilkins.Google Scholar
- de Kruijf, E. M., van Nes, J. G., van de Velde, C. J., Putter, H., Smit, V. T., Liefers, G. J., Kuppen, P. J., Tollenaar, R. A., & Mesker, W. E. (2011). Tumor-stroma ratio in the primary tumor is a prognostic factor in early breast cancer patients, especially in triple-negative carcinoma patients. Breast Cancer Research and Treatment, 125(3), 687–696.CrossRefGoogle Scholar
- de Kruijf, E. M., Dekker, T. J., Hawinkels, L. J., Putter, H., Smit, V. T., Kroep, J. R., Kuppen, P. J., van de Velde, C. J., ten Dijke, P., Tollenaar, R. A., & Mesker, W. E. (2013). The prognostic role of TGF-beta signaling pathway in breast cancer patients. Annals of Oncology : Official Journal of the European Society for Medical Oncology / ESMO, 24(2), 384–390.CrossRefGoogle Scholar
- Dekker, T. J., van de Velde, C. J., van Pelt, G. W., Kroep, J. R., Julien, J. P., Smit, V. T., Tollenaar, R. A., & Mesker, W. E. (2013). Prognostic significance of the tumor-stroma ratio: Validation study in node-negative premenopausal breast cancer patients from the EORTC perioperative chemotherapy (POP) trial (10854). Breast Cancer Research and Treatment, 139(2), 371–379.CrossRefGoogle Scholar
- Downey, C. L., Simpkins, S. A., White, J., Holliday, D. L., Jones, J. L., Jordan, L. B., Kulka, J., Pollock, S., Rajan, S. S., Thygesen, H. H., Hanby, A. M., & Speirs, V. (2014). The prognostic significance of tumour-stroma ratio in oestrogen receptor-positive breast cancer. British Journal of Cancer, 110(7), 1744–1747.CrossRefGoogle Scholar
- Gujam, F. J., Edwards, J., Mohammed, Z. M., Going, J. J., & McMillan, D. C. (2014). The relationship between the tumour stroma percentage, clinicopathological characteristics and outcome in patients with operable ductal breast cancer. British Journal of Cancer, 111(1), 157–165.Google Scholar
- Huijbers, A., Tollenaar, R. A., van Pelt, G. W., Zeestraten, E. C., Dutton, S., McConkey, C. C., Domingo, E., Smit, V. T., Midgley, R., Warren, B. F., Johnstone, E. C., Kerr, D. J., & Mesker, W. E. (2013). The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: Validation in the VICTOR trial. Annals of Oncology : Official Journal of the European Society for Medical Oncology / ESMO, 24(1), 179–185.CrossRefGoogle Scholar
- Jerevall, P. L., Ma, X. J., Li, H., Salunga, R., Kesty, N. C., Erlander, M. G., Sgroi, D. C., Holmlund, B., Skoog, L., Fornander, T., Nordenskjold, B., & Stal, O. (2011). Prognostic utility of HOXB13:IL17BR and molecular grade index in early-stage breast cancer patients from the Stockholm trial. British Journal of Cancer, 104(11), 1762–1769.CrossRefGoogle Scholar
- Lee, H. M., & Hsu, C. C. (1990). A new model for concept classification based on linear threshold unit and decision tree. Proceedings of the International Joint Conference on Neural Networks (IJCNN-90-Wash D.C. IEEE/INNS), Washington, D.C., USA, vol. 2, pp. 631–634.Google Scholar
- Ma, X. J., Salunga, R., Dahiya, S., Wang, W., Carney, E., Durbecq, V., Harris, A., Goss, P., Sotiriou, C., Erlander, M., & Sgroi, D. (2008). A five-gene molecular grade index and HOXB13:IL17BR are complementary prognostic factors in early stage breast cancer. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 14(9), 2601–2608.CrossRefGoogle Scholar
- Mitchell, T. M. (1997). Machine learning. The McGraw-Hill Companies, Inc., New York.Google Scholar
- Moorman, A. M., Vink, R., Heijmans, H. J., van der Palen, J., & Kouwenhoven, E. A. (2012). The prognostic value of tumour-stroma ratio in triple-negative breast cancer. European Journal of Surgical Oncology : the Journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology, 38(4), 307–313.CrossRefGoogle Scholar
- Muraoka, R. S., Dumont, N., Ritter, C. A., Dugger, T. C., Brantley, D. M., Chen, J., Easterly, E., Roebuck, L. R., Ryan, S., Gotwals, P. J., Koteliansky, V., & Arteaga, C. L. (2002). Blockade of TGF-beta inhibits mammary tumor cell viability, migration, and metastases. The Journal of Clinical Investigation, 109(12), 1551–1559.CrossRefGoogle Scholar
- Quinlan, J. R. (1993). C4.5 : programs for machine learning. San Mateo: Morgan Kaufmann Publishers.Google Scholar
- Zhang, Y., Schnabel, C. A., Schroeder, B. E., Jerevall, P. L., Jankowitz, R. C., Fornander, T., Stal, O., Brufsky, A. M., Sgroi, D., & Erlander, M. G. (2013). Breast cancer index identifies early-stage estrogen receptor-positive breast cancer patients at risk for early- and late-distant recurrence. Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, 19(15), 4196–4205.CrossRefGoogle Scholar