Skip to main content

Advertisement

Log in

Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study

  • Head and Neck
  • Published:
European Archives of Oto-Rhino-Laryngology Aims and scope Submit manuscript

Abstract

Purpose

The objective of this study was to train machine learning models for predicting the likelihood of recurrence in patients diagnosed with well-differentiated thyroid cancer. While thyroid cancer mortality remains low, the risk of recurrence is a significant concern. Identifying individual patient recurrence risk is crucial for guiding subsequent management and follow-ups.

Methods

In this prospective study, a cohort of 383 patients was observed for a minimum duration of 10 years within a 15-year timeframe. Thirteen clinicopathologic features were assessed to predict recurrence potential. Classic (K-nearest neighbors, support vector machines (SVM), tree-based models) and artificial neural networks (ANN) were trained on three distinct combinations of features: a data set with all features excluding American Thyroid Association (ATA) risk score (12 features), another with ATA risk alone, and a third with all features combined (13 features). 283 patients were allocated for the training process, and 100 patients were reserved for the validation of stage.

Results

The patients' mean age was 40.87 ± 15.13 years, with a majority being female (81%). When using the full data set for training, the models showed the following sensitivity, specificity and AUC, respectively: SVM (99.33%, 97.14%, 99.71), K-nearest neighbors (83%, 97.14%, 98.44), Decision Tree (87%, 100%, 99.35), Random Forest (99.66%, 94.28%, 99.38), ANN (96.6%, 95.71%, 99.64). Eliminating ATA risk data increased models specificity but decreased sensitivity. Conversely, training exclusively on ATA risk data had the opposite effect.

Conclusions

Machine learning models, including classical and neural networks, efficiently stratify the risk of recurrence in patients with well-differentiated thyroid cancer. This can aid in tailoring treatment intensity and determining appropriate follow-up intervals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The data sets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Code availability

The underlying code for this study is not publicly available but may be made available to qualified researchers on reasonable request from the corresponding author.

References

  1. Powers AE, Marcadis AR, Lee M, Morris LGT, Marti JL (2019) Changes in trends in thyroid cancer incidence in the United States, 1992 to 2016. JAMA 322(24):2440–2441. https://doi.org/10.1001/jama.2019.18528

    Article  PubMed  PubMed Central  Google Scholar 

  2. Aschebrook-Kilfoy B, Kaplan EL, Chiu BC-H, Angelos P, Grogan RH (2013) The acceleration in papillary thyroid cancer incidence rates is similar among racial and ethnic groups in the United States. Ann Surg Oncol 20:2746–2753

    Article  PubMed  Google Scholar 

  3. Li M, Brito JP, Vaccarella S (2020) Long-term declines of thyroid cancer mortality: an international age–period–cohort analysis. Thyroid 30(6):838–846

    Article  PubMed  Google Scholar 

  4. Shaha AR (2012) Recurrent differentiated thyroid cancer. Endocr Pract 18(4):600–603

    Article  PubMed  Google Scholar 

  5. Tuttle RM, Alzahrani AS (2019) Risk stratification in differentiated thyroid cancer: from detection to final follow-up. J Clin Endocrinol Metab 104(9):4087–4100

    Article  PubMed  PubMed Central  Google Scholar 

  6. Luster M, Clarke S, Dietlein M, Lassmann M, Lind P, Oyen W et al (2008) Guidelines for radioiodine therapy of differentiated thyroid cancer. Eur J Nucl Med Mol Imaging 35:1941–1959

    Article  CAS  PubMed  Google Scholar 

  7. Lee J, Lee SG, Kim K, Yim SH, Ryu H, Lee CR et al (2019) Clinical value of lymph node ratio integration with the 8th edition of the UICC TNM classification and 2015 ATA risk stratification systems for recurrence prediction in papillary thyroid cancer. Sci Rep 9(1):13361

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  8. Ouyang F-s, Guo B-l, Ouyang L-z, Liu Z-w, Lin S-j, Meng W et al (2019) Comparison between linear and nonlinear machine-learning algorithms for the classification of thyroid nodules. Eur J Radiol 113:251–257

    Article  PubMed  Google Scholar 

  9. Li L-R, Du B, Liu H-Q, Chen C (2021) Artificial intelligence for personalized medicine in thyroid cancer: current status and future perspectives. Front Oncol 10:604051

    Article  PubMed  PubMed Central  Google Scholar 

  10. Verburg F, Reiners C (2019) Sonographic diagnosis of thyroid cancer with support of AI. Nat Rev Endocrinol 15(6):319–321

    Article  CAS  PubMed  Google Scholar 

  11. Yoon J, Lee E, Koo JS, Yoon JH, Nam K-H, Lee J et al (2020) Artificial intelligence to predict the BRAFV600E mutation in patients with thyroid cancer. PLoS ONE 15(11):e0242806

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Schlumberger M, Leboulleux S (2021) Current practice in patients with differentiated thyroid cancer. Nat Rev Endocrinol 17(3):176–188

    Article  CAS  PubMed  Google Scholar 

  13. Bisong E, Bisong E (2019) Introduction to Scikit-learn. Building machine learning and deep learning models on Google cloud platform: a comprehensive guide for beginners, 1st edn. Apress, Ottawa, pp 215–29

  14. Imambi S, Prakash KB, Kanagachidambaresan G (2021) PyTorch. Programming with TensorFlow: solution for edge computing applications, Springer, Cham, pp 87–104

  15. Bisong E, Bisong E (2019) Matplotlib and Seaborn. Building machine learning and deep learning models on google cloud platform: a comprehensive guide for beginners, 1st edn. Apress, Ottawa, pp 151-167

  16. Yu L, Zhou R, Chen R, Lai KK (2022) Missing data preprocessing in credit classification: One-hot encoding or imputation? Emerg Mark Financ Trade 58(2):472–482

    Article  Google Scholar 

  17. Yue S, Li P, Hao P (2003) SVM classification: Its contents and challenges. Applied Mathematics-A Journal of Chinese Universities 18:332–342

    Article  MathSciNet  Google Scholar 

  18. Clark LA, Pregibon D (2017) Tree-based models. In: Statistical models in S. Routledge, pp 377–419.

  19. Taunk K, De S, Verma S, Swetapadma A (2019) A brief review of nearest neighbor algorithm for learning and classification. In: 2019 International Conference on intelligent computing and control systems (ICCS): IEEE; pp 1255–60

  20. Rajkomar A, Dean J, Kohane I (2019) Machine learning in medicine. N Engl J Med 380(14):1347–1358

    Article  PubMed  Google Scholar 

  21. Tuttle RM, Tala H, Shah J, Leboeuf R, Ghossein R, Gonen M et al (2010) Estimating risk of recurrence in differentiated thyroid cancer after total thyroidectomy and radioactive iodine remnant ablation: using response to therapy variables to modify the initial risk estimates predicted by the new American Thyroid Association staging system. Thyroid 20(12):1341–1349

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Halevy A, Norvig P, Pereira F (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24(2):8–12

    Article  Google Scholar 

  23. Anguita D, Ghio A, Greco N, Oneto L, Ridella S (2010) Model selection for support vector machines: advantages and disadvantages of the machine learning theory. In: The 2010 International Joint Conference on neural networks (IJCNN): IEEE, pp 1–8

  24. El Haji H, Souadka A, Patel BN, Sbihi N, Ramasamy G, Patel BK et al (2023) Evolution of breast cancer recurrence risk prediction: a systematic review of statistical and machine learning-based models. JCO Clin Cancer Inform 7:e2300049

    Article  PubMed  Google Scholar 

  25. Mazaki J, Katsumata K, Ohno Y, Udo R, Tago T, Kasahara K et al (2021) A novel prediction model for colon cancer recurrence using auto-artificial intelligence. Anticancer Res 41(9):4629–4636

    Article  PubMed  Google Scholar 

  26. Lim H, Devesa SS, Sosa JA, Check D, Kitahara CM (2017) Trends in thyroid cancer incidence and mortality in the United States, 1974–2013. JAMA 317(13):1338–1348. https://doi.org/10.1001/jama.2017.2719

    Article  PubMed  PubMed Central  Google Scholar 

  27. Seib CD, Sosa JA (2019) Evolving understanding of the epidemiology of thyroid cancer. Endocrinol Metab Clin N Am 48(1):23–35. https://doi.org/10.1016/j.ecl.2018.10.002

    Article  Google Scholar 

  28. Kelly A, Barres B, Kwiatkowski F, Batisse-Lignier M, Aubert B, Valla C et al (2019) Age, thyroglobulin levels and ATA risk stratification predict 10-year survival rate of differentiated thyroid cancer patients. PLoS ONE 14(8):e0221298. https://doi.org/10.1371/journal.pone.0221298

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wu J, Hu XY, Ghaznavi S, Kinnear S, Symonds CJ, Grundy P et al (2022) The prospective implementation of the 2015 ATA guidelines and modified ATA recurrence risk stratification system for treatment of differentiated thyroid cancer in a canadian tertiary care referral setting. Thyroid 32(12):1509–1518. https://doi.org/10.1089/thy.2022.0055

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

We sincerely thank Professor Peter Szolovits for his valuable comments on the presenting article. The article was submitted with the ethical identifier IR.UMSHA.REC.1402.360 at Hamadan University of Medical Sciences, Hamadan, Iran. 

Author information

Authors and Affiliations

Authors

Contributions

All authors have made significant contributions to this work. SB, MG, and AT collected the initial and follow-up information of the patients and participated in data cleaning and preprocessing. AT and MG contributed to the development and coding of the machine learning models. GB and GRL reviewed the data set and article for potential errors and biases regarding machine learning principles. In addition, all authors have been involved in the writing of the article.

Corresponding author

Correspondence to Aidin Tarokhian.

Ethics declarations

Conflict of interest

All authors declare no financial or non-financial competing interests. No funding was received for conducting this study.

Ethics declaration

 The author Jerome R. Lechien is also guest editor of the special issue on ‘ChatGPT and Artifcial Intelligence in Otolaryngology-Head and Neck Surgery’. He was not involved with the peer review process of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Borzooei, S., Briganti, G., Golparian, M. et al. Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study. Eur Arch Otorhinolaryngol 281, 2095–2104 (2024). https://doi.org/10.1007/s00405-023-08299-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00405-023-08299-w

Keywords

Navigation