Abstract
Esophageal cancer (EC) is a significant health concern worldwide, and predicting its metastatic progression is essential for planning effective treatments. Histopathological intervention is the gold standard for diagnosing Esophageal Cancer Metastasis (ECM). However, we introduced a noninvasive, data-driven approach utilizing different machine learning (ML) algorithms on clinical data from TCGA to predict the risk of ECM. Among these algorithms, CatBoost stands out, achieving a 73% accuracy and a 75% area under the curve (AUC) using 5-fold cross-validation with a standard deviation of 4% among 5-folds. We visualized feature importance graphs and feature correlations to explain the decision-making of ML models. Our findings highlight associations between height, weight, age, alcohol consumption, the number of packs smoked, tumor location, and the risk of metastasis in EC patients. This approach offers a promising way to enhance EC metastasis prediction while minimizing invasive procedures.
Similar content being viewed by others
References
Ai D, Zhu H, Ren W, Chen Y, Liu Q, Deng J, Ye J, Fan J, Zhao K (2017) Patterns of distant organ metastases in esophageal cancer: a population-based study. J Thorac Dis 9(9):3023–3030. https://doi.org/10.21037/jtd.2017.08.72
Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans Comput 65(10):2986–2998. https://doi.org/10.1109/TC.2016.2519914
Arnal MJD (2015) Esophageal cancer: risk factors, screening and endoscopic treatment in Western and Eastern countries. World J Gastroenterol 21(26):7933. https://doi.org/10.3748/wjg.v21.i26.7933
Bakro M, Kumar RR, Husain M, Ashraf Z, Ali A, Yaqoob SI, Ahmed MN, Parveen N (2024) Building a cloud-IDS by hybrid bio-inspired feature selection algorithms along with random forest model. IEEE, 8846–8874. Accessed 12 Nov 2023. https://doi.org/10.1109/ACCESS.2024.3353055
Barbara L, Benzi G, Gaiani S, Fusconi F, Zironi G, Siringo S, Rigamonti A, Barbara C, Grigioni W, Mazziotti A, Bolondi L (1992) Natural history of small untreated hepatocellular carcinoma in cirrhosis: a multivariate analysis of prognostic factors of tumor growth rate and patient survival. Hepatology 16(1):132–137. https://doi.org/10.1002/hep.1840160122
Barz H, Barz D (1984) Age dependence of metastases. a study of more than 5000 cases of death from cancer. Archiv für Geschwulstforschung 54(1):77–83
Blot WJ (1994) Esophageal cancer trends and risk factors. Semin Oncol 21(4):403–10
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: A new perspective. Neurocomputing 300:70–79. https://doi.org/10.1016/j.neucom.2017.11.077
Choksi D, Kolhe KM, Ingle M, Rathi C, Khairnar H, Chauhan SG, Chaudhary V, Shukla A, Pandey V (2020) Esophageal carcinoma: an epidemiological analysis and study of the time trends over the last 20 years from a single center in India. J Fam Med Prim Care 9(3):1695–1699. https://doi.org/10.4103/jfmpc.jfmpc_1111_19
Cincibuch J, Mysliveček M, Melichar B, Neoral C, Metelková I, Zezulová M, Procházková-Študentová H, Flodr P, Zlevorová M, Aujeský R, Cwiertka K (2012) Metastases of esophageal carcinoma to skeletal muscle: single center experience. World J Gastroenterol 18(35):4962–6. https://doi.org/10.3748/wjg.v18.i35.4962
Demichelis F, Della Mea V, Forti S, Palma PD, Beltrami CA (2002) Digital Storage of Glass Slides for Quality Assurance in Histopathology and Cytopathology. J Telemed Telecare 8(3):138–142. https://doi.org/10.1177/1357633X0200800303
Enzinger PC, Mayer RJ (2003) Esophageal Cancer. N Engl J Med 349(23):2241–2252. https://doi.org/10.1056/NEJMra035010
Fan Y, Yuan J-M, Wang R, Gao Y-T, Yu MC (2008) Alcohol, tobacco, and diet in relation to esophageal cancer: the Shanghai cohort study. Nutr Cancer 60(3):354–63. https://doi.org/10.1080/01635580701883011
Feldman K, Faust L, Wu X, Huang C, Chawla NV (2017) Beyond volume: the impact of complex healthcare data on the machine learning pipeline, pp. 150–169. https://doi.org/10.1007/978-3-319-69775-8_9 . http://link.springer.com/10.1007/978-3-319-69775-8_9
Fisher R, Pusztai L, Swanton C (2013) Cancer heterogeneity: implications for targeted therapeutics. Br J Cancer 108(3):479–85. https://doi.org/10.1038/bjc.2012.581
Gao Y, Hu N, Han XY, Giffen C, Ding T, Goldstein A, Taylor P (2009) Family history of cancer and risk for esophageal and gastric cancer in Shanxi, China. BMC Cancer 9:1–10. https://doi.org/10.1186/1471-2407-9-269
Guo B, Liu H, Niu L (2023) Integration of natural and deep artificial cognitive models in medical images: BERT-based NER and relation extraction for electronic medical records. Front Neurosci. https://doi.org/10.3389/fnins.2023.1266771
He L, Long LR, Antani S, Thoma GR (2008) Histology image analysis for carcinoma detection and grading. Bone 23(1):1–7
Hess KR, Varadhachary GR, Taylor SH, Wei W, Raber MN, Lenzi R, Abbruzzese JL (2006) Metastatic patterns in adenocarcinoma. Cancer 106(7):1624–1633. https://doi.org/10.1002/cncr.21778
Huang FL, Yu SJ (2018) Esophageal cancer: risk factors, genetic association, and treatment. Asian J Surg 41(3):210–215. https://doi.org/10.1016/j.asjsur.2016.10.005
Huang C, Dai Y, Chen Q, Chen H, Lin Y, Wu J, Xu X, Chen X (2022) Development and validation of a deep learning model to predict survival of patients with esophageal cancer. Front Oncol. https://doi.org/10.3389/fonc.2022.971190
Islam MM, Poly TN, Walther BA, Yeh CY, Seyed-Abdul S, Li YC, Lin MC (2022) Deep learning for the diagnosis of esophageal cancer in endoscopic images: a systematic review and meta-analysis. Cancers 14(23):1–14. https://doi.org/10.3390/cancers14235996
Javaid M, Haleem A, Singh RP, Suman R, Rab S (2022) Significance of machine learning in healthcare: features, pillars and applications. Int J Intell Netw 3:58–73. https://doi.org/10.1016/j.ijin.2022.05.002
Kalyankar GD, Poojara SR, Dharwadkar NV (2017) Predictive analysis of diabetic patient data using machine learning and Hadoop. In: 2017 international conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), pp. 619–624. IEEE. https://doi.org/10.1109/I-SMAC.2017.8058253 . http://ieeexplore.ieee.org/document/8058253/
Kuang JJ, Jiang Z-M, Chen Y-X, Ye W-P, Yang Q, Wang H-Z, Xie D-R (2016) Smoking exposure and survival of patients with esophagus cancer: a systematic review and meta-analysis. Gastroenterol Res Pract 2016:7682387. https://doi.org/10.1155/2016/7682387
Kumagai N, Wakai T, Akazawa K, Ling Y, Wang S, Shan B, Okuhara Y, Hatakeyama Y, Kataoka H (2013) Heavy alcohol intake is a risk factor for esophageal squamous cell carcinoma among middle-aged men: a case-control and simulation study. Molr Clin Oncol 1(5):811–816. https://doi.org/10.3892/mco.2013.142
Langley RR, Fidler IJ (2011) The seed and soil hypothesis revisited-the role of tumor-stroma interactions in metastasis to different organs. Int J Cancer 128(11):2527–2535. https://doi.org/10.1002/ijc.26031
Lee K, Lockhart JH, Xie M, Chaudhary R, Slebos RJC, Flores ER, Chung CH, Tan AC (2021) Deep learning of histopathology images at the single cell level. Front Artif Intell 4(September):1–14. https://doi.org/10.3389/frai.2021.754641
Loud JT, Murphy J (2017) Cancer screening and early detection in the 21st century. Semin Oncol Nurs 33(2):121–128. https://doi.org/10.1016/j.soncn.2017.02.002
Mackay CJ, Chen Y (2019) Ruptured cerebral abscess with ventriculitis and leptomeningitis; a rare complication in the setting of metastatic esophageal cancer: case report and literature review. Radiol Case Rep 14(6):782–785. https://doi.org/10.1016/j.radcr.2019.03.034
Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179(13):2208–2217. https://doi.org/10.1016/j.ins.2009.02.014
Mandard AM, Chasle J, Marnay J, Villedieu B, Bianco C, Roussel A, Elie H, Vernhes JC (1981) Autopsy findings in 111 cases of esophageal cancer. Cancer 48(2):329–335. https://doi.org/10.1002/1097-0142%2819810715%2848:2<329::AID-CNCR2820480219>3.0.CO;2-V
Matsuhashi T, Yamada N, Shinzawa H, Takahashi T (1996) Effect of alcohol on tumor growth of hepatocellular carcinoma with type C cirrhosis. Intern Med 35(6):443–448. https://doi.org/10.2169/internalmedicine.35.443
Morita M, Saeki H, Mori M, Kuwano H, Sugimachi K (2002) Risk factors for esophageal cancer and the multiple occurrence of carcinoma in the upper aerodigestive tract. Surgery 131(1):1–6. https://doi.org/10.1067/msy.2002.119287
Muller JJ, Wang R, Milddleton D, Alizadeh M, Kang KC, Hryczyk R, Zabrecky G, Hriso C, Navarreto E, Wintering N, Bazzan AJ, Wu C, Monti DA, Jiao X, Wu Q, Newberg AB, Mohamed FB (2023) Machine learning-based classification of chronic traumatic brain injury using hybrid diffusion imaging. Front Neurosci. https://doi.org/10.3389/fnins.2023.1182509
Muthukrishnan R, Rohini R (2016) LASSO: a feature selection technique in predictive modeling for machine learning. In: 2016 IEEE international conference on advances in computer applications (ICACA), pp. 18–20. IEEE, https://doi.org/10.1109/ICACA.2016.7887916 . http://ieeexplore.ieee.org/document/7887916/
Pakzad R, Mohammadian-Hafshejani A, Khosravi B, Soltani S, Pakzad I, Mohammadian M, Salehiniya H, Momenimovahed Z (2016) The incidence and mortality of esophageal cancer and their relationship to development in Asia. Ann Trans Med 4(2):29. https://doi.org/10.3978/j.issn.2305-5839.2016.01.11
Potharlanka JL, M NB (2024) Feature importance feedback with Deep Q process in ensemble-based metaheuristic feature selection algorithms. Sci Rep 14(1):2923. https://doi.org/10.1038/s41598-024-53141-w
Quint LE, Hepburn LM, Francis IR, Whyte RI, Orringer MB (1995) Incidence and distribution of distant metastases from newly diagnosed esophageal carcinoma. Cancer 76(7):1120–1125. https://doi.org/10.1002/1097-0142%2819951001%2976:7<1120::AID-CNCR2820760704>3.0.CO;2-W
Rashid Y, Bhat JI (2024) An insight into topological, machine and Deep Learning-based approaches for influential node identification in social media networks: a systematic review. Multimedia Syst 30(1):1–25. https://doi.org/10.1007/s00530-023-01258-9
Rehm J, Gmel G, Sempos CT, Trevisan M (2003) Alcohol-related morbidity and mortality. Alcohol Res Health 27(1):39
Rustgi AK, El-Serag HB (2014) Esophageal Carcinoma. N Engl J Med 371(26):2499–2509. https://doi.org/10.1056/NEJMra1314530
Shakeel PM, Burhanuddin MA, Desa MI (2019) Lung cancer detection from CT image using improved profuse clustering and deep learning instantaneously trained neural networks. Meas J Int Meas Confed 145:702–712. https://doi.org/10.1016/j.measurement.2019.05.027
Sharma NV, Yadav NS (2021) An optimal intrusion detection system using recursive feature elimination and ensemble of classifiers. Microprocess Microsyst 85:104293. https://doi.org/10.1016/j.micpro.2021.104293
Sharma N, Sharma R, Jindal N (2021) Machine learning and deep learning applications-a vision. Glob Trans Proc 2(1):24–28. https://doi.org/10.1016/j.gltp.2021.01.004
Shiroma S, Yoshio T, Kato Y, Horie Y, Namikawa K, Tokai Y, Yoshimizu S, Yoshizawa N, Horiuchi Y, Ishiyama A, Hirasawa T, Tsuchida T, Akazawa N, Akiyama J, Tada T, Fujisaki J (2021) Ability of artificial intelligence to detect T1 esophageal squamous cell carcinoma from endoscopic videos and the effects of real-time assistance. Sci Rep 11(1):1–9. https://doi.org/10.1038/s41598-021-87405-6
Su Z, Zou G-R, Mao Y-P, OuYang P-Y, Cao X-L, Xie F-Y, Li Q (2019) Prognostic impact of family history of cancer in Southern Chinese patients with esophageal squamous cell cancer. J Cancer 10(6):1349–1357. https://doi.org/10.7150/jca.26511
Tornillo L, Franco R (eds) (2022) The role of histopathology in cancer diagnosis and prognosis. Front Res Topics. Frontiers Media SA, https://doi.org/10.3389/978-2-83250-721-6 . https://www.frontiersin.org/research-topics/26934/the-role-of-histopathology-in-cancer-diagnosis-and-prognosis
Tseng L-J, Matsuyama A, MacDonald-Dickinson V (2023) Histology: the gold standard for diagnosis? Can Vet J=La revue veterinaire canadienne 64(4):389–391
Uhlenhopp DJ, Then EO, Sunkara T, Gaduputi V (2020) Epidemiology of esophageal cancer: update in global trends, etiology and risk factors. Clin J Gastroenterol 13(6):1010–1021. https://doi.org/10.1007/s12328-020-01237-x
Verdonck T, Baesens B, Óskarsdóttir M, Broucke S (2021) Special issue on feature engineering editorial. Mach Learn. https://doi.org/10.1007/s10994-021-06042-2
Verstegen MHP, Harker M, Hugen N, Rosman C, Water C, Nagtegaal ID, Post RS, Dieren J (2020) Metastatic pattern in esophageal and gastric cancer: Influenced by site and histology. World J Gastroenterol 26(39):6037–6046. https://doi.org/10.3748/wjg.v26.i39.6037
Wang X, Chen Y, Gao Y, Zhang H, Guan Z, Dong Z, Zheng Y, Jiang J, Yang H, Wang L, Huang X, Ai L, Yu W, Li H, Dong C, Zhou Z, Liu X, Yu G (2021) Predicting gastric cancer outcome from resected lymph node histopathology images using deep learning. Nat Commun 12(1):1–13. https://doi.org/10.1038/s41467-021-21674-7
Wang Y, Wang C, Liu L (2023) Trends in using deep learning algorithms in biomedical prediction systems. Front Neurosci. https://doi.org/10.3389/fnins.2023.1256351
Wang Y, Yang W, Wang Q, Zhou Y (2023) Mechanisms of esophageal cancer metastasis and treatment progress. Front Immunol. https://doi.org/10.3389/fimmu.2023.1206504
Xiao D, Zhu F, Jiang J, Niu X (2023) Leveraging natural cognitive systems in conjunction with ResNet50-BiGRU model and attention mechanism for enhanced medical image analysis and sports injury prediction. Front Neurosci. https://doi.org/10.3389/fnins.2023.1273931
Xie CY, Pang CL, Chan B, Wong EYY, Dou Q, Vardhanabhuti V (2021) Machine learning and radiomics applications in esophageal cancers using non-invasive imaging methods-a critical review of literature. Cancers 13(10):1–25. https://doi.org/10.3390/cancers13102469
Yeh JC-Y, Yu W-H, Yang C-K, Chien L-I, Lin K-H, Huang W-S, Hsu P-K (2021) Predicting aggressive histopathological features in esophageal cancer with positron emission tomography using a deep convolutional neural network. Ann Trans Med 9(1):37–37. https://doi.org/10.21037/atm-20-1419
Zhang YH, Guo LJ, Yuan XL, Hu B (2020) Artificial intelligence-assisted esophageal cancer management: now and future. World J Gastroenterol 26(35):5256–5271. https://doi.org/10.3748/WJG.V26.I35.5256
Funding
The author(s) declare that this research was supported under the Promotion of University Research and Scientific Excellence (PURSE) (SR/PURSE/2022/121) grant from the Department of Science and Technology, Govt. of India, New Delhi to the Islamic University of Science and Technology (IUST), Awantipora. The study was also supported under Employment and Skill Enhancement Enablement of High Computing and e-learning through IUST Cloud accorded by the Higher Education Department Government of Jammu & Kashmir vide Order No. 77-JK(HE) of 2021 for HEDSS2021100686.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no potential conflicts of interest.
Human and animals rights
This research did not involve human participants or animals.
Informed consent
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Aalam, S.W., Ahanger, A.B., Assad, A. et al. Noninvasive prediction of metastasis in esophageal cancer using ensemble-based feature selection. Int J Syst Assur Eng Manag (2024). https://doi.org/10.1007/s13198-024-02327-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13198-024-02327-6