Abstract
Pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) technique was analyzed to determine the significance of clinical and phenotypic variables as well as environmental conditions that can identify the underlying causes of child ALL. Fifty pediatric patients (n = 50) included who were diagnosed with acute lymphoblastic leukemia (ALL) according to the inclusion and exclusion criteria. Clinical variables comprised of the blood biochemistry (CBC, LFTs, RFTs) results, and distribution of type of ALL, i.e., T ALL or B ALL. Phenotypic data included the age, sex of the child, and consanguinity, while environmental factors included the habitat, socioeconomic status, and access to filtered drinking water. Fifteen different features/attributes were collected for each case individually. To retrieve most useful discriminating attributes, four different supervised ML algorithms were used including classification and regression trees (CART), random forest (RM), gradient boosted machine (GM), and C5.0 decision tree algorithm. To determine the accuracy of the derived CART algorithm on future data, a ten-fold cross validation was performed on the present data set. The ALL was common in children of age below 5 years in male patients whole belonged to middle class family of rural areas. (B-ALL) was most frequent as compared with T-ALL. The consanguinity was present in 54% of cases. Low levels of platelets and hemoglobin and high levels of white blood cells were reported in child ALL patients. CART provided the best and complete fit for the entire data set yielding a 99.83% model fit accuracy, and a misclassification of 0.17% on the entire sample space, while C5.0 reported 98.6%, random forest 94.44%, and gradient boosted machine resulted in 95.61% fitting. The variable importance of each primary discriminating attribute is platelet 43%, hemoglobin 24%, white blood cells 4%, and sex of the child 4%. An overall accuracy of 87.4% was recorded for the classifier. Platelet count abnormality can be considered as a major factor in predicting pediatric ALL. The machine learning algorithms can be applied efficiently to provide details for the prognosis for better treatment outcome.
Similar content being viewed by others
References
Chang JS, Wiemels JL, Chokkalingam AP, Metayer C, Barcellos LF, Hansen HM et al (2010) Genetic polymorphisms in adaptive immunity genes and childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomark Prev 19(9):2152–2163
Smith MA, Seibel NL, Altekruse SF, Ries LA, Melbert DL, O'Leary M et al (2010) Outcomes for children and adolescents with cancer: challenges for the twenty-first century. J Clin Oncol 28(15):2625
Mushtaq N, Fadoo Z, Naqvi A (2013) Childhood acute lymphoblastic leukaemia: experience from a single tertiary care facility of Pakistan. J Pak Med Assoc 63(11):1399–404
Fadoo Z, Nisar I, Yousuf F, Lakhani LS, Ashraf S, Imam U et al (2015) Clinical features and induction outcome of childhood acute lymphoblastic leukemia in a lower/middle income population: a multi-institutional report from Pakistan. Pediatr Blood Cancer 62(10):1700–1708
Awan T, Iqbal Z, Aleem A, Sabir N, Absar M, Rasool M et al (2012) Five most common prognostically important fusion oncogenes are detected in the majority of Pakistani pediatric acute lymphoblastic leukemia patients and are strongly associated with disease biology and treatment outcome. Asian Pac J Cancer Prev 13(11):5469–5475
Shaikh MS, Ali SS, Khurshid M, Fadoo Z (2014) Chromosomal abnormalities in Pakistani children with acute lymphoblastic leukemia. Asian Pac J Cancer Prev 15(9):3907–3909
Iacobucci I, Papayannidis C, Lonetti A, Ferrari A, Baccarani M, Martinelli G (2012) Cytogenetic and molecular predictors of outcome in acute lymphocytic leukemia: recent developments. Curr Hematol Malign Rep 7(2):133–143
Zhang J, Mullighan CG, Harvey RC, Wu G, Chen X, Edmonson M et al (2011) Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children's Oncology Group. Blood 118(11):3080–3087
Pui C-H, Robison LL, Look AT (2008) Acute lymphoblastic leukaemia. Lancet 371(9617):1030–1043
Jameson JL, Weetman AP, Fausi A, Braunwald E, Kasper D, Hauser SL, et al (2018) Harrison's principles of internal medicine. 20th Edition. McGraw-Hill Education, New York. pp. 757–760
Sinnett D, Krajinovic M, Labuda D (2000) Genetic susceptibility to childhood acute lymphoblastic leukemia. Leuk Lymphoma 38(5–6):447–462
Yasmeen N, Ashraf S (2009) Childhood acute lymphoblastic leukaemia; epidemiology and clinicopathological features. Journal of Pakistan Medical Association (JPMA) 59(3):150–153
Jensen CD, Block G, Buffler P, Ma X, Selvin S, Month S (2004) Maternal dietary risk factors in childhood acute lymphoblastic leukemia (United States). Cancer Causes Control 15(6):559–570
Urayama KY, Wiencke JK, Buffler PA, Chokkalingam AP, Metayer C, Wiemels JL (2007) MDR1 gene variants, indoor insecticide exposure, and the risk of childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomark Prev 16(6):1172–1177
Buffler PA, Kwan ML, Reynolds P, Urayama KY (2005) Environmental and genetic risk factors for childhood leukemia: appraising the evidence. Cancer Investig 23(1):60–75
Murray L, McCarron P, Bailie K, Middleton R, Smith GD, Dempsey S et al (2002) Association of early life factors and acute lymphoblastic leukaemia in childhood: historical cohort study. Br J Cancer 86(3):356–361
Belson M, Kingsley B, Holmes A (2007) Risk factors for acute leukemia in children: a review. Environ Health Perspect 15(1):138–145
Wiemels J, Wrensch M, Claus EB (2010) Epidemiology and etiology of meningioma. J Neuro-Oncol 99(3):307–314
Krajinovic M, Richer C, Sinnett H, Labuda D, Sinnett D (2000) Genetic polymorphisms of N-acetyltransferases 1 and 2 and gene-gene interaction in the susceptibility to childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomark Prev 9(6):557–562
Therneau T, Atkinson BR, B Riply (2019) Recursive Partitioning and Regression Trees. R package verion 4.1-10. Available from: https://cran.r-project.org/package=rpart
Liaw A, Wiener M (2012) Random Forest: Breiman and Cutler’s random forests for classification and regression. R Package Version 4.6–7. Available: http://cran.r-project.org/web/packages/randomForest/. Accessed 12 Nov 2019
Greenwell B, Boehmke J (2019) Cunningham, and G. B. M. Developers (2019) gbm: Generalized Boosted Regression Models. R package version. Available: https://cran.r-project.org/web/packages/gbm/gbm.pdf. Accessed 20 Oct 2019
Kuhn M, Weston S, Culp M, Coulter (2018) C50: C5.0 Decision Trees and Rule-Based Models. Available at: https://cran.r-project.org/web/packages/C50/index.html. Accessed 20 Jun 2018
Pan L, Liu G, Lin F, Zhong S, Xia H, Sun X et al (2017) Machine learning applications for prediction of relapse in childhood acute lymphoblastic leukemia. Sci Rep 7(1):7402
Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI (2015) Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J 13:8–17
Hosking FJ, Papaemmanuil E, Sheridan E, Kinsey SE, Lightfoot T, Roman E et al (2010) Genome-wide homozygosity signatures and childhood acute lymphoblastic leukemia risk. Blood 115(22):4472–4477
Breit S, Stanulla M, Flohr T, Schrappe M, Ludwig W-D, Tolle G et al (2006) Activating NOTCH1 mutations predict favorable early treatment response and long-term outcome in childhood precursor T-cell lymphoblastic leukemia. Blood 108(4):1151–1157
Koju S, Sachdeva MUS, Bose P, Varma N (2015) Spectrum of acute leukemias diagnosed on flow cytometry: analysis from tertiary care centre from North India. Ann Clin Chem Lab Med 1(1):12–15
Mullighan CG, Collins-Underwood JR, Phillips LA, Loudin MG, Liu W, Zhang J et al (2009) Rearrangement of CRLF2 in B-progenitor–and Down syndrome–associated acute lymphoblastic leukemia. Nat Genet 41(11):1243–1246
Harrison CJ, Moorman AV, Barber KE, Broadfield ZJ, Cheung KL, Harris RL et al (2005) Interphase molecular cytogenetic screening for chromosomal abnormalities of prognostic significance in childhood acute lymphoblastic leukaemia: a UK Cancer Cytogenetics Group Study. Br J Haematol 129(4):520–530
Moorman AV, Ensor HM, Richards SM, Chilton L, Schwab C, Kinsey SE et al (2010) Prognostic effect of chromosomal abnormalities in childhood B-cell precursor acute lymphoblastic leukaemia: results from the UK Medical Research Council ALL97/99 randomised trial. Lancet Oncol 11(5):429–438
Sherborne AL, Hosking FJ, Prasad RB, Kumar R, Koehler R, Vijayakrishnan J et al (2010) Variation in CDKN2A at 9p21. 3 influences childhood acute lymphoblastic leukemia risk. Nat Genet 42(6):492–494
Petridou E, Alexander FE, Trichopoulos D, Revinthi K, Dessypris N, Wray N et al (1997) Aggregation of childhood leukemia in geographic areas of Greece. Cancer Causes Control 8(2):239–245
Kinlen L (1988) Evidence for an infective cause of childhood leukaemia: comparison of a Scottish new town with nuclear reprocessing sites in Britain. Lancet 332(8624):1323–1327
Castro-Jiménez MÁ, Orozco-Vargas LC (2011) Parental exposure to carcinogens and risk for childhood acute lymphoblastic leukemia, Colombia, 2000-2005. Prev Chronic Dis 8(5)A106:1–14
Viana MB, Fernandes RAF, De Carvalho RI, Murao M (1998) Low socioeconomic status is a strong independent predictor of relapse in childhood acute lymphoblastic leukemia. Int J Cancer 78(S11):56–61
Bhatia S (2004) Influence of race and socioeconomic status on outcome of children treated for childhood acute lymphoblastic leukemia. Curr Opin Pediatr 16(1):9–14
Mostert S, Sitaresmi MN, Gundy CM, Veerman AJ (2006) Influence of socioeconomic status on childhood acute lymphoblastic leukemia treatment in Indonesia. Pediatrics 118(6):e1600–e16e6
Westergaard T, Frisch M, Pedersen JB, Wohlfahrt J, Melbye M, Andersen PK et al (1997) Birth characteristics, sibling patterns, and acute leukemia risk in childhood: a population-based cohort study. J Natl Cancer Inst 89(13):939–947
Canalle R, Burim RV, Tone LG, Takahashi CS (2004) Genetic polymorphisms and susceptibility to childhood acute lymphoblastic leukemia. Environ Mol Mutagen 43(2):100–109
Mcnally RJ, Parker L (2006) Environmental factors and childhood acute leukemias and lymphomas. Leuk Lymphoma 47(4):583–598
Costas K, Knorr RS, Condon SK (2002) A case–control study of childhood leukemia in Woburn, Massachusetts: the relationship between leukemia incidence and exposure to public drinking water. Sci Total Environ 300(1):23–35
Kasim K, Levallois P, Johnson KC, Abdous B, Auger P, Group CCRER (2006) Chlorination disinfection by-products in drinking water and the risk of adult leukemia in Canada. Am J Epidemiol 163(2):116–126
Infante-Rivard C, Olson E, Jacques L, Ayotte P (2001) Drinking water contaminants and childhood leukemia. Epidemiology 12(1):13–19
Smith AH, Steinmaus CM (2009) Health effects of arsenic and chromium in drinking water: recent human findings. Annu Rev Public Health 30:107
Kchour G, Tarhini M, Kooshyar M-M, El Hajj H, Wattel E, Mahmoudi M et al (2009) Phase 2 study of the efficacy and safety of the combination of arsenic trioxide, interferon alpha, and zidovudine in newly diagnosed chronic adult T-cell leukemia/lymphoma (ATL). Blood 113(26):6528–6532
Rasheed A, Iqtidar A, Khan S (1996) Hematological and biochemical changes in acute leukemic patients after chemotherapy. Zhongguo Yao li xue bao=. Acta Pharmacol Sin 17(3):207–208
Caruso V, Iacoviello L, Di Castelnuovo A, Storti S, Mariani G, de Gaetano G et al (2006) Thrombotic complications in childhood acute lymphoblastic leukemia: a meta-analysis of 17 prospective studies comprising 1752 pediatric patients. Blood 108(7):2216–2222
Bostrom BC, Sensel MR, Sather HN, Gaynon PS, La MK, Johnston K et al (2003) Dexamethasone versus prednisone and daily oral versus weekly intravenous mercaptopurine for patients with standard-risk acute lymphoblastic leukemia: a report from the Children's Cancer Group. Blood 101(10):3809–3817
Ribera J-M, Oriol A, Sanz M-A, Tormo M, Fernández-Abellán P, del Potro E et al (2008) Comparison of the results of the treatment of adolescents and young adults with standard-risk acute lymphoblastic leukemia with the Programa Espanol de Tratamiento en Hematologia pediatric-based protocol ALL-96. J Clin Oncol 26(11):1843–1849
Hann I, Vora A, Harrison G, Harrison C, Eden O, Hill F et al (2001) Determinants of outcome after intensified therapy of childhood lymphoblastic leukaemia: results from Medical Research Council United Kingdom acute lymphoblastic leukaemia XI protocol. Br J Haematol 113(1):103–114
Wayne AS, Bhojwani D, Silverman LB, Richards K, Stetler-Stevenson M, Shah NN et al (2011) A novel anti-CD22 immunotoxin, moxetumomab pasudotox: phase I study in pediatric acute lymphoblastic leukemia (ALL). Blood 118(21):248
Lowe EJ, Pui CH, Hancock ML, Geiger TL, Khan RB, Sandlund JT (2005) Early complications in children with acute lymphoblastic leukemia presenting with hyperleukocytosis. Pediatr Blood Cancer 45(1):10–15
Athale UH, Chan AK (2003) Thrombosis in children with acute lymphoblastic leukemia: part I. epidemiology of thrombosis in children with acute lymphoblastic leukemia. Thromb Res 111(3):125–131
Mitchell L, Hoogendoorn H, Giles AR, Vegh P, Andrew M (1994) Increased endogenous thrombin generation in children with acute lymphoblastic leukemia: risk of thrombotic complications in L'Asparaginase-induced antithrombin III deficiency. Blood 83(2):386–391
Aricò M, Valsecchi MG, Camitta B, Schrappe M, Chessells J, Baruchel A et al (2000) Outcome of treatment in children with Philadelphia chromosome–positive acute lymphoblastic leukemia. N Engl J Med 342(14):998–1006
Nishimoto N, Imai Y, Ueda K, Nakagawa M, Shinohara A, Ichikawa M, Nannya Y, Kurokawa M (2010) T cell acute lymphoblastic leukemia arising from familial platelet disorder. Int J Hematol 92(1):194–197
Noetzli L, Lo RW, Lee-Sherick AB, Callaghan M, Noris P, Savoia A, Rajpurkar M, Jones K, Gowan K, Balduini CL (2015) Germline mutations in ETV6 are associated with thrombocytopenia, red cell macrocytosis and predisposition to lymphoblastic leukemia. Nat Genet 47(5):535–538
Mohapatra S, Patra D, Satpathy S (2014) An ensemble classifier system for early diagnosis of acute lymphoblastic leukemia in blood microscopic images. Neural Comput & Applic 24(7–8):1887–1904
Abdeldaim AM, Sahlol AT, Elhoseny M, Hassanien AE (2018) Computer-aided acute lymphoblastic leukemia diagnosis system based on image analysis. In: Advances in soft computing and machine learning in image processing. A.E. Hassanien and D.A. Oliva (eds.). Springer International Publishing AG pp 131–147
Jagadev P, Virani H Detection of leukemia and its types using image processing and machine learning. In: 2017 International Conference on Trends in Electronics and Informatics (ICEI), 2017 IEEE, pp. 522–526
Li J, Liu H, Downing JR, Yeoh AE-J, Wong L (2003) Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients. Bioinformatics 19(1):71–78
Fuse K, Uemura S, Tamura S, Suwabe T, Katagiri T, Tanaka T, Ushiki T, Shibasaki Y, Sato N, Yano T (2019) Patient-based prediction algorithm of relapse after allo-HSCT for acute leukemia and its usefulness in the decision-making process using a machine learning approach. Cancer Med 8(11):5058–5067
Lee S-I, Celik S, Logsdon BA, Lundberg SM, Martins TJ, Oehler VG, Estey EH, Miller CP, Chien S, Dai J (2018) A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia. Nat Commun 9(1):1–13
Acknowledgements
We are thankful to the Department of Hematology and Oncology, Children Hospital and Institute of Child Health, Lahore for the provision of data.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
The study conformed to the institute’s ethical standards.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahmood, N., Shahid, S., Bakhshi, T. et al. Identification of significant risks in pediatric acute lymphoblastic leukemia (ALL) through machine learning (ML) approach. Med Biol Eng Comput 58, 2631–2640 (2020). https://doi.org/10.1007/s11517-020-02245-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-020-02245-2