Abstract
Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use of a random forest classifier was explored to analyze the relative importance of hydrometeorological indices in developing the leptospirosis model and to evaluate the performance of models based on the type of indices used, using case data from three districts in Kelantan, Malaysia, that experience annual monsoonal rainfall and flooding. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity, and temperature were transformed into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, weekly case occurrences were classified into binary classes “high” and “low” based on an average threshold. Seventeen models based on “average,” “extreme,” and “mixed” indices were trained by optimizing the feature subsets based on the model computed mean decrease Gini (MDG) scores. The variable importance was assessed through cross-correlation analysis and the MDG score. The average and extreme models showed similar prediction accuracy ranges (61.5–76.1% and 72.3–77.0%) while the mixed models showed an improvement (71.7–82.6% prediction accuracy). An extreme model was the most sensitive while an average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The rainfall variable (extreme) was the most important in classifying the leptospirosis occurrence while streamflow was the least important despite showing higher correlations with leptospirosis.
Similar content being viewed by others
References
Adler B, de la Peña Moctezuma A (2010) Leptospira and leptospirosis. Vet Microbiol 140(3–4):287–296. https://doi.org/10.1016/j.vetmic.2009.03.012
Ahangarcani M, Farnaghi M, Shirzadi MR, Pilesjö P, Mansourian A (2019) Predictive risk mapping of human leptospirosis using support vector machine classification and multilayer perceptron neural network. Geospatial Health 14(1). https://doi.org/10.4081/gh.2019.711
Ansdell VE (2017) Chapter 23 - Leptospirosis. Elsevier Inc., In The Travel and Tropical Medicine Manual (Fifth Edition), pp 336–344. https://doi.org/10.1016/B978-0-323-37506-1.00023-4
Barradas-Bautista D (2020) Random forest and deep learning performance on the Malaria DREAM sub challenge one random forest and deep learning performance. Res Comput Sci 149(5):163–170
Benacer D, Thong KL, Min NC, Verasahib KB, Galloway RL, Hartskeerl RA, Souris M, Zain SNM (2016) Epidemiology of human leptospirosis in Malaysia, 2004–2012. Acta Tropica 157:162–168. https://pubmed.ncbi.nlm.nih.gov/26844370/
Barcellos C, Sabroza PC (2001) The place behind the case: leptospirosis risks and associated environmental conditions in a flood-related outbreak in Rio de Janeiro. Cad Saúde Pública / Ministério Da Saúde, Fundação Oswaldo Cruz, Escola Nacional De Saúde Pública 17(Suppl):59–67. https://doi.org/10.1590/s0102-311x2001000700014
Batchelor TWK, Stephenson TS, Brown PD, Amarakoon D, Taylor MA (2012) Influence of climate variability on human leptospirosis cases in Jamaica. Climate Res 55(1):79–90. https://doi.org/10.3354/cr01120
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Campbell AM, Racault MF, Goult S, Laurenson A (2020) Cholera risk: a machine learning approach applied to essential climate variables. Int J Environ Res Public Health 17(24):1–24. https://doi.org/10.3390/ijerph17249378
Cann KF, Thomas DR, Salmon RL, Wyn-Jones AP, Kay D (2013) Extreme water-related weather events and waterborne disease. Epidemiol Infect 141(4):671–686. https://doi.org/10.1017/S0950268812001653
Carvajal TM, Viacrusis KM, Hernandez LFT, Ho HT, Amalin DM, Watanabe K (2018) Machine learning methods reveal the temporal pattern of dengue incidence using meteorological factors in metropolitan Manila Philippines. BMC Infect Dis 18(1):1–15. https://doi.org/10.1186/s12879-018-3066-0
Chadsuthi S, Modchang C, Lenbury Y, Iamsirithaworn S, Triampo W (2012) Modeling seasonal leptospirosis transmission and its association with rainfall and temperature in Thailand using time-series and ARIMAX analyses. Asian Pac J Trop Med 5(7):539–546. https://doi.org/10.1016/S1995-7645(12)60095-9
Cunha M, Costa F, Ribeiro GS, Carvalho MS, Reis RB, Nery Jr N, Pischel L, Gouveia EL, Santos AC, Queiroz A, Wunder Jr EA, Reis MG, Diggle PJ, Ko AI (2022) Rainfall and other meteorological factors as drivers of urban transmission of leptospirosis. PLoS Negl Trop Dis 16(4):e0007507. https://doi.org/10.1101/658872
Department of Irrigation and Drainage Malaysia [DID] (2017) Flood management - programme and activities. https://www.water.gov.my/index.php/pages/view/419?mid=244. Accessed 27 Nov 2020
Desvars A, Jégo S, Chiroleu F, Bourhy P, Cardinale E, Michault A (2011) Seasonality of human leptospirosis in Reunion Island (Indian Ocean) and its association with meteorological data. PLoS ONE 6(5):e20377. https://doi.org/10.1371/journal.pone.0020377
Dhewantara PW, Lau CL, Allan KJ, Hu W, Zhang W, Mamun AA, Soares Magalhães RJ (2019) Spatial epidemiological approaches to inform leptospirosis surveillance and control: A systematic review and critical appraisal of methods. Zoonoses and Public Health 66(2):185–206. https://doi.org/10.1111/zph.12549
Ding G, Li X, Li X, Zhang B, Jiang B, Li D, Xing W, Liu Q, Liu X, Hou H (2019) A time-trend ecological study for identifying flood-sensitive infectious diseases in Guangxi, China from 2005 to 2012. Environ Res 176(July):108577. https://doi.org/10.1016/j.envres.2019.108577
Ehelepola NDB, Ariyaratne K, Dissanayake WP (2019) The correlation between local weather and leptospirosis incidence in Kandy district, Sri Lanka from 2006 to 2015. Global Health Action 12(1):1553283. https://doi.org/10.1080/16549716.2018.1553283
Ghizzo Filho J, Nazário NO, Freitas PF, Pinto GDA, Schlindwein AD (2018) Temporal analysis of the relationship between leptospirosis, rainfall levels and seasonality, Santa Catarina, Brazil, 2005–2015. Rev Inst Med Trop Sao Paulo 3154(01):18–17. https://doi.org/10.1590/S1678-9946201860039
Glaros AG, Kline RB (1988) Understanding the accuracy of tests with cutting scores: The sensitivity, specificity, and predictive value model. J Clin Psychol 44(6):1013–1023. https://doi.org/10.1002/1097-4679(198811)44:6%3C1013::AID-JCLP2270440627%3E3.0.CO;2-Z
Gómez AA, López MS, Müller GV, López LR, Sione W, Giovanini L (2022) Modeling of leptospirosis outbreaks in relation to hydroclimatic variables in the northeast of Argentina. Heliyon 8(6):e09758. https://doi.org/10.1016/j.heliyon.2022.e09758
Guo P, Liu T, Zhang Q, Wang L, Xiao J, Zhang Q, Luo G, Li Z, He J, Zhang Y, Ma W (2017) Developing a dengue forecast model using machine learning: a case study in China. PLoS Negl Trop Dis 11(10):e0005973. https://doi.org/10.1371/journal.pntd.0005973
Haake DA, Levett PN (2015) Leptospirosis in humans. Curr Top Microbiol Immunol 387:65–97. https://doi.org/10.1007/978-3-662-45059-8_5
Hacker KP, Sacramento GA, Cruz JS, De Oliveira D, Nery N, Lindow JC, Carvalho M, Hagan J, Diggle PJ, Begon M, Reis MG, Wunder EA, Ko AI, Costa F (2020) Influence of rainfall on leptospira infection and disease in a tropical urban setting Brazil. Emerg Infect Dis 26(2):311–314. https://doi.org/10.3201/eid2602.190102
Hayati KS, Sharifah Norkhadijah SI, Salmiah MS, Edre MA, Khin TD (2018) Hot-spot and cluster analysis on legal and illegal dumping sites as the contributors of leptospirosis in a flood hazard area in Pahang, Malaysia. Asian J Agric Biol 5(2):56–59
Hu H, Wang H, Wang F, Langley D, Avram A, Liu M (2018) Prediction of influenza-like illness based on the improved artificial tree algorithm and artificial neural network. Sci Rep 8(1):1–8. https://doi.org/10.1038/s41598-018-23075-1
Ismail WR, Haghroosta T (2018) Extreme weather and floods in Kelantan state, Malaysia in December 2014. Res Mar Sci 3(1):231–244
Jamaludin N, Mohammed NI, Khamidi MF, Wahab SNA (2015) Thermal comfort of residential building in Malaysia at different micro-climates. Proc Soc Behav Sci 170:613–623. https://doi.org/10.1016/j.sbspro.2015.01.063
Joshi YP, Kim EH, Cheong HK (2017) The influence of climatic factors on the development of hemorrhagic fever with renal syndrome and leptospirosis during the peak season in Korea: an ecologic study. In BMC Infect Dis 17(1). https://doi.org/10.1186/s12879-017-2506-6
Khan S, Ullah R, Khan A, Sohail A, Wahab N, Bilal M, Ahmed M (2017) Random forest-based evaluation of Raman spectroscopy for dengue fever analysis. Appl Spectrosc 71(9):2111–2117. https://doi.org/10.1177/0003702817695571
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Soft 28(5):1–26. https://doi.org/10.18637/jss.v028.i05
Kumar V, Minz S (2014) Feature selection: a literature review. Smart Comput Rev 4(3):211–229. https://doi.org/10.6029/smartcr.2014.03.007
Kupek E, de Sousa Santos Faversani MC, de Souza Philippi JM (2000) The relationship between rainfall and human leptospirosis in Florianópolis, Brazil, 1991–1996. Braz J Infect Dis: Off Publ Braz Soc Infect Dis 4(3):131–134
Lau CL, Clements ACA, Skelly C, Dobson AJ, Smythe LD, Weinstein P (2012) Leptospirosis in American Samoa - estimating and mapping risk using environmental data. PLoS Negl Trop Dis 6(5):e1669. https://doi.org/10.1371/journal.pntd.0001669
Lau CL, Smythe LD, Craig SB, Weinstein P (2010) Climate change, flooding, urbanisation and leptospirosis: fuelling the fire? Trans R Soc Trop Med Hyg 104(10):631–638. https://doi.org/10.1016/j.trstmh.2010.07.002
Lipovetsky S, Conklin M (2001) Analysis of regression in game theory approach. Appl Stoch Model Bus Ind 17(4):319–330. https://doi.org/10.1002/asmb.446
López MS, Müller GV, Lovino MA, Gómez AA, Sione WF, Aragonés Pomares L (2019) Spatio-temporal analysis of leptospirosis incidence and its relationship with hydroclimatic indicators in northeastern Argentina. Sci Total Environ 694. https://doi.org/10.1016/j.scitotenv.2019.133651
Mayfield HJ, Lowry JH, Watson CH, Kama M, Nilles EJ, Lau CL (2018) Use of geographically weighted logistic regression to quantify spatial variation in the environmental and sociodemographic drivers of leptospirosis in Fiji: a modelling study. Lancet Planet Health 2(5):e223–e232. https://doi.org/10.1016/S2542-5196(18)30066-4
Mohammadinia A, Alimohammadi A, Saeidian B (2017) Efficiency of geographically weighted regression in modeling human leptospirosis based on environmental factors in Gilan province Iran. Geosci 7(4):136. https://doi.org/10.3390/geosciences7040136
Mohd Radi MF, Hashim JH, Jaafar MH, Hod R, Ahmad N, Nawi AM, Baloch GM, Ismail R, Ayub NIF (2018) Leptospirosis outbreak after the 2014 major flooding event in Kelantan, Malaysia: a spatial-temporal analysis. Am J Trop Med Hyg 98(5):1281–1295. https://doi.org/10.4269/ajtmh.16-0922
Péres WE, Russo A, Nunes B (2019) The association between hydro-meteorological events and leptospirosis hospitalizations in Santa Catarina Brazil. Water 11(5):1052. https://doi.org/10.3390/w11051052
Peterson TC, Folland CC, Gruza G, Hogg W, Mokssit A, Plummer N (2001) Report on the activities of the working group on climate change detection and related rapporteurs 1998–2001. Rep. WCDMP-47, WMO-TD 1071, Geneve, Switzerland, March, 143. http://etccdi.pacificclimate.org/docs/wgccd.2001.pdf. Accessed 10 Oct 2022
Picardeau M (2013) Diagnosis and epidemiology of leptospirosis. Med Et Mal Infect 43(1):1–9. https://doi.org/10.1016/j.medmal.2012.11.005
Rahayu S, Adi MS, Saraswati LD (2018) Mapping of leptospirosis environmental risk factors and determining the level of leptospirosis vulnerable zone in Demak District using remote sensing image. In: E3S Web of Conferences 2018. EDP Sciences, vol. 31, p 06003. https://doi.org/10.1051/e3sconf/20183106003
Rahmat F, Ishak AJ, Zulkafli Z, Yahaya H, Masrani A (2019) Prediction model of leptospirosis occurrence for Seremban (Malaysia) using meteorological data. Int J Integr Eng 11(4):61–69. https://doi.org/10.30880/ijie.2019.11.04.007
Rahmat F, Zulkafli Z, Ishak AJ, Mohd Noor SB, Yahaya H, Masrani A (2020) Exploratory data analysis and artificial neural network for prediction of leptospirosis occurrence in Seremban, Malaysia Based on Meteorological Data. Front Earth Sci 8:377. https://doi.org/10.3389/feart.2020.00377
Sánchez-Montes S, Espinosa-Martínez DV, Ríos-Muñoz CA, Berzunza-Cruz M, Becker I (2015) Leptospirosis in Mexico: epidemiology and potential distribution of human cases. PLoS ONE 10(7):e0133720. https://doi.org/10.1371/journal.pone.0133720
Santos MS, Soares JP, Abreu PH, Araujo H, Santos J (2018) Cross-validation for imbalanced datasets: avoiding overoptimistic and overfitting approaches [research frontier]. IEEE Comput Intell Mag 13(4):59–76. https://doi.org/10.1109/MCI.2018.2866730
Schneider MC, Nájera P, Aldighieri S, Bacallao J, Soto A, Marquiño W, Altamirano L, Saenz C, Marin J, Jimenez E, Moynihan M, Espinal M (2012) Leptospirosis outbreaks in Nicaragua: identifying critical areas and exploring drivers for evidence-based planning. Int J Environ Res Public Health 9(11):3883–3910. https://doi.org/10.3390/ijerph9113883
Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: appropriate use and interpretation. Anesth Analg 126(5):1763–1768. https://doi.org/10.1213/ANE.0000000000002864
Sehgal SC, Sugunan AP, Vijayachari P (2002) Outbreak of leptospirosis after the cyclone in Orissa. Natl Med J India 15(1):22–23
Soo ZMP, Khan NA, Siddiqui R (2020) Leptospirosis: increasing importance in developing countries. Acta Tropica 201:105183. https://doi.org/10.1016/j.actatropica.2019.105183
Strobl C, Boulesteix AL, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinforma 8(1):1–21. https://doi.org/10.1186/1471-2105-8-25
Sumi A, Telan EFO, Chagan-Yasutan H, Piolo MB, Hattori T, Kobayashi N (2017) Effect of temperature, relative humidity and rainfall on dengue fever and leptospirosis infections in Manila, the Philippines. Epidemiol Infect 145(1):78–86. https://doi.org/10.1017/S095026881600203X
Suwanpakdee S, Kaewkungwal J, White LJ, Asensio N, Ratanakorn P, Singhasivanon P, Day NPJ, Pan-Ngum W (2015) Spatio-temporal patterns of leptospirosis in Thailand: is flooding a risk factor? Epidemiol Infect 143(10):2106–2115. https://doi.org/10.1017/S0950268815000205
Tassinari WS, Pellegrini DC, Sá CB, Reis RB, Ko AI, Carvalho MS (2008) Detection and modelling of case clusters for urban leptospirosis. Tropical Med Int Health 13(4):503–512. https://doi.org/10.1111/j.1365-3156.2008.02028.x
Togami E, Kama M, Goarant C, Craig SB, Lau C, Ritter JM, Imrie A, Ko AI, Nilles EJ (2018) A large leptospirosis outbreak following successive severe floods in Fiji, 2012. American Journal of Tropical Medicine and Hygiene, 99(4), 849-851. https://doi.org/10.4269/ajtmh.18-0335
Ucar MK, Nour M, Sindi H, Polat K (2020) The effect of training and testing process on machine learning in biomedical datasets. Math Prob Eng 2020:17. https://doi.org/10.1155/2020/2836236
Uusitalo R, Siljander M, Dub T, Sane J, Sormunen JJ, Pellikka P, Vapalahti O (2020) Modelling habitat suitability for occurrence of human tick-borne encephalitis (TBE) cases in Finland. Ticks and Tick-borne Diseases 11(5):101457. https://doi.org/10.1016/j.ttbdis.2020.101457
Van Stralen KJ, Stel VS, Reitsma JB, Dekker FW, Zoccali C, Jager KJ (2009) Diagnostic methods I: sensitivity, specificity, and other measures of accuracy. Kidney Int 75(12):1257–1263. https://doi.org/10.1038/ki.2009.92
Vega-Corredor M, Opadeyi J (2014) Hydrology and public health: linking human leptospirosis and local hydrological dynamics in Trinidad, West Indies. Earth Perspectives 1:1–4. https://doi.org/10.1186/2194-6434-1-3
Weinberger D, Baroux N, Grangeon JP, Ko AI, Goarant C (2014) El Nino southern oscillation and leptospirosis outbreaks in New Caledonia. PLoS Negl Trop Dis 8(4):e2798. https://doi.org/10.1371/journal.pntd.0002798
World Health Organization (2001) WHO recommended strategies for the prevention and control of communicable diseases (No. WHO/CDS/CPE/SMT/2001.13). World Health Organization. https://apps.who.int/iris/bitstream/handle/10665/67088/WHO_CDS_CPE_SMT_2001.13.pdf. Accessed 22 May 2021
World Health Organization (2003) Human leptospirosis: guidance for diagnosis, surveillance and control (No. WHO/CDS/CSR/EPH 2002.23). World Health Organization. https://www.who.int/publications/i/item/human-leptospirosis-guidance-for-diagnosis-surveillance-and-control. Accessed 17 Aug 2021
World Health Organization (2011) Report of the Second Meeting of the Leptospirosis Burden Epidemiology Reference Group. World Health Organization. http://apps.who.int/iris/bitstream/handle/10665/44588/9789241501521_eng.pdf?sequence=1. Accessed 22 Aug 2021
Zakharova OI, Korennoy FI, Iashin IV, Toropova NN, Gogin AE, Kolbasov DV, Surkova GV, Malkhazova SM, Blokhin AA (2021) Ecological and Socio-economic determinants of livestock animal leptospirosis in the Russian arctic. Front Vet Sci 8:658675. https://doi.org/10.3389/fvets.2021.658675
Zhang X, Alexander L, Hegerl GC, Jones P, Tank AK, Peterson TC, Trewin B, Zwiers FW (2011) Indices for monitoring changes in extremes based on daily temperature and precipitation data. Wiley Interdiscip Rev: Clim Chang 2(6):851–870. https://doi.org/10.1002/wcc.147
Zhang Z, Yang Z, Ren W, Wen G (2019) Random forest-based real-time defect detection of Al alloy in robotic arc welding using optical spectrum. J Manuf Process 42:51–59. https://doi.org/10.1016/j.jmapro.2019.04.023
Zhao J, Liao J, Huang X, Zhao J, Wang Y, Ren J, Wang X, Ding F (2016) Mapping risk of leptospirosis in China using environmental and socioeconomic data. BMC Infect Dis 16(1):1–10. https://doi.org/10.1186/s12879-016-1653-5
Zhao N, Charland K, Carabali M, Nsoesie EO, Maheu-Giroux M, Rees E, Yuan M, Garcia Balaguera C, Jaramillo Ramirez G, Zinszer K (2020) Machine learning and dengue forecasting: Comparing random forest and artificial neural networks for predicting dengue burden at national and sub-national scales in Colombia. PLoS Negl Trop Dis 14(9):e0008056. https://doi.org/10.1371/journal.pntd.0008056
Acknowledgements
We acknowledge the Department of Health Kelantan for providing access to the case data and the Department of Irrigation and Drainage Malaysia for providing the hydrological data. The authors would like to thank the Director General of Health Malaysia for the permission to publish this paper.
Funding
This work was supported by grants from the Ministry of Higher Education Malaysia (NEWTON/1/2018/WAB05/UPM/1) and from the UK Natural Research Environment Council (NE/S003053/1) under the Understanding of the Impacts of Hydrometeorological Hazards in South East Asia program.
Author information
Authors and Affiliations
Contributions
Veianthan Jayaramu: conceptualization, methodology, software, formal analysis, investigation, data curation, writing—original draft, visualization, and project administration. Zed Zulkafli: conceptualization, methodology, validation, writing—review and editing, visualization, supervision, project administration, and funding acquisition. Simon De Stercke: writing—review and editing, supervision, and project administration. Wouter Buytaert: writing—review and editing, project administration, and funding acquisition. Fariq Rahmat: software and data curation. Ribhan Zafira Abdul Rahman: writing—review and editing. Asnor Juraiza Ishak: writing—review and editing. Wardah Tahir: writing—review and editing. Jamalludin Ab Rahman: writing—review and editing. Nik Mohd Hafiz Mohd Fuzi: resources and writing—review and editing.
Corresponding author
Ethics declarations
Ethical approval
Ethical approval for this study was obtained from the Medical Research and Ethics Committee, Ministry of Health Malaysia (NMRR-19–4115-47702).
Conflict of interest
The authors declare no competing interests.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jayaramu, V., Zulkafli, Z., De Stercke, S. et al. Leptospirosis modelling using hydrometeorological indices and random forest machine learning. Int J Biometeorol 67, 423–437 (2023). https://doi.org/10.1007/s00484-022-02422-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00484-022-02422-y