Abstract
This study aims to use big data (climate data, internet query data and school calendar patterns (SCP)) to improve pertussis surveillance and prediction, and develop an early warning model for pertussis epidemics. We collected weekly pertussis notifications, SCP, climate and internet search query data (Baidu index (BI)) in Jinan, China between 2013 and 2017. Time series decomposition and temporal risk assessment were used for examining the epidemic features in pertussis infections. A seasonal autoregressive integrated moving average (SARIMA) model and regression tree model were developed to predict pertussis occurrence using identified predictors. Our study demonstrates clear seasonal patterns in pertussis epidemics, and pertussis activity was most significantly associated with BI at 2-week lag (rBI = 0.73, p < 0.05), temperature at 1-week lag (rtemp = 0.19, p < 0.05) and rainfall at 2-week lag (rrainfall = 0.27, p < 0.05). No obvious relationship between pertussis peaks and school attendance was found in the study. Pertussis cases were more likely to be temporally concentrated throughout the epidemics during the study period. SARIMA models with 2-week-lagged BI and 1-week-lagged temperature had better predictive performance (βsearch query = 0.06, p = 0.02; βtemp = 0.16, p = 0.03) with large correlation coefficients (r = 0.67, p < 0.01) and low root mean squared error (RMSE) value (r = 3.59). The regression tree model identified threshold values of potential predictors (search query, climate and SCP) for pertussis epidemics. Our results showed that internet query in conjunction with social and climatic data can predict pertussis epidemics, which is a foundation of using such data to develop early warning systems.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Anderson R, Grenfell B, May R (1984) Oscillatory fluctuations in the incidence of infectious disease and the impact of vaccination: time series analysis. Epidemiol Infect 93:587–608
Blackwood J, Cummings D, Broutin H, Iamsirithaworn S, Rohani P (2012) The population ecology of infectious diseases: pertussis in Thailand as a case study. Parasitology 139:1888–1898
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. John Wiley & Sons
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks Monterey, CA
Brennan M et al (2000) Evidence for transmission of pertussis in schools, Massachusetts, 1996: epidemiologic data supported by pulsed-field gel electrophoresis studies. J Infect Dis 181:210–215
Chan EH et al (2010) Global capacity for emerging infectious disease detection. Proc Natl Acad Sci 107:21701–21706
Cherry JD (2012) Epidemic pertussis in 2012—the resurgence of a vaccine-preventable disease. N Engl J Med 367:785–787
Cho S et al (2013) Correlation between national influenza surveillance data and google trends in South Korea. PLoS One 8:e81422
Chretien J-P et al (2008) Syndromic surveillance: adapting innovations to developing settings. PLoS Med 5:e72
de Greeff SC, Dekkers AL, Teunis P, Rahamat-Langendoen JC, Mooi FR, de Melker HE (2009) Seasonal patterns in time series of pertussis. Epidemiol Infect 137:1388–1395
De Serres G et al (2000) Morbidity of pertussis in adolescents and adults. J Infect Dis 182:174–179
De'ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192
Dowell SF (2001) Seasonal variation in host susceptibility and cycles of certain infectious diseases. Emerg Infect Dis 7:369
Dugas AF, Jalalpour M, Gel Y, Levin S, Torcaso F, Igusa T, Rothman RE (2013) Influenza forecasting with Google flu trends. PLoS One 8:e56176
Dunn CE et al (2001) Analysing spatially referenced public health data: a comparison of three methodological approaches. Health Place 7:1–12
Fine PE, Clarkson JA (1986) Seasonal influences on pertussis. Int J Epidemiol 15:237–247
Gambhir M, Clark TA, Cauchemez S, Tartof SY, Swerdlow DL, Ferguson NM (2015) A change in vaccine efficacy and duration of protection explains recent rises in pertussis incidence in the United States. PLoS Comput Biol 11:e1004138
Grassly NC, Fraser C (2006) Seasonal infectious disease epidemiology. Proc R Soc Lond B Biol Sci 273:2541–2550
Grenfell B (1989) Pertussis in England and Wales: an investigation of transmission dynamics and control by mass vaccination. Proc R Soc Lond B 236:213–252
Guo B, Page A, Wang H, Taylor R, McIntyre P (2013) Systematic review of reporting rates of adverse events following immunization: an international comparison of post-marketing surveillance programs with reference to China. Vaccine 31:603–617
Haines A, Ebi KL, Smith KR, Woodward A (2014) Health risks of climate change: act now or pay later. Lancet 384:1073–1075
Huang X et al (2017a) Assessing the social and environmental determinants of pertussis epidemics in Queensland, Australia: a Bayesian spatio-temporal analysis. Epidemiol Infect 145:1221–1230
Huang X, Mengersen K, Milinovich G, Hu W (2017b) Effect of weather variability on seasonal influenza among different age groups in Queensland, Australia: a Bayesian spatiotemporal analysis. J Infect Dis 215:1695–1701
Husnayain A, Fuad A, Lazuardi L (2019) Correlation between Google Trends on dengue fever and national surveillance report in Indonesia. Glob Health Action 12:1552652
Jackson D, Rohani P (2014) Perplexities of pertussis: recent global epidemiological trends and their potential causes. Epidemiol Infect 142:672–684
Kamiya H, Otsuka N, Ando Y, Odaira F, Yoshino S, Kawano K, Takahashi H, Nishida T, Hidaka Y, Toyoizumi-Ajisaka H, Shibayama K, Kamachi K, Sunagawa T, Taniguchi K, Okabe N (2012) Transmission of Bordetella holmesii during pertussis outbreak, Japan. Emerg Infect Dis 18:1166
Kang M, Zhong H, He J, Rutherford S, Yang F (2013) Using google trends for influenza surveillance in South China. PLoS One 8:e55205
Kapitány-Fövény M, Ferenci T, Sulyok Z, Kegele J, Richter H, Vályi-Nagy I, Sulyok M (2019) Can Google Trends data improve forecasting of Lyme disease incidence? Zoonoses Public Health 66:101–107
Ke G et al (2016) Epidemiological analysis of hemorrhagic fever with renal syndrome in China with the seasonal-trend decomposition method and the exponential smoothing model. Sci Rep 6:39350
Lee HS, Nguyen-Viet H, Nam VS, Lee M, Won S, Duc PP, Grace D (2017) Seasonal patterns of dengue fever and associated climate factors in 4 provinces in Vietnam from 1994 to 2013. BMC Infect Dis 17:218
Li Z, Liu T, Zhu G, Lin H, Zhang Y, He J, Deng A, Peng Z, Xiao J, Rutherford S, Xie R, Zeng W, Li X, Ma W (2017) Dengue Baidu Search Index data can improve the prediction of local dengue epidemic: a case study in Guangzhou, China. PLoS Negl Trop Dis 11:e0005354
Liu K et al (2016) Using Baidu search index to predict Dengue outbreak in China. Sci Rep 6:38040
Meng C, Xu Z (2007) Relation between ENSO and Pprecipitation in Shandong [J]. Yellow River 1:014
Midekisa A, Senay G, Henebry GM, Semuniguse P, Wimberly MC (2012) Remote sensing-based time series models for malaria early warning in the highlands of Ethiopia. Malar J 11:165
Milinovich GJ, Williams GM, Clements AC, Hu W (2014) Internet-based surveillance systems for monitoring emerging infectious diseases. Lancet Infect Dis 14:160–168
Mooi FR, Van Der Maas NA, De Melker HE (2014) Pertussis resurgence: waning immunity and pathogen adaptation–two sides of the same coin. Epidemiol Infect 142:685–694
Murayama T, Hewlett EL, Maloney NJ, Justice JM, Moss J (1994) Effect of temperature and host factors on the activities of pertussis toxin and Bordetella adenylate cyclase. Biochemistry 33:15293–15297
Nagel AC et al. (2013) The complex relationship of realspace events and messages in cyberspace: case study of influenza and pertussis using tweets. J Med Internet Res 15
National Health Commission of the PRC (2007) Pertussis diagnostic criteria. http://www.nhfpc.gov.cn/zwgkzt/s9491/201410/52040bc16d3b4eecae56ec28b3358666.shtml
National Statistics Bureau of China (2010) The Sixth National Population Census data. http://data.stats.gov.cn/
Neuzil KM, Wright PF, Mitchel EF Jr, Griffin MR (2000) The burden of influenza illness in children with asthma and other chronic medical conditions. J Pediatr 137:856–864
Octavia S, Sintchenko V, Gilbert GL, Lawrence A, Keil AD, Hogg G, Lan R (2012) Newly emerging clones of Bordetella pertussis carrying prn2 and ptxP3 alleles implicated in Australian pertussis epidemic in 2008–2010. J Infect Dis 205:1220–1224
Pollett S et al. (2015) Validating the use of Google trends to enhance pertussis surveillance in California PLoS Curr 7
Project TS (2011) Assessment of syndromic surveillance in Europe. Lancet 378:1833–1834
Ren H, Li J, Yuan Z-A, Hu J-Y, Yu Y, Lu Y-H (2013) The development of a combined mathematical model to forecast the incidence of hepatitis E in Shanghai, China. BMC Infect Dis 13:421
Royal Netherlands Meteorological Institute (2018) Monthly overview of the weather in the Netherlands. https://www.knmi.nl/nederland-nu/klimatologie/gegevens/mow
Sang S et al (2015) Predicting unprecedented dengue outbreak using imported cases and climatic factors in Guangzhou, 2014. PLoS Negl Trop Dis 9:e0003808
Schmidtke AJ, Boney KO, Martin SW, Skoff TH, Tondella ML, Tatti KM (2012) Population diversity among Bordetella pertussis isolates, United States, 1935–2009. Emerg Infect Dis 18:1248
Seo D-W et al (2014) Cumulative query method for influenza surveillance using search engine data. J Med Internet Res 16:e289
Shin S-Y et al (2016) Correlation between national influenza surveillance data and search queries from mobile devices and desktops in South Korea. PLoS One 11:e0158539
Skowronski DM, de Serres G, MacDonald D, Wu W, Shaw C, Macnabb J, Champagne S, Patrick DM, Halperin SA (2002) The changing age and seasonal profile of pertussis in Canada. J Infect Dis 185:1448–1453
Statcounter (2018) Search engine market share China. http://gs.statcounter.com/search-engine-market-share/all/china
Statistics SPBo (2016) Shandong statistical yearbook. http://www.stats-sd.gov.cn/gtb/index.jsp?url=http%3A%2F%2Fwww.stats-sd.gov.cn%2Fart%2F2018%2F4%2F13%2Fart_6131_404684.html
Wang Z, Cui Z, Li Y, Hou T, Liu X, Xi Y, Liu Y, Li H, He Q (2014) High prevalence of erythromycin-resistant Bordetella pertussis in Xi’an, China. Clin Microbiol Infect 20:O825–O830
Wang Y, Xu C, Wang Z, Zhang S, Zhu Y, Yuan J (2018) Time series modeling of pertussis incidence in China from 2004 to 2018 with a novel wavelet based SARIMA-NAR hybrid model. PLoS One 13:e0208404. https://doi.org/10.1371/journal.pone.0208404
Wen T-H, Lin NH, Lin C-H, King C-C, Su M-D (2006) Spatial mapping of temporal risk characteristics to improve environmental health risk identification: a case study of a dengue epidemic in Taiwan. Sci Total Environ 367:631–640
WHO (2003) WHO-recommended surveillance standard of pertussis. http://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/passive/pertussis_standards/en/
WHO (2015) Weekly epidemiological record. https://www.who.int/wer/2015/wer9035.pdf?ua=1
WHO (2018) Pertussis. https://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/passive/pertussis/en/
Wongkoon S, Jaroensutasinee M, Jaroensutasinee K (2012) Assessing the temporal modelling for prediction of dengue infection in northern and northeastern, Thailand. Trop Biomed 29:339–348
World Health Organization (2018) WHO vaccine-preventable diseases: monitoring system. 2018 global summary. http://apps.who.int/immunization_monitoring/globalsummary/countries?countrycriteria%5Bcountry%5D%5B%5D=CHN&commit=OK
Zeng Q et al (2016) Time series analysis of temporal trends in the pertussis incidence in Mainland China from 2005 to 2016. Sci Rep 6:32367
Zhang Q, Li M, Wang L, Xin T, He Q (2013) High-resolution melting analysis for the detection of two erythromycin-resistant Bordetella pertussis strains carried by healthy schoolchildren in China. Clin Microbiol Infect 19:E260–E262
Zhang J, Su Y, Wu J, Liang H (2015) GIS based land suitability assessment for tobacco production using AHP and fuzzy set in Shandong province of China. Comput Electron Agric 114:202–211
Zhang Y, Milinovich G, Xu Z, Bambrick H, Mengersen K, Tong S, Hu W (2017) Monitoring pertussis infections using internet search queries. Sci Rep 7:10437
Zhang Y, Bambrick H, Mengersen K, Tong S, Hu W (2018) Using Google Trends and ambient temperature to predict seasonal influenza outbreaks. Environ Int 117:284–291
Zhang Y et al. (2019) Resurgence of pertussis infections in Shandong, China: space-time cluster and trend analysis. Am J Trop Med Hyg :tpmd190013
Acknowledgements
We would like to express our gratitude to the Shandong Center for Disease Control and Prevention for providing pertussis surveillance data. Y.Z. was supported by the China Scholarship Council Postgraduate Scholarship and the Queensland University of Technology Higher Degree Research Tuition Fee Sponsorship. L. F. was supported by Shandong Medical and Health Science and Technology Development Programs (award no. 2015WS0271). W. H. was supported by an Australian Research Council Future Fellowship (award no. FT140101216). K. M. was supported by ARC Laureate Fellowship (award no. FL150100150) and an ARC Centre of Excellence in Mathematical and Statistical Frontiers (award no. CE140100049).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Electronic supplementary material
ESM 1
(DOCX 46 kb)
Rights and permissions
About this article
Cite this article
Zhang, Y., Bambrick, H., Mengersen, K. et al. Using big data to predict pertussis infections in Jinan city, China: a time series analysis. Int J Biometeorol 64, 95–104 (2020). https://doi.org/10.1007/s00484-019-01796-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00484-019-01796-w


