Skip to main content

Advertisement

Log in

Using big data to predict pertussis infections in Jinan city, China: a time series analysis

  • Original Paper
  • Published:
International Journal of Biometeorology Aims and scope Submit manuscript

Abstract

This study aims to use big data (climate data, internet query data and school calendar patterns (SCP)) to improve pertussis surveillance and prediction, and develop an early warning model for pertussis epidemics. We collected weekly pertussis notifications, SCP, climate and internet search query data (Baidu index (BI)) in Jinan, China between 2013 and 2017. Time series decomposition and temporal risk assessment were used for examining the epidemic features in pertussis infections. A seasonal autoregressive integrated moving average (SARIMA) model and regression tree model were developed to predict pertussis occurrence using identified predictors. Our study demonstrates clear seasonal patterns in pertussis epidemics, and pertussis activity was most significantly associated with BI at 2-week lag (rBI = 0.73, p < 0.05), temperature at 1-week lag (rtemp = 0.19, p < 0.05) and rainfall at 2-week lag (rrainfall = 0.27, p < 0.05). No obvious relationship between pertussis peaks and school attendance was found in the study. Pertussis cases were more likely to be temporally concentrated throughout the epidemics during the study period. SARIMA models with 2-week-lagged BI and 1-week-lagged temperature had better predictive performance (βsearch query = 0.06, p = 0.02; βtemp = 0.16, p = 0.03) with large correlation coefficients (r = 0.67, p < 0.01) and low root mean squared error (RMSE) value (r = 3.59). The regression tree model identified threshold values of potential predictors (search query, climate and SCP) for pertussis epidemics. Our results showed that internet query in conjunction with social and climatic data can predict pertussis epidemics, which is a foundation of using such data to develop early warning systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  • Anderson R, Grenfell B, May R (1984) Oscillatory fluctuations in the incidence of infectious disease and the impact of vaccination: time series analysis. Epidemiol Infect 93:587–608

    CAS  Google Scholar 

  • Blackwood J, Cummings D, Broutin H, Iamsirithaworn S, Rohani P (2012) The population ecology of infectious diseases: pertussis in Thailand as a case study. Parasitology 139:1888–1898

    CAS  Google Scholar 

  • Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. John Wiley & Sons

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks Monterey, CA

  • Brennan M et al (2000) Evidence for transmission of pertussis in schools, Massachusetts, 1996: epidemiologic data supported by pulsed-field gel electrophoresis studies. J Infect Dis 181:210–215

    CAS  Google Scholar 

  • Chan EH et al (2010) Global capacity for emerging infectious disease detection. Proc Natl Acad Sci 107:21701–21706

    CAS  Google Scholar 

  • Cherry JD (2012) Epidemic pertussis in 2012—the resurgence of a vaccine-preventable disease. N Engl J Med 367:785–787

    CAS  Google Scholar 

  • Cho S et al (2013) Correlation between national influenza surveillance data and google trends in South Korea. PLoS One 8:e81422

    Google Scholar 

  • Chretien J-P et al (2008) Syndromic surveillance: adapting innovations to developing settings. PLoS Med 5:e72

    Google Scholar 

  • de Greeff SC, Dekkers AL, Teunis P, Rahamat-Langendoen JC, Mooi FR, de Melker HE (2009) Seasonal patterns in time series of pertussis. Epidemiol Infect 137:1388–1395

    Google Scholar 

  • De Serres G et al (2000) Morbidity of pertussis in adolescents and adults. J Infect Dis 182:174–179

    Google Scholar 

  • De'ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192

    Google Scholar 

  • Dowell SF (2001) Seasonal variation in host susceptibility and cycles of certain infectious diseases. Emerg Infect Dis 7:369

    CAS  Google Scholar 

  • Dugas AF, Jalalpour M, Gel Y, Levin S, Torcaso F, Igusa T, Rothman RE (2013) Influenza forecasting with Google flu trends. PLoS One 8:e56176

    CAS  Google Scholar 

  • Dunn CE et al (2001) Analysing spatially referenced public health data: a comparison of three methodological approaches. Health Place 7:1–12

    CAS  Google Scholar 

  • Fine PE, Clarkson JA (1986) Seasonal influences on pertussis. Int J Epidemiol 15:237–247

    CAS  Google Scholar 

  • Gambhir M, Clark TA, Cauchemez S, Tartof SY, Swerdlow DL, Ferguson NM (2015) A change in vaccine efficacy and duration of protection explains recent rises in pertussis incidence in the United States. PLoS Comput Biol 11:e1004138

    Google Scholar 

  • Grassly NC, Fraser C (2006) Seasonal infectious disease epidemiology. Proc R Soc Lond B Biol Sci 273:2541–2550

    Google Scholar 

  • Grenfell B (1989) Pertussis in England and Wales: an investigation of transmission dynamics and control by mass vaccination. Proc R Soc Lond B 236:213–252

    CAS  Google Scholar 

  • Guo B, Page A, Wang H, Taylor R, McIntyre P (2013) Systematic review of reporting rates of adverse events following immunization: an international comparison of post-marketing surveillance programs with reference to China. Vaccine 31:603–617

    Google Scholar 

  • Haines A, Ebi KL, Smith KR, Woodward A (2014) Health risks of climate change: act now or pay later. Lancet 384:1073–1075

    Google Scholar 

  • Huang X et al (2017a) Assessing the social and environmental determinants of pertussis epidemics in Queensland, Australia: a Bayesian spatio-temporal analysis. Epidemiol Infect 145:1221–1230

    CAS  Google Scholar 

  • Huang X, Mengersen K, Milinovich G, Hu W (2017b) Effect of weather variability on seasonal influenza among different age groups in Queensland, Australia: a Bayesian spatiotemporal analysis. J Infect Dis 215:1695–1701

    Google Scholar 

  • Husnayain A, Fuad A, Lazuardi L (2019) Correlation between Google Trends on dengue fever and national surveillance report in Indonesia. Glob Health Action 12:1552652

    Google Scholar 

  • Jackson D, Rohani P (2014) Perplexities of pertussis: recent global epidemiological trends and their potential causes. Epidemiol Infect 142:672–684

    CAS  Google Scholar 

  • Kamiya H, Otsuka N, Ando Y, Odaira F, Yoshino S, Kawano K, Takahashi H, Nishida T, Hidaka Y, Toyoizumi-Ajisaka H, Shibayama K, Kamachi K, Sunagawa T, Taniguchi K, Okabe N (2012) Transmission of Bordetella holmesii during pertussis outbreak, Japan. Emerg Infect Dis 18:1166

    Google Scholar 

  • Kang M, Zhong H, He J, Rutherford S, Yang F (2013) Using google trends for influenza surveillance in South China. PLoS One 8:e55205

    CAS  Google Scholar 

  • Kapitány-Fövény M, Ferenci T, Sulyok Z, Kegele J, Richter H, Vályi-Nagy I, Sulyok M (2019) Can Google Trends data improve forecasting of Lyme disease incidence? Zoonoses Public Health 66:101–107

    Google Scholar 

  • Ke G et al (2016) Epidemiological analysis of hemorrhagic fever with renal syndrome in China with the seasonal-trend decomposition method and the exponential smoothing model. Sci Rep 6:39350

    CAS  Google Scholar 

  • Lee HS, Nguyen-Viet H, Nam VS, Lee M, Won S, Duc PP, Grace D (2017) Seasonal patterns of dengue fever and associated climate factors in 4 provinces in Vietnam from 1994 to 2013. BMC Infect Dis 17:218

    Google Scholar 

  • Li Z, Liu T, Zhu G, Lin H, Zhang Y, He J, Deng A, Peng Z, Xiao J, Rutherford S, Xie R, Zeng W, Li X, Ma W (2017) Dengue Baidu Search Index data can improve the prediction of local dengue epidemic: a case study in Guangzhou, China. PLoS Negl Trop Dis 11:e0005354

    Google Scholar 

  • Liu K et al (2016) Using Baidu search index to predict Dengue outbreak in China. Sci Rep 6:38040

    CAS  Google Scholar 

  • Meng C, Xu Z (2007) Relation between ENSO and Pprecipitation in Shandong [J]. Yellow River 1:014

    Google Scholar 

  • Midekisa A, Senay G, Henebry GM, Semuniguse P, Wimberly MC (2012) Remote sensing-based time series models for malaria early warning in the highlands of Ethiopia. Malar J 11:165

    Google Scholar 

  • Milinovich GJ, Williams GM, Clements AC, Hu W (2014) Internet-based surveillance systems for monitoring emerging infectious diseases. Lancet Infect Dis 14:160–168

    Google Scholar 

  • Mooi FR, Van Der Maas NA, De Melker HE (2014) Pertussis resurgence: waning immunity and pathogen adaptation–two sides of the same coin. Epidemiol Infect 142:685–694

    CAS  Google Scholar 

  • Murayama T, Hewlett EL, Maloney NJ, Justice JM, Moss J (1994) Effect of temperature and host factors on the activities of pertussis toxin and Bordetella adenylate cyclase. Biochemistry 33:15293–15297

    CAS  Google Scholar 

  • Nagel AC et al. (2013) The complex relationship of realspace events and messages in cyberspace: case study of influenza and pertussis using tweets. J Med Internet Res 15

    Google Scholar 

  • National Health Commission of the PRC (2007) Pertussis diagnostic criteria. http://www.nhfpc.gov.cn/zwgkzt/s9491/201410/52040bc16d3b4eecae56ec28b3358666.shtml

  • National Statistics Bureau of China (2010) The Sixth National Population Census data. http://data.stats.gov.cn/

  • Neuzil KM, Wright PF, Mitchel EF Jr, Griffin MR (2000) The burden of influenza illness in children with asthma and other chronic medical conditions. J Pediatr 137:856–864

    CAS  Google Scholar 

  • Octavia S, Sintchenko V, Gilbert GL, Lawrence A, Keil AD, Hogg G, Lan R (2012) Newly emerging clones of Bordetella pertussis carrying prn2 and ptxP3 alleles implicated in Australian pertussis epidemic in 2008–2010. J Infect Dis 205:1220–1224

    CAS  Google Scholar 

  • Pollett S et al. (2015) Validating the use of Google trends to enhance pertussis surveillance in California PLoS Curr 7

  • Project TS (2011) Assessment of syndromic surveillance in Europe. Lancet 378:1833–1834

    Google Scholar 

  • Ren H, Li J, Yuan Z-A, Hu J-Y, Yu Y, Lu Y-H (2013) The development of a combined mathematical model to forecast the incidence of hepatitis E in Shanghai, China. BMC Infect Dis 13:421

    Google Scholar 

  • Royal Netherlands Meteorological Institute (2018) Monthly overview of the weather in the Netherlands. https://www.knmi.nl/nederland-nu/klimatologie/gegevens/mow

  • Sang S et al (2015) Predicting unprecedented dengue outbreak using imported cases and climatic factors in Guangzhou, 2014. PLoS Negl Trop Dis 9:e0003808

    Google Scholar 

  • Schmidtke AJ, Boney KO, Martin SW, Skoff TH, Tondella ML, Tatti KM (2012) Population diversity among Bordetella pertussis isolates, United States, 1935–2009. Emerg Infect Dis 18:1248

    Google Scholar 

  • Seo D-W et al (2014) Cumulative query method for influenza surveillance using search engine data. J Med Internet Res 16:e289

    Google Scholar 

  • Shin S-Y et al (2016) Correlation between national influenza surveillance data and search queries from mobile devices and desktops in South Korea. PLoS One 11:e0158539

    Google Scholar 

  • Skowronski DM, de Serres G, MacDonald D, Wu W, Shaw C, Macnabb J, Champagne S, Patrick DM, Halperin SA (2002) The changing age and seasonal profile of pertussis in Canada. J Infect Dis 185:1448–1453

    Google Scholar 

  • Statcounter (2018) Search engine market share China. http://gs.statcounter.com/search-engine-market-share/all/china

  • Statistics SPBo (2016) Shandong statistical yearbook. http://www.stats-sd.gov.cn/gtb/index.jsp?url=http%3A%2F%2Fwww.stats-sd.gov.cn%2Fart%2F2018%2F4%2F13%2Fart_6131_404684.html

  • Wang Z, Cui Z, Li Y, Hou T, Liu X, Xi Y, Liu Y, Li H, He Q (2014) High prevalence of erythromycin-resistant Bordetella pertussis in Xi’an, China. Clin Microbiol Infect 20:O825–O830

    CAS  Google Scholar 

  • Wang Y, Xu C, Wang Z, Zhang S, Zhu Y, Yuan J (2018) Time series modeling of pertussis incidence in China from 2004 to 2018 with a novel wavelet based SARIMA-NAR hybrid model. PLoS One 13:e0208404. https://doi.org/10.1371/journal.pone.0208404

    Article  CAS  Google Scholar 

  • Wen T-H, Lin NH, Lin C-H, King C-C, Su M-D (2006) Spatial mapping of temporal risk characteristics to improve environmental health risk identification: a case study of a dengue epidemic in Taiwan. Sci Total Environ 367:631–640

    CAS  Google Scholar 

  • WHO (2003) WHO-recommended surveillance standard of pertussis. http://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/passive/pertussis_standards/en/

  • WHO (2015) Weekly epidemiological record. https://www.who.int/wer/2015/wer9035.pdf?ua=1

  • WHO (2018) Pertussis. https://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/passive/pertussis/en/

  • Wongkoon S, Jaroensutasinee M, Jaroensutasinee K (2012) Assessing the temporal modelling for prediction of dengue infection in northern and northeastern, Thailand. Trop Biomed 29:339–348

    CAS  Google Scholar 

  • World Health Organization (2018) WHO vaccine-preventable diseases: monitoring system. 2018 global summary. http://apps.who.int/immunization_monitoring/globalsummary/countries?countrycriteria%5Bcountry%5D%5B%5D=CHN&commit=OK

  • Zeng Q et al (2016) Time series analysis of temporal trends in the pertussis incidence in Mainland China from 2005 to 2016. Sci Rep 6:32367

    CAS  Google Scholar 

  • Zhang Q, Li M, Wang L, Xin T, He Q (2013) High-resolution melting analysis for the detection of two erythromycin-resistant Bordetella pertussis strains carried by healthy schoolchildren in China. Clin Microbiol Infect 19:E260–E262

    CAS  Google Scholar 

  • Zhang J, Su Y, Wu J, Liang H (2015) GIS based land suitability assessment for tobacco production using AHP and fuzzy set in Shandong province of China. Comput Electron Agric 114:202–211

    Google Scholar 

  • Zhang Y, Milinovich G, Xu Z, Bambrick H, Mengersen K, Tong S, Hu W (2017) Monitoring pertussis infections using internet search queries. Sci Rep 7:10437

    Google Scholar 

  • Zhang Y, Bambrick H, Mengersen K, Tong S, Hu W (2018) Using Google Trends and ambient temperature to predict seasonal influenza outbreaks. Environ Int 117:284–291

    Google Scholar 

  • Zhang Y et al. (2019) Resurgence of pertussis infections in Shandong, China: space-time cluster and trend analysis. Am J Trop Med Hyg :tpmd190013

Download references

Acknowledgements

We would like to express our gratitude to the Shandong Center for Disease Control and Prevention for providing pertussis surveillance data. Y.Z. was supported by the China Scholarship Council Postgraduate Scholarship and the Queensland University of Technology Higher Degree Research Tuition Fee Sponsorship. L. F. was supported by Shandong Medical and Health Science and Technology Development Programs (award no. 2015WS0271). W. H. was supported by an Australian Research Council Future Fellowship (award no. FT140101216). K. M. was supported by ARC Laureate Fellowship (award no. FL150100150) and an ARC Centre of Excellence in Mathematical and Statistical Frontiers (award no. CE140100049).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenbiao Hu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

ESM 1

(DOCX 46 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Bambrick, H., Mengersen, K. et al. Using big data to predict pertussis infections in Jinan city, China: a time series analysis. Int J Biometeorol 64, 95–104 (2020). https://doi.org/10.1007/s00484-019-01796-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00484-019-01796-w

Keywords

Profiles

  1. Wenbiao Hu