Abstract
Understanding dynamics in time and the predominant underlying factors that shape them is a central question in biological and medical sciences. Data are more ubiquitous and richer than ever before and population biology in the big data era need to integrate novel methods. Calibrated Individual Based Models (IBMs) are powerful tools for process based predictive modelling. Intervention analysis is the analysis in time series of the potential impact of an event such as an extreme event or an experimentally designed intervention on the time series, for example vaccinating a population. A method for big data analytics (causal impact) that implements a Bayesian intervention approach to estimating the causal effect of a designed intervention on a time series is used to quantify the deviance between data and IBM outputs. Having quantified the deviance between IBM outputs and data, IBM scenarios are used to predict the counterfactual. The counterfactual is how the IBM response metric would have evolved after the intervention if the intervention had never occurred. The method is exemplified to quantify the deviance between a calibrated IBM outputs and epidemiological data of Bovine Tuberculosis with changing the cattle TB testing frequency as the intervention covariate. The advantage of IBM data validation and uncertainty assessment as time series is also discussed.
Similar content being viewed by others
References
Aanensen DM, Huntley DM, Feil EJ, al-Own Fa, Spratt BG (2009) EpiCollect: linking smartphones to web applications for epidemiology, ecology and community data collection. PLoS ONE 4:e6968
Albuquerque MTD, Gerassis S, Sierra C, Taboada J, Martín JE, Antunes IMHR, Gallego JR (2017) Developing a new Bayesian risk index for risk evaluation of soil contamination. Sci Total Environ 603–604:167–177
Andre FE, Booy R, Bock HL, Clemens J, Datta SK, John TJ, Lee BW, Lolekha S, Peltola H, Ruff T (2008) Vaccination greatly reduces disease, disability, death and inequity worldwide. Bull World Health Org 86:140–146
Aznar I, Frankena K, More SJ, O’Keeffe J, McGrath G, de Jong MCM (2018) Quantification of mycobacterium bovis transmission in a badger vaccine field trial. Prev Vet Med 149:29–37
Begon M, Townsend CA, Harper JL (2005) Ecology: from individuals to ecosystems. Wiley, Oxford
Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 31:307–327
Briggs, H. 2015. Testing ‘more effective’ than badger cull. http://www.bbc.com/news/science-environment-30820579: BBC
Brodersen KH, Gallusser F, Koehler J, Remy N, Scott SL (2015) Inferring causal impact using Bayesian structural time-series models. Ann Appl Stat 9:247–274
Brooks-Pollock E, Roberts GO, Keeling MJ (2014) A dynamic model of bovine tuberculosis spread and control in Great Britain. Nature 511:228–231
Burnham KP, Anderson DR (2002) Model selection and multimodel inference. Springer, New York
Carter SP, Chambers MA, Rushton SP, Shirley MDF, Schuchert P, Pietravalle S, Murray A, Rogers F, Gettinby G, Smith GC, Delahay RJ, Hewinson RG, McDonald RA (2012) BCG vaccination reduces risk of tuberculosis infection in vaccinated badgers and unvaccinated badger cubs. PLoS ONE 7:e49833
Claridge J, Diggle P, McCann CM, Mulcahy G, Flynn R, McNair J, Strain S, Welsh M, Baylis M, Williams DJL (2012) Fasciola hepatica is associated with the failure to detect bovine tuberculosis in dairy cattle. Nat Commun 3:853
Congdon P (2017) Representing spatial dependence and spatial discontinuity in ecological epidemiology: a scale mixture approach. Stoch Environ Res Risk Assess 31:291–304
Daliakopoulos IN, Katsanevakis S, Moustakas A (2017) Spatial downscaling of alien species presences using machine learning. Front Earth Sci 5:60
Damos P (2016) A stepwise algorithm to detect significant time lags in ecological time series in terms of autocorrelation functions and ARMA model optimisation of pest population seasonal outbreaks. Stoch Environ Res Risk Assess 30:1961–1980
DEFRA (2016a) Annex—Background and methodology to the National Statistics on the Incidence of Tuberculosis (TB) in Cattle in Great Britain
DEFRA (2016b) Bovine TB testing intervals, 2016. https://www.gov.uk/guidance/bovine-tb-testing-intervals-2016
DEFRA. 2016c. Monthly publication of National Statistics on the Incidence of Tuberculosis (TB) in Cattle to end August 2016 for Great Britain
Diebold FX (1998) Elements of forecasting. Citeseer, Harrisburg
Dunson DB (2001) Commentary: practical advantages of Bayesian analysis of epidemiologic data. Am J Epidemiol 153:1222–1226
Edwards JK, Lesko CR, Keil AP (2017) Invited commentary: causal inference across space and time—quixotic quest, worthy goal, or both? Am J Epidemiol 186:143–145
Edwards W, Lindman H, Savage LJ (1963) Bayesian statistical inference for psychological research. Psychol Rev 70:193
Enright J, O’Hare A (2017) Reconstructing disease transmission dynamics from animal movements and test data. Stoch Environ Res Risk Assess 31:369–377
Eum H-I, Cannon AJ, Murdock TQ (2017) Intercomparison of multiple statistical downscaling methods: multi-criteria model selection for South Korea. Stoch Environ Res Risk Assess 31:683–703
Evans MR, Benton TG, Grimm V, Lessells CM, O’Malley MA, Moustakas A, Weisberg M (2014) Data availability and model complexity, generality, and utility: a reply to Lonergan. Trends Ecol Evol 29:302–303
Evans MR, Bithell M, Cornell SJ, Dall SRX, Díaz S, Emmott S, Ernande B, Grimm V, Hodgson DJ, Lewis SL, Mace GM, Morecroft M, Moustakas A, Murphy E, Newbold T, Norris KJ, Petchey O, Smith M, Travis JMJ, Benton TG (2013) Predictive systems ecology. Proc R Soc B Biol Sci 280:20131452
Evans MR, Moustakas A (2016) A comparison between data requirements and availability for calibrating predictive ecological models for lowland UK woodlands: learning new tricks from old trees. Ecol Evol 6:4812–4822
Evans MR, Moustakas A (2017) Plasticity in foraging behaviour as a possible response to climate change. Ecol Inf. https://doi.org/10.1016/j.ecoinf.2017.08.001
Fan J, Han F, Liu H (2014) Challenges of big data analysis. Natl Sci Rev 1:293–314
Ginsburg O, Bray F, Coleman MP, Vanderpuye V, Eniu A, Kotha SR, Sarker M, Huong TT, Allemani C, Dvaladze A (2017) The global burden of women’s cancers: a grand challenge in global health. Lancet 389:847–860
Goicoa T, Adin A, Ugarte MD, Hodges JS (2018) In spatio-temporal disease mapping models, identifiability constraints affect PQL and INLA results. Stoch Environ Res Risk Assess. https://doi.org/10.1007/s00477-017-1405-0
Greenwood B (2014) The contribution of vaccination to global health: past, present and future. Philos Trans R Soc B Biol Sci 369:20130433
Guarte JM, Barrios EB (2006) Estimation under purposive sampling. Commun Stat Simul Comput 35:277–284
Hampton SE, Strasser CA, Tewksbury JJ, Gram WK, Budden AE, Batcheller AL, Duke CS, Porter JH (2013) Big data and the future of ecology. Front Ecol Environ 11:156–162
Harper JL (1977) Population biology of plants. Academic Press, Cambridge
Heguan Z (1986) Application of arima time series model in tree growing forecast. Sci Silvae Sin 1:011
Hernán MA (2016) Does water kill? A call for less casual causal inferences. Ann Epidemiol 26:674–680
Hernán MA (2017) Invited commentary: selection bias without colliders. Am J Epidemiol 185:1048–1050
Hinckley S, Parada C, Horne JK, Mazur M, Woillez M (2016) Comparison of individual-based model output to data using a model of walleye pollock early life history in the Gulf of Alaska. Deep Sea Res Part II Top Stud Oceanogr 132:240–262
Huang J, Malone BP, Minasny B, McBratney AB, Triantafilis J (2017) Evaluating a Bayesian modelling approach (INLA-SPDE) for environmental mapping. Sci Total Environ 609:621–632
Juan P, Díaz-Avalos C, Mejía-Domínguez NR, Mateu J (2017) Hierarchical spatial modeling of the presence of Chagas disease insect vectors in Argentina. A comparative approach. Stoch Environ Res Risk Assess 31:461–479
Kane MJ, Price N, Scotch M, Rabinowitz P (2014) Comparison of ARIMA and random forest time series models for prediction of avian influenza H5N1 outbreaks. BMC Bioinf 15:276
Kazmin D, Nakaya HI, Lee EK, Johnson MJ, van der Most R, van den Berg RA, Ballou WR, Jongert E, Wille-Reece U, Ockenhouse C (2017) Systems analysis of protective immune responses to RTS, S malaria vaccination in humans. Proc Natil Acad Sci 114:201621489
Keyes KM, Tracy M, Mooney SJ, Shev A, Cerdá M (2017) Invited commentary: agent-based models—bias in the face of discovery. Am J Epidemiol 186:146–148
Krebs JR, Anderson R, Clutton-Brock T, Morrison I, Young D, Donnelly CA, Frost S, Woodroffe R (1997) Bovine tuberculosis in cattle and badgers. Report to the Rt Hon Dr Jack Cunningham MP by The Independent Scientific Review Group, London, p 191
Lange M, Thulke H-H (2017) Elucidating transmission parameters of African swine fever through wild boar carcasses by combining spatio-temporal notification data and agent-based modelling. Stoch Environ Res Risk Assess 31:379–391
Levins R (1966) The strategy of model building in population biology. Am Sci 54:421–431
Loglisci C, Malerba D (2017) Leveraging temporal autocorrelation of historical data for improving accuracy in network regression. Stat Anal Data Min ASA Data Sci J. https://doi.org/10.1002/sam.11336
Lonergan M (2014) Data availability constrains model complexity, generality, and utility: a response to Evans et al. Trends Ecol Evol 29:301–302
Lowe R, Cazelles B, Paul R, Rodó X (2016) Quantifying the added value of climate information in a spatio-temporal dengue model. Stoch Environ Res Risk Assess 30:2067–2078
Lucas TA, Doña RA, Jiang W, Johns GC, Mann DJ, Seubert C, Webster NB, Willens CH, Davis SD (2017) An individual-based model of chaparral vegetation response to frequent wildfires. Theor Ecol 10:217–233
Lux SA, Wnuk A, Vogt H, Belien T, Spornberger A, Studnicki M (2016) Validation of individual-based markov-like stochastic process model of insect behavior and a “Virtual Farm” concept for enhancement of site-specific IPM. Front Physiol 7:363
Lynch SM, Moore JH (2016) A call for biological data mining approaches in epidemiology. BioData Mining 9:1
Mazaris AD (2017) Open data and the future of conservation biology. Ethics Sci Environ Pol 17:29–35
Mazaris AD, Fiksen Ø, Matsinos YG (2005) Using an individual-based model for assessment of sea turtle population viability. Popul Ecol 47:179–191
Mazaris AD, Matsinos YG (2006) An individual based model of sea turtles: investigating the effect of temporal variability on population dynamics. Ecol Model 194:114–124
McCormick TH, Ferrell R, Karr AF, Ryan PB (2014) Big data, big results: knowledge discovery in output from large-scale analytics. Stat Anal Data Min ASA Data Sci J 7:404–412
Medawar P (1984) The limits of science. Oxford University Press, Oxford
Moland E, Ulmestrand M, Olsen E, Stenseth N (2013) Long-term decrease in sex-specific natural mortality of European lobster within a marine protected area. Mar Ecol Prog Ser 491:153–164
Moustakas A (2016) The effects of marine protected areas over time and species’ dispersal potential: a quantitative conservation conflict attempt. Web Ecol 16:113–122
Moustakas A (2017) Spatio-temporal data mining in ecological and veterinary epidemiology. Stoch Environ Res Risk Assess 31:829–834
Moustakas A, Evans M (2015) Coupling models of cattle and farms with models of badgers for predicting the dynamics of bovine tuberculosis (TB). Stoch Environ Res Risk Assess 29:623–635
Moustakas A, Evans MR (2013) Integrating evolution into ecological modelling: accommodating phenotypic changes in agent based models. PLoS ONE 8:e71125
Moustakas A, Evans MR (2016) Regional and temporal characteristics of bovine tuberculosis of cattle in Great Britain. Stoch Environ Res Risk Assess 30:989–1003
Moustakas A, Evans MR (2017) A big-data spatial, temporal and network analysis of bovine tuberculosis between wildlife (badgers) and cattle. Stoch Environ Res Risk Assess 31:315–328
Moustakas A, Silvert W (2011) Spatial and temporal effects on the efficacy of marine protected areas: implications from an individual based model. Stoch Environ Res Risk Assess 25:403–413
Murray EJ, Robins JM, Seage GR, 3rd, Lodi S, Hyle EP, Reddy KP, Freedberg KA, Hernan MA (2017a) Using observational data to calibrate simulation models. Medical decision making: an international journal of the Society for Medical Decision Making: 272989x17738753
Murray EJ, Robins JM, Seage GR, Freedberg KA, Hernán MA (2017b) A comparison of agent-based models and the parametric g-formula for causal inference. Am J Epidemiol 186:131–142
Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Muharemagic E (2015) Deep learning applications and challenges in big data analytics. J Big Data 2:1–21
Nelson JC, Shortreed SM, Yu O, Peterson D, Baxter R, Fireman B, Lewis N, McClure D, Weintraub E, Xu S, Jackson LA (2014) On behalf of the Vaccine Safety Datalink, p. 2014. Integrating database knowledge and epidemiological design to improve the implementation of data mining methods that evaluate vaccine safety in large healthcare databases. Stat Anal Data Min ASA Data Sci J 7:337–351
Nkiaka E, Nawaz NR, Lovett JC (2018) Effect of single and multi-site calibration techniques on hydrological model performance, parameter estimation and predictive uncertainty: a case study in the Logone catchment, Lake Chad basin. Stoch Environ Res Risk Assess. https://doi.org/10.1007/s00477-017-1466-0
Pananos AD, Bury TM, Wang C, Schonfeld J, Mohanty SP, Nyhan B, Salathé M, Bauch CT (2018) Critical dynamics in population vaccinating behavior. Proc Natl Acad Sci. https://doi.org/10.1073/pnas.1704093114
Perles-Ribes JF, Ramón-Rodríguez AB, Moreno-Izquierdo L, Torregrosa Martí MT (2016) Winners and losers in the Arab uprisings: a Mediterranean tourism perspective. Curr Issues Tour. https://doi.org/10.1080/13683500.2016.1225697
Proietti T, Hillebrand E (2017) Seasonal changes in central England temperatures. J R Stat Soc Ser A 180:769–791
Punt AE, Hilborn R (2001) BAYES-SA. Bayesian stock assessment methods in fisheries. User’s manual: FAO
R Development Core Team (2017) R; A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing
Riad MH, Scoglio CM, McVey DS, Cohnstaedt LW (2017) An individual-level network model for a hypothetical outbreak of Japanese encephalitis in the USA. Stoch Environ Res Risk Assess 31:353–367
Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc Ser B 71:319–392
Sak H, Yang G, Li B, Li W (2017) A copula-based model for air pollution portfolio risk and its efficient simulation. Stoch Environ Res Risk Assess 31:2607–2616
Scott SL, Varian HR (2014) Predicting the present with bayesian structural time series. Int J Math Model Numer Optim 5:4–23
Silvert W (2001) Modelling as a discipline. Int J Gen Syst 30:261–282
Soranno PA, Schimel DS (2014) Macrosystems ecology: big data, big ecology. Front Ecol Environ 12:3
Stichel D, Middleton AM, Müller BF, Depner S, Klingmüller U, Breuhahn K, Matthäus F (2017) An individual-based model for collective cancer cell migration explains speed dynamics and phenotype variability in response to growth factors. NPJ Syst Biol Appl 3:5
Wagner HH (2013) Rethinking the linear regression model for spatial ecological data. Ecology 94:2381–2391
Wu S, Gao Y-J, Ge Z-Z (2017) Optimal use of polyethylene glycol for preparation of small bowel video capsule endoscopy: a network meta-analysis. Curr Med Res Opin 33:1149–1154
Young D, Dye C (2006) The development and impact of tuberculosis vaccines. Cell 124:683–687
Zenner C, Herrnleben-Kurz S, Walach H (2014) Mindfulness-based interventions in schools—a systematic review and meta-analysis. Front Psychol 5:603
Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Limitations of linear regression applied on ecological data. In Mixed effects models and extensions in ecology with R, pp 11–33. Springer, Berlin
Web sources
Acknowledgements
Comments from two anonymous reviewers considerably improved an earlier manuscript draft.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Moustakas, A. Assessing the predictive causality of individual based models using Bayesian inference intervention analysis: an application in epidemiology. Stoch Environ Res Risk Assess 32, 2861–2869 (2018). https://doi.org/10.1007/s00477-018-1520-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-018-1520-6