Skip to main content
Log in

Scalable Bayesian inference for self-excitatory stochastic processes applied to big American gunfire data

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

The Hawkes process and its extensions effectively model self-excitatory phenomena including earthquakes, viral pandemics, financial transactions, neural spike trains and the spread of memes through social networks. The usefulness of these stochastic process models within a host of economic sectors and scientific disciplines is undercut by the processes’ computational burden: complexity of likelihood evaluations grows quadratically in the number of observations for both the temporal and spatiotemporal Hawkes processes. We show that, with care, one may parallelize these calculations using both central and graphics processing unit implementations to achieve over 100-fold speedups over single-core processing. Using a simple adaptive Metropolis–Hastings scheme, we apply our high-performance computing framework to a Bayesian analysis of big gunshot data generated in Washington D.C. between the years of 2006 and 2019, thereby extending a past analysis of the same data from under 10,000 to over 85,000 observations. To encourage widespread use, we provide hpHawkes, an open-source R package, and discuss high-level implementation and program design for leveraging aspects of computational hardware that become necessary in a big data setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Allaire, J., Francois, R., Ushey, K., Vandenbrouck, G., Geelnard, M.: Intel: RcppParallel: Parallel Programming Tools for ‘Rcpp’. R package version 4.3.19 (2016)

  • Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18–20, 1967, Spring Joint Computer Conference, pp. 483–485 (1967)

  • Beam, A.L., Ghosh, S.K., Doyle, J.: Fast Hamiltonian Monte Carlo using GPU computing. J. Comput. Graph. Stat. 25, 536–548 (2016)

    Article  MathSciNet  Google Scholar 

  • Bjerregaard, B., Lizotte, A.J.: Gun ownership and gang membership. J. Crim. L. Criminol. 86, 37 (1995)

    Article  Google Scholar 

  • Carr, J., Doleac, J.L.: The geography, incidence, and underreporting of gun violence: new evidence using shotspotter data. In: Incidence, and Underreporting of Gun Violence: New Evidence Using Shotspotter Data (2016)

  • Carr, J.B., Doleac, J.L.: Keep the kids inside? Juvenile curfews and urban gun violence. Rev. Econ. Stat. 100, 609–618 (2018)

    Article  Google Scholar 

  • Centers for Disease Control and Prevention: Centers for Disease Control and Prevention, National Center for Health Statistics. Underlying Cause of Death 1999–2018 on CDC WONDER Online Database, released in 2020. Data are from the Multiple Cause of Death Files, 1999–2018, as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program (2020). Accessed wonder.cdc.gov/ucd-icd10.html

  • Chavez-Demoulin, V., McGill, J.: High-frequency financial data modeling using Hawkes processes. J. Bank. Finance 36, 3415–3426 (2012)

    Article  Google Scholar 

  • Choi, E., Du, N., Chen, R., Song, L., Sun, J.: Constructing disease network and temporal progression model via context-sensitive Hawkes process. In: 2015 IEEE International Conference on Data Mining, pp. 721–726. IEEE (2015)

  • Daley, D.J.: An Introduction to the Theory of Point Processes: Elementary Theory of Point Processes. Springer, Berlin (2003)

    MATH  Google Scholar 

  • Daley, D.J., Vere-Jones, D.: An Introduction to the Theory of Point Processes: Volume II: General Theory and Structure. Springer, Berlin (2007)

    MATH  Google Scholar 

  • Eddelbuettel, D., François, R.: Rcpp: Seamless R and C++ integration. J. Stat. Softw. 40, 1–18 (2011)

    Article  Google Scholar 

  • Embrechts, P., Liniger, T., Lin, L.: Multivariate Hawkes processes: an application to financial data. J. Appl. Probab. 48, 367–378 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Fang, J., Varbanescu, A.L., Sips, H.: A comprehensive performance comparison of cuda and opencl. In: 2011 International Conference on Parallel Processing, pp. 216–225. IEEE (2011)

  • Federal Bureau of Investigation: Crime in the u.s. (2005). Accessed www2.fbi.gov/ucr/05cius/data/table_05.html

  • Flaxman, S.R.: Machine Learning in Space and Time. Ph.D. thesis, Carnegie Mellon University (2015)

  • Gelman, A., Roberts, G.O., Gilks, W.R., et al.: Efficient metropolis jumping rules. Bayesian Stat. 5, 42 (1996)

    Google Scholar 

  • Grisales, C.: From Border Security to Tobacco Age, Both Parties Tout Key Wins in Spending Deal. NPR. Accessed (2019). www.npr.org/2019/12/16/788506571/border-wall-to-tobacco-age-both-parties-tout-key-wins-in-spending-deal

  • Haario, H., Saksman, E., Tamminen, J., et al.: An adaptive metropolis algorithm. Bernoulli 7, 223–242 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Hardiman, S.J., Bercot, N., Bouchaud, J.-P.: Critical reflexivity in financial markets: a Hawkes process analysis. Eur. Phys. J. B 86, 442 (2013)

    Article  Google Scholar 

  • Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  • Hawkes, A.G.: Point spectra of some mutually exciting point processes. J. R. Stat. Soc. Ser. B Methodol. 33, 438–443 (1971a)

    MathSciNet  MATH  Google Scholar 

  • Hawkes, A.G.: Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 83–90 (1971b)

    Article  MathSciNet  MATH  Google Scholar 

  • Hawkes, A.: Spectra of some mutually exciting point processes with associated variables. Stoch. Point Process. 261–271 (1972)

  • Hawkes, A.: Cluster models for earthquakes-regional comparisons. Bull. Int. Stat. Inst. 45, 454–461 (1973)

    Google Scholar 

  • Hawkes, A.G.: Hawkes processes and their applications to finance: a review. Quant. Finance 18, 193–198 (2018)

    Article  MathSciNet  Google Scholar 

  • Holbrook, A., Lemey, P., Baele, G., Dellicour, S., Brockmann, D., Rambaut, A., Suchard, M.: Massive parallelization boosts big Bayesian multidimensional scaling. arXiv preprint arXiv:1905.04582 (2019)

  • Kelly, J.D., Park, J., Harrigan, R.J., Hoff, N.A., Lee, S.D., Wannier, R., Selo, B., Mossoko, M., Njoloko, B., Okitolonda-Wemakoy, E., et al.: Real-time predictions of the 2018–2019 ebola virus disease outbreak in the democratic republic of the congo using hawkes point process models. Epidemics 28, 100354 (2019)

    Article  Google Scholar 

  • Kim, H.: Spatio-temporal Point Process Models for the Spread of Avian Influenza Virus (H5N1). Ph.D. thesis UC Berkeley (2011)

  • Laub, P.J., Taimre, T., Pollett, P.K.: Hawkes processes. arXiv preprint arXiv:1507.02822 (2015)

  • Lee, A., Yau, C., Giles, M.B., Doucet, A., Holmes, C.C.: On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. J. Comput. Graph. Stat. 19, 769–789 (2010)

    Article  Google Scholar 

  • Linderman, S., Adams, R.: Discovering latent network structure in point process data. In: International Conference on Machine Learning, pp.  1413–1421 (2014)

  • Linderman, S.W., Wang, Y., Blei, D.M.: Bayesian inference for latent Hawkes processes. Adv. Neural Inf. Process. Syst. (2017)

  • Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: Nvidia tesla: a unified graphics and computing architecture. IEEE Micro 28, 39–55 (2008)

    Article  Google Scholar 

  • Loeffler, C., Flaxman, S.: Is gun violence contagious? A spatiotemporal test. J. Quant. Criminol. 34, 999–1017 (2018)

    Article  Google Scholar 

  • Mares, D., Blackburn, E.: Evaluating the effectiveness of an acoustic gunshot location system in St. Louis, MO. Polic. J. Policy Pract. 6, 26–42 (2012)

    Article  Google Scholar 

  • Mei, H., Eisner, J.M.: The neural Hawkes process: A neurally self-modulating multivariate point process. In: Advances in Neural Information Processing Systems, pp. 6754–6764 (2017)

  • Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953)

    Article  MATH  Google Scholar 

  • Metropolitan Police Department: Juvenile and Adult Homicide in the District of Columbia—2001–2005 (2006)

  • Meyer, S., Held, L., et al.: Power-law models for infectious disease spread. Ann. Appl. Stat. 8, 1612–1639 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Mohler, G.: Marked point process hotspot maps for homicide and gun crime prediction in Chicago. Int. J. Forecast. 30, 491–497 (2014)

    Article  Google Scholar 

  • National Research Council: Firearms and Violence: A Critical Review. National Academies Press (2005)

  • National Research Council: Priorities for Research to Reduce the Threat of Firearm-Related Violence. National Academies Press (2013)

  • Ogata, Y.: Statistical models for earthquake occurrences and residual analysis for point processes. J. Am. Stat. Assoc. 83, 9–27 (1988)

    Article  Google Scholar 

  • Park, J., Schoenberg, F.P., Bertozzi, A.L., Brantingham, P.J.: Investigating Clustering and Violence Interruption in Gang-Related Violent Crime Data Using Spatial–Temporal Point Processes with Covariates (2019)

  • Petho, A., Fallis, D., Keating, D.: Shotspotter Detection System Documents 39,000 Shooting Incidents in the District. Washington Post (2013). Accessed www.washingtonpost.com/investigations/

  • Plummer, M., Best, N., Cowles, K., Vines, K.: Coda: convergence diagnosis and output analysis for MCMC. R News 6, 7–11 (2006)

    Google Scholar 

  • R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing Vienna, Austria (2019)

  • Rasmussen, J.G.: Bayesian inference for Hawkes processes. Methodol. Comput. Appl. Probab. 15, 623–642 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Ratcliffe, J.H., Rengert, G.F.: Near-repeat patterns in Philadelphia shootings. Secur. J. 21, 58–76 (2008)

    Article  Google Scholar 

  • Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc, Sebastopol (2007)

    Google Scholar 

  • Reinhart, A., Greenhouse, J.: Self-exciting point processes with spatial covariates: modelling the dynamics of crime. J. R. Stat. Soc. Ser. C 67, 1305–1329 (2018)

    Article  MathSciNet  Google Scholar 

  • Reinhart, A., et al.: A review of self-exciting spatio-temporal point processes and their applications. Stat. Sci. 33, 299–318 (2018)

    MathSciNet  MATH  Google Scholar 

  • Rizoiu, M.-A., Mishra, S., Kong, Q., Carman, M., Xie, L.: Sir–Hawkes: linking epidemic models and Hawkes processes to model diffusions in finite populations. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web International World Wide Web Conferences Steering Committee, pp. 419–428 (2018)

  • Roberts, G.O., Rosenthal, J.S.: Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. J. Appl. Probab. 44, 458–475 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. J. Comput. Graph. Stat. 18, 349–367 (2009)

    Article  MathSciNet  Google Scholar 

  • Rubin, R.: Tale of 2 agencies: CDC avoids gun violence research but NIH funds it. JAMA 315, 1689–1692 (2016)

    Article  Google Scholar 

  • Schoenberg, F.P.: Facilitated estimation of etas. Bull. Seismol. Soc. Am. 103, 601–605 (2013)

    Article  Google Scholar 

  • Showen, R.: Operational gunshot location system. In: Surveillance and Assessment Technologies for Law Enforcement, Vol. 2935 International Society for Optics and Photonics, pp. 130–139 (1997)

  • Suchard, M., Rambaut, A.: Many-core algorithms for statistical phylogenetics. Bioinformatics 25, 1370–1376 (2009)

    Article  Google Scholar 

  • Suchard, M., Wang, Q., Chan, C., Frelinger, J., Cron, A., West, M.: Understanding GPU programming for statistical computation: studies in massively parallel massive mixtures. J. Comput. Graph. Stat. 19, 419–438 (2010a)

    Article  MathSciNet  Google Scholar 

  • Suchard, M.A., Holmes, C., West, M.: Some of the what?, why?, how?, who? and where? of graphics processing unit computing for Bayesian analysis. Bull. Int. Soc. Bayesian Anal. 17, 12–16 (2010b)

    Google Scholar 

  • Truccolo, W.: From point process observations to collective neural dynamics: nonlinear Hawkes process glms, low-dimensional dynamics and coarse graining. J. Physiol. Paris 110, 336–347 (2016)

    Article  Google Scholar 

  • Ushey, K., Falcou, J.: RcppNT2: ‘Rcpp’ Integration for the ‘NT2’ Scientific Computing Library. R package version 0.1.0 (2016)

  • Wadman, M.: Firearms research: the gun fighter. Nat. News 496, 412 (2013)

    Article  Google Scholar 

  • Warne, D.J., Sisson, S.A., Drovandi, C.: Acceleration of expensive computations in Bayesian statistics using vector operations (2019). arXiv preprint arXiv:1902.09046

  • White, G., Porter, M.D.: GPU accelerated MCMC for modeling terrorist activity. Comput. Stat. Data Anal. 71, 643–651 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2016)

    Book  MATH  Google Scholar 

  • Woelfle, M., Olliaro, P., Todd, M.H.: Open science is a research accelerator. Nat. Chem. 3, 745–748 (2011)

    Article  Google Scholar 

  • Yang, S.-H., Zha, H.: Mixture of mutually exciting processes for viral diffusion. In: International Conference on Machine Learning, pp. 1–9 (2013)

  • Zhou, H., Lange, K., Suchard, M.: Graphics processing units and high-dimensional optimization. Stat. Sci. 25, 311–324 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Zhuang, J., Ogata, Y., Vere-Jones, D.: Analyzing earthquake clustering features by using stochastic reconstruction. J. Geophys. Res. Solid Earth (2004). https://doi.org/10.1029/2003JB002879

Download references

Acknowledgements

The research leading to these results has received funding through National Institutes of Health Grant U19 AI135995 and National Science Foundation Grant DMS1264153. AJH is supported by NIH Grant K25AI153816. We gratefully acknowledge support from Nvidia Corporation with the donation of parallel computing resources used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew J. Holbrook.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Holbrook, A.J., Loeffler, C.E., Flaxman, S.R. et al. Scalable Bayesian inference for self-excitatory stochastic processes applied to big American gunfire data. Stat Comput 31, 4 (2021). https://doi.org/10.1007/s11222-020-09980-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-020-09980-4

Keywords

Navigation