Abstract
Stochastic models of point patterns in space and time are widely used to issue forecasts or assess risk, and often they affect societally relevant decisions. We adapt the concept of consistent scoring functions and proper scoring rules, which are statistically principled tools for the comparative evaluation of predictive performance, to the point process setting, and place both new and existing methodology in this framework. With reference to earthquake likelihood model testing, we demonstrate that extant techniques apply in much broader contexts than previously thought. In particular, the Poisson log-likelihood can be used for theoretically principled comparative forecast evaluation in terms of cell expectations. We illustrate the approach in a simulation study and in a comparative evaluation of operational earthquake forecasts for Italy.
Similar content being viewed by others
References
Baddeley, A., Turner, R. (2005). spatstat: An R package for analyzing spatial point patterns. Journal of Statistical Software, 12, 1–42.
Baddeley, A., Turner, R., Møller, J., Hazelton, M. (2005). Residual analysis for spatial point processes. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67, 617–666.
Baddeley, A., Rubak, E., Turner, R. (2015). Spatial point patterns: Methodology and applications with R. London: Chapman and Hall/CRC Press.
Bray, A., Schoenberg, F. P. (2013). Assessment of point process models for earthquake forecasting. Statistical Science, 28, 510–520.
Bray, A., Wong, K., Barr, C. D., Schoenberg, F. P. (2014). Voronoi residual analysis of spatial point process models with applications to California earthquake forecasts. Annals of Applied Statistics, 8, 2247–2267.
Brehmer, J. R. (2021). A construction principle for proper scoring rules. Proceedings of the American Mathematical Society Series B, 8, 297–301.
Brehmer, J. R. (2023). Reproduction material for “Comparative evaluation of point process forecasts”. Available at https://github.com/jbrehmer42/pp_evaluation.
Chen, J., Hawkes, A. G., Scalas, E., Trinh, M. (2018). Performance of information criteria for selection of Hawkes process models of financial data. Quantitative Finance, 18, 225–235.
Chiu, S. N., Stoyan, D., Kendall, W. S., & Mecke, J. (2013). Stochastic geometry and its applications (3rd ed.). Chichester: Wiley.
Clements, R. A., Schoenberg, F. P., Schorlemmer, D. (2011). Residual analysis methods for space-time point processes with applications to earthquake forecast models in California. Annals of Applied Statistics, 5, 2549–2571.
Daley, D. J., Vere-Jones, D. (2003). An introduction to the theory of point processes (2nd ed., Vol. I). NewYork: Springer.
Daley, D. J., Vere-Jones, D. (2004). Scoring probability forecasts for point processes: The entropy score and information gain. Journal of Applied Probability, 41A, 297–312.
Dawid, A. P., & Musio, M. (2014). Theory and applications of proper scoring rules. Metron, 72, 169–183.
Dawid, A. P., Sebastiani, P. (1999). Coherent dispersion criteria for optimal experimental design. Annals of Statistics, 27, 65–81.
Diebold, F. X., Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13, 253–263.
Ehm, W., Gneiting, T., Jordan, A., Krüger, F. (2016). Of quantiles and expectiles: Consistent scoring functions, Choquet representations and forecast rankings. Journal of the Royal Statistical Society Series B: Statistical Methodology, 78, 505–562.
Falcone, G., Console, R., Murru, M. (2010). Short-term and long-term earthquake occurrence models for Italy: ETES, ERS and LTST. Annals of Geophysics, 53, 41–50.
Field, E. H. (2007). Overview of the working group for the development of regional earthquake likelihood models (RELM). Seismological Research Letters, 78, 7–16.
Flaxman, S., Chirico, M., Pereira, P., Loeffler, C. (2019). Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: A winning solution to the NIJ “Real-Time Crime Forecasting Challenge’’. Annals of Applied Statistics, 13, 2564–2585.
Frongillo, R., Kash, I. A. (2015). Vector-valued property elicitation. Journal of Machine Learning Research: Workshop and Conference Proceedings, 40, 1–18.
Frongillo, R., Kash, I. A. (2021). Elicitation complexity of statistical properties. Biometrika, 108, 857–879.
Gerstenberger, M. C., Wiemer, S., Jones, L. M., Reasenberg, P. A. (2005). Real-time forecasts of tomorrow’s earthquakes in California. Nature, 435, 328–331.
Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Association, 106, 746–762.
Gneiting, T., Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102, 359–378.
Gneiting, T., Ranjan, R. (2011). Comparing density forecasts using threshold- and quantile-weighted scoring rules. Journal of Business & Economic Statistics, 29, 411–422.
Harte, D. (2015). Log-likelihood of earthquake models: evaluation of models and forecasts. Geophysical Journal International, 201, 711–723.
Harte, D., Vere-Jones, D. (2005). The entropy score and its uses in earthquake forecasting. Pure and Applied Geophysics, 162, 1229–1253.
Heinrich-Mertsching, C., Thorarinsdottir, T. L., Guttorp, P., Schneider, M. (2021). Validation of point process predictions with proper scoring rules. Preprint. arXiv:2110.11803.
Hendrickson, A. D., Buehler, R. J. (1971). Proper scores for probability forecasters. Annals of Mathematical Statistics, 42, 1916–1921.
Hering, A. S., Genton, M. G. (2011). Comparing spatial predictions. Technometrics, 53, 414–425.
Herrmann, M., Marzocchi, W. (2023). Maximizing the forecasting skill of an ensemble model. Geophysical Journal International. https://doi.org/10.1093/gji/ggad020
Holzmann, H., Klar, B. (2017). Focusing on regions of interest in forecast evaluation. Annals of Applied Statistics, 11, 2404–2431.
Illian, J., Penttinen, A., Stoyan, H., Stoyan, D. (2008). Statistical analysis and modelling of spatial point patterns. Chichester: Wiley.
Jordan, T. H., Chen, Y. T., Gasparini, P., Madariaga, R., Main, I., Marzocchi, W., Papadopoulos, G., Sobolev, G., Yamaoka, K., Zschau, J. (2011). Operational earthquake forecasting: State of knowledge and guidelines for utilization. Annals of Geophysics, 54, 4.
Kagan, Y. Y., Jackson, D. D. (1995). New seismic gap hypothesis: Five years after. Journal of Geophysical Research: Solid Earth, 100, 3943–3959.
Kagan, Y. Y., Knopoff, L. (1987). Statistical short-term earthquake prediction. Science, 236, 1563–1567.
Lavancier, F., Møller, J., Rubak, E. (2015). Determinantal point process models and statistical inference. Journal of the Royal Statistical Society Series B: Statistical Methodology, 77, 853–877.
Lehr, R. (1992). Sixteen s-squared over d-squared: A relation for crude sample size estimates. Statistics in Medicine, 11, 1099–1102.
Lerch, S., Thorarinsdottir, T. L., Ravazzolo, F., Gneiting, T. (2017). Forecaster’s dilemma: Extreme events and forecast evaluation. Statistical Science, 32, 106–127.
Lombardi, A. M., Marzocchi, W. (2010). The ETAS model for daily forecasting of Italian seismicity in the CSEP experiment. Annals of Geophysics, 53, 155–164.
Marzocchi, W., Zechar, J. D., Jordan, T. H. (2012). Bayesian forecast evaluation and ensemble earthquake forecasting. Bulletin of the Seismological Society of America, 102, 2574–2584.
Marzocchi, W., Lombardi, A. M., Casarotti, E. (2014). The establishment of an operational earthquake forecasting system in Italy. Seismological Research Letters, 85, 961–969.
Marzocchi, W., Taroni, M., Falcone, G. (2017). Earthquake forecasting during the complex Amatrice-Norcia seismic sequence. Science Advances, 3, e1701239.
Meyer, S., Held, L. (2014). Power-law models for infectious disease spread. Annals of Applied Statistics, 8, 1612–1639.
Mohler, G. O., Short, M. B., Brantingham, P. J., Schoenberg, F. P., Tita, G. E. (2011). Self-exciting point process modeling of crime. Journal of the American Statistical Association, 106, 100–108.
Nandan, S., Ouillon, G., Sornette, D., Wiemer, S. (2019). Forecasting the full distribution of earthquake numbers is fair, robust, and better. Seismological Research Letters, 90, 1650–1659.
Nolde, N., Ziegel, J. F. (2017). Elicitability and backtesting: Perspectives for banking regulation. Annals of Applied Statistics, 11, 1833–1874.
Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes. Journal of the American Statistical Association, 83, 9–27.
Ogata, Y. (1998). Space-time point-process models for earthquake occurrences. Annals of the Institute of Statistical Mathematics, 50, 379–402.
Ogata, Y., Katsura, K., Falcone, G., Nanjo, K., Zhuang, J. (2013). Comprehensive and topical evaluations of earthquake forecasts in terms of number, time, space, and magnitude. Bulletin of the Seismological Society of America, 103, 1692–1708.
R Core Team. (2021). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Reinhart, A. (2018). A review of self-exciting spatio-temporal point processes and their applications. Statistical Science, 33, 299–318.
Rhoades, D., Schorlemmer, D., Gerstenberger, M., Christophersen, A., Zechar, J. D., Imoto, M. (2011). Efficient testing of earthquake forecasting models. Acta Geophysica, 59, 728–747.
Rockafellar, R. T. (1970). Convex Analysis. Princeton University Press.
Savage, L. J. (1971). Elicitation of personal probabilities and expectations. Journal of the American Statistical Association, 66, 783–801.
Schoenberg, F. P. (2003). Multidimensional residual analysis of point process models for earthquake occurrences. Journal of the American Statistical Association, 98, 789–795.
Schoenberg, F. P., Hoffmann, M., Harrigan, R. J. (2019). A recursive point process model for infectious diseases. Annals of the Institute of Statistical Mathematics, 71, 1271–1287.
Schorlemmer, D., Gerstenberger, M. C., Wiemer, S., Jackson, D. (2007). Earthquake likelihood model testing. Seismological Research Letters, 78, 17–29.
Schorlemmer, D., Werner, M. J., Marzocchi, W., Jordan, T. H., Ogata, Y., Jackson, D. D., Mak, S., Rhoades, D. A., Gerstenberger, M. C., Hirata, N., Liukis, M., Maechling, P. J., Strader, A., Taroni, M., Wiemer, S., Zechar, J. D., Zhuang, J. (2018). The collaboratory for the study of earthquake predictability: Achievements and priorities. Seismological Research Letters, 89, 1305–1313.
Serafini, F., Naylor, M., Lindgren, F., Werner, M. J., Main, I. (2022). Ranking earthquake forecasts using proper scoring rules: Binary events in a low probability environment. Geophysical Journal International, 230, 1419–1440.
Taroni, M., Marzocchi, W., Schorlemmer, D., Werner, M. J., Wiemer, S., Zechar, J. D., Heiniger, L., Euchner, F. (2018). Prospective CSEP evaluation of 1-day, 3-month, and 5-yr earthquake forecasts for Italy. Seismological Research Letters, 89, 1251–1261.
Thorarinsdottir, T. L. (2013). Calibration diagnostic for point process models via the probability integral transform. Stat, 2, 150–158.
van Belle, G. (2008). Statistical rules of thumb. Wiley series in probability and statistics (2nd ed.). Chichester: Wiley.
Woessner, J., Christophersen, A., Zechar, J. D., Monelli, D. (2010). Building self-consistent, short-term earthquake probability (STEP) models: Improved strategies and calibration procedures. Annals of Geophysics, 53, 141–154.
Zechar, J. D., Gerstenberger, M. C., Rhoades, D. A. (2010a). Likelihood-based tests for evaluating space-rate-magnitude earthquake forecasts. Bulletin of the Seismological Society of America, 100, 1184–1195.
Zechar, J. D., Schorlemmer, D., Liukis, M., Yu, J., Euchner, F., Maechling, P. J., Jordan, T. H. (2010b). The collaboratory for the study of earthquake predictability perspective on computational earthquake science. Concurrency and Computation: Practice and Experience, 22, 1836–1847.
Zhuang, J., Mateu, J. (2019). A semiparametric spatiotemporal Hawkes-type point process model with periodic background for crime data. Journal of the Royal Statistical Society Series A: Statistics in Society, 182, 919–942.
Zhuang, J., Ogata, Y., Vere-Jones, D. (2002). Stochastic declustering of space-time earthquake occurrences. Journal of the American Statistical Association, 97, 369–380.
Acknowledgements
Jonas Brehmer and Tilmann Gneiting are grateful for support by the Klaus Tschira Foundation. Jonas Brehmer gratefully acknowledges support by the German Research Foundation (DFG) through Research Training Group RTG 1953. Part of this research came to fruition during mutual visits of Kirstin Strokorb at the University of Mannheim and Jonas Brehmer and Martin Schlather at Cardiff University during a workshop funded by the London Mathematical Society. We thank our hosting institutions for their generous hospitality. The authors would also like to thank Claudio Heinrich-Mertsching, Christopher Dörr and Alexander Jordan for helpful discussions, and Kristof Kraus for code review. Likewise, we are grateful to the anonymous reviewers for their comments that helped improve the clarity of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
10463_2023_875_MOESM1_ESM.pdf
The Supplementary Material contains additional technical details and further simulation experiments. R code for reproduction is publicly available (Brehmer 2023). Data are available from the authors upon request.
About this article
Cite this article
Brehmer, J.R., Gneiting, T., Herrmann, M. et al. Comparative evaluation of point process forecasts. Ann Inst Stat Math 76, 47–71 (2024). https://doi.org/10.1007/s10463-023-00875-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-023-00875-5