Skip to main content
Log in

Calibration tests for count data

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

Calibration, the statistical consistency of forecast distributions and observations, is a central requirement for probabilistic predictions. Calibration of continuous forecasts has been widely discussed, and significance tests are commonly used to detect whether a prediction model is miscalibrated. However, calibration tests for discrete forecasts are rare, especially for distributions with unlimited support. In this paper, we propose two types of calibration tests for count data: tests based on conditional exceedance probabilities and tests based on proper scoring rules. For the latter, three scoring rules are considered: the ranked probability score, the logarithmic score and the Dawid-Sebastiani score. Simulation studies show that all the different tests have good control of the type I error rate and sufficient power under miscalibration. As an illustration, we apply the methodology to weekly data on meningoccocal disease incidence in Germany, 2001–2006. The results show that the test approach is powerful in detecting miscalibrated forecasts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78:1–3

    Article  Google Scholar 

  • Christoffersen PF (1998) Evaluating interval forecasts. Int Econ Rev 39(4):841–862

    Article  MathSciNet  Google Scholar 

  • Corradi V, Swanson NR (2006) Predictive density and conditional confidence interval accuracy tests. J Econ 135(1):187–228

    Article  MathSciNet  Google Scholar 

  • Cox DR (1958) Two further applications of a model for binary regression. Biometrika 45:562–565

    Article  MATH  Google Scholar 

  • Czado C, Gneiting T, Held L (2009) Predictive model assessment for count data. Biometrics 65:1254–1261

    Article  MATH  MathSciNet  Google Scholar 

  • Dawid AP (1984) Statistical theory: the prequential appoach. J Royal Stat Soc Ser A 147:278–292

    Article  MATH  MathSciNet  Google Scholar 

  • Dawid AP, Sebastiani P (1999) Coherent dispersion criteria for optimal experimental design. Ann Stat 27:65–81

    Article  MATH  MathSciNet  Google Scholar 

  • DeGroot M, Schervish M (2012) Probability and statistics, 4th edn. Addison-Wesley, Boston

    Google Scholar 

  • Diebold FX, Mariano RS (1995) Comparing predictive accuracy. J Bus Econ Stat 13(3):253–263

    Google Scholar 

  • Diebold FX, Gunther TA, Tay AS (1998) Evaluating density forecasts with applications to financial risk management. Int Econ Rev 39(4):863–883

  • Elsner JB, Jagger TH (2006) Prediction models for annual US hurricane counts. J Clim 19(12):2935–2952

    Article  Google Scholar 

  • Epstein ES (1969) A scoring system for probability forecasts of ranked categories. J Appl Meteorol 8:985–987

    Article  Google Scholar 

  • Farrington CP, Andrews NJ, Beale AD, Catchpole MA (1996) A statistical algorithm for the early detection of outbreaks of infectious disease. J Royal Stat Soc Ser A 159:547–563

    Article  MATH  MathSciNet  Google Scholar 

  • Frühwirth-Schnatter S, Frühwirth R, Held L, Rue H (2009) Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data. Stat Comput 19(4):479–492

    Article  MathSciNet  Google Scholar 

  • Gneiting T (2008) Editorial: Probabilistic forecasting. J Roy Statist Soc Ser A 171(2), pp. 319–321. doi:10.1111/j.1467-985X.2007.00522.x

  • Gneiting T, Balabdaoui F, Raftery AE (2007) Probabilistic forecasts, calibration and sharpness. J Royal Stat Soc Ser B 69:243–268

    Article  MATH  MathSciNet  Google Scholar 

  • Gneiting T, Stanberry LI, Grimit EP, Held L, Johnson NA (2008) Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds. Test 17(2):211–235

    Article  MATH  MathSciNet  Google Scholar 

  • Good IJ (1952) Rational decisions. J Royal Stat Soc Ser B 14:107–114

    MathSciNet  Google Scholar 

  • Harvey DI, Leybourne SJ, Newbold P (1998) Tests for forecast encompassing. J Bus Econ Stat 16(2):254–259

    Google Scholar 

  • Heisterkamp SH, Dekkers AL, Heijne JC (2006) Automated detection of infectious disease outbreaks: hierarchical time series models. Stat Med 25(24):4179–4196

    Article  MathSciNet  Google Scholar 

  • Held L, Paul M (2012) Modeling seasonality in space-time infectious disease surveillance data. Biom J 54(6):824–843

    Article  MATH  MathSciNet  Google Scholar 

  • Held L, Höhle M, Hofmann M (2005) A statistical framework for the analysis of multivariate infectious disease surveillance counts. Stat Model 5:187–199

  • Held L, Hofmann M, Höhle M, Schmid V (2006) A two-component model for counts of infectious diseases. Biostatistics 7(3):422–437

    Article  MATH  Google Scholar 

  • Held L, Rufibach K, Balabdaoui F (2010) A score regression approach to assess calibration of continuous probabilistic predictions. Biometrics 66(4):1295–1305

    Article  MATH  MathSciNet  Google Scholar 

  • Hilbe JM (2011) Negative binomial regression, 2nd edn. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Katti S (1960) The moments of the absolute difference and the absolute deviation of distributions. Ann Math Stat 31:78–85

    Article  MATH  MathSciNet  Google Scholar 

  • Knessl C (1998) Integral representations and asymptotic expansions for Shannon and Renyi entropies. Appl Math Lett 11(2):69–74

    Article  MathSciNet  Google Scholar 

  • Manitz J, Höhle M (2013) Bayesian outbreak detection algorithm for monitoring reported cases of campylobacteriosis in Germany. Biom J.

  • Mason S, Galpin J, Goddard L, Graham N, Rajartnam B (2007) Conditional exceedance probabilities. Mon Weather Rev 135(2):363–372

    Article  Google Scholar 

  • McCabe B, Martin G (2005) Bayesian predictions of low count time series. Int J Forecast 21(2):315–330

    Article  MathSciNet  Google Scholar 

  • McCabe BP, Martin GM, Harris D (2011) Efficient probabilistic forecasts for counts. J Royal Stat Soc Ser B (Stat Methodol) 73(2):253–272

    Article  MathSciNet  Google Scholar 

  • Murphy AH, Winkler RL (1987) A general framework for forecast verification. Mon Weather Rev 115:1330–1338

    Article  Google Scholar 

  • Nelson K, Leroux B (2006) Statistical models for autocorrelated data. Stat Med 25:1413–1430

    Article  MathSciNet  Google Scholar 

  • Noufaily A, Enki DG, Farrington P, Garthwaite P, Andrews N, Charlett A (2013) An improved algorithm for outbreak detection in multiple surveillance systems. Stat Med 32(7):1206–1222

    Article  MathSciNet  Google Scholar 

  • Paul M, Held L, Toschke A (2008) Multivariate modelling of infectious disease surveillance data. Stat Med 27:6250–6267

    Article  MathSciNet  Google Scholar 

  • Smith JQ (1985) Diagnostic checks of non-standard time series models. J Forecast 4:283–291

    Article  Google Scholar 

  • Spiegelhalter DJ (1986) Probabilistic prediction in patient management. Stat Med 5:421–433

    Article  Google Scholar 

  • Steyerberg E (2009) Clinical prediction models. Springer, New York

    Book  MATH  Google Scholar 

  • Winkelmann R (2008) Econometric analysis of count data, 5th edn. Springer, New York

    Google Scholar 

  • Winkler RL (1996) Scoring rules and the evaluation of probabilities. Test 5(1):1–60

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

We thank two referees for helpful comments and suggestions. Financial support by the Swiss National Science Foundation (SNF) is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Wei.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 97 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, W., Held, L. Calibration tests for count data. TEST 23, 787–805 (2014). https://doi.org/10.1007/s11749-014-0380-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-014-0380-8

Keywords

Mathematics Subject Classification (2000)

Navigation