Abstract
In this paper we present lessons learned in the Evaluating Predictive Uncertainty Challenge. We describe the methods we used in regression challenges, including our winning method for the Outaouais data set. We then turn our attention to the more general problem of scoring in probabilistic machine learning challenges. It is widely accepted that scoring rules should be proper in the sense that the true generative distribution has the best expected score; we note that while this is useful, it does not guarantee finding the best methods for practical machine learning tasks. We point out some problems in local scoring rules such as the negative logarithm of predictive density (NLPD), and illustrate with examples that many of these problems can be avoided by a distance-sensitive rule such as the continuous ranked probability score (CRPS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)
Härdle, W.: Applied Nonparametric Regression. Cambridge University Press, Cambridge (1990)
Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. Technical Report 463, Department of Statistics, University of Washington (2004)
Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. John Wiley & Sons, Inc., Chichester (2000)
Sanders, F.: The verification of probability forecasts. Journal of Applied Meteorology 6, 756–761 (1967)
Smith, C.A.B.: Consistency in statistical inference and decision. Journal of the Royal Statistical Society. Series B 23, 1–37 (1961)
Savage, L.J.: Elicitation of personal probabilities and expectations. Journal of the American Statistical Association 66, 783–801 (1971)
Winkler, R.L.: Probabilistic prediction: Some experimental results. Journal of the American Statistical Association 66, 678–685 (1971)
Corradi, V., Swanson, N.R.: Predictive density evaluation. Technical Report 200419, Rutgers University, Department of Economics (2004)
Bremnes, J.B.: Probabilistic forecasts of precipitation in terms of quantiles using NWP model output. Monthly Weather Review 132, 338–347 (2004)
Epstein, E.S.: A scoring system for probability forecasts of ranked categories. Journal of Applied Meteorology 8, 985–987 (1969)
Hamill, T.M., Wilks, D.S.: A probabilistic forecast contest and the difficulty in assessing short-range forecast uncertainty. Weather and Forecasting 10, 620–631 (1995)
Matheson, J.E., Winkler, R.L.: Scoring rules for continuous probability distributions. Management Science 22, 1087–1096 (1976)
Bernardo, J.M.: Expected information as expected utility. The Annals of Statistics 7, 686–690 (1979)
Staël von Holstein, C.A.S.: A family of strictly proper scoring rules which are sensitive to distance. Journal of Applied Meteorology 9, 360–364 (1970)
Murphy, A.H.: The ranked probability score and the probability score: A comparison. Monthly Weather Review 98, 917–924 (1970)
Murphy, A.H.: On the “ranked probability score”. Journal of Applied Meteorology 8, 988–989 (1969)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kohonen, J., Suomela, J. (2006). Lessons Learned in the Challenge: Making Predictions and Scoring Them. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds) Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment. MLCW 2005. Lecture Notes in Computer Science(), vol 3944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11736790_7
Download citation
DOI: https://doi.org/10.1007/11736790_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33427-9
Online ISBN: 978-3-540-33428-6
eBook Packages: Computer ScienceComputer Science (R0)