Abstract
Prediction of quantiles at extreme tails is of interest in numerous applications. Extreme value modelling provides various competing predictors for this point prediction problem. A common method of assessment of a set of competing predictors is to evaluate their predictive performance in a given situation. However, due to the extreme nature of this inference problem, it can be possible that the predicted quantiles are not seen in the historical records, particularly when the sample size is small. This situation poses a problem to the validation of the prediction with its realization. In this article, we propose two non-parametric scoring approaches to assess extreme quantile prediction mechanisms. The proposed assessment methods are based on predicting a sequence of equally extreme quantiles on different parts of the data. We then use the quantile scoring function to evaluate the competing predictors. The performance of the scoring methods is compared with the conventional scoring method and the superiority of the former methods are demonstrated in a simulation study. The methods are then applied to analyze cyber Netflow data from Los Alamos National Laboratory and daily precipitation data at a station in California available from Global Historical Climatology Network.
Similar content being viewed by others
References
Adams, N., Heard, N.: Dynamic Networks and Cyber-Security. World Scientific, Europe (2016)
Bader, B., Yan, J., Zhang, X.: Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate. Ann. Appl. Stat. 12(1), 310–329 (2018)
Bentzien, S., Friederichs, P.: Decomposition and graphical portrayal of the quantile score. Q. J. R. Meteorol. Soc. 140(683), 1924–1934 (2014a)
Bentzien, S., Friederichs, P.: Decomposition and graphical portrayal of the quantile score. Q. J. R. Meteorol. Soc. 140(683), 1924–1934 (2014b)
Brehmer, J.R., Strokorb, K.: Why scoring functions cannot assess tail properties. Electron. J. Statist. 13(2), 4015–4034 (2019)
Coles, S.G.: An Introduction to Statistical Modeling of Extreme Values. Springer, London (2001)
Davison, A.C., Smith, R.L.: Models for exceedances over high thresholds. J. Roy. Stat. Soc. Ser. B (Methodological) 52(3), 393–442 (1990)
Drees, H., de Haan, L., Resnick, S.: How to make a hill plot. Ann. Stat. 28(1), 254–274 (2000)
Embrechts, P., Klüppelberg, C., Mikosch, T.: Modelling Extremal Events. Springer, Berlin (1997)
Ferreira, A., de Haan, L., Peng, L.: On optimising the estimation of high quantiles of a probability distribution. Statistics 37(5), 401–434 (2003)
Fisher, R.A., Tippett, L.H.C.: Limiting forms of the frequency distribution of the largest or smallest member of a sample. Math. Proc. Camb. Philos. Soc. 24(2), 180–190 (1928)
Friederichs, P., Thorarinsdottir, T.L.: Forecast verification for extreme value distributions with an application to probabilistic peak wind prediction. Environmetrics 23(7), 579–594 (2012)
Gneiting, T.: Making and evaluating point forecasts. J. Am. Stat. Assoc. 106(494), 746–762 (2011)
Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102(477), 359–378 (2007)
Koenker, R.: A note on l-estimates for linear models. Stat. Probab. Lett. 2(6), 323–325 (1984)
Koenker, R.: Quantile regression for longitudinal data. J. Multivar. Anal. 91(1), 74–89 (2004)
Lerch, S., Thorarinsdottir, T.L., Ravazzolo, F., Gneiting, T.: Forecasters dilemma: extreme events and forecast evaluation. Statist. Sci. 32(1), 106–127 (2017)
Menne, M.J., Durre, I., Vose, R.S., Gleason, B.E., Houston, T.G.: An overview of the global historical climatology network-daily database. J. Atmos. Oceanic Tech. 29(7), 897–910 (2012)
Pickands, J.: Statistical inference using extreme order statistics. Ann. Stat. 3(1), 119–131 (1975)
Scheuerer, M., Moller, D.: Probabilistic wind speed forecasting on a grid based on ensemble model output statistics. Ann. Appl. Stat. 9(3), 1328–1349 (2015)
Turcotte, M.J.M., Kent, A.D., Hash, C.: Unified host and network data set (2017). ArXiv arXiv:1708.07518v1
Turcotte, M.J.M., Kent, A.D., Hash, C.: Unified Host and Network Data Set, vol. 1, pp. 1–122. World Scientific, Singapore (2018)
Velthoen, J., Cai, J.-J., Jongbloed, G., Schmeits, M.: Improving precipitation forecasts using extreme quantile regression. Extremes (2019)
Wang, H.J., Li, D., He, X.: Estimation of high conditional quantiles for heavy-tailed distributions. J. Am. Stat. Assoc. 107(500), 1453–1464 (2012)
Zou, H., Yuan, M.: Composite quantile regression and the oracle model selection theory. Ann. Stat. 36(3), 1108–1126 (2008)
Acknowledgements
The research of Kaushik Jana is supported by the Alan Turing Institute- Lloyd’s Register Foundation Programme on Data-Centric Engineering. The authors also like to thank Dr Tobias Fissler of Vienna University of Economics and Business for many helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gandy, A., Jana, K. & Veraart, A.E.D. Scoring predictions at extreme quantiles. AStA Adv Stat Anal 106, 527–544 (2022). https://doi.org/10.1007/s10182-021-00421-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-021-00421-9