Abstract
The generalized Pareto distribution (GPD) is a family of continuous distributions used to model the tail of the distribution to values higher than a threshold u. Despite the advantages of the GPD representation, its shape and scale parameters do not correspond to the expected value, which complicates the interpretation of regression models specified using the GPD. This study proposes a linear regression model in which the response variable is a GPD, using a new parametrization that is indexed by mean and precision parameters. The main advantage of our new parametrization is the straightforward interpretation of the regression coefficients in terms of the expectation of the positive real line response variable, as is usual in the context of generalized linear models. Furthermore, we propose a model for extreme values, in which the GPD parameters (mean and precision) are defined on the basis of a dynamic linear regression model. The novelty of the study lies in the time variation of the mean and precision parameter of the resulting distribution. The parameter estimation of these new models is performed under the Bayesian paradigm. Simulations are conducted to analyze the performance of our proposed models. Finally, the models are applied to environmental datasets (temperature datasets), illustrating their capabilities in challenging cases in extreme value theory.
Similar content being viewed by others
References
Atkinson A (1985) Plots, transformations, and regression: an introduction to graphical methods of diagnostic regression analysis. Oxford by Clarendon press, London
Cabras S, Castellanos MA, Gamerman D (2011) A default Bayesian approach for regression on extremes. Stat Model 11:557–580
Castellanos MA, Cabras S (2007) A default Bayesian procedure for the generalized Pareto distribution. J Stat Plan Inference 137:473–483
Coles SG (2001) An introduction to statistical modelling of extreme values. Springer, Berlin
Cunnane C (1979) Note on the Poisson assumption in partial duration series model. Water Resour Res 15:489–494
Davison AC, Smith RL (1990) Models for exceedances over high thresholds. J R Stat Soc Ser B 52:342–393
Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference, 2nd edn. Chapman and Hall/CRC, Baton Rouge
Lima S, Nascimento FF, Ferraz VRS (2018) Regression models for time-varying extremes. J Stat Comput Simul 88:235–249
McCullagh P, Nelder J (1989) Generalized linear models. Chapman and Hall/CRC, New York
Nascimento FF, Gamerman D, Lopes HF (2011) Regression models for exceedance data via the full likelihood. Environ Ecol Stat 18:495–512
Nascimento FF, Gamerman D, Lopes HF (2012) A semiparametric Bayesian approach to extreme value estimation. Stat Comput 22:661–675
Nascimento FF, Gamerman D, Lopes HF (2016) Time varying extreme pattern with dynamic models. TEST 25:131–149
Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131
Scarrott CJ, Macdonald A (2012) A review of extreme value threshold estimation and uncertainty quantification. REVSTAT 10:33–60
West M, Harrison J (1997) Bayesian forecasting and dynamic models, 2nd edn. Springer, New York
Wiper M, Rios Insua D, Ruggeri F (2001) Mixtures of gamma distributions with applications. J Comput Graph Stat 10:440–454
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bourguignon, M., do Nascimento, F.F. Regression models for exceedance data: a new approach. Stat Methods Appl 30, 157–173 (2021). https://doi.org/10.1007/s10260-020-00518-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-020-00518-6