Abstract
Large observed datasets are not stationary and/or depend on covariates, especially, in the case of extreme hydrometeorological variables. This causes the difficulty in estimation, using classical hydrological frequency analysis. A number of non-stationary models have been developed using linear or quadratic polynomial functions or B-splines functions to estimate the relationship between parameters and covariates. In this article, we propose regularised generalized extreme value model with B-splines (GEV-B-splines models) in a Bayesian framework to estimate quantiles. Regularisation is based on penalty and aims to favour parsimonious model especially in the case of large dimension space. Penalties are introduced in a Bayesian framework and the corresponding priors are detailed. Five penalties are considered and the corresponding priors are developed for comparison purpose as: Least absolute shrinkage and selection (Lasso and Ridge) and smoothing clipped absolute deviations (SCAD) methods (SCAD1, SCAD2 and SCAD3). Markov chain Monte Carlo (MCMC) algorithms have been developed for each model to estimate quantiles and their posterior distributions. Those approaches are tested and illustrated using simulated data with different sample sizes. A first simulation was made on polynomial B-splines functions in order to choose the most efficient model in terms of relative mean biais (RMB) and the relative mean-error (RME) criteria. A second simulation was performed with the SCAD1 penalty for sinusoidal dependence to illustrate the flexibility of the proposed approach. Results show clearly that the regularized approaches leads to a significant reduction of the bias and the mean square error, especially for small sample sizes (n < 100). A case study has been considered to model annual peak flows at Fort-Kent catchment with the total annual precipitations as covariates. The conditional quantile curves were given for the regularized and the maximum likelihood methods.
Similar content being viewed by others
References
AghaKouchak A, Nasrollahi N (2010) Semi-parametric and parametric inference of extreme value models for rainfall data. Water Resour Manag 24(6):1229–1249
Boisvert J, Ashkar F, El Adlouni S, El-Jabi N, Aucoin F (2015) Modeling St. John River (N.B., Canada) incomplete hydrometric data using bivariate distributions. Revue canadienne de génie civil 42(7):427–436
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New-York
Carreau J, Bouvier C (2015) Multivariate density model comparison for multi-site flood-risk rainfall in the French Mediterranean area. Stoch Environ Res Risk Assess 1:1–22
Chavez-Demoulin V, Davison AC (2005) Generalized additive modelling of sample extremes. J Roy Stat Soc 54(1):207–222
Craven P, Wahba G (1978) Smoothing noisy data with splines functions. Numer Math 31(4):377–403
Das D, Ganguly AR, Chatterjee S, Kumar V, Obradovic Z (2012) Spatially penalized regression for extremes dependence analysis and prediction: case of precipitation extremes
Dupuis DJ (2012) Modeling waves of extreme temperature: the changing tails of four cities. J Am Stat Assoc 107(497):24–39
El Adlouni S, Ouarda TBMJ (2008) Comparison of methods for estimating the parameters of the non-stationary GEV model. Rev Sci Eau 21(1):35–50
El Adlouni S, Favre A-C, Bobée B (2006) Comparison of methodologies to assess the convergence of Markov chain Monte Carlo methods. Comput Stat Data Anal 50(10):2685–2701
El Adlouni S, Ouarda TBMJ, Zhang X, Roy R, Bobée B (2007) Generalized maximum likelihood estimators for the nonstationary generalized extreme value model. Water Resour Res 43:W03410. doi:10.1029/2005WR004545
Fahrmeir L, Kneib T, Konrath S (2010) Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection. Stat Comput 20(2):203–219
Fan J (1997) Comments on wavelets in statistics: a review by a. antoniadis. J Ital Stat Soc 6(2):131–138
Fan J, Li R (2001) Variable selection via non-concave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Fan J, Feng Y, Wu Y (2009) Network exploration via the adaptive Lasso and SCAD penalties. Ann Appl Stat 3(2):521–541
Griffin J, Brown P (2005) Alternative prior distributions for variable selection with very many more variables than observations. Technical Report. University of Warwick, Coventry
Jenkinson AF (1955) The frequency distribution of the annual maximum (or minimum) values of meteorological elements. Q J Roy Meteorol Soc 81(348):158–171
Martins ES, Stedinger JR (2000) Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resour Res 36(3):737–744
Nasri B, El Adlouni S, Ouarda TB (2013) Bayesian estimation for GEV-B-splines model. Open J Stat 3(02):118–129
Ouarda TBMJ, El Adlouni S (2011) Bayesian non-stationary frequency analysis of hydrological variables1. JAWRA 47(3):496–505
Padoan SA, Wand MP (2008) Mixed model-based additive models for sample extremes. Stat Probab Lett 78(17):2850–2858
Park C (2010) Block thresholding wavelet regression using SCAD penalty. J Stat Plann Inference 140(9):2755–2770
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103:681–686
Schmidt M. (2005). Least squares optimization with L1-norm regularization. CS542B Project Report
Serinaldi F (2015) Dismissing return periods! Stoch Environ Res Risk Assess 29:1179–1189. doi:10.1007/s00477-014-0916-1
Shinyie WL, Ismail N, Jemain AA (2013) Semi-parametric estimation for selecting optimal threshold of extreme rainfall events. Water Resour Manag 27(7):2325–2352
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J Roy Stat Soc: Ser B (Methodol) 58(1):267–288
Vanem E (2011) Long-term time-dependent stochastic modelling of extreme waves. Stoch Environ Res Risk Assess 25(2):185–209
Vasiliades L, Galiatsatou P, Loukas A (2015) Nonstationary frequency analysis of annual maximum rainfall using climate covariates. Water Resour Manag 29(2):339–358
Yu K, Moyeed RA (2001) Bayesian quantile regression. Stat Probab Lett 54(4):437–447
Zou H, Li R (2008) One-step sparse estimates in non-concave penalized likelihood models. Ann Stat 36(4):1509
Acknowledgments
The Authors are grateful to the associate Editor and two anonymous reviewers for their comments and to the Natural Sciences and Engineering Research Council of Canada (NSERC) for the financial support. We also thank the Environment Canada Data Access Integration (DAI) portal for providing data from observed daily precipitation amounts.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Bayesian Lasso GEV B-splines
The expression of Lasso penalty is:
1.1 Proposed model (GEV-Lasso)
Appendix 2: Ridge GEV-B-splines model
2.1 Penalty
2.2 Proposed model (GEV-Ridge)
Appendix 3: SCAD GEV-B-splines
3.1 SCAD1
3.1.1 Penalty
3.1.2 Proposed model (GEV-SCAD1)
3.2 SCAD2
3.2.1 Penalty
3.2.2 Proposed model (GEV-SCAD2)
3.3 SCAD3
3.3.1 Penalty
3.3.2 Proposed model (GEV-SCAD3)
Rights and permissions
About this article
Cite this article
Yousfi, N., Adlouni, S.E. Regularized Bayesian estimation for GEV-B-splines model. Stoch Environ Res Risk Assess 31, 535–550 (2017). https://doi.org/10.1007/s00477-016-1295-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-016-1295-6