Abstract
Droughts, as many climatic and environmental phenomena, are events with a random duration. In the monitoring and risk management of this type of phenomena, it is important the development of measures of the risk that an ongoing event ends. This work develops a risk measure conditional on the current state of the event, that can be easily updated in real time. The measure is based on the hazard function of the duration of an event, that is modeled as a parametric function of covariates describing the current state of the process. The use of (time-dependent) internal covariates is often required to describe that state, and maximum likelihood methods cannot be used to estimate the model. Therefore, an approach based on partial likelihood functions that permit the inclusion of both external and internal covariates is suggested. This approach is very general but it has the drawback of requiring some programming to be implemented. However, it is proved that for durations with a geometric distribution, an equivalent and easily implemented approach based on generalized linear models can be used to estimate the hazard function. This methodology is applied to develop a risk measure in drought analysis. The approach is exemplified using the drought series from a Spanish location (Huesca) and internal covariates derived from the rainfall series. The whole modeling process is thoroughly described, including the covariate selection procedure and some new validation tools.
Similar content being viewed by others
References
Brier GW (1950) Verification of forecast expressed in terms of probability. Mon Weather Rev 78:13
Burke EJ, Brown SJ, Christidis N (2006) Modelling the recent evolution of global drought and projections for the 21st century with the Hadley Centre climate model. J Hydrometeorol 7:111325
Cancelliere A, Di Mauro G, Bonaccorso B, Rossi G (2007) Drought forecasting using the Standardized Precipitation Index. Water Resour Manage 21(5):801–819
Cebrián AC, Abaurrea J (2006) Drought analysis based on a marked cluster Poisson model. J Hydrometeorol 7:713–723
Collett D (2003) Modelling binary data, 2nd edn. Chapman and Hall
Cox DR (1975) Partial likelihood. Biometrika 62:269–276
Efron B, Tibshirani R (1997) Improvements on crossvalidation: The 0.632+ bootstrapmethod. J Am Stat Assoc 92:548–560
Gerds TA, Schumacher M (2007) Efron-type measures of prediction error for survival analysis. Biometrics 63:1283–1287
Gerds T, Cai T, Schumacher M (2008) The performance of risk prediction models. Biometr J 50:457–479
Gibbs WJ, Maher JV (1967) Rainfall deciles as drought indicators. Bureau of Meteorology Bulletin, p 48
Hwang Y, Carbone GJ (2009) Ensemble forecasts of drought indices using a conditional residual resampling technique. J Appl Meteorol Clim 48(7):1289–1301
Jolliffe IT, Stephenson DB (2008) Proper scores for probability forecasts can never be equitable, Mon Weather Rev 136:1505–1510
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley
Karl T, Quinlan F, Ezell DS (1987) Drought termination and amelioration: its climatological probability. J Clim Appl Meteorol 26(9):1198–1209
Kendall DR, Dracup JA (1992) On the generation of drought events using an alternating renewal-reward model, Stoch Hydrol Hydraul 6(1):55–68
Kharin VV, Zwiers FW (2003) On the roc score of probability forecasts J Clim 16:4145–4150
Kim T, Valdes JB (2003) Nonlinear model for drought forecasting based on a conjunction of wavelet transforms and neural networks. J Hydrol Eng 8(6):319–328
Mason SJ (2004) On using ”climatology” as a reference strategy in the Brier and ranked probability skill scores. Mon Weather Rev 132:1891–1895
McCullagh P, Nelder J (1989) Generalized linear models, 2nd edn. Chapman and Hall
Mishra AK, Desai VR (2005) Drought forecasting using stochastic models. Stoch Environ Res Risk Assess 19:326–339
Mishra AK, Desai VR (2010) A review of drought concepts. J Hydrol 391:202–216
Mishra AK, Desai VR, Singh VP (2006) Drought forecasting using a hybrid stochastic and neural network model. J Hydrol Eng 12(6):626–638
Mishra AK, Singh VP, Desai VR (2009) Drought characterization: a probabilistic approach. Stoch Environ Res Risk Assess 23:41–55
Moreira EE, Coelho CA, Paulo AA, Pereira LS, Mexia JT (2008) SPI-based drought category prediction using loglinear models. J Hydrol 354:116–130
Morid S, Smakhtin V, Bagherzadeh K (2007) Drought forecasting using artificial neural networks and time series of drought indices. Int J Climatol 27(15):2103–2111
Nadarajah S (2007) The bivariate gamma exponential distribution with application to drought data. J Appl Math Comput 24(1):221–230
Nichols N, Coughlan MJ, Monnik K (2005) The challenge of climate prediction in mitigating drought impacts. In: Wilhite DA (eds) Drought and water crisis, science technology and management issues. Taylor & Francis, pp 33–55
Paulo AA, Pereira LS (2007) Prediction of SPI drought class transitions using Markov chains. Water Resour Manage 21:1813–1827
Shiau JT (2006) Fitting drought duration and severity with two-dimensional copulas. Water Resour Manage 20:795–815
Steinemann AC (2006) Using climate forecasts for drought management. J Appl Meteorol Clim 45:1353–1361
Swets JA (1973) The relative operating characteristic in psychology. Science 182:990–1000
Yevjevich V (1967) An objective approach to definitions and investigations of continental hydrologic drought. Hydrology Paper 23. Colorado State University, Fort Collins
Yoo C, Kim D, Kim TW, Hwang KN (2008) Quantification of drought using a rectangular pulses Poisson process model. J Hydrol 355:34–48
Zelenhasic E, Salvai A (1987) A method of streamflow drought analysis. Water Resour Res 23(1):156–168
Acknowledgments
The authors thank AEMET (Spanish Meteorological Agency) for the precipitation data, and Ministerio de Educación y Ciencia (Spanish Department of Education and Science) for the financial support through the project CGL2006.02485/CLI.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Partial likelihood functions
1.1.1 General expression of a partial likelihood function
Let us H(t) represent the complete history of the process of a failure time (duration) variable T, so that it records all failure information as well as information until time t on all covariates. Let us assume that H(t) is a Markov process. That means that it depends on the whole past trajectory, but only through its current value. Let us denote by d 1, d 2,…,d n , the failure times (durations) of the n individuals (events) in the sample. The values of the covariate vector for each individual i and for each time t lower than d i are known and denoted by x i (t). It is also assumed that given H(t), the failure mechanisms act independently over [t, t + dt). Under these assumptions, the likelihood, apart from differential elements, is
where h[t; θ(x j (t), β)] is the hazard of individual j at time t and V(t) the set of individuals with failure times >t. This expression can be found in Kalbfleisch and Prentice (2002) but the integral product must be obtained, and its calculation depends on the type of distribution of the variable. When a discrete distribution is used to model a failure time T, we assume that the positive integers r are the only times where an element can fail. In addition, the hazard function is considered null after the last failure time d n . Then, apart from differential elements, it follows,
where the product index r varies over the positive integers lower than the duration of the event i. Substituting the product integral in Eq. 5, the partial likelihood function for discrete duration variables is obtained,
The corresponding partial loglikelihood is,
The derivation for the continuous distributions is a bit more complex. Considering a partition \(0=\tau_0 < \cdots < \tau_m <\infty,\) such that \(\lim_{m \to \infty} \Updelta \tau_i =0 ,\) it follows,
and interchanging the sums and applying the integral definition,
Finally, substituting this product in Eq. 5, the following partial loglikelihood is obtained,
The corresponding partial loglikelihood is,
The PL functions have the same asymptotic properties than likelihood functions, and most inference methods are valid for the PL functions resulting from models with internal covariates, see Kalbfleisch and Prentice (2002).
1.1.2 Partial likelihood functions of the geometric and exponential distributions
Partial likelihood of a geometric distribution The hazard function of a geometric distribution is equal to the parameter p of the distribution. Since p must be in the interval [0, 1], a logistic link is used. Hence, the hazard function of an element with covariate vector x(·) is,
where \(\nu({\mathbf x}(t),\beta)=\beta_0+\beta_1 x1(t)+ \cdots +\beta_K xK(t)\) is the linear predictor. Substituting this hazard in Eq. 6, the resulting partial likelihood function is,
Since
and
the partial loglikelihodd function can be expressed as,
Partial likelihood of an exponential distribution The hazard function of an exponential distribution is equal to the parameter λ of the distribution. Since this parameter must be positive, a log link is used. Hence, the hazard function of an element with covariate vector \({\mathbf x}(.)\) is,
Due to measurement limitations, covariates are usually recorded in a discrete way. If the value of the covariates is considered constant between t and t + 1, the hazard function is also constant during that interval. Consequently,
where r is an index varying in the positive integers lower than duration d i . Substituting the expression of this integral in Eq. 9, the resulting partial loglikelihood function is,
1.2 Equivalence of the approaches to the modeling of the hazard function
The objective of this Appendix is to show the equivalence between the modeling process of the hazard function of durations when a geometric distribution is assumed (Sect. 1) and the modeling of the probability of ending using a GLM with a Bernoulli error (Sect. 2). Specifically, we will show that these approaches are equivalent in the sense that both lead to the same final model (the same covariates and the same fitted parameters).
Fixed a set of covariates, the parameters to be estimated in both approaches are the same, since the hazard function of the geometric duration of an event is equal to the probability of the event ending, and because a logistic link is used in both cases to relate the response and the linear predictor. Also the covariate selection is based on the same likelihood ratio tests. Thus, to show the equivalence is enough to prove that the GLM likelihood function is equal to the PL function derived for the geometric durations in the hazard function approach.
The PL function for geometric durations is already derived, see Eq. 10, and the calculation of the likelihood of the considered GLM is rather simple. Let us assume a sample of n events with duration d i and covariate vector \({\mathbf x}_i(r)\) observed for positive integers r ≤ d i . Given that the conditional variables \(D_i(t)| {\mathbf x}_i(t)\) follow a Bernoulli distribution with parameter \(p\left({{\mathbf x}_i(t),\beta}\right),\) the likelihood function of the sample is,
and the equivalence of the modeling processes follows straightforwardly.
Rights and permissions
About this article
Cite this article
Cebrián, A.C., Abaurrea, J. Risk measures for events with a stochastic duration: an application to drought analysis. Stoch Environ Res Risk Assess 26, 971–981 (2012). https://doi.org/10.1007/s00477-011-0521-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-011-0521-5