Skip to main content
Log in

Risk measures for events with a stochastic duration: an application to drought analysis

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Droughts, as many climatic and environmental phenomena, are events with a random duration. In the monitoring and risk management of this type of phenomena, it is important the development of measures of the risk that an ongoing event ends. This work develops a risk measure conditional on the current state of the event, that can be easily updated in real time. The measure is based on the hazard function of the duration of an event, that is modeled as a parametric function of covariates describing the current state of the process. The use of (time-dependent) internal covariates is often required to describe that state, and maximum likelihood methods cannot be used to estimate the model. Therefore, an approach based on partial likelihood functions that permit the inclusion of both external and internal covariates is suggested. This approach is very general but it has the drawback of requiring some programming to be implemented. However, it is proved that for durations with a geometric distribution, an equivalent and easily implemented approach based on generalized linear models can be used to estimate the hazard function. This methodology is applied to develop a risk measure in drought analysis. The approach is exemplified using the drought series from a Spanish location (Huesca) and internal covariates derived from the rainfall series. The whole modeling process is thoroughly described, including the covariate selection procedure and some new validation tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Brier GW (1950) Verification of forecast expressed in terms of probability. Mon Weather Rev 78:13

    Article  Google Scholar 

  • Burke EJ, Brown SJ, Christidis N (2006) Modelling the recent evolution of global drought and projections for the 21st century with the Hadley Centre climate model. J Hydrometeorol 7:111325

    Article  Google Scholar 

  • Cancelliere A, Di Mauro G, Bonaccorso B, Rossi G (2007) Drought forecasting using the Standardized Precipitation Index. Water Resour Manage 21(5):801–819

    Article  Google Scholar 

  • Cebrián AC, Abaurrea J (2006) Drought analysis based on a marked cluster Poisson model. J Hydrometeorol 7:713–723

    Article  Google Scholar 

  • Collett D (2003) Modelling binary data, 2nd edn. Chapman and Hall

  • Cox DR (1975) Partial likelihood. Biometrika 62:269–276

    Article  Google Scholar 

  • Efron B, Tibshirani R (1997) Improvements on crossvalidation: The 0.632+ bootstrapmethod. J Am Stat Assoc 92:548–560

    Google Scholar 

  • Gerds TA, Schumacher M (2007) Efron-type measures of prediction error for survival analysis. Biometrics 63:1283–1287

    Article  Google Scholar 

  • Gerds T, Cai T, Schumacher M (2008) The performance of risk prediction models. Biometr J 50:457–479

    Google Scholar 

  • Gibbs WJ, Maher JV (1967) Rainfall deciles as drought indicators. Bureau of Meteorology Bulletin, p 48

  • Hwang Y, Carbone GJ (2009) Ensemble forecasts of drought indices using a conditional residual resampling technique. J Appl Meteorol Clim 48(7):1289–1301

    Article  Google Scholar 

  • Jolliffe IT, Stephenson DB (2008) Proper scores for probability forecasts can never be equitable, Mon Weather Rev 136:1505–1510

    Article  Google Scholar 

  • Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley

  • Karl T, Quinlan F, Ezell DS (1987) Drought termination and amelioration: its climatological probability. J Clim Appl Meteorol 26(9):1198–1209

    Article  Google Scholar 

  • Kendall DR, Dracup JA (1992) On the generation of drought events using an alternating renewal-reward model, Stoch Hydrol Hydraul 6(1):55–68

    Article  Google Scholar 

  • Kharin VV, Zwiers FW (2003) On the roc score of probability forecasts J Clim 16:4145–4150

    Article  Google Scholar 

  • Kim T, Valdes JB (2003) Nonlinear model for drought forecasting based on a conjunction of wavelet transforms and neural networks. J Hydrol Eng 8(6):319–328

    Google Scholar 

  • Mason SJ (2004) On using ”climatology” as a reference strategy in the Brier and ranked probability skill scores. Mon Weather Rev 132:1891–1895

    Article  Google Scholar 

  • McCullagh P, Nelder J (1989) Generalized linear models, 2nd edn. Chapman and Hall

  • Mishra AK, Desai VR (2005) Drought forecasting using stochastic models. Stoch Environ Res Risk Assess 19:326–339

    Article  Google Scholar 

  • Mishra AK, Desai VR (2010) A review of drought concepts. J Hydrol 391:202–216

    Article  Google Scholar 

  • Mishra AK, Desai VR, Singh VP (2006) Drought forecasting using a hybrid stochastic and neural network model. J Hydrol Eng 12(6):626–638

    Article  Google Scholar 

  • Mishra AK, Singh VP, Desai VR (2009) Drought characterization: a probabilistic approach. Stoch Environ Res Risk Assess 23:41–55

    Article  Google Scholar 

  • Moreira EE, Coelho CA, Paulo AA, Pereira LS, Mexia JT (2008) SPI-based drought category prediction using loglinear models. J Hydrol 354:116–130

    Article  Google Scholar 

  • Morid S, Smakhtin V, Bagherzadeh K (2007) Drought forecasting using artificial neural networks and time series of drought indices. Int J Climatol 27(15):2103–2111

    Article  Google Scholar 

  • Nadarajah S (2007) The bivariate gamma exponential distribution with application to drought data. J Appl Math Comput 24(1):221–230

    Article  Google Scholar 

  • Nichols N, Coughlan MJ, Monnik K (2005) The challenge of climate prediction in mitigating drought impacts. In: Wilhite DA (eds) Drought and water crisis, science technology and management issues. Taylor & Francis, pp 33–55

  • Paulo AA, Pereira LS (2007) Prediction of SPI drought class transitions using Markov chains. Water Resour Manage 21:1813–1827

    Article  Google Scholar 

  • Shiau JT (2006) Fitting drought duration and severity with two-dimensional copulas. Water Resour Manage 20:795–815

    Article  Google Scholar 

  • Steinemann AC (2006) Using climate forecasts for drought management. J Appl Meteorol Clim 45:1353–1361

    Article  Google Scholar 

  • Swets JA (1973) The relative operating characteristic in psychology. Science 182:990–1000

    Article  CAS  Google Scholar 

  • Yevjevich V (1967) An objective approach to definitions and investigations of continental hydrologic drought. Hydrology Paper 23. Colorado State University, Fort Collins

  • Yoo C, Kim D, Kim TW, Hwang KN (2008) Quantification of drought using a rectangular pulses Poisson process model. J Hydrol 355:34–48

    Article  Google Scholar 

  • Zelenhasic E, Salvai A (1987) A method of streamflow drought analysis. Water Resour Res 23(1):156–168

    Article  Google Scholar 

Download references

Acknowledgments

The authors thank AEMET (Spanish Meteorological Agency) for the precipitation data, and Ministerio de Educación y Ciencia (Spanish Department of Education and Science) for the financial support through the project CGL2006.02485/CLI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana C. Cebrián.

Appendix

Appendix

1.1 Partial likelihood functions

1.1.1 General expression of a partial likelihood function

Let us H(t) represent the complete history of the process of a failure time (duration) variable T, so that it records all failure information as well as information until time t on all covariates. Let us assume that H(t) is a Markov process. That means that it depends on the whole past trajectory, but only through its current value. Let us denote by d 1d 2,…,d n , the failure times (durations) of the n individuals (events) in the sample. The values of the covariate vector for each individual i and for each time t lower than d i are known and denoted by x i (t). It is also assumed that given H(t), the failure mechanisms act independently over [tt + dt). Under these assumptions, the likelihood, apart from differential elements, is

$$ \ell_p(\beta) = \prod_{i=1}^n h[{d_i; \theta({\mathbf x}_i(d_i),\beta)}] {\mathcal P}_0^\infty \left\{\prod_{j\in V(t)}\left(1-h[{t; \theta({\mathbf x}_j(t),\beta)}]dt\right)\right\} $$
(5)

where h[t; θ(x j (t), β)] is the hazard of individual j at time t and V(t) the set of individuals with failure times >t. This expression can be found in Kalbfleisch and Prentice (2002) but the integral product must be obtained, and its calculation depends on the type of distribution of the variable. When a discrete distribution is used to model a failure time T, we assume that the positive integers r are the only times where an element can fail. In addition, the hazard function is considered null after the last failure time d n . Then, apart from differential elements, it follows,

$$ \begin{aligned} &{\mathcal P}_0^\infty \left\{\prod_{j\in V(t)} \left(1-h[{t; \theta({\mathbf x}_j(t),\beta)}]dt\right)\right\} = \prod_{r=1}^{d_n}\prod_{j\in V(r)} \left(1-h[{r; \theta({\mathbf x}_j(r),\beta)}]\right)\\ &\quad =\prod_{r=1}^{d_n}\prod_{j,d_j > r} \left(1-h[{r; \theta({\mathbf x}_j(r),\beta)}]\right) = \prod_{i=1}^n\prod_{r < d_i}\left(1-h[{r; \theta({\mathbf x}_i(r),\beta)}]\right), \end{aligned} $$
(6)

where the product index r varies over the positive integers lower than the duration of the event i. Substituting the product integral in Eq. 5, the partial likelihood function for discrete duration variables is obtained,

$$ \ell_P(\beta) = \prod_{i=1}^n \left[ h[{d_i; \theta({\mathbf x}_i(d_i),\beta)}] \prod_{r < d_i}\left(1-h[{r; \theta({\mathbf x}_i(r),\beta)}]\right) \right]. $$
(7)

The corresponding partial loglikelihood is,

$$ \ell \ell_P(\beta) =\sum_{i=1}^n \left[ \log h[{d_i ;\theta({\mathbf x}_i(d_i),\beta)}]+ \sum_{r < d_i}\log\left(1-h[{r; \theta({\mathbf x}_i(r), \beta)}]\right) \right]. $$
(8)

The derivation for the continuous distributions is a bit more complex. Considering a partition \(0=\tau_0 < \cdots < \tau_m <\infty,\) such that \(\lim_{m \to \infty} \Updelta \tau_i =0 ,\) it follows,

$$ \begin{aligned} {\mathcal P}_0^\infty \left\{\prod_{j\in V(t)}(1-h[{t; \theta({\mathbf x}_j(t),\beta)}] dt)\right\}&=\hbox{exp} \left[ {\mathop {\lim }\limits_{{m \to \infty }} \sum\limits_{{i = 1}}^{m} {\ln \left( {\prod\limits_{{j \in V(\tau _{i}))}} {[1 - h[\tau _{i} ;\theta ({{\mathbf{x}}}_{i} (\tau _{i} ),\beta)]\Updelta \tau _{i} ]}} \right)}} \right]\\ &= \hbox{exp} \left[ {\mathop {\lim }\limits_{{m \to \infty }} \sum\limits_{{i = 1}}^{m} {\left({\sum\limits_{{j \in V(\tau _{i})}} {\ln [1 - h[\tau _{i} ;\theta ({{\mathbf{x}}}_{i} (\tau _{i} ),\beta)] \Updelta \tau _{i} ]} } \right)} } \right] \\ & = \hbox{exp} \left[ { - \mathop {\lim }\limits_{{m \to \infty }} \sum\limits_{{i = 1}}^{m} {\left( {\sum\limits_{{j \in V(\tau _{i})}} {h[\tau _{i} ;\theta ({{\mathbf{x}}}_{i} (\tau _{i} ),\beta)]\Updelta \tau _{i}}} \right)}} \right], \end{aligned} $$
(9)

and interchanging the sums and applying the integral definition,

$$ = \exp\left[- \sum_{i=1}^n \int\limits_0^{d_i} h[{t; \theta({\mathbf x}_i(t),\beta)}] dt \right]= \prod_{i=1}^n \exp\left[- \int\limits_0^{d_i} h[{t; \theta({\mathbf x}_i(t),\beta)}] dt \right]. $$
(10)

Finally, substituting this product in Eq. 5, the following partial loglikelihood is obtained,

$$ \ell_P(\beta) = \prod_{i=1}^n h[{d_i; \theta({\mathbf x}_i(d_i), \beta)}] \ \exp\left(-\int\limits_0^{d_i} h[{t; \theta({\mathbf x}_i(t), \beta)}] dt\right). $$
(11)

The corresponding partial loglikelihood is,

$$ \ell \ell_P(\beta) = \sum_{i=1}^n\left( \log h[{d_i; \theta({\mathbf x}_i(d_i), \beta)}] - \int\limits_0^{d_i} h[{t; \theta({\mathbf x}_i(t), \beta)}] dt \right). $$
(12)

The PL functions have the same asymptotic properties than likelihood functions, and most inference methods are valid for the PL functions resulting from models with internal covariates, see Kalbfleisch and Prentice (2002).

1.1.2 Partial likelihood functions of the geometric and exponential distributions

Partial likelihood of a geometric distribution The hazard function of a geometric distribution is equal to the parameter p of the distribution. Since p must be in the interval [0, 1], a logistic link is used. Hence, the hazard function of an element with covariate vector x(·) is,

$$ h[t;p({\mathbf x}(t),\beta)]=p({\mathbf x}(t),\beta)={\frac{\exp[\nu({\mathbf x}(t),\beta)]} {1+\exp[\nu({\mathbf x}(t),\beta)]}}, $$
(13)

where \(\nu({\mathbf x}(t),\beta)=\beta_0+\beta_1 x1(t)+ \cdots +\beta_K xK(t)\) is the linear predictor. Substituting this hazard in Eq. 6, the resulting partial likelihood function is,

$$ \ell_P(\beta) = \prod_{i=1}^n \left[ p({\mathbf x}_i(d_i),\beta) \prod_{r < d_i}[1-p({\mathbf x}_i(r),\beta) ]\right]. $$
(14)

Since

$$ \log \left[ p({\mathbf x}_i(d_i),\beta) \right]=\nu({\mathbf x}_i(d_i),\beta)-\log\left( 1+\exp[\nu({\mathbf x}_i(d_i),\beta)] \right) $$
(15)

and

$$ \log[1-p({\mathbf x}_i(r),\beta)]=\log \left[ {\frac{1} {1+\exp[\nu({\mathbf x}_i(r),\beta)]}}\right]=-\log \left(1+\exp[\nu({\mathbf x}_i(r),\beta)]\right), $$
(16)

the partial loglikelihodd function can be expressed as,

$$ \begin{aligned} \ell \ell_P(\beta) &= \sum_{i=1}^n \left[ \nu({\mathbf x}_i(d_i),\beta)-\log\left( 1+\exp[\nu({\mathbf x}_i(d_i),\beta)] \right)- \sum_{r < d_i} \log\left(1+\exp[\nu({\mathbf x}_i(r),\beta)]\right)\right] \\ &=\sum_{i=1}^n \left[\nu({\mathbf x}_i(d_i),\beta)-\sum_{r\le d_i} \log(1+\exp[\nu({\mathbf x}_i(r),\beta)])\right]. \end{aligned} $$
(17)

Partial likelihood of an exponential distribution The hazard function of an exponential distribution is equal to the parameter λ of the distribution. Since this parameter must be positive, a log link is used. Hence, the hazard function of an element with covariate vector \({\mathbf x}(.)\) is,

$$ h[t;\lambda({\mathbf x}(t),\beta)]=\lambda({\mathbf x}(t),\beta)=\exp[\nu({\mathbf x}(t), \beta)]. $$
(18)

Due to measurement limitations, covariates are usually recorded in a discrete way. If the value of the covariates is considered constant between t and t + 1, the hazard function is also constant during that interval. Consequently,

$$ \int\limits_0^{d_i} h[{t; \lambda({\mathbf x}_i(t), \beta)}] dt = \sum_{r\le d_i} h[{r; \lambda({\mathbf x}_i(r),\beta)}], $$
(19)

where r is an index varying in the positive integers lower than duration d i . Substituting the expression of this integral in Eq. 9, the resulting partial loglikelihood function is,

$$ \ell \ell _P(\beta) = \sum_{i=1}^n \left(\nu({\mathbf x}_i(d_i),\beta)- \sum_{r \le d_i} \exp[\nu({\mathbf x}_i(r),\beta)]\right). $$
(20)

1.2 Equivalence of the approaches to the modeling of the hazard function

The objective of this Appendix is to show the equivalence between the modeling process of the hazard function of durations when a geometric distribution is assumed (Sect. 1) and the modeling of the probability of ending using a GLM with a Bernoulli error (Sect. 2). Specifically, we will show that these approaches are equivalent in the sense that both lead to the same final model (the same covariates and the same fitted parameters).

Fixed a set of covariates, the parameters to be estimated in both approaches are the same, since the hazard function of the geometric duration of an event is equal to the probability of the event ending, and because a logistic link is used in both cases to relate the response and the linear predictor. Also the covariate selection is based on the same likelihood ratio tests. Thus, to show the equivalence is enough to prove that the GLM likelihood function is equal to the PL function derived for the geometric durations in the hazard function approach.

The PL function for geometric durations is already derived, see Eq. 10, and the calculation of the likelihood of the considered GLM is rather simple. Let us assume a sample of n events with duration d i and covariate vector \({\mathbf x}_i(r)\) observed for positive integers r ≤ d i . Given that the conditional variables \(D_i(t)| {\mathbf x}_i(t)\) follow a Bernoulli distribution with parameter \(p\left({{\mathbf x}_i(t),\beta}\right),\) the likelihood function of the sample is,

$$ \begin{aligned} \ell (\beta) &= \prod_{i=1}^n \left[ P[D_i(d_i)=1| {\mathbf x}_i(d_i)]\prod_{r<d_i} P[D_i(r)=0| {\mathbf x}_i(r) ] \right]\\ &= \prod_{i=1}^n \left[p\left({{\mathbf x}_i(d_i),\beta}\right) \prod_{r<d_i}[1-p\left({{\mathbf x}_i(r),\beta}\right)]\right], \end{aligned} $$
(21)

and the equivalence of the modeling processes follows straightforwardly.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cebrián, A.C., Abaurrea, J. Risk measures for events with a stochastic duration: an application to drought analysis. Stoch Environ Res Risk Assess 26, 971–981 (2012). https://doi.org/10.1007/s00477-011-0521-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-011-0521-5

Keywords

Navigation