Modelling the Clustering of Extreme Events for Short-Term Risk Assessment

Reliable estimates of the occurrence rates of extreme events are highly important for insurance companies, government agencies and the general public. The rarity of an extreme event is typically expressed through its return period, i.e. the expected waiting time between events of the observed size if the extreme events of the processes are independent and identically distributed. A major limitation with this measure is when an unexpectedly high number of events occur within the next few months immediately after a T year event, with T large. Such instances undermine the trust in the quality of risk estimates. The clustering of apparently independent extreme events can occur as a result of local non-stationarity of the process, which can be explained by covariates or random effects. We show how accounting for these covariates and random effects provides more accurate estimates of return levels and aids short-term risk assessment through the use of a complementary new risk measure. Supplementary materials accompanying this paper appear online.


INTRODUCTION
Floods and other extreme weather-related hazards are often described in terms of their return period; i.e. the expected waiting time between events if the processes being described are assumed to be stationary. In recent years, the annual exceedance probability (AEP) has also been used as an alternative metric to communicate the severity of an extreme event, because it emphasises that the risk is present in any year. When the process is changing over time, such as with a trend, e.g. due to climate change, the return period is not a particularly helpful measure of risk, but the AEP is still useful, and it simply changes from year-to-year. When the process has such non-stationarity a number of alternative ways of best presenting the associated risk over time have been proposed (Rootzén and Katz 2013;Cheng et al. 2014). Here, we examine deficiencies in the return period and AEP descriptions for stationary processes when separate independent flood events cluster in time, and we propose an additional risk measure to address the consequent difficulty in properly communicating this risk.
During the winter of 2015/2016, the north-west of the UK experienced a sequence of storm events known as Desmond, Eva and Frank. Sustained heavy rainfall caused approximately £1.6billion of economic damages, flooded 21,000 properties and severed major transport routes including the west coast train line being closed for a number of months (Environment Agency 2018). For some communities, such as in Carlisle, which had flooded badly in 1979, 1980, 1984 and 2005, it was the latest in a series of repeated flood events. Similar experiences have been felt elsewhere in the UK, e.g. for Harbertonford on the River Harbourne in Devon. This is because Harbertonford was flooded 21 times in the last 60 years and from 1998 to 2000 was flooded on six individual occasions (Bradley 2005;Cresswell 2012). Consequently, a number of Environment Agency studies have tried to determine how best to alleviate the risk of flooding.
This reoccurrence of extreme events illustrates long-term problems with the miscommunication of risk to the general public and decision-makers (Vogel et al. 2011;Cooley 2013;Olseon 2015), which stems from a historical confusion of long-and short-term risks through a single risk measure, namely the return period. The return level z T is associated with the return period T , through the distribution function G of the annual maximum by G(z T ) = 1−1/T and the AEP being T −1 . Strictly, T is the return period of the annual maxima, with T ALL = −1/ log (1 − T ) being the expected waiting time between exceedances of z T for all of the data, where T ALL ≈ T + 0.5 for T > 20. A major limitation with the return period as a risk measure is that there are regular occurrences when within the few months following a T year event (T > 50 year) another event of similar or greater severity occurs. This undermines the reputation of statisticians and flood risk managers.
The problem is that the current method of communicating the severity of an event focusses on the process being stationary, whereas this is not often the case (Black and Werritty 1997;Hannaford and Buys 2012;Gilleland et al. 2017). Clustering of apparently independent extreme events exists as a result of local non-stationarity of the process. Flood risk managers need methods to assess the short-term risk of the reoccurrence of flooding after large events have been observed, which the return period is unable to capture if G changes from yearto-year.
In some environmental applications, the standard convention is to model the extremes of the data as being identically distributed and to ignore the potential effects of covariates. There are two different univariate approaches: either a block maxima-based approach or the method considered here, which looks at exceedances above a predetermined threshold. We adopt the latter approach due to the efficiency gains in maximising the available information in the data as well as it providing an analysis at a temporal scale for which detecting covariate-response relationships is easier (Coles 2001). There are two alternative threshold approaches, the generalised Pareto distribution (Davison and Smith 1990) and the nonhomogeneous Poisson process (Smith 1989). We adopt the latter approach as it is most easy to parametrise non-stationarity. For example, a linear trend is accounted for in the location parameter of the Poisson process approach unlike for the generalised Pareto approach which . Exceedance and inter-arrival times for independent storm season events extracted from a south Devon river flow time series above thresholds u 1 , u 2 and u 3 corresponding to 0.7, 0.8 and 0.9 quantiles. a Clustering of extreme events with the crosses representing the set of event times. b Probability-probability plots of inter-arrival times for exceedances of u 1 , u 2 and u 3 , respectively, 95% tolerance bands on each plot are derived under the assumption that inter-arrivals are exponentially distributed. The grey line shows the line of equality.
instead needs both the rate of exceedance and the scale parameters to change in a nonlinear fashion (Coles 2001). The threshold-based approaches typically assume that observations are independent and identically distributed (iid). Consequently declustering of extreme values is often performed to obtain independent extreme events for modelling and inference. The most widely used method to decluster extreme observations is the runs method of Ferro and Segers (2003). Using a predetermined event window, w, periods with at least w consecutive observations below the threshold are deemed to define separate independent events. Only the cluster maximum is extracted from each event, to produce a set of independent extreme observations. However, for a declustered river flow time series for a south Devon catchment focussing only on the storm season data, Fig. 1a shows that, over a range of thresholds, clusters of extreme events still exist even after declustering has taken place. If the events were iid, the occurrence times would be well approximated by a Poisson process with exponentially distributed inter-arrival times. The P-P plots in Fig. 1b show that the inter-arrival times do not follow an exponential distribution, instead the data have a greater probability of short waiting times than is expected under this distribution, and this finding arises for three different thresholds choices.
We will show that this clustering of independent events can be described by local nonstationarity, a local change in the marginal distribution of the process. Ignoring this feature leads to biased return period estimates and an over-optimistic assessment of risk following an extreme event. In many cases, local non-stationarity might be linked to changes in climatic covariate values. Previous studies have shown that metrics such as the North Atlantic Oscillation (NAO) or the Southern Oscillation Index (SOI) influence the weather conditions in their respective spatial regions (Trigo et al. 2002;Hurrell et al. 2003). Recent research has also focused on the influence of atmospheric rivers (defined as a concentrated narrow channel of heavy vapour) and their influence on winter flooding in the UK (Lavers et al. 2013), which has the potential to identify different covariate effects by spatial location.
Detecting how a covariate affects a response such as river level can help to improve estimates of return levels and reduce the uncertainty in parameter estimates (Coles 2001). In some cases, there is knowledge of the type of covariates that affect the process of the interest, but the data on the covariate(s) are unavailable, or we may have the right covariates, but we are unsure of the appropriate functional form in which to include them in the model. Therefore, we adopt similar methods to Eastoe and Tawn (2010) and Eastoe (2019) to account for the unavailable covariates, through the inclusion of random effects in the parameters of the extreme value models. These approaches are particularly beneficial for accounting for presumed large-scale climatic covariates at small environmental scales, where detecting meaningful relationships is particularly difficult. Accounting for unavailable covariate information through random effect modelling can also improve marginal estimates and reduce the uncertainty in estimates of return values through using more physically realistic statistical models to capture the behaviour of river flow data. This paper will present a novel measure to help inform about short-term risk assessment from local non-stationarity; it will provide additional information to the return period which integrates out the local non-stationarity. The proposed measure characterises the heightened short-term risk of extreme events given the occurrence of a T year event at time t in a season. It is similar in interpretation to the definition of relative risk from epidemiology. The risk measure is T < 1) there is an increased (decreased) risk of observing another extreme event for the remainder of the season. Estimates of this new measure can be derived from our models of local non-stationarity of extreme events. The risk measure is illustrated with both simulated and observed data.
The structure of the paper is as follows: Sect. 2 defines the required underlying univariate extreme value theory used to define the risk measure. Section 3 presents the risk measure methodology, and this is illustrated through simulation studies in Sect. 4. The short-term risk measure is then illustrated in Sect. 5 in the context of a case study using peak river flow data from Devon. We also provide a practical interpretation of how this risk measure can be used for short-term risk assessment to convey more clearly the reoccurrence chance of extreme events and to help clarify the misinterpretation of an event's return period.

NON-HOMOGENEOUS POISSON PROCESS
Consider an iid sequence of random variables Z 1 , . . . , Z n , and let M n = max{Z 1 , . . . , Z n }. If the distribution of the linearly normalised variable (M n − b n )/a n , a n > 0, converges to a non-degenerate distribution as n → ∞, this non-degenerate distribution G must be the generalised extreme value distribution (GEV) (Coles 2001), i.e.
where G(z) takes the following form, The GEV distribution function (2.2) is defined through three parameters; location μ ∈ R, scale σ > 0 and shape ξ ∈ R. We may be interested in the maximum up to a scaled time t ∈ (0, 1], i.e. M 1: tn , where here and subsequently M i: j = max{Z i , . . . , Z j } for i < j, and x denotes the integer part of x, for example, the index tn is the number of these n random variables that have been observed up until scaled time t. The distribution of the maximum M 1: tn also converges to a non-degenerate distribution as n → ∞ with G given by expression (2.2). The point process representation is an extension of limit (2.1) from maxima to all large values and requires the same asymptotic assumptions. Consider a sequence of point processes P n , defined on R 2 , with that converge to a non-homogeneous Poisson process P. The scaling of P n is chosen so that the first component, time, is scaled to the interval (0, 1) (essentially a normalised temporal index for the random variable Z ). The second component of P n , the maximum of the normalised points, converges in distribution to a non-degenerate limit, given by Eq. (2.1). The point process P n → P as n → ∞, on the set [0, 1]×(u, ∞) for all u < z with u defined such that u = sup {z : G(z) = 0}, for G(z) in Eq. (2.2), where P is a non-homogeneous Poisson process with intensity of the form

The integrated intensity for the set
In a modelling context, the asymptotic limit for the point process representation is assumed to hold for large n, i.e. above a high threshold u with the normalising constants a n > 0 and b n absorbed into the location and scale parameters of the non-homogeneous Poisson process. In the stationary case, the likelihood (Coles 2001) for the points that are above the threshold u, denoted by z 1:n u = z 1 , . . . , z n u , is where n y is the number of block of data , e.g. the number of years or flooding seasons. With this choice of n y , the block maximum follows a GEV(μ, σ, ξ) distribution. Inference is performed by using maximum likelihood estimation.
For standard risk assessments, we need to estimate the T year return values denoted by z T , defined in Sect. 1. In the stationary case, we estimate z T bŷ where (μ,σ ,ξ) are estimates obtained by maximising likelihood (2.4).

OBSERVED COVARIATES
In many cases, the distribution of extreme values will be dependent on covariates. This results in the points of the non-homogeneous Poisson process now being independent but non-identically distributed. The covariate space instead of the time space is now considered, and as a result, points are no longer assumed to be observed uniformly across the space. We initially consider the case where the covariate changes at every observation of Z t . We focus on a scalar covariate S with corresponding density function h(s) for s ∈ (−∞, ∞). The effects of S can be accounted for in all three parameters of the Poisson process by modelling the parameters as {μ(s), σ (s), ξ(s)} when S = s. The corresponding intensity is The integrated intensity, stated in Eq. (2.6), is evaluated through the use of numerical integration as typically closed form expressions cannot be obtained. Further, a kernel density estimate using all marginal S data is obtained for h(s) to avoid the need for additional modelling assumptions.
In the presence of non-stationarity due to the covariate S, the likelihood in (2.4) becomes where L = L(μ, σ , ξ ; z 1:n u , s 1:n u ), with (μ, σ , ξ ) being the parameters of μ(s), σ (s), ξ(s) and s 1:n u = s 1 , . . . , s n u the covariates associated with extreme values z 1:n u . Here, we can drop the h(s) terms in the product as they do not involve μ, σ and ξ . The standard method to incorporate covariates is to use linear models, with appropriate link functions (Davison and Smith 1990); however, other more flexible methods such as generalised additive models or splines can also be used (Chavez-Demoulin and Davison 2005;Yee and Stephenson 2007). Regardless of the method chosen, the T year return level z T solves the equation where the effect of the covariate S is integrated out. A special case of a covariate relationship that we will consider is when both the covariate and its effect on the non-homogeneous Poisson process remain constant over a year (or a season). We write s i for the covariate in year i and z 1:n u = z i j , j = 1, . . . , n u (i) , where n u (i) is the number of exceedances in year i and i n u (i) = n u , the total number of exceedances. The likelihood L = L(μ, σ , ξ ; z 1:n u , s 1:n y ) is then

UNAVAILABLE COVARIATES
It will not always be possible to obtain an appropriate covariate S as required by the models in Sect. 2.2. In this situation, we want to account for covariates without explicitly stating their value. Unavailable covariates can be included in the model through the use of random effects (Laird and Ware 1982). Random effects have been adopted to capture unexplained behaviour in a number of previous applications of extreme value theory (Cooley et al. 2007;Cooley and Sain 2010). Similar methods to those discussed below were presented by Eastoe (2019) for the generalised Pareto distribution; however, this is the first time that these models have been presented for the non-homogeneous Poisson process. As with covariates, random effects can be incorporated into all three parameters of the non-homogeneous Poisson process and can be different for all parameters. We assume that these random effects remain constant within each block of time (e.g. years) i = 1, . . . , n y and are iid over blocks. The vector of all random effects is denoted by r = r μ , r σ , r ξ , with r μ = (r μ,1 , . . . , r μ,n y ) where r μ,i is the random effect for μ in year i, and similarly for r σ and r ξ . It is assumed that r i, j is independent of r k,l when (i, j) = (k, l) for all i, k ∈ {μ, σ, ξ } and j, l ∈ 1, . . . , n y . This results in non-homogeneous Poisson process (NHPP) parameters for a given block i As the parametrisation in Eq. (2.8) is linear in the random effect, we can assume that the random effects have zero mean and unit variance. For parsimony and computational convenience, we assume that they jointly follow the multivariate Normal distribution ⎡ ⎢ ⎣ where is the covariance matrix and ρ = (ρ μ,σ , ρ μ,ξ , ρ σ,ξ ) represents the correlations between the parameter-specific random effects. The resulting single-site likelihood is then (2.10) where φ is the function of the standard multivariate Normal distribution given in Eq. (2.9). Unlike in Sect. 2.2, obtaining maximum likelihood estimates for the parameters and random effects in (2.10) is difficult. Following Eastoe (2019), we therefore use a Bayesian framework and estimate posterior distributions for the parameters using Markov chain Monte Carlo (MCMC) methods with a Metropolis-Hastings random walk (Gilks et al. 1995). The adoption of Bayesian methods allows us to incorporate prior information, which can be ascertained from domain experts as described in Coles and Tawn (1996). In order to obtain convergence of the MCMC chains, the distributions from which parameters are proposed are tuned by using an adaptive algorithm until the optimal acceptance rate is reached (Roberts and Rosenthal 2009). Standard visual assessments, such as trace plots, are performed to assess the convergence of the chains.
For a historical block i for which we have an estimate of the random effects, we can determine a block i specific return level z T,i through solving Eq. (2.11) with the parameters and the random effect evaluated at the posterior estimates. If required, we can still obtain a single cross-year return level curve by integrating over the random effects and solving to obtain an estimate of z T .

RISK MEASURE FOR THRESHOLD EXCEEDANCE DATA
We propose a short-term risk measure that is updated for the rest of a block using information on the size of the largest event that has been observed up a time point t in the block. Consider a window of normalised time [0, 1] for which the covariate remains constant and there exists a threshold u above which the models of Sect. 2 hold.
For a given time series, suppose that a T -year event of size z T is observed at time t. As a result of this event z T , we want to determine for the remainder of the time period (t, 1], the probability of observing an event of size greater than z T * > u, corresponding to a T * year return period event. We denote this by R (t) T,T * . Most often we envisage that interest will be in T * = T , i.e. future events are as rare as the previously largest event denoted by R T,T * is given in Fig. 2a with Z t = z T , and with z T * > z T so the return period T * > T .
Given that M 1: tn = z T > u, we are interested in determining for the remaining observations Z tn +1 , . . . , Z n the probability that M tn +1:n = max Z tn +1 , . . . , Z n > z T * as n → ∞. In terms of the Poisson process, this is equivalent to the probability of at least one point in the set B t,z T * = [t, 1] × [z T * , ∞) given M 1: tn = z T , see Fig. 2a. This probability is compared to the marginal probability of M tn +1:n > z T * without conditioning on the maximum value M 1: tn being observed up to time t; this allows us to assess the effect of observing an event of size z T . We define the measure of short-term risk by T,T * > 1 (< 1) there is an increase (decrease) in risk of an event of size z T * occurring.

DERIVATION OF THE RISK MEASURE UNDER A COVARIATE EFFECT
Knowing that M 1: tn = z T provides information about the value of the covariate S and consequently about how the risk has changed for that specific block of time. The denominator in definition (3.1) can be calculated using the non-homogeneous Poisson process with covariate S common over the block since as in Sect. 2.2, where h is the density of the covariate. The conditional probability of M tn +1:n being above a level z given that the maximum of M 1: tn is z corresponds to the numerator of expression (3.1) and is calculated as follows, P(M tn +1:n > z|M 1: and g 1: tn (z) and g 1: tn |S (z|v) are the marginal and conditional GEV densities of M 1: tn . In order to calculate these densities, we use the marginal probability from which it follows that

DERIVATION OF THE RISK MEASURE FOR AN UNAVAILABLE COVARIATE
The risk measure of Eq. (3.1) can also be estimated in the presence of random effects. The marginal probability given in Eq. (3.1) becomes where r = (r μ , r σ , r ξ ) and φ is the multivariate Normal density defined in Eq. (2.9). The numerator of expression (3.1) is calculated as follows, where φ R|1: tn (r; |z) = g 1: tn |R (z|r)φ(r; ) g 1: tn (z) , and g 1: tn (z) and g 1: tn |R (z|r) are again marginal and conditional GEV densities of M 1: tn . Finally, using the same strategy as in Sect. 3.2, we have

NUMERICAL ILLUSTRATION OF THE RISK MEASURE
To illustrate the ideas behind our proposal for the risk measure, Fig. 3 shows a simplified version of the problem where the covariate S takes only three distinct values s 1 , s 2 and s 3 . Here, s 1 , s 2 and s 3 correspond to the 2.5, 50 and 97.5% quantiles of a N(0, 1) variable and the distribution of the random variable of interest Z |S has a negative shape parameter. By T,T * > 1. More generally S will have a continuum of possible values, and an observed maximum value of z T in a year so far will favour some range of values of S over others and hence give an updated short-term risk for the rest of the year.
Risk measure (3.1) is illustrated in the case of a linear trend in the location parameter of the NHPP(μ(s) = α + βs, σ (s) = σ 0 , ξ(s) = ξ 0 ) with 30 years' of simulated data. The following parametrisation is used: α = 0, β = 2.5, σ 0 = 1.5 and the positive and negative shape parameter cases ξ 0 = ±0.2, with S ∼ N(0, 1). The choice of values of the shape parameter is consistent with observed values in the environment. We condition on z T being equal to a 1 in 100 year event and the event occurring at t = 0.4, i.e. 40% of the way into the storm season. We are interested in the occurrence of a value above z T * after the 1 in T year event and we consider this for T * ∈ [1, 100]. Figure 4 shows that the true short-term risk measure R (0.4) 100,T * is above one for both values of ξ . When ξ > 0, the distribution is unbounded, so there is always a non-zero probability, however, small, of being above a level z T * whatever the value of the covariate. A more dramatic effect is seen for ξ < 0, with the short-term risk measure increasing with level in Fig. 4b with a much larger magnitude than when ξ > 0. For example, the chance of observing a 1 in 50 year event is now more like a 1 in 2 year event.
We next estimate the risk measure using the non-homogeneous Poisson process model fitted to the data and treating the covariate S as either available or unavailable. When the covariate is available, the methods in Sects. 2.2 and 3.2 are used. When the covariate S is unavailable, but there is evidence that the data are not identically distributed, and the methods in Sects. 2.3 and 3.3 provide us with random effects estimates of the unavailable covariate.
The two estimates of short-term risk in Fig. 4 are both slightly lower than the truth but still reflect an increase in risk and capture the pattern of how R (0.4) 100,T * varies with T * . When the covariates are unavailable, the estimates are lower than when they are available and have larger uncertainty intervals; this is expected as there are a larger number of parameters to estimate. The uncertainty intervals show that both estimation methods give intervals which contain the truth. The estimates in Fig. 4 show the benefits of using random effect models to account for an unavailable covariate as they give similar risk measure estimates to when the correct covariate is observed. Furthermore, if we had not included random effects the risk measure would have been estimated to be equal to one, a considerable underestimate of the associated short-term risk.
Comparisons of estimates of the return levels of the annual maxima for a random year within the observation period are given in Fig. 5 for the same values of the shape parameter as in Fig. 4. The estimates for when the covariates are observed agree very with the truth. When the covariates are unavailable, the estimates are still good, but naturally perform slightly less well than if the covariate was observed. In contrast, if the covariate is ignored and an iid GEV distribution is wrongly fitted, this estimate is considerably different, in each case giving far too heavy tails and overestimating return levels because the covariate variation across years is interpreted as variation to be modelled by the GEV. This comparison highlights the importance of accounting for covariates to explain the local non-stationarity of extreme events, as lower and more accurate return levels estimates are achieved. The behaviour of the return level estimates is consistent with the results presented in Carter and Challenor (1981), who showed theoretically that return values estimates from subsamples of the population are lower than the estimates from sampling of the population. This highlights that the inclusion of covariates improves the marginal estimation of return values.

STRATEGY AND RIVER FLOW DATA
The methods and risk measure developed in Sects. 2 and 3 are now applied to river flow data from the River Harbourne in a catchment in South Devon in the South West of England. As discussed in Sect. 1, England's Environment Agency has focused on trying to reduce the risk of flooding to the communities situated along this river. They built a flood defence at Palmer's Dam, upstream of Harbertonford, to create extra capacity and reduce the risk of flooding (Cresswell 2012). However, this was overtopped in 2012 by a flood estimated to be a 1 in 40 year event, which caused further flooding to properties in the local area (Devon County Council. 2013).
Daily mean river flow data from 1998-2017 for the Rolster Bridge gauging station on the River Harbourne are considered. The data were declustered above the 97th percentile with an event window of w = 7 days (Keef et al. 2009) to produce on average three independent events per year. The methods described in Sect. 2.2 were used to model the declustered data, and there was no statistically significant evidence to include random effects in the parameters of the Poisson process possibly due to a relatively short record length of 20 years. However, this changes if we examine evidence for random effects by also incorporating data from neighbouring sites with longer record lengths. This longer record length and an assumption of a common random effect across a region allow us to pool information to obtain more reliable estimates of the random effects. The regional random effect, representing a function of large-scale weather events, induces common shared behaviour between river flow gauges in a spatially homogeneous region.
The additional data set consists of peak flows from five nearby river flow gauges, for the water years 1958-2013 (year defined from October to September). Data were obtained from the National River Flow Archive (NRFA 2014), and an exploratory analysis of these data can be found on the National River Flow Archive website (NERC CEH 2018). The gauges are all situated within the same hydrometric area as the River Harbourne, and so share similar physical characteristics, see Fig. 6. However, they do not include any stations that are situated on the River Harbourne. All peak flow data sets were declustered to produce on average five independent events ever year.

REGIONAL RANDOM EFFECTS MODEL
The random effects model of Sect. 2.3 can be adopted to include a common regional random effect using the methods outlined in Eastoe (2019), who develops an equivalent model for the generalised Pareto distribution. The model given in Eq. (2.8) is extended to include both site-specific effects as well as a common regional random effect across sites. Consider a set of sites that are located in a spatially homogeneous region with NHPP parameters for site d and year i given by where for a given year i the random effects r i = (r μ,i , r σ,i , r ξ,i ) have the joint distribution defined in Eq. (2.9). The parameters μ d,0 , σ d,0 and ξ d,0 are site-specific intercept terms that account for differences between the gauging stations, e.g. the catchment size. The parameters μ 1 , σ 1 and ξ 1 , and the random effects, are common across sites. Under the assumption of spatial independence of observations at different sites in year i, given the random effects r i for that year, coupled with independence of r 1 , . . . , r n y , the likelihood function over sites is Model (5.1) is fitted under a Bayesian framework to the NRFA peak river flow data described in Sect. 5.1, using an adaptive MCMC algorithm which was run for 200,000 iterations with a burn-in of 50,000. Uninformative priors were used for all parameters of the Poisson process. It was found that there was no evidence to incorporate random effects into the shape parameter; therefore, random effects were incorporated only into the location and scale parameters. This is a typical finding with covariates rarely found to be statistically significant for the shape parameter in environmental applications (Eastoe and Tawn 2009;Eastoe 2019). As a result of this, the random effects model of Sect. 2.3 is simplified so that r i = (r μ,i , r σ,i ) follows a standard bivariate Normal with correlation parameter ρ and ξ d is the shape parameter for site d.
The estimated random effects in Fig. 7a, b show that it is unsuitable to assume that the location and scale parameter of the Poisson process remain constant over years. The location parameter shows much greater variability across years than the scale parameter. In particular, the largest values for the location random effect correspond to known flood events such as those in the water years 1990(Devon County Council. 2013. Interestingly, the random effect in the scale parameter shows a clear positive value in 1979, corresponding to known floods in the region (Devon County Council. 2013), whereas there is no significant random effect in the location parameter for this year. This particular year has the largest flow across all river flow gauges and a particularly large value of NAO, which corresponds to wet winters. It saw several more typical size extreme events as well as the large flooding event, so the large value of the random effect in the scale parameter captures both of these features. A change in the random effect of the location parameter would be insufficient as it could not cover the associated change in the variance of the threshold excesses in this particular year. The correlation of the random effects was estimated to be 0.62 (0.13, 0.88). Within-series dependence of both sequences of random effects was explored to see whether we should extend the model, e.g. using an autoregressive process to model the random T,T * for gauges 46006 (a, c) and 46008 (b, d) if an extreme event occurred in December (t = 0.20) or June (t = 0.80) (top and bottom row, respectively). The black, red and blue lines correspond to already observing a 1 in T =10, 100 and 1000 year event with the corresponding shaded areas providing 95% credible intervals. effects. However, there was no evidence for serial correlation and so we retain the simpler model assuming random effects are independent over years.

SHORT-TERM RISK MEASURE ESTIMATES
We now estimate the short-term risk measure derived in Sect. 3 for two of the five peak flow river gauges discussed in Sect. 5.2. Figure 8 presents how R (t) T,T * changes depending on t, T and T * for each gauge. Two different time points t, December and June, are considered with T = 10, 100, 1000. The short-term risk measure estimates are larger when conditioning on more severe events. For each gauge, there is a significant increase in risk for the 100 and 1000 year events as shown in Fig. 8. We can see the trade-off between the size (return period T ) and timing, t, of the event in influencing the chance of observing extreme event of return Figure 9. Comparison of the regional random effects estimate for the a location and b scale parameters with the number of events for the Rolster Bridge gauging station from the River Harbourne.
period T * , with R (t) T,T * appearing to decrease as a function of t. As expected, the confidence interval is widest when T = 1000 years. By conditioning on an extreme event occurring later in the water year, in this case June, the short-term risk measure decreases due to the shorter amount of time remaining in the season in which to observe another extreme event. Although the regional random effects are common across the gauges, the risk estimates are different due to the site-specific parameters of the non-homogeneous Poisson process: so the risk measure captures the different effects of extreme events at each site.

RELATING THE RESULTS TO THE RIVER HARBOURNE
To assess the value of the annual regional random effect estimates (r μ,i , r σ,i ), we compare them with the associated annual number of threshold exceedances for the River Harbourne in Fig. 9. For each random effect, there does seem to be an association, which is stronger for the location parameter.
In order to assess the validity of the random effects shown in Fig. 7, the random effect estimates are taken as fixed and the model (2.8) is fitted to the data from the gauge on the River Harbourne, denoted by site H . The Poisson process parameter estimates of μ 1,H and σ 1,H are significantly different from zero, suggesting we have good covariates in the form of (r μ,i , r σ,i ). However, the estimate of the local location parameter μ 1,H for the spatial random effect was statistically significantly less than μ 1 (the regional location parameter estimate for the spatial random effect) so not all aspects of the regional model translate to the River Harbourne. Figure 10 shows both return level and risk measure plots for the Harbourne derived from using the regional random efforts and treating them as fixed. Figure 10a shows the return level curve for each annual random effect, illustrating how much the annual maximum distribution varies over time, as well as the time-averaged estimate which gives us a best estimate for the future return levels when excluding long-term climate changes. For comparison, we also show the estimated return level curve based on a GEV fit to the annual maxima from the Harbourne, similarly to Fig. 5. Again we find that, relative to fitting with random effects, this single-site GEV fit overestimates the risk of large events, i.e. events with return periods of T,T * for the River Harbourne if an extreme event occurred in December (t = 0.20). The black, red and blue lines correspond to already observing a 1 in T =10, 100 and 1000 year event with the corresponding shaded areas providing 95% credible intervals. more than 100 years. The risk measure estimates, shown in Fig. 10b, show that the short-term risk of flooding for the Harbourne increases as the size of the conditioning event increases. For example, for a risk assessment in December (20% of the way into the year) the increased risk of a 100-year event (T * = 100) in the remainder of the year doubles from 3 to 6 if the previously observed event had a return period of T = 10 or 1000 years, i.e. these updated odds of these events for the rest of the year are 1:33 to 1:17.

DISCUSSION
There has been much media coverage of the reoccurrence of extreme events with a seemingly long return period at a single location. What is often ignored is the fact the definition of the return periods of such events is only valid when the process is stationary. Clustering of apparently independent extreme events in a short period of time provides clear evidence that such an assumption is no longer valid. As a result, using return periods to communicate the severity of an event is no longer suitable for short-term risk assessment.
We have developed a short-term risk measure to convey the change in the risk of observing extreme events later in a storm season after observing an initial extreme event. This measure captures the local non-stationarity in time through the incorporation of covariate information, either from observations or through the use of random effects. The covariate information allows us to improve the accuracy of the statistical models and also to produce more reliable and informative return value estimates. This novel methodology helps to provide an advancement towards improving the long-term modelling of extreme flooding in the presence of temporal variability, an issue raised by the National Flood Resilience Review (Her Majesty's Government 2016; Tawn et al. 2018;Towe et al. 2018).
The methodology was applied to river flow data from South Devon. The models fitted in Sect. 5 showed that there is clear evidence of a similar inter-year non-stationarity over sites in this area. The random effect was compared against the climate index NAO, but no significant relationship was found. These random effects estimates were then used to estimate the short-term risk measure. For a hydrometric area, we have shown that there is a clear change in risk of observing an extreme event once one has already been observed. Naturally, the magnitude of this change in risk is a function of the size of the original event and the time at which it was observed.
The short-term risk measure developed in Sect. 3.1 is one particular method of assessing risk, but alternative risk measures could also be considered. We may be interested in the probability of there being an exceedance of z T for the remainder of the time period given that there have been n exceedances of z T so far in the year. Figure 2b illustrates this event with n = 3. The risk measure could then be constructed as the ratio of this probability with the marginal probability P(M tn :n > z T ).