Parametrisation of change-permitting extreme value models and its impact on the description of change

The potential for changes in environmental extremes is routinely investigated by fitting change-permitting extreme value models to long-term observations, allowing one or more distribution parameters to change as a function of time or some other covariate. In most extreme value analyses, the main quantity of interest is typically the upper quantiles of the distribution, which are often needed for practical applications such as engineering design. This study focuses on the changes in quantile estimates under different change-permitting models. First, metrics which measure the impact of changes in parameters on changes in quantiles are introduced. The mathematical structure of these change metrics is investigated for several change-permitting models based on the Generalised Extreme Value (GEV) distribution. It is shown that for the most commonly used models, the predicted changes in the quantiles are a non-intuitive function of the distribution parameters, leading to results which are difficult to interpret. Next, it is posited that commonly used change-permitting GEV models do not preserve a constant coefficient of variation, a property that is typically assumed to hold for environmental extremes. To address these shortcomings a new (parsimonious) model is proposed: the model assumes a constant coefficient of variation, allowing the location and scale parameters to change simultaneously. The proposed model results in changes in the quantile function that are easier to interpret. Finally, the consequences of the different modelling choices on quantile estimates are exemplified using a dataset of extreme peak river flow measurements in Massachusetts, USA. It is argued that the decision on which model structure to adopt to describe change in extremes should also take into consideration any requirements on the behaviour of the quantiles of interest.


Introduction
There is widespread interest in quantifying the impacts of climate and other anthropogenic changes on the likelihood of very extreme natural hazards (IPCC 2012). When assessing the risk of a natural hazard, for example to design critical infrastructures, an estimate is needed of the magnitude of extremes expected to occur with a certain rarity, typically derived as quantiles of a statistical distribution. This estimation is carried out under the assumption that the past recordings of the variable under study would still be representative of the stochastic behaviour of the variable at present and in the future. This can be formalised as an assumption that the process under study is stationary (Coles 2001;François et al. 2019). With the increasing evidence of the impacts of climate and other anthropogenic changes, the validity of this assumption is increasingly being challenged. As a result, a large of number of studies have investigated the evidence for change in risk levels of several natural hazards in observational data (e.g. Coates et al. 2014;Guerreiro et al. 2018;François et al. 2019, and references therein), typically using statistical approaches based on Extreme Value Analysis (Cooley 2009;Katz et al. 2002). Often, a default distribution is assumed and one or more distribution parameters are modelled as a function of covariates, e.g. time. The hypothesis that the behaviour of extremes might have been changing is formulated as a functional relationship between one or more of the distribution parameters and the covariate(s). This type of change-permitting analysis is also referred to as non-stationary analysis, while the models with constant values for all parameters throughout time are often referred to as stationary analysis. Table 1 shows an extremely non-exhaustive list of published articles based on such change-permitting analysis: the list is provided to showcase the variety of modelling choices and applications of change-permitting models. From the Table it can be seen that different combinations of distributions, model structures and covariates have been employed in the literature. All the differences in the model components eventually have an impact in the final understanding about changes in extremes. Specifically, most of the studies listed in Table 1 focus on the changes to the higher quantiles of the distribution.
In this study we discuss ways to describe the changes in frequencies and magnitude focussing on change in the quantile function, showing that different model structures necessarily result in different functional forms of changes in the quantile function. Further, we show that commonly adopted model structures can result in non-intuitive behaviour of the quantile function. We argue that a reflection on the type of change in higher quantiles which can be derived for each model structure should be performed at the initial stages of any analysis to ensure that the analysis results can provide the most appropriate communication of how extremes might be changing.
In particular, we formalise the investigation of how the different model structures impact the description of change on effective quantile estimates across different exceedance probabilities using two sets of criteria. The first set of criteria measures the impact of the changes which depend on the covariates used in the change-permitting models. The second set measures the effect of switching from a nochange model to a change-permitting model. Some of these measures have already been partially proposed in the literature: Vogel et al. (2011) introduced the magnification factor as a way to describe relative changes in quantiles in time. Further, the R package extRemes Gilleland and Katz (2016) provides functions to compute the difference in effective quantile estimates under different values of explanatory variables. Moreover, several studies in the literature, including those listed in Table 1, carry out some investigation of the implied impact on the magnitude of estimated quantile of using change-permitting models against results obtained when assuming constant parameters.
The analytical study of the change criteria shows that a minority of model structures results in simple definitions of change in the quantile function. One of these is a model structure which maintains a constant coefficient of Table 1 Examples of studies which use change-permitting models to assess changes in annual maxima records

Reference
Variable (Y) Distribution Model structure Covariate (X) Vogel et al. (2011) River flow Log-Normal LN(l; r) location: l ¼ l 0 þ l 1 x scale: r ¼ r 0 (constant) Time Villarini et al. (2009) River flow Gumbel(l; r) (among others) Location: l ¼ gðxÞ (non-parametric) scale: r ¼ gðxÞ (non-parametric) Time, population, rainfall Cheng and AghaKouchak (2014) Rainfall GEV(l; r; n) Location: l ¼ l 0 þ l 1 x scale: r ¼ r 0 (constant) shape: n ¼ n 0 (constant) Time Luo and Zhu (2018) Wave height GEV(l; r; n) River flow Gen. Logistic GLO(n; a; j) Rainfall and temperature (observed and modelled) GEV(l; r; n) Location: l ¼ l 0 þ l 1 x scale: r ¼ expfr 0 þ r 1 xg n ¼ n 0 þ n 1 x global temperature GEV Generalised Extreme Value distribution variation, thus enforcing a change in the scale of the distribution when the location is allowed to change. Such models are not commonly used in the investigation of changes in extremes, even though a constant coefficient of variation is a common characteristic of estimates based on environmental extremes records (Overeem et al. 2009;Menabde et al. 1999;Serago and Vogel 2018;Blanchet et al. 2009). The manuscript is organised as follow: a brief introduction of extreme values analysis and the models used to assess changes in extremes is given in Sect. 2. The quantile-based measures of change are presented in Sect. 3 and are applied to a case study of peak flow records in Massachusetts, USA, in Sect. 4. Section 5 closes the paper offering a perspective on the importance of model specification when investigating changes in extremes. The maximum likelihood estimation framework is adopted throughout the manuscript, although some of the concepts discussed would easily apply to models estimated within a Bayesian framework.
2 Statistical models for changing extremes 2.1 Extreme value models and design events; the fixed-parameters case Assessing the risk of a natural hazard involves an assessment about the frequency at which events of given magnitudes can be expected to be exceeded. Typically, this is done by estimating the design event, which is expected to be exceeded with a annual exceedance probability (AEP) of p in any given year. If the distribution of event magnitudes is constant in time and events from year to year can be deemed independent from one another, the AEP has a direct relationship with the return period T, the average period of time over which one event at-least as big as the design event would be recorded, where T ¼ 1=p (see Volpi 2019). Since the interest of the estimation focusses on the most extreme magnitudes which might be recorded, rather than the typical magnitudes, Extreme Value Analysis is used to model the distribution of the extreme values. In particular, extremes can be defined as the maximum value in a sample recorded over fixed periods of time (for example a year). It can be shown that the Generalised Extreme Value (GEV) distribution is the limiting distribution for maxima of stationary series (Coles 2001), although other distributions are sometimes used when estimating design events for engineering purposes (Castellarin et al. 2012).
In general, it is assumed that the variable of interest, denoted by Y, follows a certain distribution parametrised by h ¼ ðh 1 ; . . .; h d Þ with probability density function f ðy; hÞ. Typically a two (d ¼ 2) or three-parameter (d ¼ 3) distribution is used (Castellarin et al. 2012) and, in a first instance, all parameters are assumed to be constant. In the remainder we will focus on the GEV distribution due to its central role in the analysis of extremes. The GEV is a three parameter distribution characterised by a location parameter l (with l 2 ðÀ1; 1Þ), a scale parameter r (with r 2 ð0; 1Þ) and a shape parameter n (with n 2 ðÀ1; 1Þ). When n ¼ 0 the distribution reduces to a two parameter distribution: the Gumbel distribution. The distribution's domain depends on the sign of the shape parameter taking the following values: À1\y l À r=n for n\0, l À r=n y\1 for n [ 0 and À1\y\1 for n ¼ 0. For a given sample of extremes y ¼ ðy i ; . . .; y n Þ it is possible and typically straightforward to find an estimate of h denoted aŝ h. Estimates for the design event Q T (where T indicates the return period) can be derived from the quantile function corresponding to the fitted distribution qð1 À p;ĥÞ (where p ¼ 1=T) so that PðY [ Q T Þ ¼ p. For the GEV distribution the quantile function takes the form: where p indicates the exceedance probability. The quantile function for the GEV shown in Eq. (1) can be rewritten as: qð1 À p; l; r; nÞ ¼ l þ r n y Àn p À 1 h i for n 6 ¼ 0 l þ rðÀ logðy p ÞÞ for n ¼ 0 taking y p ¼ ðÀ logð1 À pÞÞ. Notice that for n ¼ 0 the quantile function of the Gumbel distribution is a linear transformation of À logðy p Þ with l being the intercept and r being the slope of the line.

Incorporating change in extreme value models; the change-permitting case
The hypothesis that the behaviour of extremes might have been changing is typically formulated by assuming a functional relationship between one or more of the distribution parameters, and one or more covariates. The validity of the hypothesis is assessed by comparing the goodnessof-fit of the models with varying parameters against models with fixed parameters. This approach is established for extreme value analysis (Coles 2001) and has some similarity to Generalized Linear Models and their extensions such as Vector Generalised Additive Models (Yee 2015) or Generalised Additive Models for Location Scale and Shape (GAMLSS, Rigby and Stasinopoulos 2005;Wood 2017), which are also referred to as distributional regression models (Umlauf and Kneib 2018). In this work, to keep the presentation and computations as simple as possible, only linear models (with appropriate link functions) with one covariate are explored. The concepts can be extended to the case in which more than one covariate would enter the model and in which the relationship between the covariate and the distribution parameter (or their link transformation) would be better described by some non-linear function. Indeed, different covariates might enter the model in either a linear or non-linear fashion affecting one or more of the distribution parameters: see Yee and Stephenson (2007) for a thorough discussion on this within the extreme values framework.
As an example of the change-permitting approach the annual maxima series of the Aberjona River at Winchester, Massachusetts (USGS Gage 01102500) is used. Serago and Vogel (2018) reported that the watershed upstream of the station has undergone significant urban development which would typically increase flood magnitudes. The n ¼ 78 annual maximum peak flow records are assumed to be independent and drawn from a GEV distribution, possibly with one or more parameters changing throughout the recording period. The peak flow records are shown in Fig. 1, and it seems clear that through time the observations have been increasing in magnitude and possibly also become more variable.
Four models with different change-permitting structures are fitted to the observations to investigate possible changes in the peak flow distribution. At first a model M 0 with all parameters kept constant is considered. It is assumed that each element in the sample is iid and GEV-distributed: M 0 : Y i $ GEVðl; r; nÞ for i ¼ 1; . . .; n: Next, two models in which the location of the distribution is allowed to change as a function of time with different link functions are used. As in Serago and Vogel (2018) time is used as a covariate in the model, as a surrogate for the increased urbanisation levels. Assuming that the variable describing river peak flow at time i follows a GEV distribution with a different location for each year, Y i $ GEVðl i ; r; nÞ, the following two models are fitted: a model with a linear link between the location and time (M L ) and a model using an exponential function to link the location parameter and time (M E ): In the last fitted model (M CV E ) the location is modelled as a function of time via an exponential link function and the scale is allowed to change proportionally to the location: where s is the ratio between the scale and the location of the distribution. The model structure is discussed further in Sect. 2.3. The C subscript is used to emphasise the parameters of change-permitting models. In Fig. 1 the estimated 30 year event (i.e the design event with AEP = 1/30) under the four different models fitted to the data are shown. These are obtained by plugging the estimated parameter values in the quantile function in Eq. (1) and correspond to the effective quantiles discussed in Katz et al. (2002). Notice that for all four models shown in the Figure, different values of the scale and shape parameters will be estimated because different structures are used for the location function.
The maximum likelihood estimates of the model parameters in all four models discussed above are given in Table 2, together with their standard errors. In all models the coefficients describing the change in time for the location parameter are estimated to be positive and significantly different from 0 at the 5% significance level. The estimated value for the location parameter in 1940 and 2019 are also displayed in Table 2: the final estimation of the location at the beginning and at the end of the record is fairly similar for the three models in which the location is allowed to change: the choice of the link function has a minor effect on the estimated value of the location function within the recording period.
In Table 2 the estimated parameter values for an additional model (M S ) in which the scale is allowed to change with time using an exponential link function are also provided. The model structure is the following: This model should be used rarely and is presented here for comparative purposes: the scale parameter describes the variation of the distribution around its centre and typically one would first need to correctly model changes in the location parameter as a function of some predictors to correctly characterise changes in the scale. For this station there is little evidence that scale alone is changing when keeping the location fixed, although the variability of the record shown in Fig. 1 would appear to be higher in the later years. Indeed for the data in Fig. 1 there appear to be an increase in both the overall magnitude and the variability of flow extremes: this feature can be well described by the model M CV E , in which the location is allowed to change as a function of time and the scale changes proportionally to the location.
To assess the goodness of fit of the change-allowing models to the data, beside assessing whether the parameter describing the change is significantly different from zero, it is often appropriate to check that the distribution estimated under each model gives a good fit to the observed data. As discussed, among others, in Coles (2001) this can be done using graphical tools such as probability-probability plots (pp-plot) or quantile-quantile plots (qq-plot) which can provide a visual check for the validity of the distributional assumption for the response variable. Software such as ismev and extRemes makes the testing of model significance and the assessment of the goodness of fit fairly straightforward. For the Aberjona river case the goodness of fit plots (not shown) indicate that the GEV assumption seem to hold. Nevertheless, when assessing the significance of a change-permitting model or selecting the most suitable model by means of information criteria (e.g. AIC or BIC), it is also important to assess and take into account the goodness of fit of different models to the data.

A model preserving a constant coefficient of variation
The model in Eq.
(3) has rarely been employed in extreme value analysis: it is mentioned in passing by Smith (1999) that such a model gave a better fit to the rainfall data under study, but it has not been employed in the investigation of changes in environmental extremes, to the best of the authors' knowledge. The model is constructed so that when the location changes so does the scale, while the ratio between the scale and the location (s) remains constant. Although s is not exactly the coefficient of variation for a GEV distribution it is tightly related to its value. Most of the studies investigating changes in extremes in the literature (see for example Table 1), use GEV models with a varying location or scale (or location and scale which are allowed to vary independently from each other). These models have a varying coefficient of variation: a change in, say, the location parameter would involve also a change in the coefficient of variation. The model suggested in Eq. (3) keeps a constant coefficient of variation providing a parsimonious representation of changes in both location and scale. As mentioned in Serago and Vogel (2018), the estimates of the coefficient of variation are often found to be approximately constant across series measuring the same variable, while location and scale are well-known to vary greatly with some external variable. For example, in river flow measurements both the location and the scale typically vary as a function of watershed size and mean rainfall, while the estimates of the coefficient of variation are relatively constant across different watersheds. Similarly, rainfall accumulation across longer durations will tend to be larger and also exhibit more variability. These well-known properties of extremes are the basis for the widely adopted index-flood method and regional flood frequency analysis (Hosking and Wallis 1997) and for the standard methods used to derive Intensity Duration Frequency (IDF) curves (Menabde et al. 1999). These properties are sometimes referred to as the scaling properties of environmental extremes (Gupta and Waymire 1990). Beside the case of rainfall and river flow extremes, simple and multi-scale properties have been found to hold for, for example, snow accumulation, which scales with altitude (Blanchet et al. 2009), and wind speed, which scales with duration (Diebold and Heller 2019). The model in Eq. (3) is inspired by these so-called scaling properties of extremes: by keeping a constant coefficient of variation changes in the location result in proportional changes of the scale. As discussed in the next Section this also provide a straightforward description of changes in the distribution quantiles.

Measuring the impact of change
Once a relevant change in the distribution parameters is identified, this implies that the overall distribution of the variable of interest is deemed to have changed. The left panels in Fig. 2 gives an illustration of how potential changes in the parameters correspond to changes in the distribution by displaying the probability density functions Dðp; x Ã Þ ¼qðp; h C ðx Ã ÞÞ À qðp; hÞ ð 6Þ The metrics of change introduced above can be used to assess the impact of the changing behaviour of extremes as described by different model structures on the estimated quantiles of the distribution, i.e. the design events. Note that the quantity in Eq. (5)

Measuring change within the changepermitting models
The types of changes in the parameter values in Fig. 2 reflect some of the changes observed in parameter values obtained from change-permitting models often used in environmental studies. The model defined by a linear change in the location parameter of a GEV distribution with a constant scale, in which it assumed that The values of D d ðp; DxÞ and M d ðp; DxÞ for a number of models commonly used when assessing the presence of change in environmental extremes are provided in the ''Appendix''. There are only a few change-permitting GEV models for which one of the metric of change does not depend on the exceedance probability. One of them is the case in which the location is allowed to change, either linearly, as in the case discussed above, or as an exponential function of the covariate: in this case the value of D d ðp; DxÞ is constant across all exceedance probabilities. When using the model proposed in Eq. (3), it is the ratio of quantiles that is a constant which only depends on Dx and l C1 : Vogel et al. (2011) used time as a covariate and defined M d as a ''decadal'' magnification factor based on Dx ¼ 10 years for the case of the two-parameter log-normal distribution, which is characterised by a constant coefficient of variation. This expression of change is independent of return period and has a simple interpretation. The model introduced here in Eq.
(3) allows this concept to be applied with the more widely used GEV distribution and to provide a convenient way to communicate the outcome of changepermitting analysis to end-users.

Measuring change against the fixedparameters model
For any model used to assess the existence of change in extremes, a natural question is also how estimates from a change-permitting model compare to the corresponding estimates obtained using fixed parameters, which are likely to be the basis of the current design event estimates. If the distribution is changing, then the probability of exceeding a pre-specified value would be different at the present time than it was at any time in the record and, more importantly, the estimation based on the assumption of a unique distribution for all data, would be biased. The two metrics proposed in Eqs. (6) and (7), Dðp; x Ã Þ and Mðp; x Ã Þ, provide a natural way to assess these impacts, but for almost no model can these two quantities reduce to simple metrics such as those available for D d ðp; DxÞ or M d ðp; DxÞ. Nevertheless, it is possible for a fixed value of the covariate x Ã , to investigate how Mðp; x Ã Þ or Dðp; x Ã Þ change as a function of p. Often, up to a certain value ofp, the quantiles of the change-permitting model would be larger (respectively smaller) than the fixed-parameter model, while afterp the quantiles of the change-permitting model would be smaller (respectively larger). In practical applications, this can be a source of doubt for decision making, since events up to probabilityp might be, say, underestimated when using fixed-parameters models, while events with probability of exceedance greater thanp would appear to be overestimated. This can happen, for example, when a linear trend is assumed for the location parameter and found to be, say, positive and some variability in the data can be explained by such trend, so that the estimate for the scale parameter diminishes in value: this is the case for the Aberjona river as shown in Table 2. When this happens one of the possible consequences is a shifting and ''flattening'' the return curve so that there is an increase in the quantiles which are exceeded relatively frequently, while the very rare events are found to be generally smaller. Similarly, when including a linear trend in the location parameter, the estimate for the shape parameter might vary so that the return curve under the change-permitting model might increase at a faster or slower rate. Figure 3 shows a comparison of the scale and shape parameter for a fixed-parameters model and for a changepermitting model with a linear trend in the location as a function of time applied to 40 annual maximum series of instantaneous peak river flow in Massachusetts each containing more than 65 years of data (see Sect. 4 for a complete description of the dataset). For 31 out of 40 stations, the scale parameter is smaller for the changepermitting model, reflecting that allowing the location parameter to change explains a portion of the variability in the records. The shape parameter estimates also change (middle figure), with both increases and decreases observed.The impacts on the quantile estimation when migrating from a fixed-parameters to a change-permitting model can be assessed using Mðp; x Ã Þ, i.e. the ratio between quantiles computed under the change-permitting and the fixed parameters models, for different value of p. In the right panel of Fig. 3 the scatterplot of Mðp; x Ã Þ for the median (i.e. p ¼ 0:5) and for the 100-year event (i.e. p ¼ 0:01) is shown. For all stations in the datasets the change-permitting models are evaluated at the end of the record of the stations (x Ã ¼ maxðxÞ). The sign of the change for the frequent (p ¼ 0:5) and rare (p ¼ 0:01) events is different at times (these are the red stars in the bottom right quadrant and top left quadrant). These instances represent cases where the introduction of a change-permitting model will result in changes in design flood estimates at high return periods which are different in direction than those for low return periods; a counter intuitive outcome for most practical applications. Moreover, the magnitude of the change can be different for the two AEP values, due to the difference in the scale and the shape parameter between the two models. For this dataset changes in the high quantile (AEP of 0.01) tend to be smaller than changes in the median, reflecting that the scale parameter estimates tend to be smaller for change-permitting model than for the fixed-parameters model.

Enforcing change structures
Although having different direction of changes for events of different rarity can be practically challenging, this property might be something that a modeller wishes to exploit when analysing extremes. For example, there might be some physical reasoning by which the magnitude of frequent events under some pre-specified value of the covariates is expected to be reduced while the rare events would become larger. For example, Sharma et al. (2018) discuss how different types of change of the distribution of high flows might be expected for different catchment types, while Guerreiro et al. (2018) show that changes in the magnitude of extreme rainfall are different for the relatively common and the most extreme events.
When fixing the shape parameter to be the same in the fixed-parameters and the change-permitting models, the point at which the two return curves cross can be found analytically for some of the commonly used models, namely the model with a linear change in the location and a model with an exponential change in the scale (see Under the fixed parameters model the assumed distribution is Y $ GEVðl; r; nÞ. Under the linear change in the location model, the assumed distribution is Y i $ GEVðl C0 þ l C1 x i ; r C ; nÞ. Noticeably, the n parameter is kept to be the same under both models. The Dðp; x Ã Þ function for this change-permitting model is found to be: The two quantile functions have the same value (i.e. they cross) at the point in which Dðp; x Ã Þ ¼ 0, which occurs at: provided that nðl À l C0 À l C1 x Ã Þ=ðr C À rÞ [ À 1. Manipulating the formula for Dðp; x Ã Þ further, one can define the value that r C should take to ensure that the two quantile curves cross at a fixed y p value for a given value of ðl C0 ; l C1 Þ as: This could be enforced in the model estimation to ensure that the change in the sign of Dðp; x Ã Þ happens exactly at a desired AEPp. Further, one can study the sign of Dðp; x Ã Þ as a function of p. It can be found that quantiles from the change-permitting model exceed the quantiles from the fixed-parameters models (Dðp; x Ã Þ [ 0) as long as: From this result it is possible to further investigate the properties a change-permitting model should have to ensure that effective return periods of a certain rarity are larger (Dðp; x Ã Þ [ 0) or smaller (Dðp; x Ã Þ\0) than the return periods under the fixed-parameters model. First, notice that y Àn p À 1 [ 0 for n\0 and p [ 1 À e À1 y Àn p À 1 [ 0 for n [ 0 and p\1 À e À1 ( where p ¼ 1 À e À1 corresponds approximately to an AEP of 0.63 and a return period of T = 1.58. Assuming that one would want the two curves to cross at a value of p [ 1 À e À1 , we see that for Dðp; x Ã Þ to be positive (i.e. to ensure that the dis-equality in Eq. (8) holds) one needs to find that the location parameter evaluated at x Ã (i.e. the value l C0 þ l C1 x Ã ) is smaller that the location parameter in the fixed parameters model (l). This would mean, for example, that if time is the covariate in the model and x Ã is taken to be the end of the record, to ensure that the design events under the change-permitting model are larger than the ones obtained under the fixed parameter model for event with AEP lower than a certain valuep, one would need to find a negative trend in the location and at the same time an increase in the scale parameter compared to the fixed-parameters model. Combining these findings it would be possible to define change-permitting models such that the effective return curve derived at a certain value of the covariate crosses the fixed-parameters return curve exactly at a pre-specified AEPp and that has higher (or lower) estimated return levels than the fixed-parameters model for events with AEP smaller thanp. This could be done by including constrains or using convenient re-parametrisation within the likelihood functions used to estimate both the fixed and the change-permitting models. The constrains needed to ensure a positive Dðp; x Ã Þ under different change-permitting models are more cumbersome than the ones presented above for the case of the linear change in location. Nevertheless, it is generally true that if one wishes to ensure that the estimated magnitudes of rare events (i.e. events with a small AEP) evaluated at a given value x Ã in the change-permitting model are larger than the ones under the fixed-parameters model, the scale parameter under the change permitting needs to be larger, while the location parameter needs to be smaller (provided the shape is kept to be the same in both models).

Case study: the Massachusetts peak flow data
In this Section records of annual maxima of instantaneous peak river flow recorded from rivers in the state of Massachusetts, USA, are used to explore some of the practical consequences for the estimation of quantiles when imposing different model structures to allow for change in the distribution of peak flows. The data consists of the annual maximum series of instantaneous peak flow for 40 stations in the state with at least 65 years of valid records. The longest record is 115 year long and all records end after 2015. Some of these stations record flow at locations for which the upstream watershed has undergone changes in urbanisation similar to those experienced by the Aberjona watershed. In this study we do not wish to assess whether peak river flow has changed across the state, nor to identify the drivers of such change: we instead focus on the consequences of different parametrisation of change on the estimated design events. We also do not attempt to make an assessment of which of the model structures employed to characterise possible changes in the records is the most suitable one for the different river flow records in Massachusetts: more complete checks on the goodness of fit of each model fit for every record would be needed. Overall, finding the most suitable statistical model to characterise possible changes in the distribution of river flow peaks would require a more thorough statistical investigation of the data and a deeper understanding of the possible causes driving the observed changes (e.g. changes in the land use in the watershed). A number of different fixed-parameters and changepermitting models are fitted via standard maximum likelihood estimation to the data series of each station in the dataset. For all models time is used as a covariate and the following transformed logit function is used for the shape parameter: n ¼ logitðfÞ À 0:5 to ensure that n 2 ½À0:5; 0:5. This constraint (adapted from the gevlss function in the mgcv R package, Wood 2017) ensures that the mean and variance of the GEV are finite and the maximum likelihood estimates are consistent (see Smith 1985). At first, the standard model with fixed parameters, M 0 , is estimated for each record, together with the different change-permitting models (M L , M E , M S and M CV E ) which were applied to the Aberjona river data as seen in Table 2.

From fixed-to change-permitting models
At first we compare the support in favour of a change in the behaviour of extremes across the different models. For each of the model different assumptions are made regarding which property of the distribution is changing and how this change is related to the covariate. Therefore, rather than comparing the raw estimated values for the parameters describing the estimated changes across different stations and different models, the parameter estimates are standardised by their estimated standard deviation and compared in Fig. 4. In the left panel the trend parameters for the model M L and M E are compared. For the model with a linear link function, M L , the parameter of interest is l C1 , while for M E , in which an exponential link function is used, the parameter of interest corresponds to g C1 . For these two models, the detected changes in the location parameter are similar in sign and strength: most changes are positive, indicating an increase in the location parameter over time for the majority of stations. As in the Aberjona case the estimated location function within the recording period is similar when using a linear or an exponential link function.
In contrast, the sign and strength of the trend when modelling location and scale are very different. The scatterplot in central panel of Fig. 4 indicates that for sites with strong changes in the location parameter there is no corresponding strong evidence of change in the scale parameter when this is modelled as a changing function of time while keeping the location fixed (M S ). However, when the scale is allowed to change proportionally to the location (M E CV ), the strength and direction of the trend is similar to those found when allowing only the location to change (right panel). At each site, using the model in Eq. (3) would results in identifying a changing behaviour of extremes similar in sign and strength to the one identified when only allowing the location parameter to change.
Importantly, the consequence of the identified change on the quantiles would be different for different models due to differences in model structures. These differences are shown for the Aberjona river in Fig. 5. The top panel and the bottom left panel in Fig. 5 show in detail the comparison of the return curves for the fixed-parameter model (M 0 ) and for different change-permitting models evaluated in year 2019 (the final year in the record). These curves are derived from the estimated parameters shown in Table 2: changes in the parameters in the change-permitting models Fig. 4 Scatterplot of scaled trend coefficients under different modelling assumptions imply different relative changes across the AEP values. For models allowing changes in the location parameter it was found that the parameter value in year 2019 is generally larger than the one of the fixed parameter model and this implies an overall increase in the return curve. In contrast, the minimal difference between the parameter estimates for the fixed-parameter model M 0 and the model in which the scale is allowed to change in time M S is reflected in almost identical return curves (top-right panel) and consequently an almost flat M(p, 2019) line (lower right panel).The relative changes of the return curves for the different change-permitting models in all stations are shown in Fig. 6. For all stations in all models Mðp; x Ã Þ is evaluated in the last year available in the record, thus Fig. 6 shows the ratio of the effective return curve of the different change-permitting model structures evaluated at the end of the record period against the estimated return curve for the fixed parameter model. The left panel shows the relative change in quantiles across AEP induced by assuming an exponential trend in the location parameter. In this case the relative changes implied in the quantile functions tend to be minimal for the events with AEP of approximately 0.1 and the relative changes in the frequent events (Gumbel transform smaller than .4) tend to be larger than the relative changes for rare events (Gumbel transform larger than 4.6) . The contrary is true when the scale alone is allowed to change (central panel): overall there are much larger relative changes for the rarer events. When the scale is allowed to change proportionally to the location (right panel) the relative changes for the different AEP are fairly comparable for most stations. As in Fig. 3, for a handful of stations the direction of change for the median and the 100-year event are different when only the location is allowed to change; this is also true when the scale only is allowed to change. When using the model in Eq. (3) the changes have instead always the same direction in the range of AEP considered in Fig. 6, with most of them being fairly constant. In conclusion, the assumption made in terms of what parts of the distribution are allowed to change has a strong impact on the implied changes in higher quantiles which are typically of interest for engineering design. (3) is used the relative change in the two quantiles is the same and is equal to expfl C1 Dtg.  structure allow changes in the distribution as an exponential change in the location parameter. In this case, the changes in the frequent events are much larger than those in the rarer events: an increase in the location parameter would entail much larger estimates in the later years for the median, but only mildly larger estimates in the later years for the 100-year event. The behaviour is reversed when the scale only is allowed to change: frequent events are estimated to have relatively similar magnitude in 2015 and 1965, while the estimated magnitude of rare events is very different with effective design events for low AEP being either much larger or much smaller in 2015 than in 1965. The consequence of using the parametrisation of change presented in Eq.

Changes over the record-period
(3) is evident in the right hand panel: changes are constant across AEP giving an easy way to assess how the quantiles of the distribution might have changed over the 50 year period. No single model of change can be appropriate to study changes in all extremes series, and an evaluation of the goodness of fit for the fitted models can be useful to assess whether any model can provide a better representation than others for the series under study. This might include both comparative measures such as AIC or BIC and graphical tools such as qq-plots which can indicate whether the distributional assumption made for the data is suitable. Furthermore, models fitted to the data should be reasonable and coherent with the physics principles underlying the process under study. Nevertheless, some decisions are typically made at an initial modelling stage of the parametric forms of change to consider as possible models. This decision could also be informed by considerations of whether any of the assumed forms can provide more usable metrics to describe changes in design events (when these are of interest) or any other quantity of interest derived from the fitted model.

Assessing uncertainty of quantile estimates
The assessment of the uncertainty in the quantities of interest is a key element when performing any statistical analysis. Different approaches have been employed in the literature to assess the variability of the estimated quantiles, the most common ones being the profile likelihood, the delta method, and the non-parametric and parametric bootstrap (Cooley 2013;Kyselỳ 2008). In particular, Kyselỳ (2008) shows that the latter method appears to provide good uncertainty estimates under several cases. The parametric bootstrap is therefore employed here to compare the variability of the estimated quantiles under different model structure for all stations in the study. A brief outline of the parametric bootstrap procedure is provided below.
For any model fitted to a river flow series of size n, an estimate for the vector of d parametersĥ is obtained. Using the estimated parameters, B bootstrap samples of size n are randomly generated from the appropriate distribution parametrised byĥ. For each bootstrap sample the vectorĥ Ã is estimated, together with the quantiles of interest possibly evaluated at a specific covariate valuex, q Ã ¼ qð1 À p;ĥ Ã ðxÞÞ. This results in a bootstrap sample of quantiles ðq Ã 1 ; . . .; q Ã B Þ. Percentile based ð1 À aÞ Ã 100% confidence intervals for the quantile of interest are derived as the a=2 and 1 À a=2 percentiles of the bootstrap sample.
The scatterplots in Fig. 9 compare the width of the 95% confidence intervals for the design event of AEP equal to 0.5 derived at all stations for the different change-permitting model structures against the width of the interval for the fixed-parameters model. For all station the quantiles for the change-permitting models are evaluated at the last available year, i.e the end of the record. Noticeably, for all models in which the location is allowed to change (M L and M E ) confidence intervals are found to be wider. In contrast, the width of the confidence intervals for the frequent events under a model with a varying scale M S (bottom right panel) appears to be comparable to the one derived when assuming fixed parameters in time (M 0 ). This is not the case though when the intervals under comparison refer to rarer events. Figure 10 is structured as Fig. 9, but showing the width of confidence intervals for rare design events (AEP = 0.01). The scatterplot in bottom right panel in this case shows that the intervals for the model in which scale is allowed to change (M S ) are generally wider than those obtained when no change is allowed (M 0 ). This is not the case for models in which the mean only is allowed to change (M L and M E ): in the top two panels the interval widths for the M L and M E model are similar to those for the no-change (M 0 ) model. The width of the intervals obtained using the model which allows the scale to change proportionally to the location (M E CV ) is also typically wider: the

Discussion
In this paper the effect of different model structures used to describe changes in extremes is discussed by focusing on the impacts on quantiles as estimated under different changepermitting parametrisations. Although throughout the paper only univariate parametric models have been discussed, the results can be easily extended to the case of multivariate and non-parametric models. Further, although all analysis and calculations have been carried out assuming a GEV distribution, the findings could be useful for other distributions whose quantile function has the same structure as the one in Eq.
(2), such as the Generalised Pareto distribution which is typically employed when analysing peaks-over-threshold records and has also been used to detect changes in environmental extremes (Silva et al. 2016(Silva et al. , 2017. In that case, changes in the location parameter would correspond to changes in the threshold above which value are considered to be extremes. This has already been already proposed in the literature (see for example Roth et al. 2012;Eastoe and Tawn 2009), although with no specific consideration of the impacts of the modelling assumption on the quantile estimation.
The results from this study show that extreme value models routinely used in the applied literature can lead to very different assessments of change, but the consequences of the different modelling choices on the estimated design events are rarely discussed and explored. Moreover, many of the models typically employed entail a varying coefficient of variation, leading to changes in quantile estimates which are difficult to study under different values of the covariates and cumbersome to reconcile with the estimates derived under the fixed-parameters models. A wider discussion on what types of changes can be expected under different conditions (see for example Sharma et al. 2018), might help in clarifying what model structures are more apt to capture the expected changes. Possibly, the modelling options which are currently explored are not the ones which best describe the changes expected from a warming climate or an increase in population and urban areas. Using simple metrics which describe changes such as those proposed in Eqs. (4) to (7) can be useful to explore the consequences of different models structures on the estimation of quantities which are routinely extracted from any estimation procedure. By comparing the expected changes to the types of changes which can be described under different models, it would be possible to ensure that models of change employed the most suitable parametrisation to describe what is perceived to be the most likely direction and shape of change. As shown, sentences like ''We expect all quantiles to have changed in time by a constant factor'' or ''We would expect events of AEP smaller than 0.02 to be larger in the future and events with AEP larger than 0.02 to be smaller'' can be translated into models with specific parametrisation and possibly constrained values for certain parameters. It is important that the description of change of the quantities of interest derived from change-permitting models can be interpretable and therefore considered credible by the end users. We believe the model based on s presented in Eq. (3), in which the scale is allowed to change proportionally to the location, is a first step towards such models, since it allows for the ratio between quantiles to be the same for all exceedance probabilities, and therefore provides a direct communication of how changes in the distributions affect the final main output of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Funding Open access funding provided by Università Ca' Foscari Venezia within the CRUI-CARE Agreement.
Data Availability The Massachusetts River peak flow data was downloaded from the United States Geological Survey by means of the dataRetrieval R package (De Cicco et al. 2018).
Software Availability Code which implements the model presented in Eq. (3) based on the gev.fit function of the ismev R package by Gilleland et al. (2018) is made available at https://gist.github.com/ ilapros/8193d65a4c9426b27d7ad66a87aa16a5 and as supplementary material to this paper.
The evolution of M d ðp; DxÞ changes as a function of ðl C0 ; l C1 ; r C ; n C Þ and of p and Dx: the ratio between quantiles is different across different return periods.
• In the case in which Y i $ GEVðl C ; expfc C0 þ c C1 x i g; n C Þ D d ðp; DxÞ ¼ expfc C0 þ c C1 xg n C ðy Àn C p À 1Þ ðexpfc C1 Dxg À 1Þ : The evolution of D d ðp; DxÞ changes as a function of ðc C0 ; c C1 ; n C Þ and of p and Dx: the difference between quantiles varies across return periods.
n C l C þ expfc C0 þ c C1 xgðy Àn C p À 1Þ : The evolution of M d ðp; DxÞ changes as a function of ðl C ; c C0 ; c C1 ; n C Þ and of p and Dx: the ratio between quantiles is different across different return periods. • In the case in which Y i $ GEVðl C0 þ l C1 x; expfc C0 þ c C1 x i g; n C Þ: ðexpfc C1 Dxg À 1Þ : The evolution of D d ðp; DxÞ changes as a function of ðl C1 ; c C0 ; c C ; n C Þ and of p and Dx: the difference between quantiles varies across different return periods. M d ðp;DxÞ ¼ 1 þ n C l C1 Dx þ expfc C0 þ c C1 xgðy Àn C p À 1Þðexpfc C1 Dxg À1Þ n C ðl C0 þ l C1 xÞ þ expfc C0 þ c C1 xgðy Àn C p À 1Þ : The evolution of M d ðp; DxÞ changes as a function of ðl C0 ; l C1 ; c C0 ; c C1 ; n C Þ and of p and Dx: the ratio between quantiles is different across different return periods. • In the case in which Y i $ GEVðexpfg C0 þ g C1 x i g; r C ; n C Þ: The evolution of D d ðp; DxÞ changes as a function of ðg C0 ; g C1 Þ: the difference between quantiles is constant across different return periods.
M d ðp; DxÞ ¼ 1 þ n C expfg C0 þ g C1 xgðexpfg C1 Dxg À 1Þ n C expfg C0 þ g C1 xg þ r C ðy Àn C p À 1Þ : The evolution of M d ðp; DxÞ changes as a function of ðg C0 ; g C1 ; r C ; n C Þ and of p and Dx: the ratio between quantiles is different across different return periods.
• In the case in which Y i $ GEVðexpfg C0 þ g C1 x i g; s expfg C0 þ g C1 xg; n C Þ: D d ðp; DxÞ ¼ expfg C0 þ g C1 xgðexpfg C1 Dxg À 1Þ 1 þ s n C ðy Àn C p À 1Þ : The evolution of D d ðp; DxÞ changes as a function of ðg C0 ; g C1 ; s; n C Þ and of p and Dx: the difference between quantiles varies across different return periods.
The evolution of M d ðp; DxÞ changes as a function of g C1 and of Dx: the ratio between quantiles is constant across different return periods. • In the case in which Y i $ GEVðl C0 þ l C1 x i ; sðl C0 þ l C1 x i Þ; n C Þ: : The evolution of D d ðp; DxÞ changes as a function of ðl C1 ; s; n C Þ and of p and Dx: the difference between quantiles varies across different return periods.
The evolution of M d ðp; DxÞ changes as a function of ðl C0 ; l C1 Þ and of Dx: the ratio between quantiles is constant across different return periods.
Calculations for the crossing point between q(1p; hÞ and q(1p; h c Þ In the fixed-parameters model it is assumed that Y i $ GEVðl; r; nÞ. The value y p at which the two quantile functions for the fixed-parameters and change-permitting model cross is derived for a generic change-permitting model below, assuming that the shape parameter under the change-permitting model has the same value n as in the fixed-parameters model. In general, under the change-permitting model it is assumed that Y i $ GEVðl C ðx i Þ; r C ðx i Þ; nÞ and the two quantiles function cross at the value p such that Dðp; x Ã Þ ¼ 0, which is found to be: : This entails that in the case in which Y i $ GEVðl C ; expfr C0 þ r C1 x i g; nÞ we find: y p ¼ 1 þ n l À l C expfr C0 þ r C1 x Ã g À r À1=n : R code to implement the model in Eq. (3) The function below is built upon the ismev::gev.fit function in R. The output is an object of class gev.fit.