# Multivariate density model comparison for multi-site flood-risk rainfall in the French Mediterranean area

- 1.6k Downloads
- 4 Citations

## Abstract

The French Mediterranean area is subject to intense rainfall events which might cause flash floods, the main natural hazard in the area. Flood-risk rainfall is defined as rainfall with a high spatial average and encompasses rainfall which might lead to flash floods. We aim to compare eight multivariate density models for multi-site flood-risk rainfall. In particular, an accurate characterization of the spatial variability of flood-risk rainfall is crucial to help understand flash flood processes. Daily data from eight rain gauge stations at the Gardon at Anduze, a small Mediterranean catchment, are used in this work. Each multivariate density model is made of a combination of a marginal model and a dependence structure. Two marginal models are considered: the Gamma distribution (parametric) and the Log-Normal mixture (non-parametric). Four dependence structures are included in the comparison: Gaussian, Student t, Skew Normal and Skew t in increasing order of complexity. They possess a representative set of theoretical properties (symmetry/asymmetry and asymptotic dependence/independence). The multivariate models are compared in terms of three types of criteria: (1) separate evaluation of the goodness-of-fit of the margins and of the dependence structures, (2) model selection with a leave-one-out evaluation of the Anderson-Darling and Cramer-Von Mises statistics and (3) comparison in terms of two hydrologically interpretable quantities (return periods of the spatial average and conditional probabilities of exceedances). The key outcome of the comparison is that the Skew Normal with the Log-Normal mixture margins outperform significantly the other models. The asymmetry introduced by the Skew Normal is an added-value with respect to the Gaussian. Therefore, the Gaussian dependence structure, although widely used in the literature, is not recommended for the data in this study. In contrast, the asymptotically dependent models did not provide a significant improvement over the asymptotically independent ones.

## Keywords

Intense rainfall events Strong spatial variability Small Mediterranean catchments Elliptical and skew multivariate distributions Asymptotic dependence/independence## 1 Introduction

The French Mediterranean area is subject to intense rainfall events occurring mainly in the fall. They can be triggered by a combination of three factors: the moisture generated by the Mediterranean Sea, upper-level cold troughs coming from the North and the complex orography in the region (the Alps, the Pyrenees and the Massif Central Mountains in the South of France) (Delrieu et al. 2005). Such heavy rainfall might cause flash floods that can be defined as a sudden rise of the water level (in a few hours or less) together with a significant peak discharge (Braud et al. 2014). Flash floods can potentially cause fatalities and important material damage and are known as the main natural hazard in the Mediterranean area (Borga et al. 2011). We refer to rainfall which might lead to flash floods as *flood-risk rainfall*.

A key feature of flood-risk rainfall is its strong spatial variability at high temporal and spatial resolutions. Indeed, Gaume et al. (2009) stressed that albeit flash floods are generally associated with localized intense rainfall that lasts a few hours, they can also be generated by long lasting rainfall with moderate intensities that affects the whole catchment. In the French Mediterranean region, streamflow simulation accuracy and dynamics can be significantly enhanced when exploiting information from rainfall at higher spatial resolution (Lobligeois et al. 2014; Patil et al. 2014; Braud et al. 2014). Therefore, analyses to characterize the spatial variability of flood-risk rainfall will contribute to the understanding of flash flood processes.

We define flood-risk rainfall as rainfall with a high spatial average. More precisely, the spatial average is *high* when it is above a threshold that should be set according to the catchment at hand. This definition encompasses both intense localized events and moderate widespread events, in accordance with expert knowledge. It is straightforward to cast flood-risk rainfall modeling with such a definition into a multivariate or spatial process extreme-value theory (EVT) framework (Coles 2001; Beirlant et al. 2006). In the peaks-over-threshold approach of EVT, models are developed for multivariate or spatial extremes defined as events which are large according to a given norm. With the \({\mathcal {L}}_1\)-norm, this corresponds exactly to the definition of flood-risk rainfall (see Sabourin and Naveau 2014 who proposed a non-parametric multivariate model in this framework). However, the application of these models to flood-risk rainfall raises a number of technical questions (for example, an extreme in the hydrological sense might not be an extreme in the statistical sense).

An alternative approach to analyze and characterize flood-risk rainfall is by means of stochastic rainfall generators (or more generally weather generators, see Ailliot et al. 2015). They can simulate long series of observations from which observations corresponding to flood-risk rainfall (high spatial average) can be extracted and studied. Stochastic generators are complex statistical models which must handle rainfall intermittency (the determination of rainy and dry areas) and rainfall inhomogeneity (the presence of different types of rainfall such as convective and stratiform and of seasonal or diurnal cycles). Intermittency can be addressed either by including an atom at zero in the transformation of the marginal distribution (Bouvier et al. 2003; Vischel et al. 2009; Baxevani and Lennartsson 2015) or by applying an indicator function (Barancourt et al. 1992; Wilks 1998; Hughes et al. 1999; Kleiber et al. 2012; Leblois and Creutin 2013). Inhomogeneity can be incorporated by means of rainfall or weather patterns (Bellone et al. 2000; Thompson et al. 2007; Garavaglia et al. 2010) or by introducing covariates in the distribution parameters (Chandler and Wheater 2002; Kleiber et al. 2012; Baxevani and Lennartsson 2015).

The spatial dependence structure of rainfall is an essential building block of multi-site stochastic generators. Many rely on the meta-Gaussian distributions, i.e. the Gaussian dependence structure combined with a transformation of the marginal distributions (Lebel and Laborde 1988; Wilks 1998; Guillot and Lebel 1999; Bouvier et al. 2003; Vischel et al. 2009; Kleiber et al. 2012; Leblois and Creutin 2013; Serinaldi and Kilsby 2014; Baxevani and Lennartsson 2015). Other multivariate distributions have been employed, with possibly a transformation of the marginals, to infer the spatial structure in stochastic generators but none, as far as we are aware, in a truly multi-site framework (see Flecher et al. 2010 for a single-site multi-variable weather generator based on the multivariate Skew Normal distribution and Vrac et al. 2007 for a two-site rainfall generator merging a bivariate Gamma with a bivariate model from EVT). The dependence structure can also be modeled with copulas (Genest and Favre 2007). Besides the Gaussian copula which is equivalent to the meta-Gaussian distribution, several copula families exist such as the Student t, the Archimedean or Extreme Value families but not many are available in dimension greater than two. For instance, Schoelzel and Friederichs (2008) performed modeling of rainfall at two sites with a bivariate Gumbel copula, Bárdossy and Pegram (2009) proposed an asymmetric copula to model rainfall at 32 sites and Serinaldi (2009) proposed a copula-based mixed model for bivariate rainfall.

Extreme rainfall in the French Mediterranean area has been widely studied. To our knowledge, most of the time, a univariate viewpoint is adopted with the block maxima approach of EVT where extreme events are taken as maxima over a period of time such as the year or the month, see Gardes and Girard (2010), Ceresetti et al. (2012) and Carreau et al. (2013) for instance. In contrast, spatial dependence is taken into account in Lebel and Laborde (1988) who proposed a geostatistical approach to model monthly areal rainfall maxima and in Neppel et al. (2011) who developed a multivariate regional test in which the spatial dependence structure is modeled with the Student t copula.

Comparison studies of spatial or multivariate models for extremes were conducted in other application domains or in other study areas. The meta-Gaussian distribution is often chosen because it is easy to implement even in high dimension. However, the Gaussian has very specific dependence properties which should be validated. In finance, impacts on risk measures of the choice of the Gaussian dependence structure compared to other choices were studied in Embrechts et al. (2002) and Poon et al. (2004). In particular, Embrechts et al. (2002) emphasized the potential under-estimation of the probability of joint extreme events when employing the Gaussian copula. In hydrology, Berg and Aas (2009) modeled daily rainfall at four sites with five types of combined Archimedean copulas and compared the goodness-of-fit with the Cramer-Von-Mises statistics. In Dupuis and Tawn (2001), the effects of mis-specification of the dependence structure on bivariate extreme-value problems were studied on synthetic data while Dupuis (2007) showed the effect of model mis-specification on bivariate hydrometric data sets. More recently, Blanchet and Davison (2011) and Thibaud et al. (2013) (see also references therein) performed model selection of spatial processes for extremes of snow and rainfall, respectively, in Switzerland. These studies show that the choice of the spatial dependence structure for extremes must be made with great care and that the meta-Gaussian distribution can fit very poorly.

In this work, we aim to analyze and compare multivariate density models for multi-site flood-risk rainfall. In particular, we seek to evaluate whether the spatial dependence properties of the models can reproduce the spatial variability of flood-risk rainfall. The study area is a small representative French Mediterranean catchment, the Gardon at Anduze, that is vulnerable to devastating floods (Delrieu et al. 2005). Flood-risk rainfall can be thought of as a type of rainfall (or rainfall pattern) and the strategy that we adopt in this work can thus be seen as focussing on a single rainfall type within a stochastic generator. The flood-risk rainfall type is the most important feature multi-site stochastic generators should be able to reproduce when applied in small Mediterranean catchments. In addition, the adopted strategy is likely to reduce the need to deal with rainfall intermittency and inhomogeneity, as is the case for the Gardon at Anduze catchment. This work is intended as a preliminary study before developing a spatial stochastic rainfall generator adapted for flood-risk rainfall in the Mediterranean area.

The paper is structured as follows. Daily flood-risk rainfall data at eight rain gauge stations in the Gardon at Anduze catchment together with pairwise exploratory analyses are presented in Sect. 2. Although the response time of the catchment is in the order of the hours, we make do with analyses at the daily time-step because a longer and more complete data base is available. We assume that the dependence structure of flood-risk rainfall at the daily time-step provides relevant information on the flash flood processes, even when they occur at the sub-daily scale. Section 3 is dedicated to the description of the eight multivariate density models included in the comparison. Each model consists of marginal distributions, which describe the univariate behavior of daily flood-risk rainfall at each site, combined with a spatial dependence structure, which captures the site-to-site variability at a given day. A parametric marginal model, the Gamma distribution, and a non-parametric marginal model, the mixture of Log-Normal distributions, are described in Sect. 3.1. Four dependence structures, the Gaussian, the Student t, the Skew Normal and the Skew t, in increasing order of complexity, are presented in Sect. 3.2. They possess a representative set of theoretical properties (symmetry/asymmetry and asymptotic dependence/independence for the extremes) and they can be fitted in dimension 8 with available R libraries (Kojadinovic and Yan 2010; Azzalini 2015). Section 4 contains the comparative results in terms of three types of criteria. We first examine separately the goodness-of-fit of the marginal models and of the dependence structures in Sect. 4.1. Second, the best model is selected by performing leave-one-out validation (also called jackknife) with two goodness-of-fit statistics, Cramer-Von Mises and Anderson-Darling, in Sect. 4.2. Third, we look at hydrologically interpretable quantities (return periods of observed spatial average and conditional probability of exceedances, see Thibaud et al. 2013 for similar criteria) which involve the whole multivariate models in Sect. 4.3. We discuss the results and conclude in Sect. 5.

## 2 Data and exploratory analyses

The catchment of the Gardon at Anduze is a small catchment of about 545 km^{2} located in the Cevennes mountain range, in the South of France, see Fig. 1a. It is subject to the Mediterranean climate that, in combination with its sharp orography, can trigger heavy precipitation events especially in the fall season (Ducrocq et al. 2008). Daily rainfall observations are collected over a 43 year period, from 01/01/1958 to 12/31/2000, at eight stations scattered around the catchment, as shown in Fig. 1b and Table 1. Horizontal distance for pairs of stations varies between 5 and 40 km. Elevation ranges from 135 m, in the valleys, to 930 m near the crest of the mountain range.

Although preliminary analyses detect some order 1 auto-correlation when flood-risk rainfall happens on two consecutive days, i.e. \(Cor({\varvec{X}}_t, {\varvec{X}}_{t+1})\), we model \({\varvec{X}}_t\) as independent. As a result, the confidence intervals presented in the analyses might be narrower than if temporal dependence was taken into account, since the effective number of observations might be somewhat reduced. Consecutive flood-risk rainfall days occur 43 times and the largest magnitude of the auto-correlation is about 0.3 so we expect that the independence assumption does not have very significant impacts.

X and Y Lambert II extended coordinates of the eight daily rain-gauge stations and elevation (Z). See the map in Fig. 1b

Station name | X (km) | Y (km) | Z (m) |
---|---|---|---|

BARRE-DES-CEVENNES | 705 | 1917 | 930 |

CASSAGNAS | 713 | 1920 | 800 |

LECOLLET-DE-DEZE | 727 | 1918 | 348 |

ALES | 739 | 1905 | 135 |

GENERARGUES | 732 | 1898 | 138 |

LASALLE | 722 | 1895 | 278 |

SAINT-ANDRE-DE-VALBORGNE | 708 | 1907 | 450 |

SAINT-CHRISTOL-LES-ALES | 740 | 1901 | 138 |

### 2.1 Pairwise dependence

*i*and station

*j*respectively. Then the \(\tau\) coefficient for these two stations is given by:

Kendall’s \(\tau\) coefficients, significantly different from zero at the 5 % level, are presented in the upper triangular part of Fig. 2. Pairwise scatterplots are shown in the lower triangular part of Fig. 2 together with a smooth regression line obtained from local regression (Cleveland 1981) to help detect dependencies. Figure 2 reads as follows. Each station name and histogram appears on the diagonal. Row *i* and column *i* concerns the \(i{\text {th}}\) station. At the intersection of column *i* and row *j*, with \(j > i\), there is the scatterplot of the pair of stations *i* (*x*-axis) and *j* (*y*-axis). Conversely, at the intersection of column *j* and row *i*, the corresponding Kendall’s \(\tau\) coefficient is written when significant at the 5 % level. The axes shown can be associated to the lower triangular scatterplots or to the histograms on the diagonal.

*u*, taken as high as possible.

A useful tool to assess whether a pair of variables \(X_i\) and \(X_j\) is asymptotically dependent or independent is the so-called \(\chi\)-plot (Coles et al. 1999). In such a plot, \(\chi (X_i, X_j)\) is estimated and plotted against increasing thresholds *u* expressed as quantiles of level \(q \in [0,1]\). Confidence intervals can be constructed with the delta method but are not very reliable near \(q=0\) or \(q=1\).

In the flood-risk rainfall data, both asymptotically dependent and independent pairs of stations appear to be present. In Fig. 4, the \(\chi\)-plots for two representative pairs of stations are shown together with a 95 % confidence interval (R package evd Stephenson 2002). The \(\chi\)-plot of the pair of nearby stations, Barre-des-Cevennes and Cassagnas, in Fig. 4a, is rather stable around the value 0.6, regardless of the threshold, and thus indicates asymptotic dependence. In contrast, for the pair of distant stations, Barre-des-Cevennes and Generargues, in Fig. 4b, the \(\chi\) estimates increase from negative values (which are caused by the estimator employed, see Coles et al. 1999) to values near zero, indicative of asymptotic independence.

*u*set to the 95 % quantile (R package extRemes from Gilleland and Katz 2011) are plotted with respect to the \(\tau\) estimates. When the \(\tau\) estimates are positive, the scatter plot is quite well aligned with the \(y=x\) line.

## 3 Multivariate density models

### 3.1 Marginal distributions

*k*and \(\eta\) are the shape and scale parameters respectively with \(k, \eta > 0\) and \(\Gamma (k)\) is the Gamma function evaluated at

*k*. Estimates of

*k*and \(\eta\) are obtained by the maximum likelihood estimation method (R package MASS from Venables and Ripley 2002).

We consider as a second marginal model a mixture of Log-Normal distributions although some authors recommend to model rainfall with a hybrid distribution. Such a hybrid distribution combines a parametric (Carreau and Bengio 2009; Li et al. 2012) or non-parametric model (Lennartsson et al. 2008) for the bulk of the distribution with the Generalized Pareto distribution (GPD) in the upper tail. Univariate extreme value theory (EVT) provides an asymptotic justification for the GPD to be an appropriate model for the distribution of values exceeding a suitably chosen high threshold (Pickands 1975). An advantage of such a hybrid distribution is its ability to adapt to any type of upper tail behavior be it finite, exponential (light-tail) or power-law (heavy-tail).

The motivation for the choice of the Log-Normal mixture instead of the hybrid distribution is twofold. First, preliminary analyses based on fitting the GPD revealed that the marginal distributions of the flood-risk rainfall data appear to be light-tailed. The Gamma is light-tailed but, because it has only two parameters, might lack the flexibility to model both the bulk of the rainfall distribution and its upper tail. Second, the Log-Normal mixture is straightforward to fit, has shown to be a good model for rainfall in Southen France (Carreau and Vrac 2011) and can take into account the presence of more than one sub-population of rainfall, such as convective and stratiform, if needed.

### 3.2 Spatial dependence structures

#### 3.2.1 Gaussian and student t copulas

The first two dependence structures included in the comparison are the Gaussian and Student t copulas which belong to the elliptical family (R package copula from Kojadinovic and Yan 2010). As a widespread model among practitioners, the Gaussian copula, that represents the class of meta-Gaussian models, is taken as the benchmark model. The Student t copula has an additional parameter, the degree of freedom \(\nu\), which provides greater modeling flexibility in terms of tail dependence and encompasses the Gaussian copula as a limiting case, when \(\nu \rightarrow \infty\).

*P*which must be symmetric and positive definite. In the Student t case, \({\varvec{\theta }}\) contains, in addition to

*P*, the degree of freedom parameter \(\nu > 0\).

*P*associated to the pair \((X_i,X_j)\) (Demarta and McNeil 2005). In regard to the variety of strengths of empirical Kendall’s \(\tau\) coefficients in the flood-risk rainfall data (see Fig. 2), we chose not to impose any specific parametric form on the correlation matrix

*P*so that it has as much flexibility as needed.

#### 3.2.2 Skew normal and Skew t

The last two dependence structures included in the comparison are the multivariate Skew Normal and Skew t distributions (R package sn Azzalini 2015). They can be thought of as asymmetric extensions of their generating distribution (Gaussian for the Skew Normal and Student t for the Skew t). The Skew distributions have an additional vector of *d* parameters, \({\varvec{\alpha }}\), which act as skewness parameters. The Gaussian and Student t distributions appear as special cases when \({\varvec{\alpha }} = {0}\).

The density of the Skew Normal (resp. Skew t) are given in Eq. (13) (resp. Eq. (14)) where \({\varvec{x}} \in {\mathbb {R}}^d\) (Azzalini and Capitanio 2003). For both the Skew Normal and the Skew t, \({\varvec{\alpha }} \in {\mathbb {R}}^d\) projects \({\varvec{x}}\) onto a line and *P* is a \(d \times d\) correlation (or dispersion) matrix that is symmetric and positive definite. The Skew t has, in addition, the degree of freedom parameter, \(\nu > 0\), inherited from the Student t.

*P*and the skewness parameters \({\varvec{\alpha }}\). In practice, the sn package (Azzalini 2015) did not allow to fix the location parameters to zero and the scale parameters to 1 in the estimation as would be required by the margin transformation. In order to stay as close as possible to these parameter values, they were used as starting parameter values for the optimization. The optimized parameter values did not wander too far from the starting values. The Skew t, the most complex model, was difficult to fit. Sensible starting values, taken from the fitted Skew Normal, were provided to the optimizer to help the estimation of the parameters.

*x*-axis variable most often takes higher values than the

*y*-axis variable. In the rainfall application, this translates into one station generally hitting higher quantile values of its marginal distribution with respect to another station. In the bottom row, \(\alpha = (0.5,0.5)\) yields an asymmetry with respect to the line \(y = 1 - x\). This results in lower dependence at the smaller values than at the larger values. This can be related to the fact that low rainfall intensities tend to be scattered and intermittent and thus often display poor spatial dependence whereas high rainfall intensities tend to be more dependent (Bárdossy and Pegram 2009).

#### 3.2.3 Multivariate mixture

In order to account for the possible presence of more than one sub-population of rainfall, we tested whether a multivariate mixture with more than one component was required. The margins of the flood-risk rainfall data were transformed to standard Gaussian with the empirical marginal distribution functions, see Eq. (15), and a multivariate Gaussian mixture was fitted to the transformed data. Then, the BIC was used to select the appropriate number of components (Frayler and Raftery 1999).

According to the BIC, a single Gaussian component is needed to model the dependence structure. We expect that only one component would be selected as well when considering a mixture with the other models (Student t, Skew Normal and Skew t). Indeed, these models include the Gaussian as a special case and have a larger number of parameters. For the BIC to select more than one component, the increase in goodness-of-fit versus the increase in complexity (number of parameters) would have to be very significant. The test provide sufficient grounds to keep a single dependence structure model and not to consider further multivariate mixture modeling.

## 4 Comparative results

### 4.1 Statistical inference

First, we seek to evaluate independently how good the marginal and dependence structure models are at fitting the flood-risk rainfall data.

#### 4.1.1 Margin fit

The fit of the two marginal distributions considered is evaluated by means of quantile-quantile plots (qq-plots) as shown in Fig. 8 for the Gamma distribution and in Fig. 9 for the 2-component Log-Normal mixture. In all qq-plots, the empirical quantiles are represented on the *x*-axis and the theoretical quantiles from the marginal distributions on the *y*-axis. Confidence intervals at 95 % for the theoretical quantiles are computed with 1000 parametric bootstrap replications. To ease comparison across qq-plots, the first diagonal is drawn on the interval [0, 300]. For a given station, the marginal model is considered to fit well if the first diagonal is within the confidence interval most of the time.

#### 4.1.2 Dependence structure fit

There is no straightforward way to visually assess the fit of a dependence structure, especially in high dimension. We make do with comparisons in terms of pairwise dependence. First, we evaluate whether the models are able to reproduce the empirical Kendall’s \(\tau\) for all pairs of stations. We dropped the evaluation in terms of the extremal \(\chi\) coefficients as we have seen that the \(\chi\) estimators are positively related to the \(\tau\) coefficient estimators, see Fig. 5. Second, we look at bivariate densities for two representative pairs of stations.

In order to gain more insight into the models, we also look at the fitted bivariate copula densities for the same two pairs of stations as chosen for illustration for the \(\chi\)-plots in Fig. 4: a nearby pair, Barre-des-Cevennes and Cassagnas, in Fig. 11 and a distant pair, Barre-des-Cevennes and Generargues, in Fig. 12.

The fitted bivariate copula densities are estimated with bivariate hexagonal histograms (R package fMultivar provided by Rmetrics https://www.rmetrics.org/) on random samples of size \(10^6\) from the copulas associated to each fitted model. We made this choice because the bivariate margins of the Skew distributions are not easy to deduce (Azzalini 2013). The darker the histogram bin is, the higher the density is estimated. The same scale of grey is used for each pair of stations. The dots represent the observations.

Figures 11, 12 are organized as follows. The asymptotically independent dependence structures are in the left column (Gaussian and Skew Normal) and the asymptotically dependent ones are in the right column (Student t and Skew t). The symmetric dependence structures are in the top row (Gaussian and Student t) while the asymmetric ones (Skew Normal and Skew t) are in the bottom row.

### 4.2 Leave-one-out model selection

Second, model selection is achieved by performing an automatic quantitative evaluation of the fit of the multivariate density models based on leave-one-out validation (sometimes also called jackknife). With such a validation scheme, each observation is left aside in turn and the models are fitted on the \(n-1\) observations. Performance measures are then computed on the observation that was left aside. Since the performance is evaluated out-of-sample, the comparison is fair between models even when they have different numbers of parameters (see Chapter 2.7 in Ripley 1996).

We used the Cramer-Von Mises and Anderson-Darling goodness-of-fit statistics as performance measures. These goodness-of-fit statistics can be seen as distances between the empirical distribution function and the theoretical distribution function \(F_{\varvec{\phi }}\) of a given multivariate density model, with \({\varvec{\phi }}\) including margin and dependence parameters. The Cramer-Von Mises statistic is simply defined as the square distance between the two distribution functions while in the Anderson-Darling statistic, weights are introduced to emphasize an accurate representation of extreme values (Genest et al. 2013).

*x*-axis has a logarithmic scale to enhance differences between models. Smaller value of the statistics means a better performance.

The multivariate model with the Skew Normal dependence structure and 2-component Log-Normal margins (SN-LNorMix) outperforms the other seven models in terms of both goodness-of-fit statistics. In all cases but one, the performance of the multivariate models, in terms of both goodness-of-fit statistics, is improved when 2-component Log-Normal mixture margins are used instead of Gamma margins. The exception concerns the models with Skew t dependence structure that have similar performance with both types of margins. When Gamma margins are employed, all four dependence structures yield multivariate models with comparable performance. Only the Skew Normal displays a significantly better fit and only in terms of the Anderson-Darling statistic. The asymptotically dependent models (TC, ST) are not performing better than their asymptotically independent counterparts (GC, SN).

### 4.3 Hydrological criteria

Last, we propose to obtain complementary insight into the multivariate models by means of two hydrologically meaningful quantities: the return periods of the spatial average of flood-risk rainfall (Sect. 4.3.1) and the conditional probability that at one station, rainfall exceeds a high level given that a high level is exceeded at another station (Sect. 4.3.2).

#### 4.3.1 Spatial average return periods

The distribution of the spatial average \(\overline{X} = {X_1 + \cdots + X_8}/{8}\) and consequently of the return periods of the spatial average, involves both the margins and the dependence structure of the multivariate density models.

The theoretical return periods \(T_k\), that is as predicted by the fitted models, of the observed spatial averages are estimated by bootstrap resampling as computing exact return periods from the multivariate models would be very involved. An 8-dimensional sample of size \(10^6\) was drawn from each of the eight multivariate density models ensuring that the spatial average is always greater than 50. The theoretical return periods are then estimated with Eq. (20) in which \(P(\overline{X} > \bar{x}_{(k)} | \overline{X} > 50)\) is now approximated by the proportion of exceedances of \(\bar{x}_{(k)}\) in the bootstrap sample of \(10^6\) simulated spatial averages.

Confidence intervals at 95 % are obtained for the empirical and theoretical return periods as follows. For the empirical estimates \(\hat{T}_k\), since the size of the observed sample is small, bootstrap resampling is employed. To this end, 10,000 random samples of size 265 were drawn with replacement from the set of 265 8-dimensional observed rainfall so as to preserve spatial dependence. For the theoretical estimates \(T_k\), as the sample size is large, 95 % confidence intervals can be computed from standard errors. This is done in two steps. First, standard errors for the sample proportion of exceedances of \(\bar{x}_{(k)}\) are estimated as the standard deviation of the sample proportion divided by the square-root of the sample size (1000 in this case). The confidence intervals deduced for the sample proportion of exceedances are translated into confidence intervals for the return periods via Eq. (20).

The Skew Normal dependence structure stands out as it is the only one which, with both types of margins, is able to reproduce accurately the return periods of the smallest return levels (from 50 to 90 mm approximately). These are under-estimated by the other three dependence structures, which means that the models see these levels as more frequent than they should.

For the largest spatial averages (beyond 125 mm), the confidence intervals of the empirical estimates are very wide and contain, most of the time, the estimates of all eight models. These spatial averages are rare events (return periods greater than 30 years) with respect to the length of the data set (43 years). For the four dependence structures, the model with 2-component Log-Normal mixture margins yields lower return periods while the model with Gamma margins provides higher return periods and thus assigns smaller probabilities to the largest observed spatial averages.

#### 4.3.2 Conditional probability of exceedance

*i*and

*j*, this can be expressed as:

*i*and

*j*that satisfy \(P(X_i > R_i(T)) = P(X_j > R_j(T)) = {1}/{T}\).

*j*, the upper tail is approximated by the GPD, for all

*x*above the threshold \(u_j\), as follows:

*T*-year return level at station

*j*can be derived from Eq. (22) as:

Station name | | | | |
---|---|---|---|---|

BARRE-DES-CEVENNES | 124 | 157 | 184 | 214 |

CASSAGNAS | 160 | 197 | 226 | 255 |

GENERARGUES | 110 | 137 | 159 | 182 |

The empirical and theoretical (as predicted by each fitted model) conditional probabilities of Eq. (21) are estimated as the sample proportion of the conditional exceedances. In other words, among the observations in the sample for which \(R_i(T)\) is exceeded at station *i*, we computed the proportion for which \(R_j(T)\) is also exceeded at station *j*. For the theoretical conditional probabilities, a sample of size \(10^6\) was simulated by each fitted model such that the spatial average is greater then 50. As already mentioned, we resorted to simulation to estimate the theoretical conditional probabilities because the lower dimensional margins of the Skew distributions are not easy to deduce (Azzalini 2013).

The 95 % confidence intervals are estimated in a similar way as those for the return periods of the spatial average in Sect. 4.3.1. For the empirical estimates, as the sample size is small, 95 % confidence intervals are obtained by bootstrap resampling (10,000 random samples of size 265 were drawn with replacement from the set of 265 8-dimensional observed rainfall). For the theoretical estimates, as the sample size is large, the confidence intervals were computed with the standard errors (the standard deviation of the empirical proportion divided by the square-root of the sample size which is the number of exceedances of \(X_i\)).

Given the small sample size, the empirical estimates of the conditional probabilities are unreliable. For the nearby pair, the empirical estimates provide no information for \(T \ge 10\) since the 95 % confidence intervals reach the bounds [0,1], see Fig. 15a, b where the *y*-axis is truncated. Conversely, for the distant pair, the confidence intervals collapse to 0, also for \(T \ge 10\), see Fig. 15c, d. Indeed, the numbers of exceedances of the conditioning station Barre-des-Cevennes are very low: 21, 8, 5, and 4 exceedances for T = 2, 5, 10 and 20 respectively.

As expected, since this is a monotone transformation, the choice of margins (left column versus right column in Fig. 15) does not affect the ordering of the curves or their global features (rising, declining or stabilizing). However, for the nearby pair, the estimated conditional probabilities are clearly higher with the 2-component Log-Normal mixture margins, see Fig. 15b.

Unsurprisingly, the asymptotically dependent models, t copula (TC) and Student t (ST), yield generally the highest conditional probability estimates. This is especially true for the longer return periods and for the nearby pair, see Fig. 15a, b. In contrast, the Skew Normal model, which is asymptotically independent, provides estimates that are nearly comparable to the asymptotically dependent model estimates for the distant pair, see Fig. 15c, d.

Among the two asymptotically independent models, the Skew Normal model gives higher conditional probability estimates than the Gaussian model. For the nearby pair, the empirical Kendall’s \(\tau\) estimate is of 0.55, and thanks to this strong correlation the Gaussian model is able to predict quite high conditional probabilities as the asymptotic independence property comes into play for much longer return periods. Conversely, for the distant pair that has an empirical Kendall’s \(\tau\) estimated at \(-\)0.067, the Gaussian is almost independent and predicts a conditional probability decreasing very quickly to zero.

## 5 Discussion and conclusion

We conducted a comparative study of eight multivariate density models (marginal model combined with a dependence structure) for flood-risk rainfall, i.e. rainfall susceptible of causing flash floods in small Mediterranean catchments. The characterization of flood-risk rainfall and in particular, of its spatial variability, is crucial to improve flash-flood understanding. The study area is the Gardon at Anduze, a representative small Mediterranean catchment of about 545 km^{2}. Flood-risk rainfall is defined as rainfall with a high spatial average. For the Gardon at Anduze catchment, spatial average is considered high when it is greater than 50 mm. We used data from eight rain gauge stations at the daily time-step. The pairwise exploratory analysis revealed that the bivariate dependence varies widely from strong for nearby pairs of stations (5 km apart) to weak or zero for distant pairs (40 km apart). This confirms the strong spatial variability of flood-risk rainfall over the catchment.

Two marginal models were considered: the Gamma distribution and a mixture of Log-Normal distributions. The Gamma is a parametric model with 2 parameters that was often used to model the univariate distribution of rainfall. The Log-Normal mixture is a non-parametric model whose complexity, i.e. the number of parameters, can increase with the size of the data set. For all eight stations, two mixture components were selected based on the BIC. As a result, the mixture has 5 parameters, for this data set. Both marginal models are light-tailed, i.e. exponential decay of the upper tail. However, the 2-component Log-Normal mixture has considerably more flexibility due to its larger number of parameters. Such a mixture can adapt, in principle, to more complex distributions caused by the presence of several sub-populations of rainfall.

Four dependence structures with different theoretical properties were included in the comparison: the Gaussian, the Student t, the Skew Normal and the Skew t. The Gaussian is symmetric and asymptotically independent (except when \(|\rho |=1\)). Its parameters are the free parameters of the \(8 \times 8\) correlation matrix (constrained to be symmetric and positive definite), where 8 is the number of rain gauge stations. The Student t is symmetric and asymptotically dependent (except when \(\rho =-1\)). The asymptotic dependence, loosely speaking, characterizes the fact that extremes, i.e. asymptotically high values, tend to occur simultaneously at different sites. The Student t has one additional parameter \(\nu\), compared to the Gaussian, called the degree-of-freedom. This parameter controls the behavior of the tails: the smaller it gets, the greater the asymptotic dependence becomes. The Skew Normal introduces asymmetry in the Gaussian. It has, in addition to the correlation matrix, a vector of skewness parameter \({\varvec{\alpha }} \in {\mathbb {R}}^8\) which define the orientation of the asymmetry, see Fig. 7. The Skew Normal, like its generating distribution, is asymptotically independent but has more flexibility thanks to its eight extra parameters. The Skew t, an asymmetric version of the Student t, combines the properties of the Student t and Skew Normal: it is asymptotically dependent and more flexible than its generating distribution.

The models were included in the comparison either because they were widely used in the literature for similar applications or because they are variants of these models with different theoretical properties (non-parametric, asymmetric, asymptotically dependent). All models are relatively easy to implement thanks to R libraries mentioned throughout the text. The Gaussian with Gamma margins is the most parsimonious model while the Skew t with 2-component Log-Normal margins is the most complex (12 additional parameters). Moreover, we gained reasonable confidence that no multivariate mixture modeling was needed by testing for the number of components in a multivariate Gaussian mixture.

Three types of criteria were taken into account in the comparison of the multivariate density models. First in terms of *statistical inference*, we sought to evaluate if the marginal and dependence structure models independently provided a reasonable fit. As can be seen from the quantile-quantile plots in Fig. 9, the 2-component Log-Normal mixture, thanks to its greater flexibility, is able to fit all eight stations. Greater flexibility comes with greater variance as indicated by the large confidence intervals for the upper tail of the distribution. In contrast, the Gamma lacks some flexibility as it under-estimates the upper tail of the distribution for four stations, see Fig. 8. Although the four dependence structures all reproduce well the empirical Kendall’s \(\tau\) (see Fig. 10), they might have important differences in the fitted bivariate densities. For instance, the asymmetry of the Skew Normal appears very clearly for the pair of stations Barre-des-Cevennes/Generargues, see Fig. 12. Moreover, the effect of the asymptotic dependence can be seen for the Student t and the Skew t that have greater density in the left-top and bottom-right corners.

Second, *model selection* was achieved based on the evaluation of the Cramer-Von Mises and the Anderson-Darling statistics with a leave-one-out scheme. With such a scheme, an over-parametrized model is penalized as it will tend to fit too well the calibration sets \({\mathcal {F}}_{k:n-1}\), see Eq. (17), and perform poorly on the left-out observations. Therefore, the leave-one-out evaluation allows a trade-off between goodness-of-fit and complexity. In regard of this quantitative evaluation, the Skew Normal with 2-component Log-Normal mixture margins outperforms significantly the other seven models, see Fig. 13.

Third, to obtain complementary insight into the models, they were compared in terms of two *hydrologically interpretable quantities*: the return periods of the observed spatial averages and the conditional probability of exceedances of at-site return levels for two representative pairs of stations. In both cases, it is not possible to select a model based on comparisons with the empirical estimates because of the high uncertainty of these rare events. However, inter-model comparisons emphasize some differences between the dependence structures. In particular, the Skew Normal is the only dependence structure providing consistent return periods for the smaller spatial averages, see Fig. 14. In addition, despite being also asymptotically independent, the Skew Normal provides higher conditional probabilities and therefore reveals stronger dependence than the Gaussian, see Fig. 15. For the distant pair of stations, the Skew Normal is almost comparable to the asymptotically dependent models. The Gaussian yields the lowest conditional probabilities and thus is the model with the weakest spatial dependence. In contrast, the Skew t can display very strong spatial dependence, especially for the nearby pair of stations, see Fig. 14.

In conclusion, for the Gardon at Anduze catchment, the Skew Normal with 2-component Log-Normal mixture margins achieved the best fit. The increase in complexity of the mixture model for the margins with respect to the Gamma is compensated by a significant increase in goodness-of-fit. Similarly, the asymmetry introduced by the Skew Normal is an added-value with respect to the Gaussian. In contrast, the asymptotically dependent models did not improve the fit over the asymptotically independent ones. As mentioned in Serinaldi et al. (2014), asymptotic dependence is very difficult to detect when the time series is short, as it is the case in the present work. The Gaussian, which is the benchmark model in this comparison, is not recommended for the data at hand. Even when considering the more complex 2-component Log-Normal mixture model for the margins, its performance remains significantly lower than the Skew Normal. Moreover, preliminary testing lead us to conclude that considering a multivariate mixture of Gaussians, instead of a single Gaussian, would not improve the fit.

- 1.
The strategy that we adopted to focus on flood-risk rainfall, the type of rainfall associated to flash-floods, allows us to tackle the most important feature multi-site stochastic generators should be able to reproduce when applied to small Mediterranean catchments. This strategy circumvents the need to build a complex stochastic model that must account for rainfall intermittency and inhomogeneity. Homogeneity is dealt with a statistical approach, namely the selection of the number of components in mixture models based on the BIC, rather than by fixing the number of components based on the seasons or the months.

- 2.
We compared multivariate density models of increasing complexity with a different combinations of theoretical properties thanks to the decomposition into marginal and dependence structure models. We were able to determine which properties are most relevant for the data at hand. Multivariate EVT models were not included in the comparison because high dimensional models that could be easily implemented are too simplistic (e.g. Gumbel).

- 3.
We proposed three types of criteria that serve different purposes: (i)

*statistical inference*is meant to asses basic model goodness-of-fit, (ii)*model selection*serves to identify the best model and (iii)*hydrological interpretable quantities*helps to gain deeper understanding into the models that could be relevant for hydrological applications.

*regionalization*for the margin parameters, i.e. spatial interpolation, in order to define the margins of a continuous process at every point in space. Finally, it would be interesting to evaluate, in our application, some recent flexible models from multivariate EVT such as those proposed in Salvadori and De Michele (2010) or Bacro et al. (2015) for spatial processes.

## Notes

### Acknowledgments

This work has been has been partly supported by the StaRMIP project and the FloodScale project, funded by the French National Research Agency (ANR). The FloodScale project contributes to the HyMeX program and benefits from funding by the MISTRALS/HyMeX program (http://www.mistrals-home.org). The rainfall data are provided by the OHM-CV, an observation service certified in 2006 and funded by the Institut National des Sciences de l’Univers/Surface et Interfaces Continentales. We are thankful to all the R-package developers that we mentioned throughout the paper. We thank F. Serinaldi and two anonymous reviewers for their valuable comments, which greatly helped improve the quality of the paper.

## References

- Ailliot P, Allard D, Monbet V, Naveau P (2015) Stochastic weather generators: an overview of weather type models. J de la Société Française de Stat 156(1):101–113Google Scholar
- Allard D, Bourotte M (2014) Disaggregating daily precipitations into hourly values with a transformed censored latent Gaussian process. Stoch Environ Res Risk Assess 29(2):453–462CrossRefGoogle Scholar
- Azzalini A (2013) The skew-normal and related families, vol 3. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Azzalini A (2015) The R package sn: The skew-normal and skew-t distributions (version 1.2-3). Università di Padova, Italia. http://azzalini.stat.unipd.it/SN
- Azzalini A, Capitanio A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J R Stat Soc 65(2):367–389CrossRefGoogle Scholar
- Bacro JN, Gaetan C, Toulemonde G (2015) A flexible dependence model for spatial extremes. (
**in revision**)Google Scholar - Barancourt C, Creutin JD, Rivoirard J (1992) A method for delineating and estimating rainfall fields. Water Resour Res 28(4):1133–1144CrossRefGoogle Scholar
- Bárdossy A, Pegram GGS (2009) Copula based multisite model for daily precipitation simulation. Hydrol Earth Syst Sci 13(12):2299–2314CrossRefGoogle Scholar
- Baxevani A, Lennartsson J (2015) A spatiotemporal precipitation generator based on a censored latent Gaussian field. Water Resour ResGoogle Scholar
- Beirlant J, Goegebeur Y, Segers J, Teugels J (2006) Statistics of extremes: theory and applications. Wiley, New YorkGoogle Scholar
- Bellone E, Hughes JP, Guttorp P (2000) A hidden Markov model for downscaling synoptic atmospheric patterns to precipitation amounts. Clim Res 15:1–12CrossRefGoogle Scholar
- Berg D, Aas K (2009) Models for construction of multivariate dependence—a comparison study. Eur J Financ 15(7):639–659Google Scholar
- Blanchet J, Davison AC (2011) Spatial modeling of extreme snow depth. Ann Appl Stat 5:1699–1725CrossRefGoogle Scholar
- Borga M, Anagnostou EN, Blöschl G, Creutin JD (2011) Flash flood forecasting, warning and risk management: the HYDRATE project. Environ Sci Policy 14 (7):834–844. doi: 10.1016/j.envsci.2011.05.017. ISSN 1462-9011. http://www.sciencedirect.com/science/article/pii/S1462901111000943. Adapting to Climate Change: Reducing Water-related Risks in Europe
- Bortot P (2010) Tail dependence in bivariate skew-normal and skew-t distributions. Available online: www2.stat.unibo.it/bortot/ricerca/paper-sn-2.pdf
- Bouvier C, Cisneros L, Dominguez R, Laborde J-P, Lebel T (2003) Generating rainfall fields using principal components (PC) decomposition of the covariance matrix: a case study in Mexico city. J Hydrol 278(1):107–120CrossRefGoogle Scholar
- Bouvier C, Ayral PA, Brunet P, Crespy A, Marchandise A, Martin C (2007) Recent advances in rainfall-runoff modelling: extrapolation to extreme floods in southern France. In: First international workshop on hydrological extremes. Observing and modelling exceptional floods and rainfalls, pp 229–238, Cosenza. FRIEND-AMHYGoogle Scholar
- Braud I, Ayral PA, Bouvier C, Branger F, Delrieu G, Le JC, Nord G, Vandervaere JP, Anquetin S, Adamovic M, Andrieu J, Batiot C, Boudevillain B, Brunet P, Carreau J, Confoland A, Didon-Lescot JF, Domergue JM, Douvinet J, Dramais G, Freydier R, Gérard S, Huza J, Leblois E, Le OB, Le RB, Marchand P, Martin P, Nottale L, Patris N, Renard B, Seidel JL, Taupin JD, Vannier O, Vincendon B, Wijbrans A (2014) Multi-scale hydrometeorological observation and modelling for flash flood understanding. Hydrol Earth Syst Sci 18 (9): 3733–3761. doi: 10.5194/hess-18-3733-2014. http://www.hydrol-earth-syst-sci.net/18/3733/2014/
- Carreau J, Bengio Y (2009) A hybrid Pareto model for asymmetric fat-tailed data: the univariate case. Extremes 12(1):53–76CrossRefGoogle Scholar
- Carreau J, Vrac M (2011) Stochastic downscaling of precipitation with neural network conditional mixture models. Water Resour Res 47(10)Google Scholar
- Carreau J, Neppel L, Arnaud P, Cantet P (2013) Extreme rainfall analysis at ungauged sites in the South of France: comparison of three approaches. J de la Société Française de Stat 154(2):119–138Google Scholar
- Ceresetti D, Ursu E, Carreau J, Anquetin S, Creutin J-D, Gardes L, Girard S, Molinie G (2012) Evaluation of classical spatial-analysis schemes of extreme rainfall. Nat Hazards Earth Syst Sci 12:3229–3240CrossRefGoogle Scholar
- Chandler RE, Wheater HS (2002) Analysis of rainfall variability using generalized linear models: a case study from the west of Ireland. Water Resour Res 38(10):10CrossRefGoogle Scholar
- Cleveland WS (1981) LOWESS : a program for smoothing scatterplots by robust locally weighted regression. Am Stat 35:54CrossRefGoogle Scholar
- Coles S (2001) An introduction to statistical modeling of extreme values., Springer series in statisticsSpringer, New YorkCrossRefGoogle Scholar
- Coles S, Heffernan J, Tawn J (1999) Dependence measures for extreme value analyses. Extremes 2(4):339–365CrossRefGoogle Scholar
- Delrieu G, Nicol J, Yates E, Kirstetter P-E, Creutin J-D, Anquetin S, Obled C, Saulnier G-M, Ducrocq V, Gaume E, Payrastre O, Andrieu H, Ayral P-A, Bouvier C, Neppel L, Livet M, Lang M, du Châtelet JP, Walpersdorf A, Wobrock W (2005) The catastrophic flash-flood event of 8–9 september 2002 in the Gard region, France: a first case study for the Cévennes-Vivarais Mediterranean Hydrometeorological Observatory. J Hydrometeorol 6(1):34–52CrossRefGoogle Scholar
- Demarta S, McNeil AJ (2005) The t copula and related copulas. Int Stat Rev 73(1):111–129CrossRefGoogle Scholar
- Ducrocq V, Nuissier O, Ricard D, Lebeaupin C, Thouvenin T (2008) A numerical study of three catastrophic precipitating events over southern France. II: mesoscale triggering and stationarity factors. Q J R Meteorol Soc 134(630):131–145CrossRefGoogle Scholar
- Dupuis DJ, Tawn JA (2001) Effects of mis-specification in bivariate extreme value problems. Extremes 4(4):315–330CrossRefGoogle Scholar
- Dupuis DJ (2007) Using copulas in hydrology: benefits, cautions, and issues. J Hydrol Eng 12(4):381–393CrossRefGoogle Scholar
- Embrechts P, McNeil A, Straumann D (2002) Correlation and dependence in risk management: properties and pitfalls. Risk Manag, pp 176–223Google Scholar
- Flecher C, Naveau P, Allard D, Brisson N (2010) A stochastic daily weather generator for skewed data. Water Resour Res 46(7)Google Scholar
- Frayler C, Raftery AE (1999) MCLUST: software for model-based cluster analysis. J Classif 16:297–306CrossRefGoogle Scholar
- Garavaglia F, Gailhard J, Paquet E, Lang M, Garçon R, Bernardara P (2010) Introducing a rainfall compound distribution model based on weather patterns sub-sampling. Hydrol Earth Syst Sci Discuss 14:951CrossRefGoogle Scholar
- Gardes L, Girard S (2010) Conditional extremes from heavy-tailed distributions: an application to the estimation of extreme rainfall return levels. Extremes 13(2):177–204CrossRefGoogle Scholar
- Gaume E, Bain V, Bernardara P, Newinger O, Barbuc M, Bateman A, Blaškovičová L, Blöschl G, Borga M, Dumitrescu A et al (2009) A compilation of data on European flash floods. J Hydrol 367(1):70–78CrossRefGoogle Scholar
- Genest C, Favre A-C (2007) Everything you always wanted to know about copula modeling but were afraid to ask. J Hydrol Eng 12(4):347–368CrossRefGoogle Scholar
- Genest C, Huang W, Dufour J-M (2013) A regularized goodness-of-fit test for copulas. J de la Société Française de Statistique & revue de statistique appliquée 154(1):64–77Google Scholar
- Gilleland E, Katz RW (2011) New software to analyze how extremes change over time. EOS, Trans Am Geophys Union 92(2):13–14CrossRefGoogle Scholar
- Gräler B (2014) Modelling skewed spatial random fields through the spatial vine copula. Spat Stat 10:87–102CrossRefGoogle Scholar
- Guillot G, Lebel T (1999) Approximation of Sahelian rainfall fields with meta-gaussian random functions. Stoch Environ Res Risk Assess 13(1–2):113–130CrossRefGoogle Scholar
- Hughes JP, Guttorp P, Charles SP (1999) A non-homogeneous hidden markov model for precipitation occurrence. Appl Stat 48:15–30Google Scholar
- Joe H (1997) Multivariate models and multivariate dependence concepts, vol 73. CRC Press, Boca RatonCrossRefGoogle Scholar
- Kleiber W, Katz RW, Rajagopalan B (2012) Daily spatiotemporal precipitation simulation using latent and transformed Gaussian processes. Water Resour Res 48(1):1CrossRefGoogle Scholar
- Kojadinovic I, Yan J (2010) Modeling multivariate distributions with continuous margins using the copula R package. J Stat Softw 34 (9): 1–20. http://www.jstatsoft.org/v34/i09/
- Kollo T, Selart A, Visk H (2013) From multivariate skewed distributions to copulas. In: Combinatorial matrix theory and generalized inverses of matrices. Springer, New York, pp 63–72Google Scholar
- Lebel T, Laborde JP (1988) A geostatistical approach for areal rainfall statistics assessment. Stoch Hydrol Hydraul 2(4):245–261CrossRefGoogle Scholar
- Leblois E, Creutin JD (2013) Space-time simulation of intermittent rainfall with prescribed advection field: adaptation of the turning band method. Water Resour Res, pp n/a–n/a. ISSN 1944-7973. doi: 10.1002/wrcr.20190
- Lennartsson J, Baxevani A, Chen D (2008) Modelling precipitation in Sweden using multiple step markov chains and a composite model. J Hydrol 363(1):42–59CrossRefGoogle Scholar
- Li C, Singh VP, Mishra AK (2012) Simulation of the entire range of daily precipitation using a hybrid probability distribution. Water Resour Res 48(3)Google Scholar
- Lobligeois F, Andréassian V, Perrin C, Tabary P, Loumagne C (2014) When does higher spatial resolution rainfall information improve streamflow simulation? An evaluation using 3620 flood events. Hydrol Earth Syst Sci 18 (2): 575–594. doi: 10.5194/hess-18-575-2014. https://hal.archives-ouvertes.fr/hal-00952657
- Neppel L, Pujol N, Sabatier R (2011) A multivariate regional test for detection of trends in extreme rainfall: the case of extreme daily rainfall in the French Mediterranean area. Adv Geosci 26(26):145–148CrossRefGoogle Scholar
- Patil SD, Wigington Jr. PJ, Leibowitz SG, Sproles EA, Comeleo RL (2014) How does spatial variability of climate affect catchment streamflow predictions? J Hydrol 517(0): 135–145. ISSN 0022-1694. doi: 10.1016/j.jhydrol.2014.05.017. http://www.sciencedirect.com/science/article/pii/S0022169414003710
- Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131CrossRefGoogle Scholar
- Poon S-H, Rockinger M, Tawn J (2004) Extreme value dependence in financial markets: diagnostics, models, and financial implications. Rev Financ Stud 17(2):581–610CrossRefGoogle Scholar
- Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Sabourin A, Naveau P (2014) Bayesian Dirichlet mixture model for multivariate extremes: a re-parametrization. Comput Stat Data Anal 71:542–567CrossRefGoogle Scholar
- Salvadori G, De Michele C (2010) Multivariate multiparameter extreme value models and return periods: a copula approach. Water Resour Res 46(10)Google Scholar
- Schoelzel C, Friederichs P (2008) Multivariate non-normally distributed random variables in climate research-introduction to the copula approach. Nonlinear Process Geophys 15(5):761–772CrossRefGoogle Scholar
- Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefGoogle Scholar
- Serinaldi F (2009) Copula-based mixed models for bivariate rainfall data: an empirical study in regression perspective. Stoch Environ Res Risk Assess 23(5):677–693CrossRefGoogle Scholar
- Serinaldi F, Kilsby CG (2014) Simulating daily rainfall fields over large areas for collective risk estimation. J Hydrol 512:285–302CrossRefGoogle Scholar
- Serinaldi F, Bárdossy A, Kilsby CG (2014) Upper tail dependence in rainfall extremes: would we know it if we saw it? Stoch Environ Res Risk Assess, pp 1–23Google Scholar
- Sklar M (1959) Fonctions de répartition à n dimensions et leurs marges. Université Paris 8Google Scholar
- Stephenson AG (2002) Evd: extreme value distributions. R News 2 (2): 0, June 2002. http://CRAN.R-project.org/doc/Rnews/
- Thibaud E, Mutzner R, Davison AC (2013) Threshold modeling of extreme spatial rainfall. Water Resour Res 49(8):4633–4644CrossRefGoogle Scholar
- Thompson CS, Thomson PJ, Zheng X (2007) Fitting a multisite daily rainfall model to New Zealand data. J Hydrol 340:25–39CrossRefGoogle Scholar
- Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York. http://www.stats.ox.ac.uk/pub/MASS4. ISBN 0-387-95457-0
- Vischel T, Lebel T, Massuel S, Cappelaere B (2009) Conditional simulation schemes of rain fields and their application to rainfall-runoff modeling studies in the Sahel. J Hydrol 375:273–286CrossRefGoogle Scholar
- Vrac M, Naveau P, Drobinski P (2007) Modeling pairwise dependencies in precipitation intensities. Nonlinear Process Geophys 14(6):789–797CrossRefGoogle Scholar
- Wilks DS (1998) Multisite generalization of a daily stochastic precipitation generation model. J Hydrol 210(1):178–191CrossRefGoogle Scholar
- Zareifard H, Khaledi MJ (2013) Non-Gaussian modeling of spatial data using scale mixing of a unified skew Gaussian process. J Multivar Anal 114:16–28CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.