## Abstract

In drought frequency analysis, as the number of drought variables increases, the joint behavior between these variables needs to be studied. Therefore, this study aims to develop a flexible four-variate joint distribution function of the regional stochastic nature of drought. Using run theory, drought duration, severity, peak, and inter-arrival time were abstracted from the Standardized Precipitation Evapotranspiration Index (SPEI) aggregated at six months, observed in mainland China between 1961 and 2013. As these drought variables showed significant dependence properties and followed different marginal distributions, we employed and compared six four-variate symmetric and asymmetric Archimedean copulas (i.e., Frank, Clayton, Gumbel–Hougaard). The best-fitting model for each region was carefully selected using RMSE, AIC, and BIAS goodness-of-fit tests. Results revealed that the empirical and theoretical probabilities of the symmetric Clayton in regions NE (Northeast), CS (Central and Southern China), EMC (Entire China), and symmetric Frank in regions NC (North China), SC (South China), IM (Inner Mongolia), NW (Northwest), TP (Tibet Plateau) agreed well. Symmetric Frank copula was considered the best-fit for station-based drought analysis in EMC. Based on these copulas, the drought probabilities and return periods for the occurrence of drought events over the next 5, 10, 20, 50, and 100 years in each region were hereby comprehensively explained, and the results shown here could be helpful in the appraisal of the adequacies of water supply systems under drought conditions in all regions. This study showed that a four-variate copula approach is a vital tool for probabilistic interpretation of hydrological and meteorological data in the different climatic region of mainland China.

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## 1 Introduction

Drought-affected areas have remarkably increased over the past 50 years in China as a result of variation in precipitation (*P*) and temperature (Yang et al. 2013; Leng et al. 2015; Chang et al. 2016). In fact, a noticeable extreme winter-spring droughts events occurred in southwest China from August 2009 to May 2010 (Lu et al. 2011; Lu et al*.* 2012; Zhang et al. 2012; Yang et al. 2013), while in 2011 the middle and lower reaches of the Yellow River were impacted by the spring-summer drought (Lu et al. 2013; Zhao et al. 2013; Chang et al. 2016). Droughts have caused greatest damages in China during 1949–1995, and many damages have resulted to economic losses more than the US $12 billion (Xu et al., 2015a; Qin et al. 2015; Chang et al. 2016). Consequently, droughts are of prominent concern in the outlining and control of water resources. Studying meteorological droughts is necessary by itself, and also because they act as antecedents to the longer-lasting and more significant agricultural and hydrological droughts (Haslinger et al., 2014, Wilhite et al., 2014).

The popular method of assessing drought risk requires the calculation of the probability that a specific value of drought variable will be exceeded, which is equal to the evaluation of the recurrence interval (Prohaska et al. 2008; Chen et al. 2012). This procedure is normally focused on the univariate frequency analysis. In this analysis, the authors have concentrated on some characteristic of drought and assume that the other variables are constant. A direct attempt to simplify the multi-dimensional character of droughts has been investigated by several authors (Shiau, 2006; Ganguli and Reddy, 2014; Huang et al., 2014a). Although the simplistic analysis produces practical results, it does not strictly represent the actual mathematical interactions between the various dimensions of the phenomenon and obviously cannot give accurate results. Since drought variables are usually reliant on one other, a proper frequency analysis of droughts should consider such dependencies within a suitable multivariate framework.

It should be mentioned that the attention of researchers in multivariate modeling has increased considerably in the last decades, due to the application of copulas. The comprehensive theoretical structure proposed by Sklar (1959) can be gathered from Nelsen (2006), Joe (1997) and Salvadori et al. (2007). There have been diverse cases of copula applications in the setting of drought management (Serinaldi et al., 2009; Shiau and Modarres, 2009; Kao and Govindaraju, 2010; Wong et al., 2010; Song and Singh, 2010; Mishra and Singh, 2010; Mirabbasi et al., 2012). This includes return-period estimation (Salvadori and De Michele, 2010), multivariate simulation (AghaKouchak et al. 2010), propose a new drought indicator (Kao and Govindaraju, 2010), and many other theoretical studies of multivariate severe issues (Salvadori and De Michele, 2015; Zhang et al. 2015a; She et al. 2016). There are several types of copula functions, which have been described (e.g. Nelsen, 1999). The regularly applied copula for hydrological analysis belongs to four classes which includes: Archimedean class (AMH, Frank, Clayton, Gumbel, and Joe), elliptical class (student t and normal), extreme value class (Gumbel, Galambos, Tawn, Husler-Reiss, and t-EV), and miscellaneous class (Farlie–Gumbel–Morgenstern and Plackett) (Shiau and Modarres, 2009; Mirabbasi et al. 2012; Lee et al. 2013; Xu et al., 2015b). The Archimedean class of copulas are characterized by their simple structure and strong representativeness (Huang et al., 2014a; Tsakiris et al., 2016).

There are many applications of two-variate copula-based drought studies in many countries. For instance in China, Shiau (2006) built a joint distribution between drought severity and duration in Southern Taiwan while Shiau and Modarres (2009) employed Clayton copula to model drought severity and duration frequency curves for two climatic regions in Iran. Tsakiris et al. (2016) analyzed drought severity and areal extent utilizing Gumbel-Hougaard copula in Greece. The Frank and Gumbel–Hougaard have been considered the best copulas for illustrating drought occurrence frequency in Canada and Iran, respectively (Lee et al. 2013). Meta-Gaussian copula was applied for the joint modeling of drought variables within Texas (Song and Singh, 2010) while drought duration and severity of Sharafkhaneh in the northwest of Iran was modeled using Galambos copula (Mirakbari et al., 2010). In each of the cases, some bivariate probabilistic properties of droughts were investigated.

Three-variate Archimedean copulas have been employed for the joint distribution of rainfall, drought and flood events (Grimaldi and Serinaldi, 2006; Zhang and Singh 2007). Using asymmetric Archimedean copulas, Serinaldi and Grimaldi (2007) described an inference procedure to carry out a three-variate frequency analysis. Ma et al. (2013) created a joint distribution of drought duration, severity, and peak using elliptical, asymmetric and symmetric Archimedean copulas. Because of the flexibility of meta-elliptical copula (Fan et al. 2016), many meta-elliptical, Clayton, Frank, Gumbel-Hougaard, and Ali-MikhailHaq copulas have been employed to create a joint distribution of drought severity, duration, and interval time, and the best copula was chosen (Song and Singh, 2010).Three-variate Plackett copulas have been adopted for the investigations of extreme rainfall cases (Kao and Govindaraju, 2008) and for modeling a joint distribution of drought duration, severity and inter-arrival time (Song and Singh, 2010). Student’s *t* copula has been utilized to identify three-variate drought events under the control of La-Nina, EL-Nino, and natural climate state in New South Wales, Australia (Wong et al. 2010). Other studies on drought frequency analysis using copula includes Huang et al. (2014a, b), Zhao et al. (2015), Zhang et al. (2015b), Zhang et al. (2017) among others.

Crucial hurdles remain not only in drought characterization but also in the interpretation of drought variables to information relevant for regional monitoring, early warning, and water resources planning. Assuming that the territorial unit affected by drought is the entire region, there are two variables which together characterize each drought event; the severity and the duration. If the peak and interarrival time of drought are considered as an additional variable, a four-dimensional copulas approach may be used in which duration, severity, peak and interarrival time of droughts can be jointly analyzed. Recently, the applications of four-variate copula have been reported. De Michele et al. (2007) proposed a procedure towards building four-variate distributions, stating two copulas for a separate two-variate candidate case and utilizing the approach to give a four-variate nature of sea state.

Employing a four-variate student copula, Serinaldi et al. (2009) built the corresponding joint distributions of drought length, mean, minimum SPI values, and drought mean areal extent to investigate their joint probabilistic characteristics. So far, most of the research focused on parametric two-variate and symmetric three-variate copulas (Ganguli and Reddy, 2014), but four-variate copulas are somewhat limited because their constructions are very complicated. Although there have been some attempts to construct multivariate extensions of two-variate Archimedean copula (Embrechts et al., 2003; Whelan, 2004; Savu and Trede, 2010), however such studies seldom compares the performances of four-variate symmetric and asymmetric copulas in different climatic regions of China, considering probability and return periods of drought events. Further, knowing that droughts are multiplex phenomena, two- and three-variate analysis cannot give an exhaustive appraisal of droughts because the investigation could lead to a deficient drought probabilistic estimation (De Michele et al. 2005; Grimaldi and Serinaldi, 2006; Chebana and Ouarada, 2011; Ganguli and Reddy, 2014; Xu et al., 2015a; Tsakiris et al. 2016).

Given the challenges as mentioned above in drought characterization, a focused research is required to comprehensively analyze and delineate drought trends and spatiotemporal patterns more flexibly and intuitively. It is, therefore, the objective of this paper to produce a methodology using the four-variate copulas approach for analyzing the multivariate drought frequency of droughts. In particular, we built a joint distribution utilizing drought duration (D*d*), severity (D*s*), peak (D*p*) and inter-arrival time (D*i*), based on four-variate symmetric (Clayton, Frank, and Gumbel) and asymmetric (Clayton, Frank, and Gumbel) Archimedean Copulas. We showed how it could be used (i) to observe the temporal evolution of a drought event, and (ii) to perform a real-time appraisal of drought in seven climatic regions of mainland China over 1961–2013. The risks of drought events have been described based on joint probability and return periods, which has become a conventional method for a risk-based plan of water resources structures.

The paper is prepared as follows. Section 2 describes details about the study area and data, Standardized Precipitation Evapotranspiration Index (SPEI) calculation and estimation of *Dd*, *Ds*, *Dp,* and D*i*. Section 3 introduces the four-variate symmetric and asymmetric Archimedean copula. Section 4 compares the performance of the copulas. Section 5 selects the best-fitted copula to build a joint distribution of *Dd*, *Ds*, *Dp,* and D*i* in seven climatic regions of China. Section 6 contains some concluding remarks.

## 2 Study Region and Data

Geographically, China lies between 15^{0}N - 60^{0}N and 75°E - 135°E. By its geographical extent, China is endowed with diverse landforms that include hills, mountains, high plateaus and deserts in the western reaches, while in the central and east areas, the land slopes into broad plains and deltas. From the higher elevations in the west, thousands of rivers drain the country eastwards; the most significant are the Yangtze (i.e., the longest river in China and third longest in the world), Heilong (Amur), Mekong, Pearl, and Yellow river. The climate of China fluctuates significantly from one region to another because of its tremendous space, complex topography as well as variation in monthly *P* and temperature (Zhai *et al.* 2010; Wu *et al.* 2011). The northern China and western are influenced by dry climate while humid and semi-humid predominates over eastern China (Wu *et al.* 2011).

Based on topography and climate, China is divided into seven climatic regions (Zhao, 1983; Ayantobo *et al.* 2017). Northeast humid/semi-humid warm region (NE, 72 stations), North China humid/semi-humid temperate zone (NC, 104 stations), Central and Southern China humid subtropical zone (CS, 165 stations), South China humid tropical zone (SC, 57 stations), Inner Mongolia steppe zone (IM, 44 stations), Northwest desert areas (NW, 61 stations), and Qinghai-Tibet Plateau (TP, 49 stations) (Fig. 1). Regions NE, NC, CS, SC are associated with Eastern Monsoon. NW is an arid region, and the replenishment of water in this region is mostly from melting glacial and perennial frozen soil, not from *P*.

The dataset employed in this research to appraise drought events are daily and monthly weather data from 552 national basic meteorological stations in mainland China from 1961 to 2013. These 53-year records were obtained from the China Meteorological Data Sharing Network. From Fig. 1, the meteorological stations are not uniformly spread as more are situated within the east than western regions particularly the Qinghai-Tibet Plateau. The non-parametric tests were used to cross-check the reliability and quality of the climatic data. According to Helsel and Hirsch (1992), Kendall autocorrelation test, Mann-Kendall trend test, and Mann-Whitney homogeneity tests for mean and variance were employed to test randomness, homogeneity, and absence of trends. The results showed that randomness and stationarity of the weather data were fixed between the critical points (5% statistical significance level).

## 3 Methods

### 3.1 Drought Indices and Drought Identification

Several indices have been suggested to identify and observe droughts, and there is a detailed article on modeling drought events utilizing these indices (Mishra and Singh, 2010; Zargar et al. 2011). Indices based exclusively on precipitation, do not consider the complexity of land surface processes and cannot directly account for the impacts of evapotranspiration on soil moisture. This may be a particularly serious disadvantage under warming conditions or other changes in regional hydrology. Based on these considerations, since SPEI have been widely applied in various copula-based drought-related studies in China (Liu et al. 2015; Xu et al., 2015b; Wang et al. 2015; Zhang et al. 2015b; Hao et al. 2017), it is therefore employed for this study. The SPEI is a multi-scalar drought index, recommended as an enhancement of the SPI (Vicente-Serrano et al. 2010). Firstly, in SPEI estimation, the simple water balance is calculated, which is described as a deviation in *P* and potential evapotranspiration (PET) for i^{th} month as follows:

PET is estimated according to the FAO-56 Penman-Monteith equation (Allen et al. 1998). The World Meteorological Organization commonly approves this approach, and the calculation is based on the minimum and maximum air temperature, wind speed, and sunshine duration. The D_{i} values are synthesized in each time scale as:

where k is the time scales of synthesis, and n is the month used for calculation.

Next, D_{i} is fitted with the three-parameter log-logistic distribution. Finally, the SPEI is obtained as standardized values, and details of the estimation can be found in Vicente-Serrano et al. (2010). The mean value of the SPEI is 0, and the standard deviation is 1. In this study, SPEI at the 6-month timescale, denoted by SPEI-6 is estimated for each station and seven regions. SPEI-6 is most useful for describing the shallow soil moisture available to crops (Reddy and Singh, 2014; Abdi et al. 2016).

Using the theory of runs (Fig. 2), drought identification is achieved by considering drought years as a continuous period where SPEI-6 is below 0 (Shiau, 2006). Next, significant drought variables which include D*d*, D*s*, D*p,* and D*i*, are extracted. The definitions of D*d*, D*s*, and D*p*, can be found in Shiau (2006), Mishra and Singh (2010) and Ayantobo et al. (2017) while D*i* is described as the interval between the commencement of a drought to the start of the subsequent drought (Song and Singh, 2010). Water manager needs to be aware of the risk of having drought scenarios of different D*d*, D*s*, D*p*, and D*i* during drought scenarios. Therefore, a four-variate distribution must be constructed using copula functions.

### 3.2 Marginal Distribution of Drought Variable

The univariate distribution forms the framework of four-variate analysis using copulas. The marginal distributions for D*d*, D*s*, and D*p* for different climatic regions across mainland China have been documented (Ayantobo et al. 2017). In the case of D*i*, ten marginal distributions which include exponential, two-parameter exponential, Weibull, three-parameter Weibull, generalized extreme value, inversion Gaussian, three-parameter inversion Gaussian, generalized Pareto, gamma, and three-parameter gamma are compared to choose the best distribution to fit D*i*.

The Kolmogorov–Smirnov (KS) and Anderson-darling (AD) tests are employed as a goodness-of-fit test, and the principle that the threshold value should be as small as possible to preserve the largest sample is employed (Shiau and Modarres, 2009; Song and Singh, 2010; Abdi et al. 2016). The fitted distributions of D*i* are compared against the empirical non-exceedance probabilities estimated utilizing Gringorten’s plotting position function as follows (Gringorten, 1963; Cunnane, 1978; Song and Singh, 2010):

Here, *n* is sample size, N_{m} is the amount of *x*_{i} regarded as *x*_{j}≤*x*_{i}, *i* = 1, …, *n*, 1≤*j*≤*i*. Utilizing the observed data (*x*), the marginal parameters are calculated employing the maximum likelihood estimation.

### 3.3 Empirical Four-Variate Distribution of Drought Variables

The empirical four-variate joint probability distribution of *Dd*, *Ds*, *Dp*, and *Di* can be modified from Song and Singh (2010):

Here, n is size of the sample. *N*_{mnlp} is amount of (*x*_{i}, *y*_{i}, *z*_{i}, *k*_{i}) regarded as *x*_{j} ≤ *x*_{i}, *y*_{j} ≤ *y*_{i}, *z*_{j} ≤ *z*_{i}, and *k*_{j} ≤ *k*_{i}, *i* = 1, …, *n*, 1≤*j*≤*i*.

### 3.4 Joint Cumulative Probability Distribution of Drought Variables

Assuming four correlated random variables of *Dd*, *Ds*, *Dp*, and *Di* are represented using X_{1}, X_{2}, X_{3}, and X_{4}; and \( {u}_1={F}_{x_1}\left({x}_1\right) \), \( {u}_2={F}_{x_2}\left({x}_2\right) \), \( {u}_3={F}_{x_3}\left({x}_3\right) \), and \( {u}_4={F}_{x_4}\left({x}_4\right) \) denote their cumulative distribution function (CDF), respectively. Therefore, their joint CDF could be represented as (Song and Singh, 2010):

### 3.5 Modeling Four-Variate Drought Variables Using Copulas

For an explicit hydrological application of copulas, readers are referred to Joe (1997), Nelsen (2006), Genest and Favre (2007) and Salvadori et al. (2007). By way of definition, copulas are functions that combines marginal CDF’s to form multivariate CDF. This function allows one to characterize dependence properties more flexibly by separating effects of dependence from effects of margins. Assuming *F*_{1, 2, …. , n}(*x*_{1}, *x*_{2}, …., *x*_{n}) is many variables CDF of *n* associated arbitrary variables of X_{1}, X_{2}, . . . ., X_{n} with the corresponding candidate CDF *F*_{1}(*x*_{1}), *F*_{2}(*x*_{2}), …. , *F*_{n}(*x*_{n}), next the n-dimensional CDF including univariate distributions of *F*_{1}(*x*_{1}), *F*_{2}(*x*_{2}), …. , *F*_{n}(*x*_{n}) could be written as shown below (Sklar 1959; Nelsen, 2006; Ganguli and Reddy 2014):

Here, *C* is a *d*-dimensional copula in the form: [0, 1] d → [0, 1], with association parameter ** θ**;

*F*

_{k}(

*x*

_{k}) =

*u*

_{k}for

*k*= 1, . . .,

*n*;

*U*

_{k}~

*U*(0, 1).

Archimedean copulas have been utilized because the functions can be conveniently built, and can capture a wide range of dependence structures with many satisfactory properties. According to Joe (1997), Nelsen (1999), Embrechts et al. (2003), Whelan (2004), and Grimaldi and Serinaldi (2006) the generalization of *n*-dimensional Archimedean copula could be represented as (Chen et al. 2012):

Here, φ is the copula generator. As observed, \( {\varphi}_1,\dots, {\varphi}_{n^{-1}}=\varphi, \) Eq. (7) becomes Archimedean *n*-copula and can be rewritten as (Chen et al. 2012):

Many different parametric families of copulas are commonly used in dependence analysis. The two most frequently used are elliptical copulas and the Archimedean copulas. In this study, we consider three families of four-variate symmetric and asymmetric Archimedean copulas, paying particular attention to Clayton, Frank, and Gumbel (Table 1), because they are simple and easy to generate (Nelsen, 2006; Chen et al. 2012). These copulas are compared to ascertain the excellent copula for modeling drought variables across various climatic regions of mainland China.

### 3.6 Copula Parameters Estimation

The parameters of Clayton, Frank and Gumbel copulas are calculated by employing the curve fitting method (CFM). The CFM is the minimum standards of squared residuals and determined as follows (Ganguli and Reddy, 2014):

Where *n,* x_{c}(*i*)*,* x_{0}(*i*)are sample size, i^{th} calculated value, i^{th} observed value respectively.

### 3.7 Selection of Appropriate Copula Function

The goodness-of-fit assessment indices (i.e., Root mean square error (RMSE), Akaike information criterion (AIC) and BIAS) are employed to select the best-fitted copula function (Akaike, 1974; Zhang and Singh, 2006, 2007; Genest and Favre, 2007).

Where E [** ∘**] = Mean square error (MSE) is the expectation operator;

*n,*x

_{c}(

*i*)

*,*x

_{0}(

*i*)are sample size, i

^{th}calculated value, i

^{th}observed value respectively and k is the number of parameters utilized in getting the computed value. The copula with least values of AIC and RMSE is selected as the best-fitted fuction.

Also, using the observed data (*u*), the empirical copula,*C*_{n}(*u*), and fitted parametric copula, *C*_{p}(*u*), are compared to select the best-fitted function. Constructing*C*_{n}(*u*) requires: (1) ranking *Dd*, *Ds*, D*p* and *Di* variables considering their pseudo-observations (*U*_{i,1}, *U*_{i,2}, *U*_{i,3}, *U*_{i,4}), (2) the empirical CDFs are constructed based on the ranks and (3) the empirical four-variate copula are calculated using the empirical CDFs (Nelsen, 2006; Genest and Favre, 2007):

Where I (A) is the pointer variable of function A which gives a number of 1 if A is valid and 0 if A is invalid. Ranks of i^{th} *Dd*, *Ds*, D*p* and *Di* value are given as *R*_{i, 1}, *R*_{i, 2}, *R*_{i, 3} and *R*_{i, 4} respectively (Mirabbasi et al., 2012). The Quantile-Quantile (Q-Q) plot is then used to compare the closeness between *C*_{n}(*u*) and *C*_{p}(*u*). graphically.

### 3.8 Four-Variate Joint Drought Frequency Analysis

According to Salvadori and De Michele (2004), in a multivariate situation such that X_{1}, X_{2},. . .,X_{d} exceeds their corresponding thresholds (*X*_{1} *> x*_{1}, *. . .*, *X*_{d} *> x*_{d}), the joint probability and return period of all drought events can be computed. In this study, the four-variate joint probabilities, *P*_{DSPI}, of (*Dd* ≥ *d*, *Ds* ≥ *s*, *Dp* ≥ *p*, *Di* ≥ *i*) can be calculated utilizing the copula-based procedure modified from Shiau (2006), Ganguli and Reddy (2014):

Analogous to return periods, the four-variate joint return periods, *T*_{DSPI}, of (*Dd* ≥ *d*, *Ds* ≥ *s*, *Dp* ≥ *p*, *Di* ≥ *i*) can be estimated as follows:

In the equations above, *ζ* = *N/q*, *N* = period of SPEI time-series (years), *q* = amount of drought events in *N* years. *C*_{DSPI}(*d*, *s*, *p*, *i*)= four-variate joint distributions, *P*_{DSPI} and *T*_{DSPI}, expresses the joint probability and return periods of occurrence of *Dd* or *Ds* or *Dp* or *Di* exceeding a particular value of *d*, *s*, *p* and *i*, respectively.

## 4 Results and Discussions

### 4.1 Drought Properties and Joint Dependence

The statistical description of D*i* extracted from SPEI-6 time series are shown in Table 2. The average regional values of D*i* across NE, NC, CS, SC, IM, NW, TP, and EMC during 1961–2013 were 8.9, 11.7, 12.4, 10.1, 10.1, 14.7, 9.9, and 12.1 months, respectively. Further, the qualitative joint dependence of *Di* against D*i*, *Ds*, and *Dp* was examined by employing a graphical tool in the form of a scatter plot matrix (Salvadori et al. 2007; Genest and Favre, 2007). According to Serinaldi et al. (2009), the scatter plot helps to synthesize information on dependence structure and marginals. The joint dependence between D*i*, D*s* and D*p* have been studied (Ayantobo et al. 2017) and would not repeat again here. The joint dependence of *Di* against D*i*, *Ds*, and *Dp* in regions NE, NC, CS, SC, IM, NW, TP, and EMC are shown in Fig. 3. Figure 3 gave a visual judgment on the pair-wise dependence between drought variables (Genest and Favre, 2007). Although some regions showed the collection of points in the lower left corner, an uphill pattern relationship was observed from left to right. For each pair of variables, a growing positive linear relationship existed.

The Kendall’s k, Spearman’s p, and Pearson’s classical correlation coefficients r are applied to test the strength of the relationship of drought variables, and their respective values in regions NE, NC, CS, SC, IM, NW, TP and EMC are shown (Table 3). Positive and significant associations between drought variables were observed. The correlations between *Dd* and *Ds* as well as *Ds* and *Dp* were high while the correlations between *Dd* and *Dp* were small. The dependence analysis demonstrated it in Fig. 3 and Table 3 that the mutual dependence of drought variables are tremendous and hence suitable for building a regional joint distribution using copula functions.

### 4.2 Marginal Distribution for Inter-Arrival Time

Different marginal distributions have been recommended as a uniform procedure to fit *Dd*, *Ds*, and *Dp* (Shiau and Modarres, 2009; Song and Singh, 2010; Wong et al. 2010; Lee et al. 2013). The marginal distributions and associated parameters for *Dd*, *Ds*, and *Dp* in different regions of mainland China have been documented (Ayantobo et al. 2017). These distributions showed that the empirical and theoretical distributions had an excellent agreement. In the case of D*i*, many marginal distributions were also compared and the goodness of fit evaluated. According to Table 4, the KS and AD values were compared for each distribution, and their respective ranks in each region are shown. It was demonstrated that the KS was better than the AD test. Therefore, based on the KS test, the generalized Pareto was the best for regions NE, NC, CS, SC, NW, TP and EMC while gamma was the best for region IM because they had the smallest values. As given in Table 5, the parameters of the distributions were calculated using the MLE and after that used to fit the drought data.

### 4.3 Copula-Based Four-Variate Joint Distributions

Table 6 presents the copula parameters, AIC, RMSE and BIAS values in regions NE, NC, CS, SC, IM, NW, TP and EMC used to compare the performances of the symmetric and asymmetric copulas. It could be observed that the copula parameters are positive; connoting that four-variate dependence existed between variables of *Dd*, *Ds*, *Dp*, and D*i*. Generally, it was shown that symmetric copulas fit better than the asymmetric copulas because they had the lowest RMSE and AIC values in most regions. In the symmetric class, the RMSE and AIC values of Clayton and Frank were more or less the same but gave different fittings in the different region. Clayton copula was best for regions NE, CS and EMC while Frank copula was the best for NC, SC, IM, NW, and TP because they had the smallest values of RMSE and AIC.

The observed and theoretical probabilities of the symmetric and asymmetric copula functions are compared in Fig. 4. The figure showed that the theoretical probabilities for the Clayton in regions NE, CS, EMC, and Frank copula in NC, SC, IM, NW, TP fit the observed values very well.

Further, the theoretical joint probabilities of the occurrence orders of the observed values were evaluated. As shown in Fig. 5, the theoretical joint probabilities were sketched against empirical probabilities, which showed that there were significant differences between empirical and theoretical joint probabilities. The plots of all the copulas deviate from the 45^{0} diagonal line uniquely. The maximum deviations from the diagonal line were observed for Gumbel plots, showing that the functions might not be suitable for building joint dependence. The symmetric Clayton in regions NE, CS and EMC, and symmetric Frank in regions NC, SC, IM, NW, and TP exhibited good agreement between the theoretical and empirical probabilities, and seemed very satisfying in modeling drought variables. Therefore, these copulas were thereafter selected for regional drought frequency analysis.

In the same way, the selected copulas were compared in each station, and the RMSE and AIC goodness of fit test results are presented in Fig. 6. It could be observed that the average AIC (Fig. 6a) and RMSE values (Fig. 6b) of all the 552 stations for symmetric Frank copula were smaller than that of Clayton and Gumbel copulas. Therefore, the four-variate symmetric Frank copula was selected for the station-based drought frequency analysis across EMC.

### 4.4 Four-Variate Joint Drought Frequency Analysis

#### 4.4.1 Regional Four-Variate Probabilities of Drought Events

The average historical drought events for regions NE, NC, CS, SC, IM, NW, TP, and EMCas well as their corresponding 5-, 10-, 20-, 50- and 100-years univariate return periods defined by separate *d*, *s*, *p* and *i* were estimated and presented in Fig. 7. Next, the *P*_{DSPI}in each of these regions were obtained using Eq. (14). Fig. 7 shows the *P*_{DSPI}, representing the historical, 5-, 10-, 20-, 50- and 100-year return periods to provide adequate information about the potential drought risk associated with D*d*, D*s*, D*p* and D*i* across different regions. Overall, as the return period increased, the trend of the joint probability decreased. Hence, the simultaneous existence of higher degree drought events in different regions was less frequent as the year increases. For example, in sub-region I, considering 20- years univariate return period, the values of *d*, *s*, *p*, and *i*, were 13.56 months, 17.46, 2.44 and 20.73 months, respectively. The *P*_{DSPI}(*Dd* ≥ 13.56, *Ds* ≥ 17.46, *Dp* ≥ 2.44, *Di* ≥ 20.73)was 0.124.

These results could provide useful hints in appraising the drought risk in different regions. For instance, regions CS and NE had the highest probability of about 0.57, indicating that drought management and planning are needed within these regions. It was shown that regions SC and IM had a low probability of 0.48, which meant a low combination risk. The joint probability not only confirmed the occurrence of regional drought events but also gave a quantitative approach to analyze the probability of drought under varying *d*, *s*, *p* and *i* situations.

#### 4.4.2 Regional Four-Variate Return Period of Drought Events

The *ζ* values were 0.78, 1.02, 1.04, 0.84, 0.84, 1.23, 0.83 and 1.00 months for regions NE, NC, CS, SC, IM, NW, TP and EMC, respectively. Using the average historical and univariate return periods of *d*, *s*, *p* and *i* estimated for 5-, 10-, 20-, 50- and 100-year, the *T*_{DSPI} were obtained for each region. Figure 7 shows the *T*_{DSPI} for the historical, 5-, 10-, 20-, 50- and 100-year return periods. The graph showed that the *T*_{DSPI} trends increased with the year.

For example, in region NE, considering 20- years univariate return period, the values of *d*, *s*, *p*, and *i*, were 13.56 months, 17.46, 2.44 and 20.73 months, respectively. The *T*_{DSPI}, (*Dd* ≥ 13.56, *Ds* ≥ 17.46, *Dp* ≥ 2.44, *Di* ≥ 20.73)was 6.27 years. The*T*_{DSPI}were lowest in regions NE and CS. This implied that drought frequency would be higher in these regions compared to the other regions. The return period of drought variables obtained from univariate frequency analysis is higher than those by joint distribution for the AND case (*T*_{DSPI}). This showed that the univariate analysis does not furnish satisfactory knowledge about the drought risks associated with the four-variates. If engineers design hydraulic structures based on the results from univariate frequency analysis, the drought variables may be overestimated and this will lead to an increased cost of the structure. These results could be valuable in risk evaluation of water resources operations under severe and extreme drought situations.

The *P*_{DSPI} and *T*_{DSPI} showed that the result computed agreed with the actual data and confirmed that the selected copula functions for each region fitted the data well. Figure 7 further explained that for a particular recurrence interval, the *P*_{DSPI} showed a decreased trend while the *T*_{DSPI} showed an increased trend. For example, given the occurrence of *d*, *s*, *p*, and *i* in region NE, the estimated *P*_{DSPI} of 5- and 100-year plan drought are 0.332 and 0.029, respectively. Considering these illustrations, the calculated results seemed feasible. Therefore, from Fig. 7, the *P*_{DSPI} and *T*_{DSPI} of any other drought events in regions NE, NC, CS, SC, IM, NW, TP, and EMC can be obtained directly through interpolation.

#### 4.4.3 Spatial Distribution of Four-Variate Probability and Return Periods

The average *d*, *s*, *p*, and *i* for EMC were 5.81 months, 4.96, 1.15 and 11.1 months respectively. Similar to the regional drought analysis, the station-based *P*_{DSPI} and *T*_{DSPI}of drought events exceeding these specific values, (*i*. *e*., *Dd* ≥ 5.81, *Ds* ≥ 4.96, *Dp* ≥ 1.15, *Di* ≥ 11.1)were estimated for each station using the symmetric Frank copula. The spatial distribution of the*P*_{DSPI} and *T*_{DSPI} are mapped in Fig. 8.

The *P*_{DSPI}varied from 0.34 to 0.75. In this case, most of northern China experienced higher probability spreading from the northwest to northeast China. Notably, it seems that the *P*_{DSPI} were very high in NW and some stations in NC and TP. Based on the spatial pattern of drought events, the drought risk resulted from these regions are consistent. On the other hand, the results for southern China including SC and CS were slightly different as relatively lower *P*_{DSPI} was observed. Accordingly, counter measures of drought hazards need to be set for proactive actions especially in regions that had greater drought risks.

The *T*_{DSPI} ranged from 1.02 to 6.79 years. Long *T*_{DSPI}were found around NC while short *T*_{DSPI} dominated parts of CS, NW, and TP. Medium*T*_{DSPI}were found in eastern China, most especially NE, NC, CS. The spatial pattern of *P*_{DSPI} and *T*_{DSPI} suggested tremendous variations within different regions. In most cases, regions with high *P*_{DSPI}are often associated with short *T*_{DSPI} and vice versa. Severe droughts are most liable to eventuate in most of the northwestern and southwestern regions because of short *T*_{DSPI}. Also, because these areas are commercially advanced with huge populations, severe drought events could place grave danger on water resources in these regions. This requisite information is needed by water managers and other government agencies for adequate planning as well as management of water resources under severe drought situations. In this study, four-variate distribution was determined and successfully used.

## 5 Summary and Conclusions

Most meteorological phenomena are multiplicative, and the advantages of using a multivariate assessment strategy are evident in this study. In particular, drought events are characterized by *Dd*, *Ds*, *Dp,* and D*i* and the joint drought risk, which reveals their probability at the same time, is vital for management and planning of water resources under drought situations. Using the symmetric and asymmetric Archimedean copula functions, annual droughts were studied as four-dimensional phenomena incorporating these variables to build the joint four-variate distributions. Next, the *P*_{DSPI} for different *T*_{DSPI} in each region and EMC were then estimated.

The foremost outcomes of this research were summed as follows: (1) For D*i*, the generalized Pareto was the best distribution for regions NE, NC, CS, SC, NW, TP, and EMC while gamma was the best distribution for region IM. (2) The RMSE and AIC values were utilized to choose the suitable copula. We employed Clayton copula in regions NE, CS, and EMC, and Frank copula in regions NC, SC, IM, NW, and TP. Symmetric Frank copula was used for the station-based drought analysis. The selected copula functions provided the best fit for the dependence structures of *Dd*, *Ds*, *Dp,* and D*i*, and consequently, it was adopted for the joint risk drought analysis. (3) The risk of drought event is determined based on *P*_{DSPI} and *T*_{DSPI}, which furnished vital information for drought analysis. For instance, by using only the univariate information provided by either *Dd*, *Ds*, *Dp,* and D*i*, may yield under or overestimates of the actual drought state, and, in turn, of the corresponding risk. (4) Analyzing the probabilities and return periods of four-variate drought events, the trends or changes in drought events were then assessed over time. The concept of return period used in this study is a familiar concept in the community of water resources professionals. However, for a deep understanding of drought event, the ultimate evaluation was based on the hazard which the event possesses, and the vulnerability of this region to the drought event. Therefore, based on the results of our analysis, it is reasonable to increase drought control concerns especially in regions with very high *P*_{DSPI} and low *T*_{DSPI}. As a recommendation, a drought return period of below ten years seems to indicate that the drought is entering an alert or a dangerous state.

This study added to a better valuable knowledge in the sphere of disaster management, especially concerning the appraisal of drought issues and the performance of full drought-risk investigations. Our insights on the real-time evaluation of meteorological droughts of different regions may provide valuable information about the possible evolution of drought episode and also help the water manager to plan effective mitigation strategies. As a conclusion, in this paper, a multivariate frequency analysis of meteorological droughts in seven climatic regions is addressed using Copulas. Such an approach is flexible, comprehensive and offered other several advantages over previous definitions of multivariate frequency analysis (Salvadori et al., 2015). The copula-derived drought analysis considers complete interdependencies between the drought variables over a geographical region. Finally, in this paper, no consideration has been included for the climate change influence on the above analysis although climate change will affect the estimation of the return period of droughts. For this aspect, rigorous and systematic attempts should be made for estimating the anticipated changes in the meteorological parameters which directly or indirectly affect the *Dd*, *Ds*, *Dp,* and D*i* and ultimately the frequency of drought events.

## References

Abdi A, Hassanzadeh Y, Talatahari S, Fakheri-Fard A, Mirabbasi R 2016 Regional bivariate modeling of droughts using L-components and copulas. Stoch. Environ. Res. Risk. Assess

AghaKouchak A, Bardossy A, Habib E (2010) Conditional simulation of remotely sensed rainfall fields using a non-Gaussian transformed copula. Adv Water Resour 33(6):624–634

Akaike H 1974 A new look at the statistical model identification. IEEE Trans Autom Control AC 19(6):716–723

Allen RG, Pereira LS, Raes D, Smith M 1998 Crop evapotranspiration: guidelines for computing crop requirements, irrigation, and drainage paper 56. FAO, Roma

Ayantobo OO, Li Y, Song S, Yao N (2017) Spatial comparability of drought characteristics and related return periods in mainland China over 1961-2013. J Hydrol 550:549–567

Begueria S, Vicente-Serrano SM, Reig F, Latorre B (2013) Standardized precipitation evapotranspiration index (SPEI) revisited: parameter fitting, evapotranspiration models, tools, datasets and drought monitoring. Int J Climatol 34:3001–3023

Chang J, Li Y, Wang Y, Yuan M (2016) Copula-based drought risk assessment combined with an integrated index in the Wei River Basin, China. J Hydrol 540:824–834

Chebana F, Ouarada T (2011) Multivariate quantiles in hydrological frequency analysis. Environmetrics 22(1):63–78

Chen L, Singh VP, Shenglian G, Hao Z, Li T 2012 Flood coincidence Risk Analysis using Multivariate Copula Functions. Journal of Hydrologic Engineering 17, 6

Cunnane C (1978) Unbiased plotting positions—a review. J Hydrol 37(3–4):205–222

De Michele C, Salvadori G, Canossi M, Petaccia A, Rosso R (2005) Bivariate statistical approach to check adequacy of dam spillway. J Hydrol Eng 10(1):50–57

De Michele C, Salvadori G, Passni G, Vezzoli R (2007) A multivariate model of sea storms using copulas. Coast Eng 54(10):734–751

Embrechts P, Lindskog F, McNeil A (2003) Modeling dependence with copulas and applications to risk management. In: Handbook of Heavy Tailed Distributions in Finance. Elsevier Sci, New York, pp 329–384

Fan, L., Qian, Z., 2016. Probabilistic modelling of flood events using the entropy copula in China. Advances in Water Resources. 97, 233-240

Ganguli P, Reddy MJ (2014) Evaluation of trends and multivariate frequency analysis of droughts in three meteorological subdivisions of western India. Int J Climatol 34(3):911–928

Genest C, Favre AC (2007) Everything you always wanted to know about copula modeling but were afraid to ask. J Hydrol Eng 12(4):347–368

Grimaldi S, Serinaldi F (2006) Asymmetric copula in multivariate flood frequency analysis. Adv Water Resour 29(8):1115–1167

Gringorten II (1963) A plotting rule for extreme probability paper. J Geophys Res 68(3):813–814

Haslinger K, Koffler D, Schöner W, Laaha G (2014) Exploring the link between meteorological drought and streamflow: effects of climate-catchment interaction. Water Resour Res 50:2468–2487

Hao C, Zhang J, Yao F (2017) Multivariate drought frequency estimation using copula method in Southwest China. Theor Appl Climatol. 127:977

Helsel DR, Hirsch RM (1992) Statistical Methods in Water Resources. Elsevier, Amsterdam

Huang SZ, Chang JX, Huang Q, Chen YY et al (2014b) Spatio-temporal changes and frequency analysis of drought in the Wei River Basin, China. Water Resour Manag 28(10):3095–3110

Huang SZ, Hou BB, Chang JX, Huang Q, Chen YT (2014a) Copulas-based probabilistic characterization of the combination of dry and wet conditions in the Guanzhong Plain, China. J Hydrol 519:3204–3213

Joe H (1997) Multivariate models and dependence concepts. Chapman and Hall, New York

Kao SC, Govindaraju RS (2010) A copula-based joint deficit index for droughts. J Hydrol 380:121–134

Kao SC, Govindaraju RS (2008) Trivariate statistical analysis of extreme rainfall events via the Plackett family of copulas. Water Resour Res 44(2):W02415

Lee T, Modarres R, Quarda TBMJ (2013) Data-based analysis of bivariate copula tail dependence for drought duration and severity. Hydrol Process 27(10):1454–1463

Leng GY, Tang QH, Rayburgc S (2015) Climate change impacts on meteorological, agricultural and hydrological droughts in China. Glob Planet Chang 126:23–34

Liu XF, Wang SX, Zhou Y, Wang FT, Li WJ, Liu WL 2015 Regionalization and Spatiotemporal Variation of Drought in China Based on Standardized Precipitation Evapotranspiration Index (1961–2013). Advances in Meteorology, 2015, 18 pgs

Lu E, Cai WY, Jiang ZH, Zhang Q, Zhang CJ, Higgins RW, Halpert MS (2013) The day-to-day monitoring of the 2011 severe drought in China. Clim Dyn 43(1–2):1–9

Lu E, Luo YL, Zhang RH, Wu QX, Liu LP 2011 Regional atmospheric anomalies responsible for the 2009–2010 severe drought in China. J. Geophys.- Atmos. 116

Lu J, Ju J, Ren J, Gan W (2012) The influence of the Madden-Julian Oscillation activity anomalies on Yunnan’s extreme drought of 2009–2010. Sci China Earth Sci 55:98–112

Ma M, Song S, Ren L, Jiang S, Song J (2013) Multivariate drought characteristics using trivariate Gaussian and Student t copulas. Hydrol Process 27(8):1175–1190

Mirabbasi R, Fakheri-Fard A, Dinpashoh Y (2012) Bivariate drought frequency analysis using the copula method. TheorAppl Climatol 108:191–206

Mirakbari M, Ganji A, Fallah SR (2010) Regional bivariate frequency analysis of meteorological droughts. J Hydrol Eng 15(12):985–1000

Mishra AK, Singh VP (2010) A review of drought concepts. J Hydrol 391(1–2):202–216

Nelsen RB (2006) An introduction to copulas. Springer, New York

Nelsen, R. B., (1999). An introduction to copulas: Springer Verlag, New York, 215p

Prohaska S, llic A, Majkic B 2008 Multiple-coincidence of flood waves on the main river and its tributaries. IOP Conf. Series: Earth and Environmental Science, IOP Publishing, Bristol, UK

Qin Y, Yang DW, Lei HM, Xu K, Xu XY (2015) Comparative analysis of drought based on precipitation and soil moisture indices in Haihe basin of North China during the period of 1960–2010. J Hydrol 526:55–67

Reddy MJ, Singh VP (2014) Multivariate modeling of droughts using copulas and meta-heuristic method. Stoch Env Res Risk A 28(3):475–489

Salvadori G, De Michele C (2004) Frequency analysis via copulas: theoretical aspects and applications to hydrological events. Water Resour Res 40:W12511

Salvadori G, De Michele C (2010) Multivariate multiparameter extreme value models and return periods: a copula approach. Water Resour Res 46:W10501

Salvadori G, De Michele C (2015) Multivariate real-time assessment of droughts via copula-based multi-site Hazard Trajectories and Fans. J Hydrol 526(SI):101–115

Salvadori G, De Michele C, Kottegoda NT, Rosso R (2007) Extremes in nature. An approach using copulas. Springer, Dordrecht

Savu C, Trede M (2010) Hierarchies of Archimedean copulas. Quant Finan 10(3):295–304

Serinaldi F, Bonaccoroso B, Cancelliere A, Grimaldi S (2009) Probabilistic characterization of drought properties through copulas. Phys Chem Earth Parts A/B/C 34(10–12):596–605

Serinaldi F, Grimaldi S (2007) Fully nested 3-copula: Procedure and application on hydrological data. J Hydrol Eng 12(4):420–430

She DX, Mishra AK, Xia J, Zhang LP, Zhang X (2016) Wet and dry spell analysis using copulas. Int J Climatol 36(1):476–491

Shiau JT (2006) Fitting drought duration and severity with two-dimensional copulas. Water Resour Manag 20:795–815

Shiau JT, Modarres R (2009) The copula-based drought severity-duration frequency analysis in Iran. J Appl Meteorol 16(4):481–489

Sklar A (1959) Fonctions de répartition à n dimensions et leursmarges. Publ Inst Statist Univ Paris 8:229–231

Song S, Singh VP (2010) Meta-elliptical copulas for drought frequency analysis of periodic hydrologic data. Stoch Env Res Risk A 24:425–444

Tsakiris G, Kordalis N, Tigkas D, Tsakiris V, Vangelis H (2016) Analysing drought severity and areal extent by 2D Archimedean copulas. Water Resour Manag 30:1–13

Vicente-Serrano SM, Begueria S, Lopez-Moreno JI (2010) A multiscalar drought index sensitive to global warming: the standardized precipitation evapotranspiration index. J Clim 23(7):1696–1718

Wang H, Chen Y, Pan Y (2015) Characteristics of drought in the arid region of northwestern China. Clim Res 62:99–113

Whelan N (2004) Sampling from Archimedean copulas. Quant Finan 4(3):339–352

Wilhite DA, Sivakumar MV, Pulwarty R (2014) Managing drought risk in a changing climate: the role of national drought policy. Weather Clim Extrem 3:4–13

Wong G, Lambert MF, Leonard M, Metcalfe AV (2010) Drought analysis using trivariate copulas conditional on climatic states. J Hydrol Eng 15(2):129–141

Wu ZY, Lu GH, Wen L, Lin CA (2011) Reconstructing and analyzing China’s fifty-nine year (1951–2009) drought history using hydrological model simulation. Hydrol Earth Syst Sci Discuss 8(1):1861–1893

Xu K, Yang D, Yang H, Li Z, Qin Y, Shen Y (2015b) Spatio-temporal variation of drought in China during 1961–2012: A climatic perspective. J Hydrol 526:253–264

Xu K, Yang DW, Xu XY, Lei HM (2015a) Copula-based drought frequency analysis considering the spatiotemporal variability in Southwest China. J Hydrol 527:630–640

Yang P, Xiao Z, Yang J, Liu H (2013) Characteristics of clustering extreme drought events in China during 1961–2010. Acta Meteor Sin 27(2):186–198

Zargar A, Sadiq R, Naser B, Khan FI (2011) A review of drought indices. Environ Rev 19:333–349

Zhai J, Su B, Gao C, Jiang T (2010) Spatial variation and trends in PDSI and SPI indices and their relation to streamflow in 10 large regions of China. J Clim 23:649–663

Zhang D, Chen P, Zhang Q, Li X (2017) Copula-based probability of concurrent hydrological drought in the Poyang lake-catchment-river system (China) from 1960 to 2013. J Hydrol 553:773–784

Zhang L, Singh VP (2006) Bivariate flood frequency analysis using the copula method. J Hydrol Eng 11(2):150–164

Zhang L, Singh VP (2007) Gumbel-Hougaard copula for trivariate rainfall frequency analysis. J Hydrol Eng 12(4):409–419

Zhang L, Xiao J, Li J, Wang K, Lei L, Guo H (2012) The 2010 spring drought reduced primary productivity in southwestern China. Environ Res Lett 7(4):045706

Zhang Q, Qi T, Singh VP et al (2015b) Regional Frequency Analysis of Droughts in China: A Multivariate Perspective. J Water Resour Manag 29(6):1767–1787

Zhang Q, Xiao MZ, Singh VP (2015a) Uncertainty evaluation of copula analysis of hydrological droughts in the East River basin, China. Glob Planet Chang 129:1–9

Zhao CH, Deng XZ, Yuan YM, Yan HW, Liang HM (2013) Prediction of drought risk based on the WRF model in Yunnan province of China. Adv Meteorol 2013

Zhao J, Huang Q, Chang JX, Liu DF, Huang SZ, Shi XY (2015) Analysis of temporal and spatial trends of hydro-climatic variables in the Wei River Basin. Environ Res 139:55–64

Zhao S 1983 A new scheme for comprehensive physical regionalization in China. Acta GeographicaSinica, 1–10. (In Chinese with English Abstract)

## Acknowledgments

This work was supported by the National Key Research and Development Program of China (grant number 2017YFC0403303), China Natural Science Foundation (No. U1203182), and the China 111 project (B12007).

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

### Conflict of Interest

The authors declare that they have no conflict of interest.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Ayantobo, O.O., Li, Y. & Song, S. Multivariate Drought Frequency Analysis using Four-Variate Symmetric and Asymmetric Archimedean Copula Functions.
*Water Resour Manage* **33**, 103–127 (2019). https://doi.org/10.1007/s11269-018-2090-6

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11269-018-2090-6