1 Introduction

According to the Fourth Assessment Report (AR4) of the Intergovernmental Panel on Climate Change (IPCC 2007), spatial patterns of projected 2-m temperature show large increases over land, particularly over high-latitude regions of the Northern hemisphere. This increase in temperature is expected to accelerate spring snowmelt, and shorten the overall snowfall season, leading to earlier and larger spring runoff (Barnett et al. 2005). As for precipitation, the global average is projected to increase with increasing water holding capacity of the atmosphere in a warmer climate (Meehl et al. 2007). For North America, Christensen et al. (2007) reported projected decreases in future snow season length and snow depth, but increases in future precipitation in winter and spring for southern Canada. Increase in temperature and precipitation can significantly affect flood dynamics in Canadian river basins where high flows are primarily generated due to spring snowmelt (Mareuil et al. 2007). Changes to flood characteristics can impact various sectors including water resources management, agriculture, ecosystem and health and society at large; it is hence of high importance to assess characteristics of extreme events such as floods in the context of a changing climate to enable appropriate adaptation strategies.

Coupled Global Climate Models (CGCMs) are the most comprehensive tools used to generate information about present and future climate for various greenhouse gases and aerosols (GHGA) concentration scenarios. However, because of their high complexity, CGCM simulations are very demanding in computational resources and are performed at relatively coarse horizontal resolution. Therefore, regional and site-specific climate change scenarios are generally produced by means of downscaling methods, i.e. statistical downscaling (e.g. Wilby et al. 2002) and/or dynamical downscaling (e.g. Laprise 2008; Rummukainen 2010). Compared to statistical downscaling approaches, dynamical downscaling using regional climate models (RCMs) provide physically based finer-scale regional climate information when driven by outputs from CGCMs. Due to their reasonable skill in simulating regional-scale climate and hydrology, many studies (e.g. Jha et al. 2004; Wood et al. 2004; Sushama et al. 2006; Kay et al. 2006a, b; Graham et al. 2007a, b; Dadson et al. 2011; Poitras et al. 2011) have used RCM outputs directly to evaluate climate change impacts on regional/basin-scale hydrologic variables including mean, seasonal and extreme flows in their target regions.

Projected changes to flood characteristics have been generally studied so far within a univaiate flood frequency analysis framework (e.g. Menzel and Bürger 2002; Prudhomme et al. 2002; Booij 2005; Huziy et al. 2012). The results of such analyses can only provide limited assessment of the probability of flood occurrence as floods generally are multivariate events, characterized by its peak, volume and duration (e.g. Shiau 2003; De Michele et al. 2005; Zhang and Singh 2006; Chebana and Ouarda 2009). Therefore, better understanding of changes to flood characteristics is essential from a multivariate viewpoint.

Some techniques have been developed to model multivariate flood characteristics as a generalization of the univariate distribution. For example, bivariate normal (Goel et al. 1998), bivariate lognormal (Yue 2000), bivariate exponential (Favre et al. 2004), bivariate gamma (Yue 2001), and bivariate extreme value (Adamson et al. 1999) distributions have been used to model flood characteristics. Compared to this, copula based multivariate framework is a more versatile approach for modeling joint distribution functions from univariate marginal distributions as it allows modeling the dependence structure among random variables independently of the marginal distributions (Favre et al. 2004). Because of this flexibility, copulas are becoming increasingly popular to investigate multivariate distributions of various hydrometeorological variables (e.g. Favre et al. 2004; De Michele et al. 2005; Grimaldi and Serinaldi 2006; Zhang and Singh 2006; Renard and Lang 2007; Karmakar and Simonovic 2009; Aissia et al. 2011; Lee and Salas 2011). A comprehensive list of references on the copula topic applied in the field of hydrology is available on the website of the International Commission of Statistical Hydrology of the International Association of Hydrological Sciences—ICSH-IAHS (www.stahy.org).

Northeastern Canada, the region considered in this study, plays an important role in the economy of the area with its large number of hydroelectric power generating stations. Based on projections from 21 global climate models that participated in the AR4 (IPCC 2007), Christensen et al. (2007) predicted that the annual mean temperature will increase by 3.6 °C (with a range of 2.3–5.6 °C) and precipitation by 7 % (with a range of −3 to 15 %), over eastern North America including middle and southern parts of Québec, for the 2080–2099 period with respect to the 1980–1999 period; these results consider the IPCC’s (2001) Special Report on Emissions Scenarios (SRES) AlB scenario. The largest increase in mean temperature (3.8 °C) and precipitation (11 %) is expected in winter, while the smallest increase in mean temperature (3.3 °C) and precipitation (1 %) is expected in summer. The projected increase in winter temperature and precipitation can impact spring flood characteristics. Because of the importance of streamflow in northeastern Canadian basins, previously some investigators have studied projected changes to streamflow characteristics in few individual river basins (e.g. Dibike and Coulibaly 2007; Quilbé et al. 2008; Minville et al. 2008), while the study by Huziy et al. (2012) concentrated on the entire northeast Canadian region, which is the same region as considered in this study. It should be noted that Huziy et al. (2012) studied projected changes to various streamflow characteristics including flood peaks in a univariate setting.

The main objective of this study is to evaluate climate change impacts on three spring flood characteristics, i.e. flood peak, volume and duration, within a multivariate framework, for 21 northeast Canadian basins covering Québec and some parts of the adjoining Ontario and Newfoundland provinces of Canada. A five member ensemble of the fourth generation Canadian RCM (CRCM) current (1970–1999) and future (2041–2070) climate simulations, driven by five different members of a Canadian GCM initial condition ensemble, is used to assess projected changes, while an ERA40 (the European Centre for Medium-Range Weather Forecasts 40-year reanalysis; Uppala et al. 2005) driven CRCM simulation is compared to observations for validating model simulated flood characteristics. Conventional univariate frequency analysis is applied to individual flood characteristics and copula based bivariate frequency analysis to three pairs of flood characteristics (i.e. peak–volume, peak–duration and volume–duration). Marginal and bivariate distributions of flood characteristics and marginal and joint return period-magnitude relationships are evaluated for selected return periods in various forms.

The paper is organized as follows. Section 2 describes the CRCM and its simulations used in the study. A detailed methodology for determining joint distribution functions for different combinations of flood characteristics using the copula approach is described in Sect. 3. Section 4 presents results of the selection of best fitting marginal distributions and copula function, performance and boundary forcing errors and projected changes to joint occurrence probabilities of flood characteristics. Main conclusions of the study are given in Sect. 5.

2 Model, data and study area

2.1 CRCM

The streamflows and therefore the flood characteristics analyzed in this study are derived from the transient climate change simulations performed with the CRCM, which is a fully elastic non-hydrostatic limited-area nested regional model (de Elía and Côté 2010). It uses a semi-implicit and semi-Lagrangian numerical scheme to solve the basic non-hydrostatic Euler equations (Caya and Laprise 1999). The CRCM’s lateral boundary conditions are provided through one-way nesting method over a regional domain inspired by Davies (1976) and redefined by Yakimiw and Robert (1990). Therefore, the CRCM receives atmospheric nesting information from its driving data, but does not influence the driving data in return. The CRCM is driven by time dependent vertical profiles from the driving data’s wind, air temperature, humidity and pressure imposed at the lateral boundaries exactly, as interpolated onto the CRCM’s atmospheric levels. The simulated horizontal winds are relaxed toward values of the driving data over the sponge zone. In addition, spectral large-scale nudging is imposed to force coherence of the CRCM large-scale winds with the driving data (Biner et al. 2000).

The CRCM generally uses most of the sub-grid scale physical parameterization packages of the Canadian GCM (CGCM3.1; Flato and Boer 2001), except for moist convection. Cloud cover is parameterized in terms of local relative humidity assuming maximum (random) overlap, depending on presence (or absence) of clouds in adjacent layers as in CGCM3.1 and precipitation is parameterized in terms of a simple super saturation based condensation scheme as in CGCM3.1 (Laprise et al. 2003). Mesoscale convection follows the parameterization scheme of Kain and Fritsch (1990) and Bechtold et al. (2001). Though the study focuses on northeast Canadian basins, all CRCM simulations were computed on a 200 × 192 points grid (see inset of Fig. 1a), covering whole of North America and adjoining oceans, with a horizontal grid-point spacing of 45 km and 29 levels in the vertical, ranging from the surface to the model top near 29 km.

Fig. 1
figure 1

Study area with its 21 basins. The inset shows the CRCM simulation domain. The location of 16 CEHQ gauging stations is also shown. Additional information about the CEHQ station is provided in Table 1. The basin names corresponding to three letter abbreviations are ARN Arnaud, BOM Bersimis-Outrades-Manic-5, FEU Rivière aux Feuilles, MEL Rivière aux Mélèzes, ROM Romaine, STM Saint-Maurice, BAL Baleine, CAN Caniapiscau, GEO Georges, LGR La Grande Rivière, MOI Moisie, PYR Pyrite, RUP Rupert, BEL bell, CHU Churchill falls, GRB grand rivière de la Baleine, MAN Manicouagan, MAN Natashquan, RDO Rivière de Outaouais, SAG Saguenay, WAS Waswanipi

2.2 Streamflow simulations

Streamflows are derived from CRCM-simulated runoff using a modified version of WATROUTE (Poitras et al. 2011), a cell-to-cell routing scheme based on the modified routing algorithm of the distributed hydrological model WATFLOOD (Kouwen et al. 1993). Detailed description on streamflow calculation from CRCM-simulated runoff can be found in Poitras et al. (2011). An ensemble of five 30-year simulations are analyzed for current (1970–1999) and future (2041–2070) climates; these five pairs of current and future climate CRCM simulations were driven by different members of a CGCM3.1 initial condition ensemble. Future simulations are affected by changes in GHGA, following IPCC’s (2001) SRES A2 scenario. In addition to the above simulations, a 30-year (1970–1999) CRCM simulation driven by ERA40 is used for validating the regional model. As already mentioned, though the CRCM simulations were performed over a domain covering whole of North America, the present work focuses mainly on Québec and adjoining parts of Newfoundland and Ontario provinces of Canada. Hereafter, the CRCM simulation driven by ERA40 will be denoted by CRCM–ERA40 and those simulations driven by different CGCM3.1 members for current and future climate by CRCM–CGCMc and CRCM–CGCMf, respectively.

2.3 Observed data and study area

CRCM–ERA40 simulated flood characteristics (i.e. flood peak, volume and duration) are compared to those derived from observed data for the 1970–1999 period to assess model performance. Observed daily streamflow data for 16 gauging stations falling in the Québec region (Fig. 1) were obtained from CEHQ (Centre d’expertise hydrique du Québec; http://www.cehq.gouv.qc.ca/) dataset. Additional information about the gauging stations (i.e. location, representative drainage area, and annual mean flow) is provided in Table 1. These gauging stations represent various hydrological conditions ranging from small mountainous basin outlets to large basin main stream outlets, with drainage area ranging from 1,110 to 40,900 km2 and annual mean flow from 6.8 to 846.1 m3/s. The representative CRCM grid points corresponding to 16 gauging stations, shown in Table 1, are selected on the basis of their proximity to the gauging stations and consistency with the digital flow directions.

Table 1 Information about selected CEHQ stations and representative CRCM grid points used in the comparison of flood characteristics derived from observed data, CRCM–ERA40 and CRCM–CGCMc for the period from 1970 to 1999

3 Methodology

The procedure to assess projected changes to spring flood characteristics involves:

  1. 1.

    Identification of flood characteristics (peak, volume and duration) from daily streamflow hydrographs for current and future climate (see Sect. 3.1).

  2. 2.

    Determination of appropriate marginal distributions for flood characteristics derived from CRCM–ERA40 and CRCM–CGCMc simulations for current and future climate (see Sect. 3.2).

  3. 3.

    Determination of appropriate copula families for three pairs of flood characteristics (peak–volume, peak–duration and volume–duration) to develop joint distribution functions (see Sect. 3.3).

  4. 4.

    Comparison of observed and CRCM–ERA40 simulation based basic flood statistics (i.e. mean and standard deviation) and results of marginal and bivariate frequency analyses to evaluate CRCM performance and comparison of CRCM–ERA40 and CRCM–CGCMc simulated flood characteristics for current climate to assess boundary forcing errors, i.e. the impact of errors in the boundary forcing data (CGCM in this study) (see Sect. 4.1).

  5. 5.

    Comparison of basic flood statistics and results of marginal and bivariate frequency analyses for CRCM–CGCMc and CRCM–CGCMf simulations to evaluate climate change impacts on flood characteristics (see Sect. 4.2).

3.1 Identification of flood characteristics

A flood hydrograph is generally characterized by its peak, volume and duration as illustrated in Fig. 2. Base flow and fixed threshold approaches are usually recommended to determine flood characteristics. Base flow approach identifies flood duration by manually determining time points corresponding to rise in discharge from base flow (start date) and return to base flow (end date) as shown by Dbase in Fig. 2 (Yue 2000; Karmakar and Simonovic 2008). The fixed threshold approach, on the other hand, identifies flood duration (Dthre) by fixing a threshold discharge and considers upper part of the hydrograph as a flood event (Grimaldi and Serinaldi 2006; Karmakar and Simonovic 2008). In addition to the above two approaches, one can also use a base flow separation function such as a recursive digital filter used in the work of Serinaldi and Grimaldi (2011). Flood volume is determined by removing the base flow from the total volume of streamflow corresponding to flood duration. For the present work, fixed threshold approach is more appropriate than the base flow approach because the current and future flood characteristics need to be identified with respect to the same reference flow in order to facilitate assessment of climate change impacts on flood characteristics. One drawback of the fixed threshold approach is that the correlations among flood characteristics are sensitive to the choice of threshold. Figure 3 shows variations of average values of Kendall’s coefficient of correlation (KCC) for 16 gauging stations for peak–volume, peak–duration and volume–duration pairs as a function of threshold discharge. Thresholds ranging from 0.8 to 2.0 μ are considered, where μ represents mean annual streamflow. The values of KCC for peak–volume and peak–duration cases are somewhat sensitive, particularly for thresholds below 1.3 μ, while those for volume–duration case are relatively insensitive to threshold discharge. Figure 3 also shows that the correlation of peak and duration is lower than that of the other two pairs for all values of threshold discharge. Consistent with the findings of Grimaldi and Serinaldi (2006) and Karmakar and Simonovic (2009), the pairs of flood characteristics in the present study area are also positively correlated that supports the necessity of multivariate flood frequency analyses. Concerning the selection of a threshold discharge, there is a greater possibility for the base flow to be included in the identified flood event if too small a threshold is used. On the other hand, a too large threshold would result in exclusion of large amounts of flood flow volumes. Also, it is important to select a threshold that can be used satisfactorily for a wide range of hydrological conditions across the study area, including 16 observation stations and 547 grid points of CRCM. In view of these points, the selected threshold of 1.3 μ provides a reasonable compromise and it is generally found to be suitable for all observation stations and CRCM grid points.

Fig. 2
figure 2

A schematic diagram showing flood characteristics (peak, volume and duration) based on fixed threshold and base flow approaches; hydrograph corresponds to CEHQ station 40830 for the year 1996. Dthre and Dbase are the flood durations corresponding to fixed threshold and base flow approaches, respectively

Fig. 3
figure 3

Correlation functions of peak–volume, peak–duration and volume–duration pairs of flood characteristics for different thresholds. KCC values are averaged over 16 gauging stations (Table 1). Threshold along the x-axis is ‘x’ times μ (the mean annual streamflow) where x varies from 0.8 to 2.0. The x-value of 1.3 is used for defining the thresholds in the analysis

3.2 Marginal distributions

Marginal and joint frequency analyses are developed on the basis of seasonal maximum values of flood characteristics. Two parameter exponential, lognormal, gamma, Gumbel and Weibull distributions and three parameter Generalized Extreme Value (GEV) and log-Pearson type 3 (LP3) distributions are considered in order to identify the best fitting marginal probability distribution for each of the three flood characteristics. Maximum likelihood method is employed for parameter estimation of two parameter distributions. Following suggestions from Hosking (1985) for samples of small to medium size, which is 30 in the present study, the probability weighted moments method is employed to estimate parameters of the GEV distribution. The method of moments approach is employed for LP3 distribution following recommendations of Bulletin 17B (IACWD 1982). The Root Mean Square Error (RMSE), Akaike Information Criterion (AIC; Akaike 1974) and Bayesian Information Criterion (BIC; Schwarz 1978) are used to select most appropriate marginal distribution for flood peak, volume and duration. The RMSE is expressed as

$$ {\text{RMSE}} = \left\{ {\frac{1}{n}\sum\limits_{i = 1}^{N} {[p_{f} (i) - p_{e} (i)]^{2} } } \right\}^{1/2} $$
(1)

where N is the number of observations, p e (i) and p f (i) represent nonexceedance probabilities calculated from an empirical distribution and a fitted distribution for the ith observation. To calculate empirical nonexceedance (cumulative) probability, the Gringorten (1963) plotting position formula is used. The AIC and BIC can measure lack-of-fit of the model as well as complexity of the model due to the inclusion of a penalty term for the number of parameters in the model (e.g. Zhang and Singh 2006); these criteria can be expressed as

$$ {\text{AIC}} = N\log ({\text{MSE}}) + 2k $$
(2)
$$ {\text{BIC}} = N\log ({\text{MSE}}) + k\log (N) $$
(3)

where k represents the number of parameters and MSE represents the mean square error (i.e. squared value of RMSE). The model with the minimum value of RMSE, AIC, BIC or a combination of these measures is selected as the potential optimal model.

3.3 Joint distributions

A copula is a distribution function that models the dependent structure between the random variables by connecting multivariate probability distribution to their one-dimensional marginal probability distributions (Nelsen 1999). Let X and Y be two random variables with the marginal cumulative distribution functions (CDFs) F X and F Y , the joint cumulative distribution function of (X, Y), F XY (x, y), can be expressed as

$$ F_{XY} (x,y) = C[F_{X} (x),F_{Y} (y)] $$
(4)

where C is a bivariate copula of (X, Y). If F X (x) = u and F Y (y) = v, the expression (4) can be written as follows:

$$ C(u,v) = F_{XY} \left[ {F_{X}^{ - 1} (u),F_{Y}^{ - 1} (v)} \right],\quad (u,v) \in [0,1]^{2} $$
(5)

where \( F_{X}^{ - 1} \) and \( F_{Y}^{ - 1} \) are generalized inverses of F X and F Y , respectively.

Different families of copulas (Archimedean, elliptical and extreme value) have been suggested and described by Nelsen (1999). The Archimedean copula family is often used for multivariate hydrological analysis due to the following advantages (Zhang and Singh 2006): (1) it can easily be constructed, (2) a huge variety of copula models belong to this class, which have attractive stochastic properties that often lead to statistically tractable relationships for continuous data (McNeil and Neslehová 2009), and (3) it can be applied for both positively and negatively correlated variables. This family has also been used earlier for single-site bivariate flood frequency analysis in some Canadian studies (e.g. Favre et al. 2004; Aissia et al. 2011; Karmakar and Simonovic 2009). According to Nelsen (2006), a bivariate Archimedean copula can generally be expressed as

$$ C_{\theta } (u,v) = \phi^{ - 1} [\phi (u) + \phi (v)] $$
(6)

where subscript θ of copula C is parameter hidden in the generating function ϕ. For the Archimedean copula, θ can be determined from the relationship between KCC τ and generating function ϕ(t), which is defined by \( \tau = 1 + 4\int_{0}^{1} {\frac{\phi (t)}{{\phi^{\prime } (t)}}dt} \) (Karmakar and Simonovic 2009), where t = u or v. The KCC τ is a well-known nonparametric measure of dependence between any two (X and Y) random variables. In addition to the KCC and generating function based approach, it is also possible to identify a suitable parametric copula family using the relationship between the KCC and upper tail dependence coefficient, i.e. the probability of observing a high value for a variable given that the other variable assume a high value (e.g. Poulin et al. 2007; Serinaldi et al. 2009).

Three Archimedean copulas (i.e. Frank, Gumbel and Clayton copulas) are considered in this study. Selected mathematical properties (i.e. copula equation, generating function ϕ(t) and relationship between θ and τ) of the three families of Archimedean copula are listed in Table 2. RMSE, AIC, and BIC values are calculated from the empirical joint cumulative distribution function and the copula-based fitted bivariate distribution for peak–volume, peak–duration and volume–duration bivariate cases. Most appropriate copula function from the three candidates is chosen that generated smallest values of RMSE, AIC and BIC. Many studies (e.g. Karmakar and Simonovic 2009; Chowdhary and Singh 2010; Zhang et al. 2012) have used these three traditional accuracy measures to select an appropriate bivariate copula function. However, more sophisticated goodness-of-fit procedures are also available (see Kojadinovic and Yan 2010).

Table 2 Properties of the three selected Archimedean copulas

Two types of joint occurrence probabilities are investigated: (1) P1—the probability of X exceeding a threshold x or Y exceeding another threshold y, i.e. P(X > x or Y > y), and (2) P2–the probability of both X and Y exceeding their respective thresholds at the same time, i.e. P(X > x and Y > y). Here x and y denote the values of X and Y corresponding to a selected return period, respectively. Following Yue and Rasmussen (2002) and Liu et al. (2011), these probabilities are formulated as:

$$ P(X > x\,{\text{or}}\,Y > y) = 1 - F_{XY} (x,y) = 1 - C[F_{X} (x),F_{Y} (y)]. $$
(7)
$$ P(X > x\;{\text{and}}\;Y > y) = 1 - F_{X} (x) - F_{Y} (y) + F_{XY} (x,y) = 1 - F_{X} (x) - F_{Y} (y) + C[F_{X} (x),F_{Y} (y)] $$
(8)

For both x and y, flood peak, volume and duration corresponding to 5-, 20-, and 50-year return periods for the current climate are used. The joint occurrence probabilities of peak–volume, peak–duration and volume–duration for the current climate and their projected changes for the future climate are estimated based on these fixed thresholds. The joint occurrence probabilities P(X > x or Y > y) and P(X > x and Y > y) for an r-year return period are denoted by P1r and P2r, respectively. The difference between these two probabilities is explained further as follows. For example, consider two random variables X and Y that are mutually independent, their joint probability F XY (x, y) equal to g and their joint occurrence probabilities P150 and P250 equal to h 1 and h 2, respectively. If the two random variables are mutually correlated, the joint probability F XY (x, y) would be larger than g, the P150 would be smaller than h 1 and P250 would be larger than h 2.

3.4 Merged series analysis

Both marginal and joint frequency analyses are performed independently for each pair of the five CRCM–CGCM current and future period simulations to estimate percentage changes to return levels of selected return periods. These changes are then averaged over the five pairs of simulations to obtain ensemble-averaged projected change. As demonstrated in Huziy et al. (2012), the Kruskal–Wallis test (Kruskal and Wallis 1952), a multiple comparison test, suggests that the five streamflow series corresponding to five members of the CRCM ensemble for the current climate may belong to the same distribution for the majority of the CRCM grid cells over the study domain and the same is also noted for the future climate. Therefore, the five simulated streamflow series for the current climate are merged to create a longer sample for each grid-cell and the same procedure is followed for the future climate. Projected changes are then assessed from the merged longer samples for the current and future periods. The advantage of the latter merged series approach over the former ensemble averaged approach is the reduced uncertainty associated with longer return period return levels due to larger sample size.

4 Results

4.1 Selection of marginal distributions and copula function

In Table 3, average RMSE, AIC and BIC for 16 gauging stations and corresponding CRCM grid cells are shown for seven marginal distributions fitted to three flood characteristics derived from observed records, CRCM–ERA40 and CRCM–CGCMc. In the case of CRCM–CGCMc, both overall average and range of performance measures are provided. Overall, the values of the three performance measures (RMSE, AIC and BIC) indicate that the GEV distribution yields the best performance for all three flood characteristics for the case of observed records, CRCM–ERA40 and CRCM–CGCMc. The GEV distribution is also associated with the smallest values of range for the three performance measures. Based on these results, the GEV distribution is selected as the marginal distribution for all three flood characteristics. Since only one flood event per year is included in the analysis, the choice of the GEV distribution can also be justified on theoretical grounds due to the fact that the distribution of annual or seasonal maxima converges to the GEV distribution. It should be noted that the same distribution is used for the future climate but by re-estimating its parameters for future simulations. Thus, the family of distributions stays the same both for current and future climates.

Table 3 Average values of RMSE, AIC and BIC for 16 gauging stations and representative CRCM grid points for seven marginal distributions fitted to flood peak, volume and duration derived from observed records, CRCM–ERA40 and CRCM–CGCMc

In Table 4, average values of RMSE, AIC and BIC for 16 gauging stations and representative CRCM grid cells are shown for three copula functions fitted to three pairs of flood characteristics (peak–volume, peak–duration and volume–duration) derived from observed records, CRCM–ERA40 and CRCM–CGCMc. The three goodness-of-fit measures indicate that the Clayton family provides the best performance for all three pairs of flood characteristics derived from observed records and CRCM–ERA40 and for the volume–duration pair derived from CRCM–CGCMc. For peak–volume and peak–duration pairs derived from CRCM–CGCMc, the Clayton and Frank families exhibit comparable performance. Thus, based on these results, the Clayton family is selected as the copula function for the three pairs of flood characteristics for both the current and future climates.

Table 4 Average values of RMSE, AIC and BIC for 16 gauging stations and representative CRCM grid points for three copula functions fitted to flood peak–volume, peak–duration and volume–duration pairs of characteristics derived from observed records, CRCM–ERA40 and CRCM–CGCMc. The numbers in bold represent best performance according to the selected goodness-of-fit measures. For CRCM–CGCMc, average and range of the three goodness-of-fit measures based on five different CRCM simulations are provided

4.2 Performance and boundary forcing errors

Basic statistics (mean and interannual standard deviation (SD)) of seasonal maximum series of flood characteristics and estimated values of 5-, 20- and 50-year return levels, observed and modelled, are compared in Fig. 4. For flood peak and volume, the above statistics for CRCM–ERA40 compare favorably with those derived from observed records. R-squared values for peak and volume for CRCM–ERA40 are larger than 0.8. Flood duration appears to be a challenging parameter for CRCM to simulate since R-squared values for CRCM–ERA40 vary from 0.3 to 0.6. The performance errors associated with flood duration are thus larger than those associated with flood peak and volume. The boundary forcing errors, i.e. the errors associated with the CGCM boundary forcing data, reflected in the differences between CRCM–ERA40 and CRCM–CGCMc, are larger for flood duration compared to flood peak and volume. The CRCM–CGCMc ensemble is generally closer to those observed than the CRCM–ERA40 simulation for flood peak, while this tendency is not so obvious for flood volume and duration.

Fig. 4
figure 4

Comparison of simulated (CRCM–ERA40 and CRCM–CGCMc) and observed mean, interannual standard deviation (SD) and 5-, 20- and 50-year return levels (RLs) (based on the GEV distribution) of seasonal maximum values of flood peak, volume and duration at 16 gauging stations shown in Fig. 1

In Fig. 5, average values of KCC and joint occurrence probabilities (P1 and P2) corresponding to 5-, 20- and 50- year return periods for 16 gauging stations and representative CRCM grid cells for three pairs of flood characteristics derived from observed records, CRCM–ERA40 and CRCM–CGCMc are presented. Both CRCM–ERA40 and CRCM–CGCMc tend to under-estimate KCC for all three pairs of flood characteristics; however, the values for CRCM–CGCMc are typically closer to observed ones. Theoretically, the joint cumulative probability C[F X (x), F Y (y)] in Eqs. (7) and (8) increases as KCC increases; the joint occurrence probability P1 decreases and the joint occurrence probability P2 increases as KCC increases. Consequently, CRCM–ERA40 and CRCM–CGCMc tend to overestimate P1 and underestimate P2. Percentage difference between observed and simulated values is less than 4 % (26 %) for P1 (P2) for the three pairs of flood characteristics. Though differences exist between CRCM–ERA40 and CRCM–CGCMc based statistics (Fig. 5), overall the boundary forcing errors are modest compared to performance errors.

Fig. 5
figure 5

Average values of a KCC and joint occurrence probabilities, b P1 and c P2 corresponding to 5-, 20- and 50-year return period thresholds (based on the Clayton family) for 16 gauging stations and representative CRCM grid points for three pairs of flood characteristics derived from observed records, CRCM–ERA40 and CRCM–CGCMc. Each panel uses different scale for y-axis

The boundary forcing errors for the entire domain are investigated by comparing the selected statistics simulated by CRCM–ERA40 and CRCM–CGCMc, which are presented in the first and second columns in Figs. 6, 7, 8 and 9. In Fig. 6, basic statistics of three flood characteristics for CRCM–ERA40 and CRCM–CGCMc are provided. The spatial patterns of the mean and SD are very similar for CRCM–ERA40 and CRCM–CGCMc. As expected, spatial patterns of CRCM simulated mean flood peak and volume show large values for grid points situated along mainstreams or outlet of a basin and small values for upstream grid points. Similar coherent spatial patterns are not visible in the case of flood duration statistics simulated by CRCM; nevertheless the regional model tends to yield larger duration for grid points on mainstream and outlets than those on upstream areas in general.

Fig. 6
figure 6

Mean and standard deviation (SD) of seasonal maximum values of a flood peak, b volume and c duration for CRCM–ERA40 (column 1), CRCM–CGCMc (column 2) and CRCM–CGCMf (column 3). Projected changes to respective statistics in future climate with respect to current climate are shown in column 4. The results shown in columns 2–4 correspond to ensemble averaged values

Fig. 7
figure 7

Five-, 20- and 50-year return levels (RLs) of a flood peak, b volume and c duration for CRCM–ERA40 (column 1), CRCM–CGCMc (column 2) and CRCM–CGCMf (column 3). Projected changes to RLs in future climate with respect to current climate are shown in column 4. Results shown in columns 2–4 correspond to ensemble averaged values

Fig. 8
figure 8

Joint occurrence probability P1 for a peak–volume, b peak–duration and c volume–duration corresponding to current marginal return values of 5-, 20- and 50-year return periods for CRCM–ERA40 (column 1), CRCM–CGCMc (column 2) and CRCM–CGCMf (column 3). Percentage difference between CRCM–CGCMf and CRCM–CGCMc is shown in column 4. Results shown in columns 2–4 correspond to ensemble averaged values

Fig. 9
figure 9

Joint occurrence probability P2 for a peak–volume, b peak–duration and c volume–duration, corresponding to current marginal return values of 5-, 20- and 50-year return periods for CRCM–ERA40 (column 1), CRCM–CGCMc (column 2) and CRCM–CGCMf (column 3). Percentage difference between CRCM–CGCMf and CRCM–CGCMc is shown in column 4. Results shown in columns 2–4 correspond to ensemble-averaged values

Figure 7 shows 5-, 20- and 50-year return levels of flood peak, volume and duration for CRCM–ERA40 (first column) and CRCM–CGCMc (second column). Estimated return levels for CRCM–ERA40 and CRCM–CGCMc show good agreement for all three flood characteristics. Spatial patterns of estimated return levels of short return periods (i.e. 5- or 20-year) of flood peak, volume, and duration are basically similar to those of mean values presented in Fig. 6.

Joint occurrence probabilities P1 of the three pairs of flood characteristics corresponding to current marginal return values of 5-, 20- and 50-year return periods for CRCM–ERA40 and CRCM–CGCMc are given in Fig. 8 in first and second columns, respectively. Simulated values of P1 for CRCM–ERA40 and CRCM–CGCMc show good agreement for the three pairs and three selected return periods, although the agreement of P1 values is not as good as those of the marginal values. P1 values for CRCM–ERA40 and CRCM–CGCMc do not show any clear spatial patterns. Figure 9 presents joint occurrence probability P2 for the three pairs of flood characteristics corresponding to current marginal return values of 5-, 20- and 50-year return periods for CRCM–ERA40 (first column) and CRCM–CGCMc (second column). Again, the joint occurrence probability P2 for CRCM–ERA40 and CRCM–CGCMc show good agreement for the three pairs of characteristics and selected return levels. As expected, the P2 values are smaller than the P1 values for the three pairs of flood characteristic.

4.3 Projections of future flood characteristics

4.3.1 Ensemble-averaged approach

4.3.1.1 Basic statistics—mean and interannual standard deviation

Estimated values of basic statistics (mean and interannual SD) of seasonal maximum values of flood characteristics and results of marginal and joint frequency analyses for CRCM–CGCMf are compared to those of CRCM–CGCMc for selected return periods in order to evaluate changes to flood characteristics. In Fig. 6, basic statistics of three flood characteristics for CRCM–CGCMc (second column) and CRCM–CGCMf (third column) and percentage change in these statistics (fourth column) are presented. After comparing CRCM–CGCMf and CRCM–CGCMc for the entire domain, 7.4, 12.8 and 10.8 % increase in mean and 8.7, 14.1 and 39.6 % increase in SD is found for flood peak, volume and duration, respectively, i.e. on average the three flood characteristics will have larger values and will be more variable in the future. Most of the northern and southern basins (except the southern part of RDO) show an increase in flood peak and volume in future climate, while the central eastern basins show some decreases. Some grid points for central-eastern basins (LGR and RUP) show also smaller increases or decreases in mean flood peak and volume. The duration however shows a general increase in future climate though associated with higher interannual variability in future climate, particularly for the central eastern basins. Summary statistics of the above discussed regional level projected changes for the northern, central and southern watersheds with respect to the domain averaged values are provided in Table 5.

Table 5 Ratios of regionally averaged increases for Northerna, Centralb, and Southernc regions to entire domain averaged increases in the mean and standard deviation (SD) of seasonal maximum values and selected marginal return levels (RL) of 5- and 50-year return period and joint occurrence probabilities (P1 and P2) of flood characteristics

The projected changes in mean flood characteristics are linked with changes to spring temperatures and/or snow water equivalent (SWE). For the northernmost basins considered in this study, the projected increase in mean flood characteristics are a result of both increased SWE and accelerated snowmelt caused by warmer spring temperatures. For the rest of the domain, where increases in flood characteristics, particularly peak and volume, are noted are due to increased spring temperatures, as SWE decreases in future climate for these regions despite an increase in precipitation. This decrease in SWE for the southern and central basins is because precipitation falls as rain even through most of December and snow buildup is delayed and reduced, leading to a decrease in the snow-to-rain ratio (Sushama et al. 2006; Huziy et al. 2012). As for the eastern basins, the impact of reduced SWE is larger than the impact of the increased spring temperatures, leading to a decrease in flood volume and peak in future climate.

4.3.1.2 Marginal return levels

Figure 7 presents 5-, 20- and 50-year return levels of flood peak, volume and duration for CRCM–CGCMc (second column) and CRCM–CGCMf (third column). Percentage change in various return levels is also presented in this figure (fourth column). Again, some grid points for central basins show smaller increases or decreases in 5-, 20- and 50-year return levels of flood peak and 5-year return level of flood volume only. In the central area, longer return period (i.e. 50-year) return levels of flood volume show large increases which are consistent with the large increases in SD. Note that the estimation of the marginal return values corresponding to 50-year return period are more uncertain than those of 5- and 20-year return periods. Larger increases are noted for return values of longer return periods than those corresponding to short return periods.

Ratios of regionally averaged increases to entire domain averaged increases in the marginal return values of 5- and 50-year return period of flood characteristics are provided in Table 5. Larger increase in flood peak, smaller increases in flood duration and about similar increases in flood volume compared to the domain averaged changes in the 50-year return levels are projected for the northern basins. For the central basins, compared to domain averaged changes, smaller increase in flood peak, slightly larger increase in flood volume, and relatively larger increase in flood duration are projected in the 50-year return levels. For the southern basins, smaller increases in the three flood characteristics are projected in the 50-year return levels compared to domain averaged changes in these return levels.

4.3.1.3 Joint occurrence probabilities, P1 and P2

Joint occurrence probabilities P1 and P2 of the three pairs of flood characteristics for both current and future periods are estimated using fixed thresholds, i.e. return values of flood peak, volume and duration corresponding to 5-, 20- and 50-year return periods estimated from the marginal distributions for the current climate. As defined earlier, P1 represents the joint occurrence probability when any one flood characteristic exceeds its respective threshold and P2 represents the joint occurrence probability when both flood characteristics exceed their respective thresholds at the same time. It would be useful to explain first the P1 and P2 estimation procedures before presenting their results. Therefore, an example of the relationship between marginal and joint distributions of flood peak and volume and calculation of joint occurrence probabilities corresponding to 50-year return period (i.e. P150 and P250) for the current and future periods for a CRCM grid cell is provided in Fig. 10. In the figure, flood peak and volume of 50-year return period are 572.8 m3/s and 622.8 MCM (million cubic meters) and the cumulative probabilities of the thresholds in the future marginal distributions are 0.9313 and 0.8349, respectively. Although, the KCC for the future period (0.4805) is smaller than that for the current period (0.6460), future joint cumulative probability (0.7938) is much smaller than the current joint cumulative probability (0.9617) because the marginal cumulative probabilities in the future climate are smaller than those in the current climate for the two flood characteristics. The estimated values of P150 and P250 in the future climate for this sampled CRCM grid cell are five and 14 times larger than those in the current climate.

Fig. 10
figure 10

An example of marginal and joint distributions of flood peak and volume for current and future periods for a representative CRCM grid point. Estimation procedures of current and future joint cumulative probabilities based on the current 50-year return period threshold are also shown

Figure 8 presents joint occurrence probabilities P1 of the three pairs of flood characteristics corresponding to current marginal return values of 5-, 20- and 50-year return periods for CRCM–CGCMc (second column), CRCM–CGCMf (third column) and future changes (fourth column) to P1 values with respect to current climate. CRCM–CGCMf yields 38.5 % to more than 200 % larger values compared to CRCM–CGCMc for the entire domain. It is also obvious from Fig. 8 that percentage increase in the probability increases as the return period increases. Regionally, P1 shows larger increases for the northern basins and smaller increases for the central basins for the peak–volume pair than the domain averaged value. For peak–duration and volume–duration pairs, some grid points along the main streams or outlets in the northern and central basins (i.e. ARN, GRB, PYR, LGR, CAN and CHU) and southern basins (RDO, STM, BEL, WAS and SAG) show larger increase than the other regions. Clear decreases of P15 are observed in the central-eastern part of the region for all three pairs of characteristics. The above results are summarized in Table 5 to ease comparison between changes to P1 values for northern, central and southern basins with respect to the domain averaged changes. The regional increases in the joint probability are consistent with those in the marginal flood characteristics. As shown in Fig. 7, future flood peak is associated with large increase for the northern basins and future flood volume with large increase for the northern and southern basins; however, both future flood peak and volume yield smaller increase for the central basins than other regions.

Figure 9 presents joint occurrence probability P2 for the three pairs of flood characteristics corresponding to current marginal return values of 5-, 20- and 50-year return periods for CRCM–CGCMc (second column), CRCM–CGCMf (third column) and future changes in P2 with respect to current climate (fourth column). In general, increases in P2 values are larger than those of P1 for all three flood characteristic pairs and return periods. The percentage increase in the probability increases considerably with the increase in return period. Although, the spatial distribution of future increase in P2 is less clear than that of P1, it is basically similar in character. For the peak–volume pair, future P2 shows larger increases for the northern and smaller increases for the central basins. For the peak–duration and volume–duration pairs, some grid points along the main streams or outlets in the northern, central (i.e. ARN, GRB, PYR, LGR, CAN and CHU) and southern basins (RDO, STM, BEL, WAS and SAG) show larger increases in P2 than the other regions in future. A summary of the above results is presented in Table 5 to ease comparison between changes to P2 values for northern, central and southern basins with respect to the domain averaged values.

4.3.2 Merged series analysis approach

Figure 11 presents marginal return values and joint occurrence probabilities P1 and P2 for 50-year return period for CRCM–CGCMc and CRCM–CGCMf, estimated using merged series. Percentage changes in marginal return levels and P1 and P2 values are also shown. Fifty-year return level is the highest extreme event considered in this study and it is also associated with higher uncertainty in the estimated flood characteristics and joint occurrence probabilities. The merged series analysis has statistical advantage over the ensemble averaged approach based on the individual series analyses as the former uses larger sample size than the latter analysis that help reduce the range of uncertainty associated with longer return period return levels. A comparison of 50-year return levels shown in Figs. 7 and 11 suggests that the two analyses produce almost similar spatial patterns and changes to the three flood characteristics. Similarly, a comparison between joint occurrence probability P1 shown in Figs. 8 and 11 suggests about similar spatial patterns for the three flood characteristic pairs. However, increases for the case of merged series analysis approach are much smaller than those obtained with the ensemble averaged approach for the three pairs of flood characteristics. A comparison of the joint occurrence probabilities shows that the ensemble averaged approach, on average, projects 1.2–2.6 times larger future increases in the three pairs of flood characteristics than the merged series analysis approach. Percentage increases are similar for short (i.e. 5- or 20-year) return periods but differ considerably for longer (i.e. 50-year) return periods (detailed results are not shown). Ratios of regionally averaged changes (mainly increases) to entire domain averaged changes in the mean and SD of seasonal maximum values and selected marginal return values of 5- and 50-year return period and joint occurrence probabilities P1 and P2 of flood characteristics are provided in Table 5. The results of this table support further the correspondence between the results of ensemble averaged and merged series approaches presented above.

Fig. 11
figure 11

Fifty-year return levels of a peak, volume and duration and their joint occurrence probabilities b P1 and c P2 computed using merged longer samples for CRCM–CGCMc (column 1) and CRCM–CGCMf (column 2). Percentage difference between CRCM–CGCMf and CRCM–CGCMc is shown in column 3

5 Summary and conclusions

In the present work, climate change impacts on three spring (March–June) flood characteristics, i.e. peak, volume and duration, for 21 northeastern Canadian basins are evaluated using univariate and copula based bivariate frequency analyses using CRCM current (1979–1999) and future (2041–2070) climate simulations. The mutually correlated nature of the three pairs of flood characteristics (i.e. peak–volume, peak–duration and volume–duration) visible clearly in the observed records and the CRCM simulated flood events for the study area supports the necessity for bivariate flood frequency analysis. Prior to assessing projected changes to flood characteristics, basic statistics (i.e. mean and interannual standard deviation) of the seasonal maximum values of flood characteristics and results of univariate and bivariate frequency analyses for CRCM–ERA40 are compared to those observed at 16 gauging stations in order to evaluate the performance of CRCM. A similar comparison of CRCM–CGCMc results with CRCM–ERA40 is also performed to assess the lateral boundary forcing errors. The main results are summarized below:

  • Comparison of CRCM–ERA40 simulated flood characteristics with those observed at 16 gauging stations suggests that the model reasonably well captures the characteristics. The R-squared values between CRCM–ERA40 and observed basic statistics and marginal return values are generally larger than 80 % for flood peak and volume and lie between 30 to 60 % range for flood duration. The percentage differences between observed and CRCM–ERA40 values are modest (less than 4 %) for the joint occurrence probability P1, and are less than 26 % for the joint occurrence probability P2.

  • Comparison of the basic statistics and frequency analyses of flood characteristics for CRCM–CGCMc with those of CRCM–ERA40 helped assess boundary forcing errors. In general the boundary forcing errors are found smaller than the differences between CRCM–ERA40 and observations, which is generally referred to as performance errors.

  • Though there are important regional differences, the average projected increase to flood peak, volume and duration for the 21 basins is 7.4, 12.8 and 10.8 %, respectively, while the interannual standard deviation of these flood characteristics is projected to increase by 8.7, 14.1 and 39.6 %, respectively. On average, the projected changes to marginal return levels of flood peak, volume and duration suggest increases in future climate. Huziy et al. (2012) and Clavet-Gaumont et al. (2012), who studied projected changes to flood peaks for the same 21 watersheds, also reported similar future increases.

  • The projected increases in the flood peak, volume and duration for the majority of the basins are caused by increased winter and spring precipitation and warmer spring temperatures, leading to increased spring snowmelt in future climate. Such increases were also reported recently by Huziy et al. (2012).

  • Projected changes to the joint occurrence probabilities P1 and P2 of the peak–volume, peak–duration and volume–duration pairs of flood characteristics were studied for the very first time for the 21 watersheds considered in this study. Results suggest future increases for both P1 and P2, with larger increases for longer return periods than shorter return periods.

  • Comparison of projected changes obtained using the ensemble-average approach and the merged series approach, where values of flood characteristics from five different simulations are merged to create longer samples for all grid points (i.e. 150 values for each grid point) projects relatively smaller increases to future joint occurrence probabilities P1 and P2 than the ensemble-average approach. The merged series approach helps reduce uncertainty associated with return values of longer return periods. Thus, taking into account uncertainties associated with short samples, projected joint occurrence probabilities for the merged series case appear to be more reliable than the ensemble-average approach.

The results of this study are useful for a number of sectors, including water resources and flood risk management, hydropower industry and environmental management. Information on projected changes to the joint occurrence probability of flood characteristics, particularly P2 related to flood peak and volume, are necessary for the management of hydroelectric projects and infrastructure facilities. While planning of adaptation strategies and risk management based on P2 would minimize risks, those based on P1 would involve some additional risk.

The projected changes in flood characteristics presented in this study are based on a five-member ensemble of the CRCM driven by five different members of the CGCM. This limited ensemble describes only the uncertainty of the CRCM–CGCM projections based on A2 scenario. A multi-model ensemble approach would be necessary to quantify other sources of uncertainty (e.g. those due to model formulation, future emission scenarios, choice of CGCM lateral boundary conditions) on the impacts of future climate change over the study area. For instance, Sushama et al. (2006) investigated climate change impacts on the climatological mean and extremes for major climatic regions in North America based on two different versions of the CRCM. They reported that high-flow characteristics, particularly the seasonal distribution of high-flow events and selected return levels, can be more sensitive to model formulation.