In the past several years, extreme weather and climate events such as extreme daily temperatures and rainfall amounts, regional heatwaves, and hurricanes have received increased attention due to their significant impact on natural environments and human activities, and the fact that they are projected to occur with increased frequency (IPCC SREX 2012). In order to facilitate the analysis of such extreme events, projects such as the STAtistical and Regional dynamical Downscaling of EXtremes for European regions (STARDEX; Haylock and Goodess 2004) and the Expert Team on Climate Change Detection and Indices (ETCCDI; e.g. Klein Tank et al. 2009; Zhang et al. 2011; Sillmann et al. 2013a) have defined a set of climate indices that provide a good mixture of daily statistics to assess changes in temperature and precipitation regimes in terms of duration, intensity and occurrence. For the daily-precipitation indices, three categories can be defined:

  1. 1.

    Mean and strong intensity indices that focus on fixed or relative thresholds, which describe features in the mean and highest precipitation amount per day, such as:

    • The “simple daily intensity” index, SDII, defined as the mean precipitation amount on wet days, defined as days with precipitation ≥1 mm/day.

    • The “precipitation due to heavy, very heavy and extreme precipitation days” indices, R90pTOT, R95pTOT and R99pTOT, which are defined as the percentage of total precipitation due to days with precipitation exceeding the 90th, 95th and 99th percentiles of the climatological distribution for wet-day amounts.

  2. 2.

    Occurrence indices that refer to counts of days when the amount of daily precipitation exceeds a specified fix threshold, such as:

    • The “number of wet days” index, R1mm, which is defined as the number of days with precipitation ≥1 mm/day.

    • The “Xmm precipitation days” indices, RXmm, which are defined as the number of days with precipitation ≥X mm/day, where X is usually a local value that delineates heavy (or very heavy) daily precipitation (for example, in extratropical regions, the R10mm is used as heavy intensity index and R20mm is used as very heavy intensity index).

  3. 3.

    Duration indices that assess periods of persistence or maximum duration in wet/dry day sequences, such as:

    • The “consecutive wet days” index, CWD, defined as the maximum number of consecutive days with precipitation ≥1 mm/day,

    • The “consecutive dry days” index, CDD, defined as the maximum number of consecutive days with <1 mm/day.

These daily-precipitation indices have been used widely to analyse extremes in observations (e.g., Frich et al. 2002; Alexander et al. 2006; Costa and Soares 2009; Donat et al. 2013), in historical simulations and climate-change projections (e.g., Alexander and Arblaster 2009; DeAngelis et al. 2013; Sillmann et al. 2013a, b) and in Regional Climate Models (RCMs) simulations (e.g., Gao et al. 2002; Im et al. 2011; Sylla et al. 2012; Roy et al. 2012). The occurrence and duration indices are typically used in impact studies where fixed thresholds are related to local extremes, while percentile indices are generally employed in climate-change detection studies because they permit to compare changes in the same parts of the precipitation distribution over large regions (Costa and Soares 2009). However, only relatively little work can be found in the literature related to daily precipitation extremes in RCMs over Sahel because of the limited availability of daily-precipitation observations with suitable spatial resolution over this region. For example, Sylla et al. (2012) have analysed daily precipitation indices over the whole African continent, and found that the simulated statistics from the RegCM3 RCM are mostly within the relatively wide range (of uncertainties) of the three observed datasets used as reference.

Sahel is a semiarid region of Africa extending between 12°N and 20°N, from Senegal in the west toward Sudan in the east. The region is most vulnerable to changes in precipitation frequency and intensity because its economy and food supply depend strongly on local non-irrigated agriculture, which in turn strongly depends on the rainy season. This area is characterised by large spatial–temporal rainfall variability (Sylla et al. 2010). The beginning and the end of the rainy season in West Africa is associated with the seasonal migration of the Inter-Tropical Convergence Zone (ITCZ). During May and June, the ITCZ remains quasi-stationary around 5°N (corresponding to the first of two rainy seasons during each year for the Guinean Coast region), followed by an abrupt northward shift at the end of June and a stabilisation at around 10°N during July and whole month of August. During September, the ITCZ is retreating gradually toward the south, bringing the second rainy season over Guinean region and the ending of the Sahel rainy season (Sultan et al. 2003, 2005).

A substantial number of studies can be found in literature focusing on the mechanisms that can explain the geographical distribution of the African seasonal-mean rainfall and its inter-annual variability. For example the African Multidisciplinary Monsoon Analysis project (AMMA; Redelsperger et al. 2006) was designed to address the evaluation of climate and weather forecast models with respect to the West Africa Monsoon (WAM) system and the understanding of its physical mechanisms. Other projects have also focussed the WAM area such as the AMMA Model Intercomparison Project (AMMA-MIP; Hourdin et al. 2010), the AMMA land-surface Model Intercomparison Project (ALMIP; Boone et al. 2009), the AMMA “Couplage de l’Atmosphère Tropicale et du Cycle Hydrologique” (AMMA-CATCH) project (Lebel et al. 2009). Other projects on African climate evaluation are the WAM Modelling and Evaluation (WAMME) project (Xue et al. 2010; Druyan et al. 2010), the Ensembles-based Predictions of Climate Change and their Impacts (ENSEMBLES) African project (Paeth et al. 2011) and the COordinated Regional climate Downscaling EXperiment over AFRICA (AFRICA-CORDEX; Giorgi et al. 2009; Despite the large number of studies, there is still a limited understanding of the interactions between local and remote processes, and the interactions across weather and climate time scales, which makes the WAM simulation a challenging task for both Global Climate Models (GCMs) and RCMs. The results show a wide spread in the models skill over the WAM region (see Chapter 9 of IPCC 2013). Some models fail to adequately simulate important dynamical features such as the upper-troposphere Tropical Easterly Jet (TEJ) and the mid-troposphere African Easterly Jet (AEJ), resulting in an improper latitudinal location of the rainfall band, leading to important precipitation biases in the Sahel region (e.g., Druyan et al. 2010; Nikulin et al. 2012; Hernández-Díaz et al. 2012; Roehrig et al. 2013; Rowell 2013). Sylla et al. (2010) showed that the intra-seasonal fluctuations in the position of the monsoon rain belt are strongly affected by the AEJ core, and its north–south displacement during the monsoon season, as a more equatorial position of the AEJ favours dry conditions over Sahel.

The start of the rainy season in Sahel determines the optimal planting time there, as the stage of germination and of growth of plants requires water availability (Sultan et al. 2005; Marteau et al. 2011; Vellinga et al. 2013). Very heavy rainfall events can cause floods, while the occurrence or persistence of dry spells during the rainy season can cause severe droughts, having a huge impact on crops and therefore on the food supply. Sultan et al. (2005), citing Ingram et al. (2002), mention that the most important rainfall parameters for agriculture in West Africa are the timing of the onset, the end of the rainy season, as well as the rainfall distribution and water deficits within the rainy season. It is therefore important to evaluate the performance of climate models in representing not only key aspects of the mean seasonal precipitation, as do in the majority of papers found in literature, but also in simulating the higher-order statistics of daily precipitation, such as the frequency distribution and the occurrence, duration and intensity of daily-precipitation amounts.

The primary objective of this study is to evaluate the daily-precipitation statistics and the monsoon onset/retreat as simulated by two Canadian RCMs, the CanRCM4 from the Canadian Centre for Climate Modelling and Analysis (CCCma) and the CRCM5 from University of Québec at Montréal (UQAM). The simulations evaluated here came from the evaluation experiment of the AFRICA-CORDEX project, which was designed to evaluate RCMs’ runs over the past two decades (1989–2008) over a predefined domain covering the whole African continent, using common horizontal grid spacing, experimental protocol and output format. For this period, the two RCMs are driven by the European Centre for Medium-range Weather Forecasting (ECMWF) ERA-Interim reanalyses (Uppala et al. 2008; Dee et al. 2011).

Our evaluation work is made prior to the application of the simulations in health and agriculture impact studies within the FACE project (“Faire-face Aux Changements Ensemble”; see from the International Research Initiative on Adaptation to Climate Change (IRIACC), in which both Environment Canada (EC) and UQAM collaborate with various institutions in Niger (ACMAD, AGRHYMET and CERMES). One of the FACE project’s goals is to support the adaptation of vulnerable populations in Niger to climate variability and change. Therefore, although the RCM simulations are realised over the whole African domain, the main focus here is over the WAM area, especially over the western Sahel (15°W–15°E and 12°N–20°N; see Fig. 1).

Fig. 1
figure 1

AFRICA-CORDEX domain, with the West-Africa (blue color) and West-Sahel (red color) sub-regions used in the analysis. The red dashed lines within the West-Sahel area marks the 14° and 17° latitudes used to separate Sahel into the southern, central and northern regions

First, we address the RCM evaluation for daily precipitation indices highlighting also the challenge associated with the evaluation in data-sparse regions where the only practical products are given by gridded observations from satellite estimates. We will also analyze the spatial–temporal daily-precipitation frequency distribution. Second, we evaluate the RCMs performance in simulating the monsoon onset/retreat over West Africa.

The paper is organized as follows. The subsequent section (Sect. 2) presents the description of the two RCMs, the reanalyses and the sets of gridded observations used for models evaluation, along with the study area and the methods. Section 3 shows the results about the rainfall seasonal mean climatology (Sect. 3.1), the daily-precipitation indices and frequency distributions over Sahel (Sect. 3.2), and the monsoon onset/retreat (Sect. 3.3). A summary and conclusions are provided in Sect. 4.

Data, study area and methods

Under the AFRICA-CORDEX evaluation project (Giorgi et al. 2009; Nikulin et al. 2012), modeling groups are invited to provide simulations at 0.44° and 0.22° horizontal meshes, integrated over the entire African continent, and covering the past two decades (1989–2008) driven by the ERA-Interim reanalysis products. Three RCM simulations are used here, i.e. two runs at 0.44° and one run at 0.22°, using the two recently developed Canadian RCMs presented in the following sub-section.

Model characteristics

The CanRCM4 model, developed by the Canadian Centre for Climate Modelling and Analysis division of the Climate Research Branch of Environment Canada (CCCma/EC;, and the CRCM5 (Hernández-Díaz et al. 2012), developed by the Centre pour l’Étude et la Simulation du Climat à l’Échelle Régionale (ESCER Centre) at the UQAM (, are the two Canadian RCMs used in this study. These two models share the same dynamics from the Environment Canada GEM environmental multiscale forecast model (Côté et al. 1998a, b), but use different packages of physical parameterizations. Their main characteristics and physical parameterizations are presented in Table 1 and will be briefly described in the following.

Table 1 CCCma CRCM4 and UQAM CRCM5 configuration for the AFRICA-CORDEX simulations analysed in this paper

CRCM5 shares several physics modules with the limited-area numerical weather prediction model GEM (Yang et al. 2012; Mailhot et al. 2010). It uses however the interactive Flake lakes module (Martynov et al. 2010, 2012) and the land surface scheme CLASS 3.5 (Verseghy 2000, 2008). The CRCM5 simulation analysed here was the object of a first analysis presented in Hernández-Díaz et al. (2012) over Africa, which focussed on the mean seasonal climatology and diurnal and annual cycles. For this simulation, the CLASS scheme was using three layers with depths of 0.10, 0.25 and 3.75 m and the standard CLASS distributions of sand and clay fields as well as the bare soil albedo values were replaced by data from the ECOCLIMAP database (Masson et al. 2003). The subgrid-scale processes parameterisations include the Kain and Fritsch (1990) deep-convection scheme, the Kuo-transient scheme for shallow convection (Kuo 1965; Bélair et al. 2005), the Correlated-K Distribution method for solar and terrestrial radiations (Li and Barker 2005), a subgrid-scale mountain gravity-wave drag (McFarlane 1987). The model also uses a low-level orographic blocking (Zadra et al. 2003), a turbulent kinetic energy closure in the planetary boundary layer and vertical diffusion (Benoit et al. 1989; Delage and Girard 1992; Delage 1997). Aerosols are not yet implemented in this version of the model and the model uses the Sundqvist scheme for large-scale condensation (Sundqvist et al. 1989). CRCM5 is a one-way nested model with a ten grid-point nesting zone, inspired by the work of Davies (1976) and adapted by Robert and Yakimiw (1986) and Yakimiw and Robert (1990). An option also exists for weakly nudging the large scales within the interior of the domain, but this option was not activated for the simulation analysed here. The simulation was initialised on January 1984 with climatology soil moisture and temperature. The atmospheric lateral boundary conditions came from the ECMWF gridded reanalyses ERA-40 for the spin-up period (1984–1988) and ERA-Interim for the main CORDEX study period (1989–2008), while the sea surface temperatures (SSTs) are provided by AMIP2 (Fiorino 1997) for the spin-up period, and by ERA-Interim for the 1989–2008 period. More details concerning the current version of the model can be found in Hernández-Díaz et al. (2012) and Laprise et al. (2013).

The CanRCM4 constitutes a limited-area version of the fourth-generation Canadian Atmospheric global climate Model, CanAM4 (von Salzen et al. 2013). CanRCM4 uses the same parameterizations as CanAM4, including CLASS 2.7 land-surface scheme with soil profile divided into three horizontal layers, of thicknesses 0.10, 0.25 and 3.75 m. The deep-convection scheme is based on the cumulus parameterization of Zhang and McFarlane (1995), while the effects of shallow convection are parameterized following von Salzen and McFarlane (2002), and von Salzen et al. (2005). The absorption by gases in the atmosphere is parameterized using the Correlated-k Distribution method (Li and Barker 2005) while the Monte Carlo Independent Column Approximation (McICA; Barker et al. 2008; Pincus et al. 2003) is used for radiative transfers. The model is also using an anisotropic orographic low-level drag (Scinocca and McFarlane 2000), and the surface exchanges of heat, moisture, and momentum (the boundary layer scheme) follow the treatment of Abdella and McFarlane (1997). The land-use data are from Global Land Cover 2000 database (European Commission, Joint Research Centre 2003,; Bartholomé and Belward 2005). The model uses a prognostic bulk aerosol scheme (von Salzen et al. 2013) and microphysics processes for different types of inorganic and organic aerosol as described in von Salzen and McFarlane (2002). The aerosols at model lateral boundaries and the other lateral boundary conditions are provided by an adaptive run of the global model (CanAM4) that assimilated reanalysis data through a simple relaxational procedure that forces the horizontal wind, temperature and specific humidity from CanAM4 to follow that from the reanalysis, and it is also constraining all the other CanAM4 variables to be consistent with the reanalysis variables. The two CanRCM4 simulations analysed in this paper are nested using all physical prognostic variables from the parent global-model CanAM4 that was previously nudged toward ERA-Interim reanalysis. The SSTs came from the ERA-Interim reanalysis and the simulations were started on January 1988 with climatology soil moisture and temperature. As for the CRCM5, the CanRCM4 is a one-way nested model using a Davies (1976) nesting zone, with an optional interior nudging for the large scales. For the CanRCM4 simulations presented in this paper, the interior nudging was turned on, with a relaxation time of 24 h applied to spatial scales larger than 1,000 km.

Satellites and reanalysis datasets

As mentioned previously, the main goal of this paper concerns the evaluation of daily precipitation from the three RCM simulations described above. This requires datasets with spatial resolutions similar to the RCM simulations, and covering a long period of time. Usually, these sets of observations are gridded products that are based on in situ gauge-station observations. A number of such datasets are today available for the African continent, but these data suffer from uncertainty due to limitations in density of available stations and in the techniques of data blending (e.g. Sylla et al. 2012). One example of such set of gridded gauge-based daily precipitation observations is the CPC Unified Gauge-based Analysis of Global Daily Precipitation (Chen et al. 2008) that covers the entire globe at 0.5° spatial resolution, from 1979 to present. Unfortunately, the rain-gauge stations are sparse throughout the African continent, especially over the tropical Africa; for example the mean station-to-station distance over tropical Africa is around 500 km, while over USA is close to 30 km. Consequently, Chen et al. (2008) do not recommended this dataset for tropical-Africa analysis. As an alternative, we will use gridded daily precipitation from three satellite estimates that integrate also observations from rain-gauge stations, namely:

  • The dataset from Global Precipitation Climatology Project (GPCP-1DD v.1.2; Huffman et al. 2001) is made available from late 1996 to present at 1° horizontal resolution and is using several input data. In the band 40°N–40°S, the daily precipitation is built from the Threshold Matched Precipitation Index, a microwave-calibrated infrared technique. This technique uses 3-hourly infrared brightness temperature histograms (from GOES, Meteosat and GMS geostationary satellites) with locally determined brightness temperature thresholds and precipitation rates based on Special Sensor Microwave Imager (SSMI) and constrained by the GPCP V2.1 monthly satellite-gauge product that incorporates gauge observations from Global Precipitation Climatology Center (GPCC). The gaps are filled with NOAA AVHRR polar-orbiting IR satellite using the GOES Precipitation Index.

  • The TRMM 3B42 version 6 (Tropical Rainfall Measuring Mission; Huffman et al. 2007) dataset provides 3 hourly precipitation data for the entire tropics (50°S–50°N) since 1998 at 0.25° horizontal resolution; the rainfall product is based on a combination of estimates from the passive microwave radiometer and precipitation radar from the TRMM satellite and infrared observations from geostationary satellites (GOES, Meteosat and GMS), which are re-scaled upon monthly gauge precipitation observations such as the GPCC and Climate Anomaly Monitoring System (CAMS) products.

  • The recent Africa Rainfall Climatology version 2 (ARC2; Novella and Thiaw 2012) daily precipitation estimates from the Climate Prediction Center (CPC) provide daily precipitation data over Africa from 1983 to present at 0.1° horizontal resolution. The satellite estimates are based upon cloud top temperature (which is used for the GOES Precipitation Index) measured from the 3-hourly geostationary infrared (IR) data centered over Africa primarily from the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT). These satellite estimates are next merged with quality controlled Global Telecommunication System (GTS) gauge observations, which reports 24-h rainfall accumulations over Africa, in order to produce the final estimates.

Precipitation in these sets of data is an indirect product obtained using a combination of several measurements from multiple satellites combined into the final product using various complex algorithms. Consequently, the uncertainty in these observational products is greater than in in situ gauge measurements. However, as mentioned previously, they represent the only practical solution for regions without a dense gauge-measurement network. Despite the fact that the satellite-estimated precipitation is re-scaled to match rain-gauge measurements in land points where such measurements are available, some differences exist into the various final products in the other points. Consequently, in this study, the RCM evaluation is done with respect to all three satellite-estimated products (GPCP, TRMM and ARC2) instead of only one. Using all of them in the evaluation process permits also to obtain a certain measure of the uncertainty in derived observations. However, it must be point out that the three sets of data are not totally independent. They are using some common satellite measurements, and are also re-scaled to match rain-gauge measurements, which are not totally independent either because the different networks with in situ measurements are sharing a large part of the same measurements. Hence, the comparison of the GPCP, TRMM and ARC2 data gives a minimum level for the observation uncertainty. The RCM simulated data are considered good if they are falling into the range of observed values.

Along with these satellite-gauge-combined datasets and the three RCM simulations, this study will consider also daily precipitation from four reanalysis products:

  • The ECMWF ERA-Interim reanalysis (Uppala et al. 2008; Dee et al. 2011), which provides precipitation rate at a spatial resolution of 0.75° × 0.75° covering the period 1979 to present.

  • The NCEP/DOE Reanalysis II (Kanamitsu et al. 2002) that provides precipitation rate on a T62 Gaussian grid with 192 × 94 points covering the period 1979 to present (henceforth referred to simply as NCEP).

  • The Twentieth Century Reanalysis (hereafter 20CR) is a global reanalysis dataset (Compo et al. 2011) with precipitation rate on a global Gaussian T-62 grid and it is spanning the period 1870–2010.

  • The NASA Modern Era Reanalysis for Research and Applications (MERRA; Rienecker et al. 2011) with precipitation that spans 1979 through present day and a spatial resolution of 2°/3° longitude by 1°/2° latitude. From the four sets of reanalysis, MERRA has the closest horizontal resolution to the AFRICA-CORDEX simulations.

Reanalysis products provide long-term atmospheric fields, created via data assimilation scheme and background forecast model integration over periods of 6–12 h covering the last decades, with model formulation and assimilation system kept unchanged (Bosilovich et al. 2008). Hence, they are dependent on the assimilation method and model physics, which lead to uncertainty in the output, especially for precipitation field that is not a direct assimilated observation. Then, this last field is highly related to forecast-model physical parameterizations. For example, MERRA assimilates directly from observations the surface pressure, temperature, specific humidity, cloud condensate and zonal and meridional wind, which are used in the forecast model to simulate surface convective and large-scale rain rate. On the other hand, for precipitation, for the recent period (1987/07 to present), the MERRA assimilation system uses rain rate estimates from the Special Sensor Microwave Imager and TRMM Microwave Imager, which are not direct in situ measurements. ERA-Interim is using rain-affected radiances from satellites in the assimilation process rather than derived rain rates, while NCEP/DOE is using a correction method for the soil moisture at the topsoil layer based on the differences between the previous pentad average NCEP model generated precipitation, and the CMAP pentad precipitation. 20CR is assimilating only synoptic surface pressure and is using monthly SST and sea-ice distributions as boundary oceanic conditions. Because several studies consider reanalysis as an alternative solution for evaluation studies over regions where observational data are sparse, and when knowledge of the state of the atmosphere on a similar uniform grid is required (e.g. Zhang et al. 2011), a comparison with satellite-based precipitation will bring valuable information.

Study area and space–time evaluation

Hence, the daily-precipitation statistics from the 0.22° and 0.44° grids of CanRCM4 and 0.44° grid of CRCM5 simulations and also from ERA-Interim, NCEP, 20CR and MERRA will be compared with that of GPCP, TRMM and ARC2, over West Africa (20°W–25°E; 0°N–25°N), with an emphasis on the West Sahel region (15°W–15°E; 12°N–20°N). Figure 1 presents the AFRICA-CORDEX domain of integration, and these two areas used in our analysis (West Africa in blue, West Sahel in red).

The analysis focuses only on the common period across various products described in the previous section, which corresponds to 1998–2008 (11-year statistics). Daily precipitation indices will be computed over the Sahel wet season, corresponding mainly to June, July, August and September (JJAS) months. Because the various datasets correspond to a broad range of horizontal resolution, the indices (see their definition in the following section) are computed initially on their original grid, and then interpolated on the AFRICA-CORDEX grid at 0.44° of resolution.

Precipitation indices and monsoon onset-retreat definition

We selected four indices from the aforementioned occurrence and duration categories that are suitable for the purposes of this research because they are related to the rainfall distribution within the rainy season: R1mm, R10mm, R30mm and CWD.

A day is considered a wet day when daily precipitation ≥1 mm/day. Therefore, the R1mm index is the seasonal wet day’s occurrence index and it is computed as the number of wet days in the JJAS season. CWD index is the maximum length of wet spells, and it is given by the greatest number of consecutive wet days during the JJAS season. Threshold daily-precipitation values of 10 and 30 mm/day are used to delimit moderate intensity and heavy intensity precipitation days, i.e. R10mm and R30mm correspond to the number of days with daily precipitation ≥10 and 30 mm/day, respectively, during the JJAS season.

In order to determine the timing of the monsoon onset over the WAM area, different indicators have been developed in literature (e.g. among others, Sijikumar et al. 2006; Fontaine and Louvet 2006; Vellinga et al. 2013). The vast majority of these indicators are based on large-scale precursors for which the band of maximum precipitation is shifted from its spring quasi-stationary position at 5°N to a northward position around 10°N. Therefore, the overall change in the monsoon system can be derived using different variables such as daily rainfall, outgoing terrestrial radiation or moist-static energy. Other indicators use local indices for agricultural applications, with information related to the effective start of the seasonal rainfall in specific area (e.g. Vellinga et al. 2013). In this paper, two indicators will be used, a large-scale index and a local index.

The large-scale precipitation onset index is usually computed from pentad (the 5-day mean) of precipitation as follows. First are computed, for each year, the area-averaged amounts for each pentad of precipitation in a Northern box and a Southern box that encompass the Guinea and the Sahel maxima, which are normalised by the highest pentad of the year over each box (e.g. Sijikumar et al. 2006; Vellinga et al. 2013). Next, are computed the differences between the Northern and the Southern boxes. The result is a large-scale onset function that has values between −1 and 1. The onset date for each year is given by the pentad when the onset function turns positive, and remains positive for at least three pentads.

In this paper, the Southern box is defined between 0°N–8°N, while the Northern box is defined between 8°N–16°N. Both boxes extend in longitude from 15°W to 15°E similar to the West Sahel domain presented in Fig. 1. This is different from previous studies that were using smaller boxes (from 10°W to 10°E). Larger boxes are used here in order to facilitate the comparison with the local onset index that will be computed over the West Sahel domain. An 11-year average (1998–2008) is also applied on the large-scale onset pentads obtained for each year.

The local onset index is inspired and slightly modified from the work of Liebmann and Marengo (2001) who focused on the rainy season in the Brazilian Amazon Basin. This index takes into account not only the occurrence of wet days during the time, but also the combined effect of duration and intensity of rainy sequences in order to define the onset. At each grid point, the daily rainfall accumulated anomalies (A) are computed for each year as the sum of daily-precipitation anomalies from 1st January up to the specific day:

$$A(day) = \mathop \sum \limits_{n = 1\,january}^{day} \left( {R(n) - \varvec{R}} \right),$$

where R(n) is the daily precipitation, with day from 1st January to 31 December, and R is given by:

  1. (a)

    The local climatological mean (in our case the 11-year mean over the period 1998–2008) at points where the climatological mean is ≥1 mm/day (therefore a wet day).

  2. (b)

    The 1 mm/day value (the threshold that separates dry days from wet days), in points where the local climatological mean is smaller than 1 mm/day.

Therefore, the anomalies are computed for each grid point with respect to a local threshold that is given by the 11-year mean (from 1998 to 2008) >1 mm/day, and by the value of 1 mm/day in points where the 11-year mean is smaller than 1 mm/day.

Figure 2 presents an example of the local onset function and daily precipitation amounts in GPCP dataset, for 1998, at the grid point closest to Niamey (2.2°E; 13.2°N). At that point, the local climatological mean (in our case the 11-year mean) exceeds 1 mm/day, hence the onset function increases/decreases when the local daily precipitation is greater/less than the local climatological mean. The figure shows that the first one hundred days of the year are dry days, and that the first events with precipitation exceeding the local climatological mean (the first two small increasing portions in Fig. 2a) are two isolated events followed by dry periods. These isolated events must not be considered as indicators for the onset. The day after the annual minimum of the function is the first heavy-rain event, as this is followed by regular days with precipitation. The rainy season begins with this sequence of rainy events, as this corresponds to the persistence in rainfall occurrence, duration and intensity. The maximum of the function (in Fig. 2a) indicates the last important rainy event followed by a rapid installation of the dry spell. In order to avoid the false onsets, we consider the day after the annual minimum of the function as the day when the rainy season will really install in the region (the local onset day). The day of annual maximum will mark the day when the rainy season is finished (the local retreat index). Therefore, the onset in GPCP dataset at Niamey in 1998 corresponds to the 147 day of the year (i.e. May 27th), while the monsoon retreat is given by the 273 day of the year (i.e. September 30th). The local onset/retreat will be computed for each grid point and the climatological value will be given by the 11-year average of the onset/retreat indices obtained for each year.

Fig. 2
figure 2

The 1998 time variation (in calendar days) of a the local onset function and b daily precipitation, corresponding to the GPCP dataset, at the grid point closest to Niamey

For regions where the local climatological mean is smaller than 1 mm/day, the daily rainfall accumulated anomalies are computed with respect to the 1 mm/day threshold, and hence the onset function will generally decrease, being characterised by dry days in the most part of the year. In these regions, the annual minimum corresponds to the last day of the year, while the annual maximum corresponds to the first day of the year. Such points where an annual minimum occurs after the 1st of October will be marked as points without a real monsoon onset/retreat in our analysis.


Seasonal mean climatology

We first analyse the wet-season (JJAS) average precipitation over West-Africa region. Figure 3 presents the spatial distribution of the 11-year-average JJAS-mean precipitation for ARC2, TRMM, GPCP, 0.22° and 0.44° CanRCM4, 0.44° CRCM5, MERRA, 20CR, NCEP and ERA-Interim. The West-Sahel region is highlighted by a red rectangle. In general, the observation products (Fig. 3a–c) have similar spatial precipitation patterns with maximum values over the mountain regions of Cameroun and Guinea, and with a Northeast–Southwest gradient over Sahel. However, although their large-scale patterns are similar, they present significant differences at the local scale. The two RCMs and the reanalysis also capture the large-scale spatial patterns of mean precipitation, but while CRCM5, MERRA and ERA-Interim present a narrower band of precipitation, the CanRCM4 and 20CR extend the band of precipitation over Sahel. Both 0.44° RCM simulations and the sets of reanalyses have more intense and extended maxima of precipitation over ocean along the West Africa coast, while the 0.22° CanRCM4 simulation is dryer along the coastal regions of Gulf of Guinea.

Fig. 3
figure 3

Averaged (1998–2008) seasonal-mean (JJAS) precipitation (in mm/day) over West Africa for: a ARC2, b TRMM, c GPCP, d 0.22° CanRCM4, e 0.44° CanRCM4, f 0.44° CRCM5, g MERRA, h 20CR, i NCEP and j ERA-Interim. All the fields were regridded on the 0.44° AFRICA-CORDEX grid. The West-Sahel region is highlighted by a red rectangle

The range of spatial correlations and root-mean-squared errors (RMSE) between the precipitation fields over the Sahel region (15°W–15°E; 12°N–20°N) is illustrated in Fig. 4: Fig. 4a shows the Pearson correlation coefficient, while Fig. 4b shows the RMSE between each set of RCM and reanalysis data and the three sets of observations (ARC2, TRMM and GPCP) used as different reference fields (X-axis). The Pearson correlation coefficient reveals the degree of large-scale similarity between the simulated and observed patterns of the fields, whereas the RMSE criterion highlights the small-scale differences among various products. All three sets of observational data represent quite similarly the suggested Northeast–Southwest gradient of precipitation, with the correlation coefficients mostly >0.96. Similar values are obtained for the 20CR (between 0.92 and 0.96) and NCEP (between 0.9 and 0.95) reanalysis, while MERRA, ERA-Interim, CRCM5 and 0.44° CanRCM4 present correlation coefficients that range between 0.96 and 0.79. In the case of 0.44° CanRCM4, a smaller gradient results in a lower correlation, as the simulated band of precipitation extends too far in the North according to observed data (see Fig. 3), while for MERRA, ERA-interim and CRM5 cases, a greater gradient caused by the dry bias from the North of Sahel has in general a lower impact on the correlation. The 0.22° CanRCM4 has somewhat smaller correlation coefficients with values between 0.72 and 0.83 than the other simulations and reanalysis. This is caused by its higher values than in observations over a large part of the southern Sahel, which changes the pattern of precipitation compared to TRMM and GPCP from a Northeast–Southwest gradient to a North–South gradient.

Fig. 4
figure 4

a Spatial Pearson correlation coefficient and b root-mean-squared differences in mm/day between the climatological seasonal (JJAS) mean of precipitation over West Sahel (red box in Fig. 3) from each RCM/reanalysis and the three sets of observations (ARC2, TRMM and GPCP)

Concerning the RMSE criteria (Fig. 4b), we can clearly distinguish the three sets of observations from reanalyses and models. The larger differences between observations are found with respect to GPCP fields (0.9 and 0.7 mm/day for ARC2 and TRMM, respectively) defining an interval of uncertainty relatively small for the seasonal mean. However, as mentioned in Sect. 2.2, the three sets of observations are not totally independent: they are rescaled to match almost the same gauge measurements. Consequently, this interval of uncertainty does not take into account uncertainty in gauge measurement, but rather the combined effect of resolution and integration of various satellite products between datasets. The reanalysis and the models have RMSE varying between 1.3 and 2.0 mm/day, with MERRA presenting in general the better scores. The 0.22° RCM4 has lower scores than the 0.44° version over the West Sahel region, the increased resolution exacerbating slightly the wet bias over Sahel.

In summary, the three datasets of observations, available over the West Africa region, show similar large-scale patterns for the mean seasonal precipitation, with quite high correlation values for all pairs of data products. However, some differences are noted at the local scale, with RMSE over Sahel region being up to 0.9 mm/day. Even if this value seems to be small, this could affect the identification of wet and dry days towards the northern limit of the WAM area (see the following section). At the large scale, the two RCMs perform quite well in reproducing the main features of the WAM area. Nevertheless, at the local scale, both CanRCM4 simulations have a large band of precipitation that extend too far into Sahel region (i.e. too wet over this area), while CRCM5 simulates a too narrow band of precipitation, which is translated into a dry bias over Sahel. These results are in line with previous works using other RCMs (e.g. Sylla et al. 2012; Diallo et al. 2012; Nikulin et al. 2012).

Daily-precipitation indices and frequency distribution

In this section, we analyze the RCMs and reanalyses skill in reproducing the daily-precipitation indices over West Africa according to the three gridded satellite-gauge observations used as reference: GPCP, TRMM and ARC2. As these products are provided on different horizontal resolution grids (i.e. 1° × 1° for GPCP, 0.25° × 0.25° for TRMM and 0.1° × 0.1° for ARC2), as for the three sets of reanalyzes, all fields have been interpolated on a common grid (i.e. CanRCM4/CRCM5 rotated pole grid). All indices are computed over the JJAS season and correspond to average values over the 1998–2008 period.

The wet days occurrence index (R1mm)

Figure 5 displays the comparison of 11-year average values of wet day’s occurrence index (i.e. R1mm expressed in number of days) in JJAS season between the three sets of gridded observations, the three RCMs and the four reanalysis products over West Africa. This shows that the three sets of gridded observations have similar large-scale patterns, the larger number of wet days being observed in regions with maximum seasonal precipitation (see Fig. 3). However the extension and amplitude of these maxima are quite different in the three sets of gridded observations. While GPCP display large areas with more than 90 wet days within the season, the TRMM and ARC2 have only small regions with such high values over the western Guinean plateau and the eastern Cameroon Mountain. Nevertheless, over Sahel, the three sets show more similar fields characterised by a gradual decrease in the number of wet days from south–west to north–east. Figure 6a shows the RMSE (in number of days) between each field displayed in Fig. 5 and the three sets of gridded observations over West Sahel (the red rectangle in Fig. 5). The RMSE between observations over Sahel region is up to 10 days, the larger differences being with respect to TRMM data that presents a smaller number of wet days in the South Sahel than the other two sets of gridded observations.

Fig. 5
figure 5

Comparison of seasonal (JJAS) mean (computed over the 1998–2008 period) values of wet days occurrence (R1mm index in days) over West Africa between: a ARC2, b TRMM, c GPCP, d 0.22° CanRCM4, e 0.44° CanRCM4, f 0.44° CRCM5, g MERRA, h 20CR, i NCEP and j ERA-Interim. All the fields were regridded on the 0.44° AFRICA-CORDEX grid. The West-Sahel region is highlighted by a red rectangle

Fig. 6
figure 6

Comparison of seasonal (JJAS) mean (computed over the 1998–2008 period) values of R1mm over West Sahel (red box area in Fig. 5) from both RCMs and reanalysis products concerning. a The root-mean-squared differences (in number of days) with respect to the three sets of gridded observations (ARC2, TRMM and GPCP) and b Taylor diagrams with respect to TRMM

The spread between the results of RCMs and reanalyses is much larger than between observations (Fig. 5). The overall picture shows that CanRCM4 simulations, MERRA, ERA-Interim and 20CR reanalyses exhibit large and quite homogeneous regions with a high number of wet days not present in the observed datasets, especially in the south of 15°N. On the other hand, CRCM5 and NCEP are closer to the GPCP field, presenting similar patterns with similar amplitudes. If only the Sahel region is considered, the CRCM5 field is close to the uncertainties or errors in observations presenting RMSE between 8 (with respect to GPCP) and 11 days (with respect to TRMM), while the CanRCM4 simulations display a quite large number of wet days with RMSE between 52 and 66 days within the season (Fig. 6a). Quite large RMSE values are also present in the MERRA and 20CR fields. The differences between CRCM5 and GPCP are due to the narrower band of seasonal wet days in the model compared to observations, with a dry bias in the North Sahel for this RCM. On the contrary, the CanRCM4 simulations have a larger band of seasonal wet days that is extending over Sahel inducing an overestimation of mean seasonal precipitation in this region (Figs. 3, 6a). Overall, the two CanRCM4 simulations using two different resolutions are very similar in general, but some finer details depicted for the 0.22° simulations over the Gulf of Guinea region seem to be closer to observations than for the coarser 0.44° simulation, mainly due to the better representation of coastlines (or air-sea contrast) and topography within the higher resolution simulation.

A summary of differences or similarities between observed and simulated fields over Sahel region is presented in Fig. 6b, which displays the Taylor (2001) diagram with respect to TRMM field. The spread present in the set of observations makes it difficult to choose one reference field for the Taylor diagram. In order not to underestimate the uncertainties between observations, we choose TRMM as reference for the R1mm index evaluation, because the larger mean differences between the sets of observations over Sahel is with respect to this dataset. We have also plotted on the diagram the other two sets of observations in order to give some measure of the uncertainty in observations. The Taylor diagram shows three statistical criteria with respect to one reference field: the black radial lines represent the spatial correlation coefficients between the simulated/reanalysis/observed and the reference field; the black dotted circles show the spatial standard deviation ratio between the simulated/reanalysis/observed and the reference field; and the blue dashed circles correspond to the RMSE once any overall bias has been removed, i.e. subtracting the spatial mean and dividing by the spatial standard deviation of the reference-field. The diagram confirms that the differences between the sets of observations over Sahel are primary due to differences in the spatial variability, both sets of observations having a larger spatial standard deviation than the TRMM field. Concerning the RCMs and the reanalyses, the best scores are obtained for NCEP, which has a high correlation coefficient (0.95) and a similar spatial standard deviation with TRMM, and one of the lowest RMSE. CRCM5 and ERA-Interim also present similar patterns as reflected by a high correlation coefficient (0.96). However, their spatial standard deviation over Sahel is overestimated with respect to observations because both are affected by a dry bias or lack of wet days in the North Sahel. The overestimation of wet days over Sahel in the two CanRCM4 simulations induces a smaller correlation coefficient (around 0.72), and also a larger spatial standard deviation than observed data. The MERRA and 20CR with too excessive gradient in wet days over Sahel have the largest spatial standard deviation, i.e. the strongest spatial overestimation among RCMs and reanalysis datasets with respect to observed data.

The moderate intensity index of rainfall (R10mm)

The R10mm index is analyzed in this section, which gives the number of days in the JJAS season with precipitation ≥10 mm/day as this threshold corresponds to moderate intensity values of rainfall over Sahel regions (see Figs. 8, 20 in Gachon et al. 2007). Its 11-year average (1998–2008) is displayed in Fig. 7 over the WAM area, while the corresponding RMSE over Sahel region is presented in Fig. 8a. The three sets of gridded observations present similar features as for the R1mm index: the two primary maxima are related to regions with high mean precipitation values and wet day’s occurrences, GPCP showing a much larger extend of these maxima than the TRMM and ARC2 data. Over Sahel, the number is decreasing from south-west to north-east where all three sets have very small R10mm values (≤2). The larger differences between observations over Sahel are obtained with respect to GPCP data (RMSE over Sahel up to 4 and 5 days during the JJAS season), which exhibit higher values in the Southwest Sahel than ARC2 and TRMM. Because the larger mean differences between the sets of observations over Sahel are with respect to GPCP, we choose GPCP as reference for the R10mm Taylor diagram over Sahel, which is displayed in Fig. 8b. The Taylor diagram shows even a larger difference in the spatial variance of observations over Sahel in R10mm than in R1mm. However, the correlation coefficients between the three sets of observations have large values (larger than 0.95) showing a good overall agreement among them.

Fig. 7
figure 7

Same as Fig. 5 but for the R10mm index (in number of days)

Fig. 8
figure 8

Same as Fig. 6 but for the R10mm index

The CRCM5 R10mm field presents also similar characteristics to R1mm field with similar patterns with maxima extending as in GPCP, but with a little bit larger amplitude and a dry bias in the North Sahel due to the narrower band of moderate precipitation compared to observations (Fig. 7). This is reflected by a ratio of spatial standard deviation close to GPCP and a good correlation coefficient (above 0.85; Fig. 8b). Both CanRCM4 simulations present R10mm indices smaller than in observations over land and underestimate spatial variability, especially in the 0.22° simulation (Figs. 7, 8b). Given that CanRCM4 simulations overestimate wet days occurrence over all West Africa and Sahel, this suggests that the excessive occurrence of wet days in the CanRCM4 is linked with light precipitation events, smaller than 10 mm/day. However, both CanRCM4 simulations present correlation coefficients with GPCP over Sahel >0.87, slightly better than for CRCM5 (see Fig. 8b).

The 20CR presents also better results for R10mm than for R1mm, with spatial correlation coefficient with GPCP of 0.97 (Fig. 8b) and RMSE over Sahel between 4 and 8 days (Fig. 8a). However, this reanalysis has higher values over South Sahel and smaller over the North Sahel, which are reflected into a higher spatial variance in the Taylor diagram. MERRA and ERA-Interim present the larger RMSE with respect to the three sets of observation due to the very small values in R10mm over almost all Sahel (see Fig. 7).

The high intensity index of rainfall (R30mm)

For Sahel regions, the R30mm index corresponds to heavy precipitation values but not necessarily to very high values as defined by ETCCDI (i.e. near or slightly below of above the 90th percentile of rain days; see Figs. 8, 22 in Gachon et al. 2007). Figure 9 shows the comparison of climatological mean values for this index among observed, reanalysis and RCMs datasets, Fig. 10a presents the RMSE for each reanalysis and RCM dataset versus the three sets of gridded observations (ARC2, TRMM and GPCP), while Fig. 10b shows the Taylor diagram with respect to TRMM, which reflects also the largest R30mm spatial gradient across the Sahel region among the three observed datasets.

Fig. 9
figure 9

Same as Fig. 5 but for the R30mm index (in number of days)

Fig. 10
figure 10

Same as Fig. 6 but for the R30mm index

As for R1mm and R10mm indices, some differences are observed between the three sets of observations (Fig. 9). The largest differences are between TRMM and ARC2 datasets, the former presenting large values reaching up to 8 days in the South Sahel, while ARC2 has values not higher than 4 days over this region. Figure 9 shows also that CRCM5 and NCEP present similar R30mm indices, with larger amplitude over ocean, large values over Guinea coast and very small values over Sahel, while MERRA and ERA-Interim have overall very small R30mm values. The CanRCM4 simulations and 20CR reanalyses present different patterns, with very small values on the Guinea coast and large values over ocean and South Sahel. Over South Sahel, the 0.22° CanRCM4 simulation is quite similar to the ARC2 values, while 20CR is closer to GPCP values. As consequence, as shown in Fig. 10a, b, their metrics over Sahel are in the interval of uncertainty in observations. The very small values of R30mm over Sahel in CRCM5, MERRA and ERA-Interim induce smaller spatial correlation values than for the other datasets (Fig. 10b). In this respect, the CanRCM4 simulations provide higher correlation values and lower RMSE quite compatible to ARC2 values (i.e. with respect to TRMM).

The daily precipitation frequency distribution (south, central and north Sahel)

To complement the analysis of wet days occurrence, moderate and heavy daily rainfalls, the precipitation frequency distribution within the wet season (JJAS) is shown in this section. In Fig. 11, the amount of rainfall contribution per daily intensity fallen over the June–September months during the 11-year period is shown in ten intervals (or bins) of intensity: 0–0.1; 1–2; 2–4; 4–8; 8–16; 16–32; 32–64; 64–128; 128–256; 256–512 mm/day. Here we consider each day during the JJAS seasons (in the 1998–2008 time window) and each grid point in three Sahel sub-regions (the Southern Sahel between 12°N and 14°N, the Central Sahel between 14°N and 17°N, and the Northern Sahel between 17°N and 20°N). The figure shows the precipitation distribution per season and per region (over Sudanese, Sahelian and sub-Saharan areas). Consequently the sum of all bins gives the total spatial–temporal average precipitation. Because the Y-axis presents the total precipitation in a precipitation intensity bin (instead of the more classical representation of percentage of precipitation per bins and their relative contributions to the total precipitation amount per season; e.g. Diaconescu and Laprise 2013; Martynov et al. 2013; Šeparović et al. 2013), the comparison of diagrams indicates also the bias magnitude or similarity for each precipitation intensity. Note that these distributions should actually be shown as histograms; however, in order to increase the clarity of the comparison between observed and reanalysis/RCMs products, we chose to present them as curves. Different panels show these two datasets (the reanalyses in the top panels and RCMs in the bottom panels) with the three sets of observations (ARC2, TRMM and GPCP) shown in all panels, grey area representing minimum and maximum values of those three. Figure 11 shows that the largest proportion of the precipitation over the Sahel region is coming from events with intensity between 8 and 32 mm/day in ARC2 and in GPCP, and higher in TRMM between 16 and 64 mm/day. In general, the ARC2 and GPCP distributions are quite similar, especially for light and moderate precipitation (smaller than 16 mm/day) over southern and central Sahel and high precipitation events (greater than 32 mm/day) over central and northern Sahel. In the intensity category between 16 and 32 mm/day, the GPCP distribution has larger amplitude than the ARC2 distribution over southern Sahel, due to the higher R10mm index in GPCP than in ARC2 over this sub-region (see Fig. 7). The spread among observations increases for heavy precipitation events especially over northern Sahel with TRMM distribution presenting a systematic higher occurrence of strong rainfall above 32 mm/day (as suggested also in Fig. 9). For the reanalysis datasets, 20CR and NCEP distributions over Southern and Central Sahel run almost parallel, presenting maxima corresponding to the bins of 8–32 mm/day as the ARC2 and GPCP datasets. However, their amplitudes differ from observations as NCEP underestimates the total amount of precipitation in bins of 8–32 mm/days, while 20CR overestimates the total amount of precipitation for light and moderate events. 20CR is shifted toward weaker precipitation over northern Sahel, while NCEP is presenting very small occurrence of light and moderate rainfalls (without any heavy precipitation event) because of the dry bias observed over this sub-region (see Figs. 5, 7, 9). Despite the fact that MERRA and ERA-Interim have finer horizontal resolutions than 20CR and NCEP reanalyses, they present distributions over Sahel shifted toward light-precipitation events, with maxima corresponding to the bins of 4–8 mm/day over southern Sahel, and 2–8 mm/day over central Sahel. This result is in agreement with the results obtained for R1mm, R10mm and R30mm, which showed that MERRA and ERA-Interim in general overestimate the number of wet days (R1mm) over southern and central Sahel, but underestimate the number of moderate (R10mm) and heavy (R30mm) rainfalls over all Sahel. The ERA-Interim dry bias over the northern Sahel is reflected also in very small amounts of total precipitation for all the bins over this sub-region. This is not surprising, given the fact that precipitation in reanalyses is mainly model generated and therefore, highly related to forecast-model physical parameterizations.

Fig. 11
figure 11

Spatial-temporal daily-precipitation frequency distribution per region (Southern, Central and Northern Sahel, see Fig. 1) and per eleven JJAS seasons (1998–2008) from (first line) the MERRA, 20CR, NCEP and ERA-Interim reanalysis and from (second line) 22° CanRCM4, 44° CanRCM4 and CRCM5 simulations. The distributions corresponding to the three sets of observations (ARC2, TRMM and GPCP) are presented on all panels and the total area covered by them is shadowed in grey. The Y-axis represents the spatial–temporal mean amount of daily rainfall per bins or per range of intensity (in mm/day), i.e. the mean of all rainfall fallen over each respective area over the 4 months (June–September) and the 11-year period. The sum of all bins gives the total spatial–temporal average precipitation

The CanRCM4 and CRCM5 simulations show quite different precipitation distributions. CRCM5 presents a distribution similar to ARC2, over southern Sahel, with only smaller amplitude for moderate to high rainfalls (i.e. 16–32 mm/day). In general, the occurrence of all high-intensity bins over central Sahel is underestimated in this model, and quite negligible for the northern Sahel, due to its narrower band of precipitation that generally does not go north of 15°N (see Figs. 5, 7, 9). The two CanRCM4 simulations have quite similar values for all categories of precipitation intensity, with two maxima instead of one maximum as suggested by observed and other reanalysis/RCM products. For all three sub-regions, the simulated total amount of precipitation is mostly coming from events with light precipitation (2–8 mm/day) or from extreme values (precipitation higher than 64 mm/day). In other words, CanRCM4 strongly overestimates the precipitation events with small amount, underestimates events coming from moderate to high intensity rainfall (16–32 mm/day) over all sub-regions, and slightly overestimates very high values over southern and central Sahel. This is related to and compatible with results shown in Fig. 5 with a higher number of wet days (R1mm) and a lower number of days with precipitation ≥10 mm/day than in observations over the major part of Sahel. Few differences in the simulated frequency distributions from the 0.22° and 0.44° simulations are also revealed. The higher resolution increases the amount of precipitation coming from extreme precipitation over southern and central Sahel, and reduces that coming from smaller rainfall events (2–8 mm/day) over northern Sahel. However, the occurrence of moderate precipitation events (i.e. 16–32 mm/day) stays quite similar between the two CanRCM4 simulations, while the total amount from intensities of 8–16 mm/days is even smaller in the higher resolution run. Therefore, the increased resolution in CanRCM4 does not lead to major improvement in the simulated frequency distribution over Sahel.

The maximum length of wet spells (CWD)

To complement the evaluation of precipitation occurrence indices (i.e. R1mm, R10mm and R30mm), we analyze in this section the persistence or duration of wet spells, i.e. the CWD index, which represents the maximum number of consecutive wet days during the JJAS season.

Figure 12 shows the climatological mean values of CWD (over the 1998–2008 period), while in Fig. 13 we present the RMSE for this index (in number of days) over West Sahel between each reanalysis/RCM dataset and the three sets of gridded observations (ARC2, TRMM and GPCP). As for the R1mm, R10mm and R30mm indices, there are important differences between the three sets of observations. These differences concern in general the amplitude and extension of maxima, which are more pronounced in GPCP than in TRMM and in ARC2. Nevertheless the three fields present more similar features over West Sahel region (the red square). In general, the CWD index is relatively weak over this region with values between 4 and 8 days in the most part of South Sahel and values no >4 days in Central and North Sahel. Similar values are presented in CRCM5 and NCEP datasets, which achieve weak RMSE with values within the range of observational data (Fig. 13). A different feature is depicted from 20CR dataset, which has CWD values between 50 and 80 days to the South border of Sahel domain, and up to 20 days in the Central Sahel. The MERRA, ERA-Interim and the two CanRCM4 simulations overestimate the persistence of wet days over the major part of southern Sahel, and over the 12°N–15°N humid band area. Note that this overestimation in 20CR, MERRA, ERA-Interim, 0.22° and 0.44° CanCM4 datasets coincides with excessive number of wet days (R1mm, see Fig. 5) and that for MERRA, ERA-Interim, 0.22° and 0.44° CanCM4 datasets, the most part of the total precipitation over Sahel is coming from days with light and moderate precipitation (see Figs. 5, 7, 11). Therefore, the excess in number of wet days with smaller intensity than in observations explain the larger number of consecutive rainy days. This does not result in any significant bias in the mean seasonal precipitation, i.e. the total amount is not significantly affected (see Fig. 3).

Fig. 12
figure 12

Same as Fig. 5 but for the CWD index (in number of days)

Fig. 13
figure 13

Same as Fig. 6a but for the CWD index (in number of days)

In summary, overall CRCM5 presents a quite good daily-precipitation distribution, with a relatively well simulated number of wet days, and a CWD close to observations for a large part of the west Sahel region. However, this RCM underestimates the precipitation frequency, especially over central and northern Sahel, where a dry bias and a general narrower seasonal band of precipitation compared to observations is revealed. The two CanRCM4 simulations are generally characterised by an excessive number of wet days and CWD over West Sahel. This is due to a general propensity to simulate a large number of days with light and moderate precipitation, the most part of the amount of total precipitation over West Sahel resulting from such events, the rest resulting from a relatively small number of days with extreme precipitation (larger than 60 mm/day); all these events are too high with respect to observed data.

West African monsoon onset and retreat indices

In this section we examine how observations, RCM simulations and reanalyses represent the timing of the mean monsoon onset/retreat over Sahel using two precipitation indicators: a large-scale index indicated on the Hovmöller diagrams and a local index as defined in Sect. 2.4.

Hovmöller diagrams and the large-scale onset index

Figure 14 shows the mean (1998–2008) daily-precipitation intra-annual variability (in mm/day) using Hovmöller (latitude vs. time, from 0°N to 24°N and for each day) diagrams. The diagrams are computed from daily precipitation fields zonally averaged between 15°E and 15°W (the West Sahel region) and smoothed using a five-day running mean. The Fig. 14 indicates also, using vertical lines, the mean large-scale onset date (computed as the date when the mean precipitation in the 15°E–15°W; 8°N–16°N box turns greater than the mean precipitation in the 15°E–15°W; 0°N–8°N box; see Sect. 2.4) for RCMs and reanalyses (a vertical red line in the corresponding panel) and for the ARC2/TRMM/GPCP sets of observations (indicated by cyan/blue/green vertical lines and presented in all the panels in order to facilitate the comparison with model simulations and reanalyses onset dates). The period corresponding to the mean large-scale onset pentad for each dataset is also mentioned in the title of the corresponding Hovmöller diagram.

Fig. 14
figure 14

Hovmöller diagrams for mean daily precipitation (in mm/day), zonally averaged between 15°E and 15°W across the year (over the 1998–2008 period) using a 5-day running mean, for a ARC2, b TRMM, c GPCP, d 0.22° CanRCM4, e 0.44° CanRCM4, f CRCM5, g MERRA, h 20CR, i NCEP and j ERA-Interim datasets. The mean large-scale onset date (computed as the date when the mean precipitation in the 15°E–15°W; 8°N–16°N box turns greater than the mean precipitation in the 15°E–15°W; 0°N–8°N box) for RCMs and reanalyses is indicated by a vertical red line in the corresponding panel, while the ARC2/TRMM/GPCP mean large-scale onset is indicated by cyan/blue/green vertical lines presented in all the panels. The south border of the West Sahel region (i.e. 12°N) is highlighted by a horizontal black line

The Hovmöller diagrams show that the three sets of observations have similar features with a first rainy season (March to June) over the Guinea Gulf coastlines (near 4°N), a sudden shift in the maximum of precipitation from 5°N to around 12°N in June/July, and a relatively smooth retreat of the band of precipitation from 12°N toward south during September corresponding to the second maximum of the rainy season near the Guinea Gulf. However, some differences exist in the amplitudes of the maxima and/or in its position during the year. In general, ARC2 data present smaller amplitude over 10°N than TRMM and GPCP datasets, while GPCP data display the largest maximum at 10°N and the smallest maximum for the Guinea Gulf during the first part of the rainy season.

The two CanRCM4 simulations also display maxima of precipitation with an abrupt shift between the first and second maximum during the year. However, the band of precipitation from March to July is too far extended toward the North compared to the three sets of observations. While all three sets of observations display precipitation larger than 2 mm/day over the South of the Sahel (near 12°N) with a progression by the end of May toward the North, the CanRCM4 simulations show larger precipitation values over this area starting too early (i.e. by the end of March) with a more rapid northward migration than in observed datasets. In spite of these differences, the simulated maxima of precipitation are in general well located in space and in time with respect to the three sets of observations. The CRCM5 simulation is characterised by a narrower band of precipitation without a clear abrupt shift in precipitation. The band of precipitation larger than 2 mm/day passes through the 12°N latitude by the end of April, but does not progress much farther northward as in observed values.

NCEP reanalysis data also present a narrower band of precipitation without an evident abrupt shift between the two maxima positioned too far in the south with respect to observations. A more gradual or smoother progression from South to North characterises also MERRA and ERA-Interim precipitation compared to observed. The 20CR presents a first rainy season over the Guinea Gulf coastlines, a relatively rapid shift in the maximum of precipitation toward the North, and a smooth retreat toward the South. However, the intensities and the positions of the two maxima are different in this reanalysis compared to the three sets of observations with a weaker and shorter first maximum (in amplitude and in time) over the 4–8°N band, and an excessive second maxima over the 12°N area.

In general, the average onset pentad, indicated by the vertical lines, captures well the mean period of the northern shift observed in the latitudinal distribution of precipitation for each dataset. The mean value over the eleven-year period corresponds to the onset pentad 34 (15–19 June) in the ARC2 case, 35 (20–24 June) in the GPCP case and 37 (30 June–4 July) in the TRMM case. Therefore, the total period covered by the onset date in observations is from 15 June to 4 July. This means that a large uncertainty (i.e. around 3 weeks) is associated with the large-scale onset date in observations. In Table 2, we present the large-scale onset for each year, the 1998–2008 mean pentad, and their inter-annual standard deviation (in number of pentads from 1st January). This Table shows that the spread of onset date between the sets of observations is not the same for each year. In some years (1999 and 2008) the three sets of observations present similar onsets, while for other years (1998, 2000 and 2001) the differences between individual observations might be equal to or even larger than five pentads (25 days).

Table 2 Large-scale onset pentad for each year, their corresponding 1998–2008 mean and their inter-annual standard deviation (STD), in number of pentads from 1st January

Both 0.44° simulations have mean large-scale onset pentads situated within the observed interval, although specific years can be out of this interval (see red values in Table 2). Furthermore, all three reanalyses present large-scale onset dates later than in observations, while the 0.22° CanRCM4 simulation presents earlier onset dates.

The local onset/retreat analysis

The climatological (1998–2008) local onsets/retreat from all observed, reanalysis and RCMs datasets are compared here over Sahel, and presented in Figs. 15 and 16 (in Julian days) over the 12°N–20°N latitude, and 15°W–15°E longitude bands. The regions in white correspond to dry regions without a real wet season or distinctive onset date. In the three sets of observations, the local onset (Fig. 15) appears to migrate from south to north, between 130–150 Julian days (10–30 May) over the 12–14°N bands and between 150 and 170 Julian days (May 30 and June 19) over the 14°N–16°N areas. This corresponds to the Hovmöller daily precipitation values >2 mm/day crossing the 12°N boundary in the second part of May in ARC2, and by the end of May in TRMM and GPCP datasets, and during the third weeks of June in all observed values above 14°N (see Fig. 14). Over the Northeastern Sahel, a very small number of wet days (smaller than 10 days over the JJAS season, see Fig. 5) induces no real onset/retreat, as revealed by the three observed datasets. Overall, the three sets of observations present similar spatial distributions of local onset indices across the whole region, with quite similar mean values over the Sahel (172, 173 and 178 Julian days for ARC2, GPCP and TRMM, respectively, corresponding to 21 June, 22 June and 27 June respectively). As for the large-scale onset, the TRMM local onset is later than in ARC2 and GPCP datasets. The comparison between the observation large-scale mean onset and the mean local onset shows also that only GPCP mean local onset (22 June) is falling within the large-scale pentad (20–24 June). For ARC2, the mean local onset (21 June) is falling slightly after the mean large-scale onset (15–19 June) while for TRMM, the mean local onset (27 June) is falling before its mean large-scale onset (30 June–4 July). It must be mentioned that the Sahel mean for the local onset is computed only over the points where the local onset is defined, and it does not consider the white points in the Northern Sahel shown in Fig. 15. On the other hand, the large-scale onset is computed considering all points in the two regions implicated in the computation. The local onset in observations is characterised by a smaller interval of uncertainty than the large-scale onset that is based on pentads. For the local onset, the largest differences are observed in the South Sahel, where ARC2 dataset shows in general an earlier mean onset than TRMM and GPCP, in particular over North of Burkina Faso and North of Nigeria. Both CanRCM4 simulations display an earlier mean onset (i.e. end of March and beginning of April) than in observations, for a large part of the domain. As mentioned previously, the CanRCM4 simulations show the presence of daily precipitation values between 2 and 4 mm/day over Sahel by the end of March, while the maximum is still over the Guinea Gulf Coast during that time (Fig. 14). However, the CanRCM4 local onset is closer to observations over the West Senegal and the East Mali (domain SW: 15°W–9.7°W; 12°N–14°N). The CRCM5 simulation displays a similar overall picture, with an earlier (20–40 days) onset over North Burkina Faso and North Nigeria and local onsets similar to observations over West Senegal and East Mali. However, because CRCM5 has a narrower seasonal band of precipitation than observed, which does not migrate toward the North, the northern region presents no real onset/retreat dates. As for the large-scale onset, the 20CR, NCEP and ERA-Interim reanalyses present in general later dates for the local onset for most part of the domain, the northern dry region being larger in NCEP and ERA-Interim cases and very small in 20CR one. On the other hand, MERRA presents an overall similarity in onset appearance and structure to the ARC2 onsets over southern and central Sahel, and a northern dry area a little bit larger than in GPCP.

Fig. 15
figure 15

Climatological (1998–2008) local onset index over Sahel, in Julian day, for a ARC2, b TRMM, c GPCP, d 0.22° CanRCM4, e 0.44° CanRCM4, f CRCM5, g MERRA, h 20CR, i NCEP and j ERA-Interim datasets. The regions in white correspond to dry regions without a real wet season or distinctive onset date

Fig. 16
figure 16

Same as Fig. 16 but for climatological local retreat index

For the climatological (1998–2008) local monsoon retreat (shown in Fig. 16), the three sets of observations display very similar values over most part of Sahel domain. The end of the rainy season progresses from the Northeast in the first part of September, toward the Southwest by the end of October. Once again, the period covered by the local retreat corresponds well with the rainfall retreat shown in the Hovmöller diagrams (Fig. 14). The Sahel-mean for the local retreat in TRMM corresponds to the 263 Julian day (20 September), while ARC2 and GPCP present the same mean value at the 266 Julian day (23 September). The reanalyses are in general characterised by later retreat indices, especially over the southern Sahel. The three RCM simulations have retreat indices that are closer to observations than their onset indices, for most part of the domain. The averaged local retreat in CRCM5 corresponds to the 263 Julian day (20 September), which is close to TRMM case. For CanRCM4 simulations, the greater differences appear over the West Senegal and the East Mali, where the models have later retreat indices than the observations.

In conclusion, even if RCMs are closer to observations when the large-scale aspect of monsoon climatological onset is considered, they show important differences for the local onset, as this aspect of the WAM remains a challenge for the RCMs. The local monsoon retreat in models is generally better represented than the onset. The reanalyses present also deficiencies in reproducing the onset index, with a general tendency for later local and large-scale onsets. Their poor performance for these indices is explained by the fact that precipitation in reanalyses is model generated; therefore it relays on the reanalyses forecast model parameterisations.

Summary and concluding remarks

The West Africa rainfall regime is a considerable challenge for climate models because of the complexity of dynamical and physical processes that characterise the WAM area. This study focused on the evaluation of daily-precipitation statistics and monsoon onset/retreat as simulated by two Canadian RCMs, the CCCma CanRCM4 and the UQAM CRCM5, that took part in the AFRICA-CORDEX project. Three simulations were evaluated: two CanRCM4 simulations with grid meshes of 0.22° and 0.44° that are forced by a GCM constrained to ERA-Interim including its SSTs, and one CRCM5 simulation at 0.44° forced by ERA-Interim atmospheric and SST values. This study is made prior to the application of the RCM simulations in health and agriculture impact studies over Sahel, within the FACE (Faire-face Aux Changements Ensemble, project. Consequently, the analysis was focused over West Africa, especially over western Sahel. Three gridded daily-precipitation datasets, based on ground and satellite derived observations, are used in the evaluation in order to give an uncertainty measure in the observational data: ARC2, TRMM and GPCP. We also used daily precipitation datasets from four reanalyses products: ERA-Interim, NCEP/R2, 20CR and MERRA.

The first part of the paper focused on the evaluation of the 11-year (1998–2008) seasonal mean precipitation, daily-precipitation indices (namely, R1mm, R10mm, R30mm and CWD indices), and also the daily-precipitation frequency distribution over three Sahel sub-regions. The second part of the paper analysed the Western Africa monsoon, using large-scale and local onset/retreat indices.

The three sets of observations, ARC2, TRMM and GPCP, are the result of different methods used in the satellite estimations and in obtaining the gridded products. However, they also integrate many of the same satellite and rain-gauge measurements, making them not completely independent. For seasonal means, their comparative analysis showed that the sets of observations are similar, the uncertainty associated with them being overall small. On the other hand, important differences are obtained between their occurrence (R1mm, R10mm, R30mm) and duration (CWD) indices over Guinea and Cameroon, where GPCP has maxima covering much larger areas than TRMM and ARC2. However, the three observational datasets present more similar features over Sahel region, where, as for seasonal precipitation mean, the fields are characterised by a gradual decrease from southwest to northeast. They also present similar local onset/retreat monsoon indices, but in general different large-scale onset indices. Other important differences between observations, over Sahel region, are noted in their daily-precipitation frequency distributions: the TRMM distribution is shifted toward higher intensity compared to ARC2 and GPCP, presenting a smaller amount of precipitation from days with moderate precipitation and higher amounts from very heavy precipitation days. Because of these differences, the reanalyses and model evaluation cannot be done with respect to one specific set of observations, but with respect to an interval of values defining the observation uncertainty and obtained by comparing the three observational datasets.

For seasonal mean precipitation, the three RCM simulations and the four sets of reanalyses have similar skill in reproducing the main features of the WAM area. The main differences from observations consist in:

  • A dry bias over North Sahel in CRCM5, NCEP, MERRA and ERA-Interim datasets, associated with narrower bands of precipitation, and

  • A wet bias in the two CanRCM4 simulations and 20CR, associated with a Northern position for the band of seasonal precipitation compared to the three sets of observations.

For the daily-precipitation indices, the CRCM5 and NCEP datasets stand out from the other reanalyses and simulations because they have large-scale spatial patterns over Guinea and Cameroon similar to GPCP dataset. However, both of them are affected by a dry bias over the North Sahel that is extending further South in the R10mm and R30mm indices; for R1mm and CWD indices, CRCM5 and NCEP have quite similar RMSE values over Sahel within the range of values between the three sets of observations.

On the other hand, the two CanRCM4 simulations and the other three reanalyses (MERRA, ERA-Interim and 20CR) are highly overestimating the number of wet days (R1mm) and the CWD index over almost all West Africa, and underestimate the R10mm index and even more the R30mm index over all Sahel; MERRA and ERA-Interim are presenting the largest RMSE over Sahel, with respect to observations for R10mm and R30mm indices. Their high R1mm and small R10mm and R30mm suggest that they produce too many days with light precipitation and not enough days with moderate and high precipitation. It must be mentioned that precipitation field in reanalyses is model generated, therefore dependent on forecast model parameterizations.

The analysis of daily-precipitation frequency has also highlighted that:

  • CRCM5 and NCEP have daily-precipitation frequency distribution centred on the 8–32 mm/day bins, similar to ARC2, but with smaller amplitudes, especially over central and Northern Sahel, caused by their dry bias associated with the narrower band of precipitation.

  • The 20CR has a spatial–temporal frequency distribution close to GPCP over Southern and Central Sahel, but with larger amplitude for the bins with intensity smaller than 32 mm/day, caused by its high number of light precipitation.

  • The MERRA and ERA-Interim frequency distribution is centered on light-precipitation events over all Sahel, highly underestimating precipitation with intensity greater than 16 mm/day and overestimating the light precipitation intensity.

  • Both CanRCM4 simulations overestimate the light precipitation intensity, the most part of daily precipitation coming from events with precipitation smaller than 16 mm/day, but also overestimate the total amount of precipitation from very high intensity events over Southern and Central Sahel.

The analysis of the large-scale monsoon onset showed that for both 0.44° RCM simulations the large-scale onsets are situated in the interval of uncertainty associated to observations; the reanalyses have later large-scale onset dates for almost all years.

We also developed a local onset/retreat index that focuses on the effective arrival of the seasonal rainfall in each grid point of Sahel, which can be of great importance for the local agriculture. We obtain very similar spatial distributions for the climatological local onset and retreat indices in the three sets of observations, which also correspond well with the latitudinal seasonal migration of precipitation observed in Hovmöller diagrams. The Sahel onset date varies from 21 to 27 June, while for the retreat date is from 20 to 23 September. For the models and reanalyses we obtained that:

  • The two CRCM4 simulations are characterised by earlier local onset indices caused by light precipitation events that are installing over Sahel when the maximum of precipitation is still over the Coast of Guinea Gulf. However, they have overall good local retreat indices.

  • The CRCM5 simulation has also good local retreat indices and an earlier onset over the south Sahel; however the region with no real onset is extending further South comparing to observations.

  • From reanalyses, MERRA have the closest onset to observations but later offsets, while the others reanalyses are characterised by later onset and offset indices.

In conclusion, the two RCMs differ in simulating the daily-precipitation indices. In general, all daily-precipitation derivate fields from CRCM5 are affected by a dry bias over Northern Sahel. On the other hand, the CanRCM4 has a general tendency to produce a large number of days with light precipitation (smaller than 8 mm/day) along with a small number of days with extreme precipitation (larger than 64 mm/day) over Southern Sahel. This tendency is observed in both CanRCM4 simulations, the higher resolution exacerbating in general the bias of the amount of extreme precipitation and its occurrence. The models use the same dynamical core, the main differences consisting in using different physical schemes. This indicates that a large part of differences in their simulated precipitation indices is due to models physics. It is important to note that these two general biases are also affecting the reanalyses despite the fact that the reanalyses assimilate meteorological fields as surface pressure (20CR, ERA-Interim, NCEP and MERRA), temperature and wind (ERA-Interim, NCEP and MERRA): the dry bias is affecting the NCEP, MERRA and ERA-Interim, while the MERRA, ERA-Interim, and in less extent 20CR, are producing a too large number of light precipitation. This confirms the fact that precipitation over Sahel is highly related to physics. This is in line with results obtained by Sylla et al. (2011) that analyses how deep convection affects the representation of West African monsoon location and strength, by comparing two RegCM3 RCM simulations with the scheme of deep convection activated and turned off, respectively.

In our case, we analyzed further the CanRCM4 model tendency to produce a high amount of light precipitation by comparing the total precipitation frequency distribution over the West Sahel, from the 0.44° CanRCM4 simulation, with convective-only precipitation versus the total amount in Fig. 17. As in Fig. 11, the distributions consider all the days in the eleven (1998–2008) JJAS seasons, but here they cover the whole West Sahel. The total precipitation is presented in magenta, while the convective precipitation is showed in indigo. Figure 17 shows that the most part of the precipitation is in form of convective precipitation with intensities between 2 and 8 mm/day. The secondary maximum is from extreme precipitation (larger than 64 mm/day) developed by the resolved-precipitation scheme. The analyses of R1mm and CWD indices have showed that the precipitation in CanRCM4 has a general tendency to be too temporally uniform resulting in a significant lack of intermittency, while R30mm showed that the number of days with high intensity was very small. Hence, the bias is related to excessive light precipitation events coming from convective precipitation, while the rest of precipitation is provided by a relatively very small number of days with very high intensity developed by the large-scale precipitation scheme. This behavior of too weak and temporally uniform convective precipitation is shared with the parent global model CanAM4. Because convective precipitation is dominant over the most part of West Africa, the increase in the model horizontal resolution (from 0.44° to 0.22°) did not significantly modify the properties of total precipitation simulated over Sahel as shown in Fig. 11. The increase in horizontal resolution brought an increase of high intensity resolved precipitation, presumably because more scales of motion were resolved. However, the 0.22° simulation continued to produce a large number of days with light and moderate intensities of convective precipitation. This overestimation in the number of small-intensity events affects the mean precipitation values and its accuracy, and/or physical representativeness, and causes local onsets earlier than in observations. Then, the simulated precipitation regime is affected not only by a bias in the intensity but also in the frequency of occurrence for each intensity category. One should be caution about any bias correction on the mean precipitation and its usefulness for climate change studies, as bias varies among each simulated quantile of daily precipitation as this depends on the occurrence of light, moderate and heavy precipitation intensity. Such as demonstrated in the present study, a detailed analysis on the physical nature of precipitation events in terms of intensity, duration, occurrence and frequency is needed to clearly evaluate or interpret the physical sense given in the mean simulated precipitation values. It also raises the question of whether or not the present-day bias will be present in future climate change simulations, and if this bias is or not stationary, i.e. if this will affect the climate change signal over this western Africa area.

Fig. 17
figure 17

Spatial-temporal daily-precipitation frequency distribution over West Sahel region (15°W–15°E; 12°N–20°N) and 11-year JJAS season for the total precipitation (i.e. PR, magenta line) and the convective precipitation (i.e. PRC, indigo line) in 0.44° CanRCM4 simulation

It is important to note, that the CanRCM4 physics are identical with that used in the parent GCM (the CanAM4) and that other models are also characterised by positive biases in the occurrence frequency of light-intensity precipitation (e.g. Gutowski et al. 2003; Iorio et al. 2004; DeAngelis et al. 2013), as are also some reanalyses (e.g. MERRA, ERA-Interim and CR20). Hence, more exhaustive studies are needed to improve the physical behaviour of the convective parameterization schemes so that improved performance is realized in the tropics. Ultimately, the goal is to provide credible estimates of the change in these important precipitation indices over the African Sahel, which occurs as a consequence of climate change. A critical but still unresolved issue is the extent to which present-day biases in any of the indices impacts the predicted response of those indices to climate-change forcings. This a notoriously difficult issue to address and is the subject of future investigation. On the other hand, the different kind of bias obtained for the two Canadian RCMs reinforces the idea that more than one model must be considered in impact studies, in order to give a certain measure of their uncertainty as well. Consequently, further work will involve defining the response of these important indices over the Sahel to future climate change from CORDEX future scenario experiments, from both CanRCM4 and CRCM5, and from the ensemble of CORDEX RCMs. This work will be reported in a future study.