1 Introduction

Burkina Faso, as all the West African area, is subject to a continuous rainfall deficit since the beginning of the 1970 decade (Landsberg 1975; Dai et al. 2004). An analysis of the rainfall data from 1896 to 2006 in West Africa shows that the mean annual rainfall amount during the last four decades (1970–2009) remained lower than the mean annual rainfall recorded during the period 1900–1970 (Mahé and Paturel 2009). This continuous rainfall deficit is detrimental to the socio-economic situation because the population’s main activities, agriculture and livestock, depend strongly on the rainfall amount fallen during the rainy season. Every rainfall deficit is synonymous with a drop in crop yields and a deficit of food. Indeed, two extreme events demonstrate the fragility of the sahelian natural resources (surface water, groundwater, and ecosystems), the drought of 1972–1973 and the drought of 1984–1985 (Landsberg 1975; Herceg et al. 2007) which caused loss of human life and a decimation of livestock herds in the semiarid zone. One of the characteristics of this rainfall deficit is a decrease in the number of rainy days (Le Barbé and Lebel 1997). The authors showed that the rainfall deficit is much more due to a decrease of the rainfall frequency than to a decrease of the rainfall amounts per event. Another study (Sivakumar 1988) made over the Niamey area (Niger) on the predictions of the potentialities of the rainy season demonstrated the importance of the installation time and the season lengths. It showed that the season length depends strongly on the date of its onset, the earlier (later) the season begins, the longer (shorter) it will be. Also the longer the season, the more likely it is to have an important number of rainfall events and a larger total rainfall. The results of these two studies show that the rainfall deficit can be related to a shortening of the rainy season and/or a low number of rainfall events. The results quoted in this paragraph, show that the rainfall amount recorded during a season and its potential to satisfy the needs of the societies depend on several factors which need to be quantified and should be predicted by models.

Climate models are essential tools for understanding climatic processes and their evolution at a global scale (Hulme et al. 2001; Rockel et al. 2008; Vanvyve et al. 2008; Rodríguez-Fonseca et al. 2011; Ruti et al. 2011). One of the first applications of global models were done over Africa with the experiments of Charney et al. (1977). But since it has been established that global models lack the spatial resolution to properly resolve the mesoscale processes (such as the life cycle of the convective systems) essential for controlling the variability in this region (Sylla et al. 2009). Regional climate models have been developed in many institutes in order to overcome these problems. Their higher resolution allows to represent more detailed local processes relevant to climate, such as orography, vegetation distribution or land-use.

Several RCMs have already been tested over West Africa (Vanvyve et al. 2008; Sylla et al. 2010; Paeth et al. 2011) for different purposes. Most of these studies focused on their skill in representing the annual or seasonal cycle of critical meteorological variables (rainfall, temperature, humidity, cloudiness, etc.). Sylla et al. (2009) assess the ability of the ICTP (International Center for Theoritical Physics) regional climate model RegCM3 to reproduce the seasonal temperature and precipitation cycle during the period of 1981–2000 over West Africa with two sets of boundaries conditions, reanalysis data and ECHAM5 output. They found that on average, the first run underestimates rainfall amount during the rainy season while the second run overestimated it even if both runs produced an annual cycle of rainfall close to the observed one. Paeth et al. (2011), in a review of recent dynamical downscaling exercise over West Africa, found that RCMs are subject to systematic biases for rainfall over the region. Nevertheless it is of great interest for RCM output users to have a detailed evaluation of the main characteristics of the rainy seasons in these models.

The main focus of this study is the evaluation of the performance of an ensemble of regional climate models through the dominant rainy season characteristics derived from daily rainfall data recorded and simulated over a typical sahelian area. The observed rainfall data come from a network of ten well spread synoptic stations over Burkina Faso for the period 1990–2004. The simulated data are produced by five regional climate models (CCLM, HadRM3P, RACMO, RCA and REMO). The climate models were run, in the context of the collaboration between the ENSEMBLES and AMMA European Projects, under the SRES scenario A1B over the period of 1960–2050 with GCMs boundary condition. In a second set of simulations the RCMs were driven by ERA-interim reanalysis (Dee and Uppala 2008) over the period of 1989–2005. We will first present in more detail the data used and the methods applied in order to describe all the sub-seasonal characteristics of the rain-season. Then we will evaluate the ability of the RCMs to reproduce the observed properties of the rainfall in this sahelian region.

2 Data

Observed rainfall data were obtained through the effort made by AMMA (African Monsoon Multidisciplinary Analysis) to ensure data exchange between operational services and the research community. The present study focuses on daily rainfall data recorded by the national meteorology service of Burkina Faso from a network of ten synoptic stations (Bobo Dioulasso, Bogandé, Boromo, Dédougou, Dori, Fada N’Gourma, Gaoua, Ouagadougou, Ouahigouya, and Po) for the time period 1961–2004. These stations are homogeneously distributed over the country (Fig. 1). The datasets are complete for nine stations; there is only one gap, the 1978 season at Bogande.

Fig. 1
figure 1

Synoptic stations with the co-located RCMs grid box. The map represents the three climatic zones and the ten stations with the surrounding RCM mesh. The climate zones are derived from the annual rainfall average over 1961–1990 from CRU rainfall data

Burkina Faso is a land locked country which covers a surface of about 274,200 km2. The country is subdivided into three main climate zones (Fig. 1): north-Sudanese in the south (annual rainfall between 900 and 1,200 mm), sub-Sahelian in the middle (annual rainfall between 600 and 900 mm) and Sahelian zone in the north (annual rainfall between 400 and 600 mm).

The second type of data comes from the simulations of five Regional Climate Models performed in the framework of the EU FP6, ENSEMBLES project (http://ensemblesrt3.dmi.dk/). The RCMs used here are listed in Table 1. The boundary conditions are from the ERA-Interim re-analysis (1989–2005) and from two global climate models (1960–2050). The GCMs were run under the SRES A1B scenario (Nakicenovic et al. 2000) which assumes a balanced increase in the greenhouse gas (GHG) concentrations. We will consider here only the current climate of these runs. The two GCMs used as boundary conditions are, HadCM3Q0 a version of the Hadley Centre’s third generation coupled ocean–atmosphere general circulation model (Wilson et al. 2010) and ECHAM5 (Roeckner et al. 2003), the MPI (Max Plank Institute of Germany) fifth-generation of atmospheric general circulation model. ECHAM5-r1 and ECHAM5-r3 differ only in the initial conditions which are based on stabilization runs.

Table 1 List of the five RCMs forced by ERA-interim and a GCM

The RCM’s resolution is about 50×50 km2 and the same grid is used by all 5 models. The domain covered by the models is much larger than our study area and it goes from 35°W to 30°E and 20°S to 35°N. As the periods covered by the different data sets are not the same, we will focus our study on the overlay period of 1990–2004.

3 Methods

This study provides a detailed description of the rainy seasons in Burkina Faso from the data sets discussed above in order to better identify its characteristics and to determine the ability of models to reproduce them correctly. The main rainy season characteristics to be analyzed are related to season duration (onset and end of season), rainfall intensity (daily rainfall average, annual rainfall number, extreme daily rainfall intensities), and dry spells (frequency and duration).

The first approach of the analysis is to identify the rainy seasons at a given station based on its daily rainfall data. The rainy season is generally identified in the West African zone according to different methods depending on the objectives of the studies and the locations. We can distinguish two classes of methods which are usually used in the Sahelian region; the agronomic method and the hydrological method (Sivakumar 1988; Balme et al. 2005). The agronomic method defines the rainy season start after the first April with a 3 days cumulative rainfall amount higher than 20 mm and not followed by a dry spell of more than 7 days. The rainy season end of this method is marked by the last rainfall higher than 5 mm/day after the first September with any rainfall higher than 5 mm/day during the twenty following days. For the hydrologic method, the rainy season begins with the first rainfall higher than 5 mm/day (runoff triggering threshold) and it ends with the last rainfall higher than 5 mm/day. The limit of these two methods is that they are empirical and they are based on some assumptions on the behavior of land surface conditions or the crops.

As this study deals with simulated rainfall data, which could have systematic biases (Lebel et al. 2000; Frei et al. 2003; Déqué 2007; Jacob et al. 2007), a new method which does not include any assumption on locally valid rainfall thresholds or on any specific application is needed. The proposed approach will be called the statistical method and it is valid for a rainfall regime of one rainy season within the year. The criteria used for this method are only based on the statistical properties of the daily rainfall time series for a given station or RCM grid point. The criteria are formulated as follows:

  • The season onset is determined after 5% of the total annual rainfall amount is reached and the end of the season is determined after 95% of the annual total rainfall amount has fallen;

  • The date of the season onset corresponds to the date of the rainfall higher than the average of annual first rainfall events over the entire period. In addition, to be considered, the rainfall event must not be followed by a dry spell longer than the median of the mean dry spell durations at the station or grid point;

  • The end of season is marked by a rainfall event occurring after or completing the 95% of the annual rainfall amount and followed by a dry spell longer than the median dry spell duration at the station or grid point.

Secondly, a rainy day is defined by a threshold of 0.1 mm/day which is the minimum intensity of the observations. From this low threshold, six rainfall classes are defined for the daily rainfall amounts analysis: very low (0.1–5 mm/day), low (5–10 mm/day), moderate (10–20 mm/day), strong (20–50 mm/day), very strong (50–100 mm/day) and extremes (>100 mm/day).

The ability of RCMs to reproduce the observed characteristics of rainfall time series is assessed with correlation analysis of the inter-annual variability and statistical tests for average and variance.

The difference between the observations and the simulations assessment is based on the difference in their averages and the inter-annual variance of the time series. Non parametric procedures which don’t require any condition on the data distribution are used to assess the significance (Wasserman 2006):

  • The nonparametric Wilcoxon rank sum test (Ansari and Bradley 1960) allows assessing the bias between two series. For two given samples, the difference between the data are calculated and classified in ascending order of the absolute value of the differences. With W+ the sum of the positive value rank and W- the sum of the negative value rank, W++W = N(N + 1)/4, N the number of non zero differences. If N > 25 as in this case with 30 values, the W+ or W distribution can be approximated by N(μ; σ) with

    $$ \mu = \frac{N(N + 1)}{4}\quad{\text{and}}\quad\sigma = \sqrt {\frac{N(N + 1)(2N + 1)}{24}} . $$

    The test variable is \( u = \frac{w - \mu }{\sigma } \) with w = min(W+, W). At the significant level of α = 5%, u α = 1.96 taken from the normal distribution N(0, 1). So, the null hypothesis (no significant difference between the two time series) is rejected if \( \left| u \right| \) is greater than u α. As suggested by Willmott and Matsuura (2005), the mean absolute error is used to compute the gap magnitude between the observed data and the simulated data;

  • The non parametric median-centering Fligner-Killeen test for homogeneity of variances (Fligner and Killeen 1976; Conover et al. 1981) is used at the significant level α = 5% to assess the differences between the variances of the 2 data sets. For the correlation, the Pearson test is used to assess the correlation significance between the data (Millot 2009).

Furthermore, the Taylor diagram (Taylor 2001) which displays on one plot the correlation coefficient and the relative standard deviation (ratio between the simulated and observed standard deviations) is used to assess the inter-annual variability of the simulations.

The procedures listed above are applied at the level of each station but the analysis will not emphasize the inter station disparities and most reported results are averages over the 10 stations or the corresponding 10 grid boxes. As similar errors were found over the 10 stations, the averages reported are representative of the whole country. In addition, a comparison between the CRU (New et al. 2000) and IRD (Paturel et al. 2010) spatial rainfall data over Burkina Faso and the ten synoptic stations were conducted but haven’t shown any meaningful differences. The three annual rainfall averages (CRU, IRD, and stations) are very similar. We conclude that the ten synoptic stations capture well the rainfall characteristics over the whole country. All evaluations are performed on the ERA-Interim driven as well as the GCM driven simulations.

4 Results

4.1 Rainy season characteristics in Burkina Faso

The dates of the season onset and the end of the season are discussed first as they are key parameters for defining the rainy season period in the region. For this analysis these dates are measured in days since the first January of the given year.

The statistical method of rainy season periods characterization has been verified through a comparison with the agronomic and hydrologic methods (results not shown here). It was found that the hydrological method produces earlier start dates and is very sensitive to isolated intense rainfall events. In contrast, the agronomic method is very demanding on the rainfall amounts as it ignores rainfall sequences between 15 and 20 mm/day separated by 4–7 dry days. In order to illustrate this difference we take the case of the rainy season of 1978 at Ouahigouya as it displays the largest difference between the 3 methods. While the hydrological method gives a season onset on the 14th March, the agronomic method computes a stating date on the 18th June and the statistic method determines the season onset on the 24th April. The large difference between the two first criteria is observed frequently at the ten stations. It was also noted that the inter-annual variance of the season onset is higher with the hydrologic and agronomic methods than with the statistic method indicating a better stability for the methodology proposed here.

4.2 Rainy season period characteristics

The rainy season in Burkina Faso is governed by the West African monsoon flux with a northward intrusion in March and a southward retreat in September (Sultan and Janicot 2000; Ramel et al. 2006).

In the same way as the monsoon flux, the rainy season onset in Burkina Faso migrates northward and takes more than 40 days to run along the country, from the beginning of April on average at Gaoua (the most south station) to the first decade of June on average at Dori (the most north station) (Figs. 2, 3). In contrast, form the same figures, the duration of the southward migration takes around 20 days to cover the same North–South distance, from the mid-September on average at Dori to the beginning of October on average at Gaoua. Thus, the rainy season installation is about two times slower than it’s withdrawal. This result is in agreement with the ITF (Inter-Tropical Front) movement over West Africa. Lélé and Lamb (2010) found that the ITF is almost twice as fast in its southward retreat than in its northward advance. All modeled season onsets and end dates are at around the observed period at each station. So, the models reproduce the general migrations of season onset and end of season but most of them have an early onset and a delayed end (Figs. 2, 3) when compared to observations. The HadRM3P is the most advanced in season onset and the most delayed at the end of season in contrary to the CCLM model which has a late season onsets and advanced ends of season. Altogether, the models generally produce too long rainy seasons. Figures 2 and 3 show that the five models keep the same deficiencies for the season onset and end of season with the two driving data sets.

Fig. 2
figure 2

Season onset and end of season at Gaoua from 1990 to 2004. The whisker boxes represent the season onset and end of season dates from the observations and the models. The season onset boxes are at the bottom and the end of the season boxes are at the top. The boxes represent the full time series with the minimum (the bottom dash), the first quartile (25%), the median, the third quartile (75%) and the maximum (the top dash). The vertical lines separate the different sets of data, first column for the observations, the second column for the GCM driving data and the third column for the ERA driving data. Gaoua is the southwest station of the synoptic network stations

Fig. 3
figure 3

Season onset and end of season at Dori from 1990 to 2004. The boxes present the same statistics of the Fig. 2 for Dori. Dori is the northwest synoptic network station

The Wilcoxon test, applied at the 5% level at each station (results not shown), shows that HadRM3P, RACMO, RCA and CCLM present a significant difference with the observed dates of seasons onset for the two sets of driving data. We observe a negative bias (advanced dates) for the two first models and a positive bias (late dates) for CCLM. REMO doesn’t present any significant difference with observations for the GCM driven run (Figs. 2, 3). The same test applied for the end of season, reveals a significant delay for HadRM3P and RACMO for the two driving data. The other models do not present any significant differences with observation at more than seven stations.

The second aspect to be analyzed for these two parameters (season onset and end of season) is their inter-annual variance. It was found from observation that the season onset has a high inter-annual variance (standard deviation of 16 days) in comparison to the end of season (standard deviation of 10 days). These values for the simulations are on average 21 and 11 days respectively for season onset and the end of the season (Figs. 2, 3). From the same figures, we can observe that difference between models is more important for the season onset than for its end.

As the models have a similar behavior at the majority of stations for the two driving data and in order to facilitate the discussion, we will consider in the rest of this discussion the average over all stations when comparing the characteristics of the simulated rainy season with observations. Figure 4 which represents the season duration shows that three models, HadRM3P, RACMO, and RCA produce long rainy season in contrast to CCLM model which produces a short rainy season. Using the re-analysis to drive these RCMs tends to prolong the rainy season and thus aggravate the deficiency for most models.

Fig. 4
figure 4

Season durations in Burkina from the five models and observations from 1990 to 2004. The boxes represent the season duration average over the ten stations for each model and driven runs

The correlation of the inter-annual variability of these three parameters (season onset, end of season, season duration) shows a significant anti correlation (coefficient less than −0.7) between the season duration and the season onset at the ten stations. The correlation between the season duration and the end of season, which has a weak inter-annual variance, is not significant. These relations between the three parameters were also found on the simulated data. The season duration is more related to the season onset than on the end of season. Thus the differences of season duration (Fig. 4) between the models and for the different driving data can essentially be attributed to deficiencies in the simulated season onset.

Despite these differences, the inter-annual variability is assessed to verify whatever the most realistic large scale forcing of ERA-interim (Sylla et al. 2010) can lead the models to reproduce the observed inter-annual variability of season onset and the end of season.

In Fig. 5, the Taylor diagram of the season onset presents the correlation and standard deviation between the 15 years time series (1990–2004) of observed and modeled season onset dates at the 10 stations. The diagram shows low correlations (lower than 0.5) between the simulated and the observed onset dates. The models miss the inter-annual variability of season onset with the two lateral boundary conditions. The relative standard deviation is closed to 1 (between 0.5 and 1.25) indicating a good amplitude of the inter-annual variability. This is confirmed by the Fligner test for variance homogeneity which shows no case of significant difference at the 5% level for the two sets of simulations (ERA driving data and GCMs driving data), even if the models tend to underestimate the variance (80% of the points are between the curves 0.5 and 1). For the second parameter (the end of the seasons) the two sets of simulations (Fig. 6) do not present either any significant correlation with observations (coefficients less than 0.4). The two clouds of points for the GCM and ERA driven simulations are both distant from the reference point but the relative standard deviation remains close to 1. The Fligner test for variance shows no significant differences at the level of 5%. For the end of season date, the barycenters of the two sets of simulations are well separated and the ERA driven simulations show clearly a more positive correlation. Forcing the RCMs by the re-analysis seems to increase the correlation of the inter-annual variability of the end of season dates with observations but it is not sufficient to produce in these runs a realistic year-to-year variation of season length and intensity. The simulated season duration of the two driving data, which results from the 2 parameters discussed above do not present any significant correlation with the observed season duration either.

Fig. 5
figure 5

Taylor diagram of the rainy season onset at the ten stations for the five models. Each point of the diagram represents a gird box co-located with a station. The coordinates are the correlation coefficient (between the RCM data and the observations) and the relative standard deviation of the RCM data (ratio between the simulated data standard deviation and the observed data standard deviation). The arcs represent the relative standard deviation and the lines the correlation coefficients. The two points represent the barycenters, blue point for ERA driving data and red point for GCM driving data

Fig. 6
figure 6

Taylor diagram of the end of the rainy season at the ten stations for the five models. Same as Fig. 5

In this part, we have shown that the regional climate models do not produce a satisfactory inter-annual variability of the season onset and end dates. The result is not improved when the models are driven by the ERA re-analysis, except perhaps for the season’s end dates. One may wonder if this result is not linked to the high spatial variability of rainfall in the region and the fact that only 10 stations are used in this assessment. Studies over the square degree area of Niamey (Niger) have shown with the high-density rain gauge network that spatial gradients of annual rainfall of up to 275 mm over 10 km can be found (Lebel et al. 1997). In order to verify the influence of the network used on the results, we have performed the same analysis on annual mean rainfall averaged over Burkina Faso using the IRD (Paturel et al. 2010) data sets which include more than 100 stations over the country. In this case as well, the correlation between the observed and modeled inter-annual variability is low (below 0.4 for all models). This means that the results found with ten stations is robust.

In the following sections we will investigate other important features related to the rainfall amount.

4.3 Rainfall intensity and number of rain days

Several studies (Barron et al. 2003; Graef and Haigis 2001; Vischel and Lebel 2007) have demonstrated that the annual agricultural production or the annual quantity of water in streams on a basin depends more on the frequency of rainfall events and their average intensity than on the annual/seasonal mean rainfall. Regular (one event per week) moderate rainfall events (10–20 mm/day) will be more beneficial than irregular (spaced by more than 3 weeks) strong rainfall events (>50 mm/day). Thus the efficiency, in term of agricultural productivity for instance, of the rainy season depends more on the intensity and distribution in time of rainfall event than on the total amount of water provided to the surface.

Figure 7 displaying the annual rainfall amount average over all stations (the vertical line represents the average of the observations) shows a systematic annual rainfall amount overestimation (right shift) for HadRM3P and REMO for both forcing data sets. The other models are closer to observation and their biases are more dependent on the driving data. RCA driven by a GCM and CCLM driven by ERA tend to underestimate the annual rainfall amounts at most stations. For three models, CCLM, HadRM3P and RCA, the bias is more important in the ERA driven runs than when the large scale forcing is taken from the GCM. In addition, the Wilcoxon test performed at a 5% level shows that only HadRM3P annual rainfall amount, for both driving data sets have a significant difference with the observations at the ten stations. CCLM driven by GCM and RACMO driven by ERA have no significant difference at more than seven stations. The others simulations have in general significant difference with observations at most stations. For the annual rainfall amount, the impact of a change in the large scale forcing is not as systematic as it was found for the parameters analyzed previously.

Fig. 7
figure 7

Annual rainfall amount averages distribution in Burkina Faso from 1990 to 2004. The points represent the annual rainfall amounts average over the ten stations sorted and plotted for each model. The vertical dash represents the average of the time series data and the vertical line is the average of the observation data

An analysis of the annual rainfall mount using the Taylor diagram shows that the RCMs miss the inter-annual variability of annual rainfall for the two driving data. For the 15 years time series, the correlation coefficients of the inter-annual variability between the observations and the simulations are lower than 0.6 over all simulations and stations. But the models present good variance homogeneity with the observations at the ten stations.

The comparison of the annual number of rainfall events (Fig. 8) shows a systematic and significant (Wilcoxon test at the 5% level) overestimation for the five models at the ten stations. Figure 8 shows that HadRM3P and RACMO produce more than twice the observed number of rainfall days. The ERA-driven runs present for all analyzed RCMs higher rainfall frequencies than the GCM-driven runs.

Fig. 8
figure 8

Mean annual number of rainy days (0.1 mm/day). The whisker boxes represent the statistics of the average number over all stations of the seasonal rainy days from the observations (OBS) and the five RCMs

Here also, the RCMs miss the inter-annual variability of annual number of rain days with correlation coefficients less than 0.6 over all stations.

The repartition of these two characteristics into the different rainfall classes (defined in Sect. 3) will allow to better describe the quality of RCMs throughout the rainfall intensity spectrum.

The observed annual number of rainfall days and annual rainfall amounts distribution into the six rainfall classes from 1990 to 2004 is presented in Fig. 9 from the averages over the ten stations. The inter-station variation of the distribution is less than 4% points for all classes as indicated by the error bars in Fig. 9. The largest contribution to the annual rainfall amount comes from the strong rainfalls class with more than 48% but it represents only 20% of annual number of rainfall events. In contrast, the very low class which represents around 40% the annual rainfall events contributes less than 7% to the annual totals. We can point out here that the magnitude of the “very low” is not related to the rainfall threshold of 0.1 mm/day. A sensitivity assessment with 0.5 and 1 mm/day produced very similar results. The third class of the average rainfall events, contributes at the same level to the annual number and annual amount. The extreme class represents less than 1% of the two sums.

Fig. 9
figure 9

Proportion of each rainfall class in total rainfall and total number of rainy days. The inter-stations standard deviation is the spatial standard deviation within the ten stations

As shown previously (Amani et al. 1996; Stroosnijder 1996), the total rainfall distribution into different classes is different from the one for the annual number of rainfall events and demonstrates the importance of individual strong rainfall events.

First the simulated cumulative fraction of total annual rainfall contributed by each class of event intensity is analyzed. Figure 10 shows that for all models, except CCLM driven by the GCM, the cumulative rainfall weight distribution is higher than the observed for threshold below 20 mm/day. For CCLM, RACMO and REMO, the ERA driven runs produce much more low intensity events than the GCM driven runs. RACMO driven by ERA has 90% of its total rainfall falling in events of less than 20 mm/day when in the observational data only 40% of total rainfall is generated in this class. On the other hand 30% of the total rainfall in the CCLM model driven by GCM comes from events producing 20 mm/day or less.

Fig. 10
figure 10

Average weight of the total rainfall events at different intensities over the annual rainfall amount in Burkina Faso (continuous line = GCM driven simulations and dashed line = ERA driven simulations). The curves represent the cumulative weight of the total rainfall over the rainfall event intensities. These distributions are the averaged over the ten stations (inter-stations standard deviations is less than 5% points). The dashed lines represent ERA driven runs and the continuous lines represent the GCM driven runs

For the strong rainfalls class (>20 mm/day), the cumulated weights for three RCMs (CCLM, RCA and REMO) are lower than the observed cumulative weights. This is due to the fact that these models produce high extreme rainfalls which have a considerable weight on the annual totals. For these three RCMs, the events of intensity lower than 50 mm/day contribute less than 75% to the annual amount. So, the rainfall events higher than 50 mm/day which represents less than 2% of the model’s annual rainfall number (2.5% for the observations) contribute more than 25% (13% for the observations) to the annual rainfall.

In most cases ERA driven simulations produce systematically more weak events than the GCM driven runs as illustrated by the average shift of 5% in Fig. 10. Except for HadRM3P where the application of the re-analysis at the lateral boundaries does not change the distribution of the intensity of rainfall events and in RCA where events tend to weaken.

For the second distribution we will examine (Fig. 11) the cumulated number of rain events in the season at different intensities. For instance in this figure we can read that 40% of days in the season have recorded rainfall events with an intensity less than 50 mm/day. In contrast, the models HadRM3P and RACMO have an occurrence of more than 80% of days of rain with less than 20 mm/day during the season. Indeed, the five models overestimate the annual number of rainfall events (rainfall higher than 0.1 mm/day). HadRM3P and RACMO produced more than twice the observed annual number of events, even though their simulated seasons are longer than observed.

Fig. 11
figure 11

Average proportion of rainfall events number over season duration in Burkina Faso (continuous line = GCM drivien runs and dashed line = ERA driven runs). The curves represent the cumulative weight of the daily rainfalls number at different intensities over the season duration. These proportions are the averages over the ten stations (the inter-stations standard deviation is less than 5% points)

Rainfall events lower than 20 mm/day represent more than 90% of the RCMs number of days in the season against 75% for the observations. The very low rainfall events (<5 mm/day) are dominating in RCMs at a weight from 50% for HadRM3P to 70% for CCLM against 7% for the observations. For the five models, the rainfalls lower than 50 mm/day represent more than 95% of the days in the season. Hence, the models produce too many rainfall events of low intensity. In all models the situation is aggravated when they are forced by the re-analysis as more rainfall events are produced. The only exception to this result is HadRM3P.

With regard to season duration, rainfalls higher than 50 mm/day have similar frequency in the models and the observations but their weight in the annual totals present significant differences. It can be noted in Fig. 12c, that the observed average annual maximum rainfall intensities over the ten stations is lower than that for CCLM and REMO for the two driving data sets. RACMO driven by ECHAM5 overestimates also the maximum daily rainfalls over all the stations in contrary to RACMO driven by ERA which underestimates the maximum daily rainfall. Only HadRM3P model produces maximum daily rainfall close to observations. For daily average rainfall intensity (Fig. 12a), the five models (for both driving data sets) are lower than the observations, pointing again to the dominance of the weak events in the models. The 95th rainfall intensities percentiles (Fig. 12b) are also underestimated by the models, indicating that a low number of unrealistically extreme rainfall events explain the result found for the annual maximum rainfall events.

Fig. 12
figure 12

Daily rainfall intensities in Burkina from 1990 to 2004. Each point represents the annual average of the daily rainfall over the ten stations

Altogether, the three rainfall intensity features (the annual average, the distribution at different intensities and the extreme events) derived from the RCM data show significant differences with the observations. Here also, ERA driven runs present the highest deviation from the observations; we will now assess how the rainy days are distributed within the seasons.

4.4 Frequency and duration of dry spells

The rainy season contains small periods of consecutive dry days called dry spells. Their frequencies and duration in the sahelian area depend on the large scale synoptic variability of the monsoon (Janicot et al. 2011). In order to define these dry spells, rainfall thresholds need to be given in order to avoid interrupting the sequence with events that produce too little rainfall to be significant for agriculture or water resources (Barron et al. 2003; Modarres 2010). Sivakumar (1992) showed from a study of dry spells with five rainfall thresholds (1, 5, 10, 20, 25 mm/day) that the dry spell length and frequency at a given station depend on the rainfall threshold, the number of dry spells of less than 5 days decrease with rainfall thresholds while the number of dry spells more than 15 days increase. The author concluded that drought risks in West Africa are strongly related to mean annual rainfall amount and dry spell frequency. For increasing annual rainfall, frequencies of dry spells less than 5 days increased and frequencies of dry spells of more than 15 days decreased. The increase of the short dry spells and the decrease of the long dry spells come from an increase of the rainfall frequency, rainfalls are separated by few dry days. Lebel et al. (1997) noted in the observations from a dense rain gauges network in Niger that while the 1991 and 1992 annual rainfall amounts were similar, the timing of rainfall was very different in both years. During 1992 the rainy season produced more dry spells (>5 days) leading to reduced millet crop yields in some areas and the development of the grass layer was very low.

From the observed daily rainfall timing, the average length of dry spells at each station is about 3 days with the rainfall threshold of 0.1 mm/day (minimum rainfall) and 5 days with rainfall threshold of 5 mm/day (imbibitions rainfall and mean daily potential evapotranspiration in Burkina Faso). The duration of 5 days is considered as the limit of the first dry spells class. Following the previous study (Sivakumar 1992), the dry spell lengths are subdivided in three classes, short (<5 days), average (5–10 days) and long (>10 days).

Based on the above discussion of the systematic biases in simulated rainfall intensities notably the high frequency of the very low rainfalls, the selection of rainfall thresholds for defining dry spells in the RCM simulations requires some attention. In order to find a minimal rainfall intensity which makes the diagnostic less dependent on model biases, a relative rainfall threshold is defined. This value is taken at the rainfall intensity where the cumulative weight of the annual rainfall amount reaches 5% (Fig. 10). This approach can be justified by the fact that 95% sahelian annual rainfall is provided by Mesoscale Convective Systems (MCS) which produces generally larger rainfall intensities (Laurent et al. 1998). The threshold values can be read in Fig. 10: it is 4 mm for the observations, 2.5 mm for CCLM-GCM, 1 mm for RACMO-GCM, 0.5 mm for RACMO-ERA and 1.5 mm for the other models.

Hence for a detailed description of the dry spells timing, the following analysis focuses on three characteristics, the number of consecutive dry days, the number of dry spells in different classes, and the season’s longest dry spell.

Figure 13 shows that the dry days account for 55% of the season duration in the observed time series. But the models have too few dry days in the season, each one with its respective threshold, and only CCLM reaches values close to 50%. The ERA driven simulations, despite their longer rainy seasons, have fewer dry days than the runs driven by GCM data, with the exception again of HadRM3P.

Fig. 13
figure 13

Average fraction of dry days in the rainy season in Burkina Faso from 1990 to 2004. Number of dry days (at the corresponding rainfall threshold of the data) as a fraction of the rainy season duration. The fraction represents the frequency of dry days within the rainy season. The whiskers provide the inter-stations standard deviation

Another consequence of the too frequent rainfall produced by the RCMs is the shrinking of the average dry spells length. As it has been found for the fraction of dry days, the average duration of the longest seasonal dry spells of CCLM driven by ECHAM5 is close to observations (Fig. 14). The other models present significantly shorter maximum dry spells. RACMO driven by ERA data has the shortest maximum dry spell length which is consistent with its low number of dry days in the season.

Fig. 14
figure 14

Season longest dry spell length in Burkina Faso from 1990 to 2004. Each point represents the average over the ten stations of the seasonal longest dry spell of the dataset. The whiskers provide the inter-stations standard deviation

The dry spells are distributed into the three classes according to their duration (Fig. 15) in order to demonstrate that the short dry spells are the most frequent (more than 70%) in the observations and the simulations. But the models tend to overestimate this feature. The second (5–10 days) and third (more than 10 days) classes of dry spells are less frequent during the rainy season and the models represent this rapid decrease of occurrence. Altogether, the CCLM model driven by GCM data reproduces best the dry spell characteristics probably a consequence of the fact that its cumulative rainfall distribution events is quite realistic for low intensity events (<30 mm/day, see Fig. 10).

Fig. 15
figure 15

Dry spell classes weight in the total dry spells in Burkina Faso from 1990 to 2004. The bars represent the dry spell classes weight (number of dry spells of the class) over the total number of dry spells. The whiskers provide the inter-stations standard deviation

5 Summary and discussion

This analysis has investigated three main rainy season components: season duration, rainfall intensity and frequency, and dry spells length that are described by several parameters or characteristics.

Table 2 sums up these parameters from the observations and the five models in the three climate zones (sahelian, sub-Sahelian, and sub-Sahelian) of Burkina Faso. The table shows that the models reproduce the North–South gradients of the different parameters between the three climatic zones but underestimate the speed of the northward propagation of the rainy season and overestimate the contrast in terms of number of rainy days. The North–South difference in the number of rain events is 20 in the observations while it is 34 or 36 days for the RCMs, but on a higher average values.

Table 2 Average and inter-model standard deviation of some the rainy season characteristics at the three climatic zones

We have found that the main common deficiency in the five models for both driving data sets is the important number of low intensity rain events (lower than 5 mm/day). It is twice as high as the observed number. The high frequency of low rainfall values in the models entails fewer dry days with the relative rainfall thresholds at 5% of the cumulative distribution of rainfall intensity. The models generate fewer dry days and shorter dry spells than observed.

In these diagnostics as well as those presented above (disparities between models), it is clear that systematic biases of the regional models dominate (Paeth et al. 2011). In other words, the deficiencies found are characteristic of the models even if they can be aggravated by the data used to force the model at the boundaries of the domain. Nevertheless, it is remarkable that these deficiencies are affected by the driving data and the RCMs behave better when the large scales fields of GCMs are used. One can speculate that the difference in the number of perturbations fed into the domain by the two sets of large scale fields play a role here. It can also be hypothesized that the different balance of thermodynamic conditions in the two data sets may have an impact on the development of the perturbations which generate rainfall during the monsoon season.

The humidity fed into the domain by the large scale forcing certainly plays an important role in the deficiencies of the simulated rainfall distributions. But the relation is far from trivial. The ERA-Interim forcing provides a realistic precipitable water contents as could be verified with independent data (Bock et al. 2011). On the other hand ECHAM has a too moist atmosphere (John and Soden 2007) and feeds about 10% too much water into the domain (predominantly from the south during the rainy seasons), as measured at the borders of the domain. Still the distribution of rainfall intensities is worse when the 3 models (CCLM, RACMO and REMO) forced by ECHAM use ERA. It has to be noted here that RACMO uses in this version (Meijgaard et al. 2008) the same physics package as the ECMWF model with which the ERA-Interim re-analysis was performed (Cycle31r2). Clearly the link in the models between the background moisture and the rain generating processes needs to be better understood.

The diagnostic of the simulated inter-annual variability of the rainy season’s characteristics was deceiving. Even when the models were forced by the more realistic large scale forcing provided by the re-analysis the year to year fluctuations were not well reproduced. This seems to indicate that the internal dynamics generated by the models within their domains have more weight on the rainfall generating processes than the tele-connections which are well documented for this region (Janicot et al. 2011; Rodríguez-Fonseca et al. 2011). Sylla et al. (2010) found in their analysis of the RegCM3 simulations a better representation of the inter-annual variability of rainfall in West Africa. But it has to be pointed out that their analysis covered a larger area of West Africa and only seasonal rainfall averages were used for the inter-annual variability validation. Thus our result could be due to our choice of diagnostic variables and models.

6 Conclusion

This assessment of the regional climate models skill over a sahelian area of West Africa revealed the importance of looking at the details of the rainy season and how it is represented by models. An analysis based only on the annual or monthly rainfall amounts would hide large parts of the model’s capability or weaknesses. It is particularly important for this region to look at the frequency of rain events and the distribution of their intensities. The five RCMs presented, which used different large scale forcing data sets, displayed an overestimation of the frequency of very low rainfall events (between 0.1 and 5 mm/day) and an underestimation of the mean daily rainfall amounts. Despite the long duration of the rainy season in the RCMs, the high rain event frequency lead to shorter dry spells than those observed. Dry spell length is an important parameter for applications and quite telling for the quality of the representation of the physical processes which govern rainfall generation (Lafore et al. 2011).

The influence of the driving data on the climatology of RCMs is well known (Frei et al. 2003; Jacob et al. 2007) but it was unexpected that using atmospheric re-analysis (ERA-interim in our case) would lead to worse results than driving the models with GCM outputs. This raises the question on the role of the lateral boundary conditions for RCM set-up over tropical continental areas where land surface processes play an important role (Taylor et al. 2011).

RCMs are an important tool for studying the impacts of climate change or fluctuations because of their high resolution. In West Africa their outputs are particularly relevant for water resources, food production and public health studies. But it is deceiving that for parameters of the rainy season essential to these applications, the RCMs show such large biases. Processes such as infiltration or desiccation of crops cannot be realistically represented if rainfall events have too weak intensities or are not separated by long enough dry spells. It is thus essential to bias-correct the simulated precipitation in order to reduce the impact of these biases on the application models. It also calls for a major effort to improve the representation in RCMs of the atmospheric processes governing the rainfall generation in the tropics.