1 Introduction

In the recent decades, Africa has been affected by significant precipitation variability, including severe droughts in the Horn of Africa and West Africa during 1970s and 1980s, a multi-year drought in South Africa’s winter rainfall region, and severe floods in countries with arid climate such as Algeria, Tunisia, Egypt and Somalia (Niang et al. 2014; Burls et al. 2019). Future increases in radiative forcing may not only further increase the variability of precipitation, but also lead to permanent shifts in the regional climate characteristics. Such climatic changes can be potentially detrimental to already vulnerable natural and human systems across Africa. It is thus imperative to understand how global climate change may impact regions across Africa, particularly those with inherently low adaptive capacity, for informed planning and decision making to cope with the environmental challenges in the coming decades.

A multitude of different numerical model types and experiments have been developed and used to generate climate information for Africa over the last two decades. The production of climate change projections has been primarily carried out by means of Global Climate Models (GCMs), especially within the framework of the Coupled Model Intercomparison Project Phase 5 (CMIP5; Taylor et al. 2012) and Phase 6 (CMIP6; Eyring et al. 2016). Amongst the many works based on CMIP5 results (see Niang et al. 2014), Diffenbaugh and Giorgi (2012) identified areas of the Sahel, tropical West Africa and southern Africa as hotspots of regional climate change, while several studies including e.g., Monerie et al. (2017), Ongoma et al. (2018), Diedhiou et al. (2018) and Giannini et al. (2018) analyzed multiple aspects of future precipitation characteristics over subregions across Africa.

More recently, CMIP6 simulations have been investigated in terms of mean precipitation change at continental scale (Almazroui et al. 2020) and for southern Africa (Sian 2021), and by extreme events over East Africa (Ayugi et al. 2021a). In addition, Moon and Ha (2020) and Ukkola et al. (2020) investigated future characteristics of (global) monsoon precipitation and meteorological droughts, respectively.

However, due to their coarse horizontal spatial resolution (usually in the order of a hundred km or more), GCMs are often inadequate in the representation of fine-scale regional processes, particularly those influenced by complex topography, land use heterogeneity, coastal lines and mesoscale convection.

In the framework of the COordinated Regional-climate Downscaling EXperiment, (CORDEX, Giorgi and Gutowski 2015), Regional Climate Models (RCMs) have been used to dynamically downscale the results of CMIP5 GCMs to generate high-resolution (0.44°, around 50 km) historical and future climate projections over Africa. CORDEX future climate projections for Africa have been analyzed in several studies (Laprise et al. 2013; Haensler et al. 2013; Saeed et al. 2013; Mariotti et al. 2014; Dosio and Panitz 2016; Pinto et al. 2016; Diallo et al. 2016; Akinsanola and Zhou 2019; Endris et al. 2019; Dosio et al. 2019; Bichet et al. 2020; Tamoffo et al. 2021; Mengistu et al. 2021; among others). Although CORDEX produced a relatively large ensemble (> 20) of multi GCM-RCM simulations, most of these studies are based on the results of a single RCM downscaling an ensemble of GCMs, or on a small ensemble of RCMs downscaling a small ensemble of GCMs. In addition, although few of these studies compared the downscaled simulations with those of the driving GCMs, none compared the results with the full range of the CMIP5 ensemble, or with newer products such as CMIP6.

Recently, a new initiative within CORDEX, the Coordinated Output for Regional Evaluations (CORDEX-CORE, hereafter simply CORE) was launched aiming at producing climate projections in a more homogeneous framework, where all participating RCMs were required to downscale the same set of driving GCMs (in contrast to CORDEX where the choice of the GCMs was left to the individual RCM modelling groups). In addition, to make the CORE results more suitable for application in impact studies, the horizontal resolution was set twice as high as that of CORDEX (0.22°, around 25 km). Unfortunately, the exceptionally large computational and data archival resources required to carry out simulations following the CORE protocol resulted in only a very small number of RCM modelling groups (three) participating in the initiative. In addition, studies analyzing CORE projections to date only made use of one or two of the participating RCMs; Gnitou et al. (2021) investigated the performances of two CORE RCMs in simulating seasonal precipitation over Africa, Teichman et al. (2020) analyzed the mean climate change signal at global scale from CORE and the CMIP5 GCMs, and Coppola et al. (2021) provided an analysis of projected impact-relevant indices (including extreme precipitation and the number of dry days) based on CORE, CORDEX, CMIP5 and CMIP6 runs, although the CMIP6 ensembles was limited (12 GCMs) and results focused only at annual time scales. Finally, Ashfaq et al. (2020) provided an assessment of future characteristics of monsoon systems, including the West and East African monsoon, but results were based on a single CORE RCM only.

Other experiments available to study climate change projections over Africa include the High-Resolution Model Intercomparison Project (HighResMIP, Haarsma et al. 2016), which consists of a small subset of CMIP6 GCMs run globally at high resolution (between 0.28° and 1°). However, future projections from HighResMIP are available only until 2050 and, to the best of our knowledge, results for Africa are described only in Ajibola et al. (2020) who evaluated the performance of HighResMIP models in representing historical climate over West Africa.

Finally, promising results are becoming available from convection permitting models (CPM), i.e. RCMs run at very high resolution (around 5 km or less) so that convection, which is one of the main drivers of precipitation in Africa, can be explicitly simulated, with no or limited need for parameterisation. These simulations showed significant improvements in small-scale precipitation characteristics (such as intensity and diurnal cycle) with notable effects on projected changes in extremes. However, results are available only for a very limited period of time (10 years or less) (Kouadio et al. 2020; Kendon et al. 2019; Senior et al. 2021).

The construction of regional climate information needs to be based on multiple lines of evidence, including, but not limited to, the analysis of the results of different classes of climate models. In fact, it is important to assess whether and to what extent the somehow limited size of the RCMs ensembles (CORDEX and, specially, CORE) is able to capture the entire range of future changes projected by the full CMIP ensemble. In addition, although RCMs were shown to better reproduce some aspects of precipitation climatology compared to their driving GCMs (e.g., Gibba et al. 2019), this ‘added value’ is not always present (especially for mean quantities e.g., Dosio et al. 2015) and it is often not straightforward to establish whether for instance improvements in the mean precipitation field are a result of a more realistic simulation of the physical processes (Dosio et al. 2019; Tamoffo et al. 2020).

In light of this, and building on and expanding the results of previous works (Dosio et al. 2019; Almazroui et al. 2020; Coppola et al. 2021), here, to the best of our knowledge, we provide the most comprehensive assessment to date of future daily characteristics of precipitation over Africa, by comparing the results from large ensembles of global (CMIP5, CMIP6) and regional (CORDEX, CORE) models. Projected changes at the end of the twenty-first century are analyzed for different emission scenarios, hence providing insight into the impact of different mitigation policies. The main aim of this work is to highlight the similarities and disagreements between the results of different model ensembles. In fact, where results are consistent across ensembles, a high degree of confidence can be assumed in the projected change (e.g., its intensity or spatial extent). On the other hand, when models results diverge, these discrepancies need to be taken into consideration by both the scientific community (to understand why differences occur) and the user community (including policy makers), which may need to be prepared to cope with contrasting but equally plausible futures (e.g. Dosio et al. 2020).

The paper is structured as follows: Sect. 2 describes the data and method used in the analysis. Section 3 discusses the results for both the recent past and future climate. Summary and concluding remarks are presented in Sect. 4.

2 Data and methods

Daily precipitation data for the period 1981–2100 was obtained for a large number of model simulations, listed in the Supplementary Information (SI) Tables 1–4. Data was collected from the Earth System Grid Federation (ESGF) servers and include all the freely available simulations at the time of writing. Simulations include a historical period, forced by observed natural and anthropogenic atmospheric composition, covering the period until 2005 (2014 for CMIP6) and future projections (until 2100) forced by three Representative Concentration Pathways, namely RCP2.6, RCP4.5 and RCP8.5 (note however that CORE results are not available for RCP4.5). CMIP6 simulations are based on a new and extended scenario approach, the SSP-RCP framework, which relates socioeconomic scenarios (SSPs) to the RCPs. Some SSP-RCP scenarios, namely SSP1-RCP2.6, SSP2-RCP4.5, SSP4-RCP6.0 and SSP5-RCP8.5, were designed specifically to provide continuity with the CMIP5 RCPs, resulting in the same radiative forcing by the end of the twenty-first century (see e.g., Tebaldi et al. 2021). Here, for consistency with the available CORDEX and CORE simulations, only the CMIP6 simulations driven by SSP1-RCP2.6, SSP2-RCP4.5 and SSP5-RCP8.5 were used.

The size of each ensemble varies greatly, ranging from 9 simulations for CORE to 33 for CMIP5. Also note that the number of runs in each ensemble differs depending on the RCP scenario, with RCP2.6 having the smallest number of simulations and RCP8.5 the largest (SI Tables 1–4).

The reference period for model evaluation and for the calculation of the climate change signal is defined as 1981–2010. This period provides a compromise between the historical periods of the CMIP5 (CORDEX, CORE, ending in 2005) and CMIP6 (ending in 2014), and coincides with the World Meteorological Organization “climatological standard normal period”.

Before computing any statistics, CMIP5 and CMIP6 outputs were interpolated to a 1.32° grid (i.e., three times coarser than the CORDEX one) over the African domain defined by the CORDEX protocol.

Amongst the many indices available for analyzing daily precipitation characteristics, besides seasonal mean (SM) precipitation, we focus mainly on the number of rainy days (RR1, where a rainy day is defined as a day when precipitation is > 1 mm), the simple precipitation intensity index (SDII, i.e., the mean precipitation rate on rainy days), and the maximum daily precipitation rate (RX1day). These indices provide useful metrics to characterize present and future frequency and intensity (both mean and maximum) of precipitation spells. In addition, the number of consecutive dry days (i.e. days with precipitation < 1 mm, CDD) is also considered as a meteorological indicator of dry spells.

2.1 Definition of subregions

A new set of reference regions has been designed for the upcoming Intergovernmental Panel on Climate Change (IPCC) 6th Assessment Report (AR6) (Iturbide et al. 2020) and used in studies with both global (e.g. Almazroui et al. 2020) and regional (e.g. Coppola et al. 2021) models. However, here we use a somehow different set of regions (Fig. 2); in fact, the IPCC macro-regions may be too large for analyzing precipitation characteristics in regions where precipitation exhibits substantial spatial and temporal heterogeneity (compare e.g. the different precipitation annual cycle between north and south central Africa, or between the coast of the Gulf of Guinea and the western Sahel) as well as in regions with specific geographical characteristics (e.g. Ethiopian highlands or the coast of the Gulf of Guinea).

2.2 Definition of robust climate change signal

When analyzing the results of climate models ensembles, it is important to take into account the consensus amongst models. Usually, criteria for defining a “robust” climate change include the analysis of its sign and/or its significance, i.e. how the change compares with natural variability. There are many different methods to define the robustness of the climate change signal (see discussion in Dosio et al. 2019). Here we use a method developed from that used in the IPCC AR5, and similar to that for the upcoming IPCC AR6, which is based on the models’ agreement in terms of both significance and sign of the projected change.

First, for each land grid point and for each model run, we test the significance of the change at the end of the century (2070–2099) by comparing it to a threshold linked to the interannual variability. The threshold is defined as γ = √(2 ⁄ 30) × 1.645 × σ. In IPCC AR5 σ was calculated as the standard deviation of non overlapping 20 year means of the long (200 year) pre-industrial control simulations. As this is not possible for regional model simulations (starting in 1950), σ is instead calculated as the interannual standard deviation of the linearly detrended annual time series in the reference period (1981–2010). The factor √(2/30) accounts for the fact that we consider the variability of a difference in means over a 30 year period, whereas the 1.645 factor accounts for a 90% confidence in the signal surpassing the internal variability (in IPCC AR5 this value was set to 2).

Second, we classify the change as follows:

  • The change is considered robust if more than 66% of the model simulations show a significant change and, at the same time, more than 66% of them agree on its sign,

  • The change is considered uncertain, if more than 66% of the model simulations show a significant change but less than 66% of them agree on its sign.

In addition to these two criteria (and in contrast to the IPCC methodology), we also distinguish the case where more than 80% of model simulations exhibit a non-significant change (independently of the agreement on the sign): this indicates areas where any apparent change simulated by most of the models is small compared to the variability, i.e. non-significant. Finally, we denote with ‘little agreement’ areas where none of the above conditions are satisfied (i.e. regions where between 20 and 66% of models show significant change, independently of its sign) and no clear conclusions can be drawn.

Although in many studies higher thresholds are used (e.g. 80% for the agreement in sign in the IPCC methodology), here we decided to lower this threshold in order to make it applicable to small ensembles (such as CORE) for which, otherwise, robustness may be too strongly dependent on the results of a single model.

It should be noted that all definitions of robustness are subjective, and none of these methods attempt to link the projected change (hence its robustness) to its dynamic and thermodynamic drivers. It must be also noted that a robust change does not necessarily mean that the inter-model variability (interquartile range) is small, whereas this is true by definition for the regions where the signal is non-significant.

3 Results

3.1 Evaluation over the reference period (1981–2010)

Figure 1 shows the observed (GPCC_FDD, 1983–2010, with a spatial resolution of 1°, Ziese et al. 2018) and modelled spatial distribution of June–July–August (JJA) daily precipitation characteristics. Similar maps for December–January–February (DJF) are shown in SI Fig. 1. Here we focus on ensemble mean results only, whereas the uncertainty (spread) of the different ensembles are analyzed later.

Fig. 1
figure 1

Mean daily precipitation characteristics for June–July–August (JJA) from observation (GPCC_FDD, 1983–2010) and multi model mean (1981–2010) from CMIP5, CMIP6, CORDEX and CORE. Results show seasonal mean precipitation (SM, first row), simple daily precipitation index (SDII, second row), number of rainy days (RR1, third row) and maximum daily precipitation (RX1day, fourth row). The list of models used is reported in the SI Tables 1–4

The spatial distributions of seasonal mean (hereafter SM) precipitation produced by all models ensembles are qualitatively similar to each other and to the observations, although, locally, differences are visible, with the regional models capturing the details associated with a more realistic representation of the topography (e.g. Ethiopian highlands).

Differences between global and regional models are more marked when other daily precipitation characteristics are examined, such as the mean precipitation on rainy days (SDII), the number of rainy days (RR1) and the maximum daily precipitation rate (RX1day). In fact, we note that over the monsoon precipitation band, CORDEX and CORE results are closer to the observed values than CMIP5 and CMIP6 models that, in general, overestimate the number of rainy days but underestimate SDII and RX1day. Other studies have highlighted the CMIP6 GCMs difficulty in simulating accurately the West African monsoon precipitation intensities (Klutse et al. 2021) and their tendency to underestimate SDII over east Africa (Ayugi et al. 2021b). Underestimation of the extreme precipitation rates by the global models (especially CMIP5) is also reported by Coppola et al. (2021).

On the other hand, several studies showed the ability of the RCMs to better simulate, compared to the driving GCMs, higher order statistics and extreme events (e.g., Giorgi et al. 2014; Gibba et al. 2019), although first order statistics (such as seasonal mean precipitation) are not always improved by the downscaling (e.g., Dosio et al. 2015, 2019; Akinsanola and Zhou 2019) as the geographical distribution of seasonal precipitation simulated by the RCMs is strongly affected by the boundary conditions.

Several inconsistencies are found in the observed precipitation between different observational products, especially over regions where station networks are sparse (Sylla et al. 2013; Maidment et al. 2015; Diaconescu et al. 2015; Herold et al. 2017; Hua et al. 2019; Masunaga et al. 2019; Bador et al. 2020), even for mean quantities (Fig. 2). When model simulations are compared to a large ensemble of observational products including gauge-based, satellite-based and reanalysis products (which have been evaluated in Dosio et al. in review), all ensembles generally reproduce the annual cycle of monthly averaged daily precipitation over many African subregions. Model ensemble means generally lie within the uncertainty range of observed datasets, although large differences exist among individual ensemble members. However, over the Sahel region (West Sahel, SAH_W and East Sahel, SAH_E) global models tend to underestimate the precipitation peak during June–September. Over western southern Africa (SAF_W) all models overestimate December-March precipitation. None of the model ensemble means are able to reproduce the double precipitation peak over the coast of the Gulf of Guinea (GN_C) and all ensemble means underestimate (overestimate) the April (October) precipitation peak over the Horn of Africa (HRN). Similar findings are reported by Almazroui et al. (2020) and Sian et al. (2021) for CMIP6, Yang et al. (2015), Agyekum et al. (2018), Zebaze et al. (2019), and Ongoma et al. (2019) for CMIP5 and e.g., Endris et al. (2016), Favre et al. (2016) and Tamoffo et al. (2019) for CORDEX.

Fig. 2
figure 2

Annual cycle of monthly averaged daily precipitation (mm/day) over the reference period (1981–2010) over the African subregions shown as blue boxes in the map. The thick black line represents the GPCC_FDD observed values, whereas the thin black lines show the range of a large ensemble of observational products, including reanalyses, satellite-based and gauge-based products discussed in Dosio et al. (in review). Colored lines and shaded areas show the ensemble mean and range of the different model ensembles, respectively

Individual model results can vary greatly (e.g., James and Washington (2013) and Washington et al. (2013) for CMIP5; Klutse et al. (2021) for CMIP6, Teichmann et al. (2020) and Gnitou et al. (2021) for CORE, Dosio et al. (2019) for CORDEX): generally the spread amongst regional model results is smaller than that of the global models (apart, notably, for RX1day), although this may crucially depend on the different size of the model ensembles (in particular for the CORE experiment, which has used only three CMIP5 GCMs as boundary conditions).

The comparison of modelled and observed daily precipitation indices over several subregions are shown in Fig. 3. All model ensembles perform generally satisfactorily for most indices over all regions, although global models tend to underestimate SDII and RX1day (except for southern Africa) and overestimate the number of rainy days. Kim et al. (2020) noted that CMIP6 models simulate more intense precipitation than CMIP5 models over most regions of the world; this is in agreement with our findings particularly over central and southern Africa. Akinsanola et al. (2021) noted that CMIP6 models overestimate mean precipitation but underestimate SDII, CDD and extreme precipitation rates over the Horn of Africa in March–April–May (MAM) and September–October–November (SON). Similar results for East Africa are shown by Ayugi et al. (2021b) for both CMIP5 and CMIP6. Klutse et al. (2021) showed that most CMIP6 models overestimate the frequency of wet days over the coast of the Gulf of Guinea during the West African monsoon season.

Fig. 3
figure 3

Comparison of observed and modelled daily precipitation indices over African subregions over the reference period (1981–2010). In each panel, the box-and-whiskers plots show the median, interquartile and full ranges of the ensembles of observational products (black) and models (colors). The vertical black line separates the indices ranging from 0 to 20 (SM and SDII) from those ranging from 0 to 100 (RR1, RX1day, CDD). Units depend on the index: SM, SDII and RX1day are measured in mm/day, whereas RR1 and CDD in days. Note that the season when indices are analyzed depends on the subregion, and corresponds to the season of maximum rainfall (see Fig. 2)

On the other hand, regional models tend to overestimate maximum daily precipitation rates, especially over the western Sahel, East Africa and southern Africa. In addition, the RCMs interquartile range is particularly large over the Sahel and GN_C, particularly for CORDEX, although the CMIP6 full range can be similar or even larger than those of the RCMs over e.g. central Africa. As mentioned previously, the ability of RCMs to add value to the driving GCM in simulating precipitation characteristics (especially higher order statistics) has been investigated in several studies (e.g. Dosio et al. 2015; Pinto et al. 2016; Nikiema et al. 2017; Fotso-Nguemo et al. 2017; Gibba et al. 2019; Tamoffo et al. 2020; Gnitou et al. 2021). It must be noted that added value, i.e. non-negligible fine-scale information that is absent in the lower resolution simulations, stems from physical mechanisms resolved at higher resolution, for either present-day or future climate (Dosio et al. 2019).

In summary, the analysis of the results presented in Figs. 1, 2 and 3 provides evidence on the ability of global and regional models in simulating different precipitation characteristics over the reference period, and hence, their fitness for purpose for generating reliable future projections. Generally, both global and regional models reproduce, in mean, the observed indices within the observational uncertainties (with few exceptions), although large differences exist amongst individual simulations, especially for precipitation extremes. There is not clear evidence of an improvement in CMIP6 performances compared to CMIP5, with both ensembles generally overestimating the number of rainy days, and consequently, underestimating the daily precipitation intensity. Although the performances of regional models depend on the driving GCMs, especially for mean quantities, and their performances for other characteristics depend on the region and index, RCMs show a tendency to better simulate e.g. SDII but to overestimate RX1day. Finally, Dosio et al. (2019) showed that, although large biases exit in e.g. the simulated position, extension and intensity of the precipitation band simulated by the CORDEX RCMs, a wet (dry) bias on the present climate does not necessarily imply a tendency towards wetter (dryer) future precipitation characteristics, making any attempt to select a ‘best-performing’ model (or class of models), or even linking future projections to simulation skills over the present climate, very challenging see also e.g., Almazroui et al. (2021).

3.2 Projection of future precipitation characteristics

Figure 4 shows the DJF projected change in precipitation indices at the end of the century under the SSP5-RCP8.5 emission scenario. Results for the other seasons are shown in the SI Figs. 3–5. Here we focus on the main similarities and disagreements across the different ensembles, thus highlighting regions where a consistent message can be drawn and those where results are contrasting. Seasonal mean precipitation in DJF is projected to exhibit a robust increase according to all model ensembles over the Horn of Africa, and parts of Angola, Kenya and Tanzania, whereas a robust decrease is projected over part of the Atlas region and the northern coasts of Morocco and Algeria. On the other hand, large discrepancies exist between global and regional models on the projected change in mean precipitation over central Africa, where both CMIP5 and CMIP6 models project a robust increase in precipitation. It should be noted that for the GCMs the increase is robust even for SSP1-RCP26 (see SI Fig. 6). In contrast, CORDEX models show mainly little change or uncertain signal, while CORE simulations show a robust decrease. Drying is projected by all ensembles over most of the western South Africa, Namibia and Botswana, but the change is robust only in the CORE simulations. Also in MAM (SI Fig. 3) the robust increase in precipitation over central Africa simulated by the global models is not present in the regional models.

Fig. 4
figure 4

Multi model mean changes in mean daily precipitation characteristics for December-January–February (DJF) for the period 2071–2100 w.r.t. 1981–2010, under SSP5-RCP85. Areas with no hatching (i.e., in full color) are those where the signal is robust. The regions where the change is non-significant or uncertain are highlighted by hatching, together with the areas where there is little agreement on the sign and/or magnitude of the change. The number of models used for each ensemble is shown near to the ensemble name

Similarly, in JJA (SI Fig. 4) consistent robust drying is projected over part of the western Sahel (e.g. Senegal) and part of the Guinean region (e.g. Guinea Highland). However, while global models show wetter conditions over the eastern Sahel and the Ethiopian Highlands, CORDEX and CORE results show uncertain change and robust drying, respectively (see also Dosio et al. 2019). Consistent wetting is projected over the Horn of Africa and drying over most of Southern Africa and Madagascar for SON, with a robust increase in the length of dry spells (SI Fig. 5). Similar results have been found for CMIP6 (Almazroui et al. 2020; Ukkola et al. 2020), CORDEX (e.g. Bichet et al. 2020; Gibba et al. 2019; Dosio et al. 2019) and CORE (Teichman et al. 2020). Other studies find discrepancies between RCMs and the driving GCMs (e.g., Saeed et al. 2013; Teichmann et al. 2013; Diallo et al. 2016; Dosio and Panitz 2016; Pinto et al. 2018).

Although results for seasonal mean precipitation may be contrasting for specific regions and seasons across the different model ensembles, the change in other indices shows better agreement. For instance, for DJF all ensembles project an increase in SDII over central Africa, and an increase (decrease) in RR1 over Tanzania and the Atlas region (Mozambique). Similarly, for JJA a consistent robust decrease in RR1 is visible over the Guinean coast and western Sahel, northern coast of Algeria and Morocco and parts of South Africa (see e.g., Pinto et al. 2016 for CORDEX). Also, Ukkola et al. (2020) noted an increase in meteorological drought duration over southern Africa, Guinea and the northern African coasts in the CMIP5 and CMIP6 ensembles. Ayugi et al. (2021a) found an increase in CDD, along with an intensification of extreme precipitation over East Africa, while Moon and Ha (2020) noted a thermodynamically driven increase in precipitation rates over the monsoon regions.

It should be noted that the classification of results according to the robustness of the signal is crucially dependent on many factors, including the threshold used, the ensemble size and the modelled internal variability over the reference period, which can vary greatly amongst ensembles (see SI Fig. 2). In fact, it is evident for instance that the spatial distribution of the changes in RX1day and CDD for DJF (and, to a lesser extent, RR1 and other indices in other seasons) look remarkably similar across the ensembles, although the magnitude of the change can vary substantially.

The results of the different ensemble can also be compared by calculating the fraction of land for which the projected change in an index is robust, uncertain or non-significant, as shown in Figs. 5, 6 and 7 and SI Figs. 7, 8. This is useful to investigate the similarities and discrepancies of the information derived from different ensembles. For instance, over the Atlas region in DJF, all ensembles suggest that under SSP5-RCP8.5 the majority of the land is projected to face a robust reduction in mean precipitation (Fig. 5) and number of rainy days (Fig. 7), accompanied by longer dry spells (SI Fig. 8). Reduction in maximum precipitation intensity (SI Fig. 7) is less consistent, whereas SDII is projected to not change significantly over most of the region even under the high emission scenario (Fig. 6). Crucially, all indices show a non-significant change over most of the region under SSP1-RCP2.6, which emphasizes the benefit of implementing effective mitigation policies. Results are also consistent across all the ensembles and indices in SON for HRN, which is projected to face more frequent and intense rains, and in DJF for EAF, where, despite an increase in mean precipitation, the number of rainy days is projected to decrease over a substantial fraction of land, with a consequent increase in the length of dry spells.

Fig. 5
figure 5

Fraction (%) of land area where the change in seasonal mean (SM) precipitation is robust (colored bars), non-significant (\\\) or uncertain (XXX), according to different emission scenarios and model ensembles. The white area of the column not covered by any colors or symbols represents little agreement in the change. In each column, each category is shown on top of the other, so that the total of the 4 categories is 100%. For each emission scenario, results are shown for each ensemble starting from CMIP5 (leftmost column), CMIP6, CORDEX and CORE (rightmost column). Blue/green colors indicate a positive change (i.e. an increase in mean precipitation), brown/orange colors a negative change. For instance, for SAH_W, under RCP8.5, CMIP5 results show robust change over approximately 55% of the land area (15% robust positive change, 40% negative), little agreement over approximately 20% of the land area, and uncertain change over the remaining 25% of the land area. Note that CORE results are not available for the SSP2-RCP45 scenario

Fig. 6
figure 6

As Fig. 5 but for the simple precipitation intensity index SDII

For southern Africa, despite CORE projecting a decrease in mean precipitation over a vast fraction of land, in contrast to the other ensembles, all ensembles agree on an intensification of rainfall for 15–30% of land in DJF, with CORE projecting a robust reduction in rainy days and an increase in CDD over more than 80% of land, under SSP5-RCP8.5.

Results for West Africa, central Africa and the Ethiopian Highlands are less consistent. In general all ensembles agree on a robust increase in SDII over parts of these regions, in particular central Africa (both CAF_N in SON and CAF_S in DJF), where SDII (and Rx1day) is projected to increase over a large fraction of land (more than 90% for the global models, Fig. 6 and SI Fig. 7). Conversely, a reduction of the number of rainy days is projected over the majority of western Sahel (between 50 and 95% of land) by all ensembles (Fig. 7). However, other indices over these regions show contrasting results between global and regional models. In particular, while global models project a robust increase in SM over most of SAH_E and part of ETH in JJA, with up to 95% of land for CMIP6, regional models project a robust decrease. For the eastern Sahel in JJA, this projected change is due to an increase (in global models) and reduction (in RCMs) in the number of rainy days, as precipitation mean (SDII) and maximum intensity are projected to mostly increase. Over the western Sahel in JJA, mean precipitation is projected to mainly decrease, apart from CMIP6 that shows an increase over nearly 40% of the land area. However, all ensembles show a robust reduction in the number of rainy days over an area ranging from nearly 50% (CMIP6) to nearly 100% (CORE) with consequent increase of the length of dry spells.

Fig. 7
figure 7

As Fig. 5 but for the number of rainy days (RR1)

Over parts of West and Central Africa, some ensembles show an uncertain change. As mentioned, this means that models do not agree on the sign of change, but this change is nevertheless significant for more than 2/3 of the models. Hence, for instance, over the west Sahel in JJA, all ensembles show that a large fraction of land (up to nearly 90% for CMIP6) will face a significant change in mean precipitation, although over some of this land the direction of change is uncertain.

Likewise, for some indices and regions the change is robust over a substantial fraction of land also for moderate or even low emission scenarios (especially for CMIP6). This may have important consequences for the planning of adaptation measures independently of the effectiveness of the mitigation policies.

The change in precipitation indices averaged over the subregions is shown in Figs. 8, 9 and 10 and SI Figs. 9, 10. The results for seasonal mean precipitation (Fig. 8) show that the intermodel spread is always very large, with many cases where models’ results show opposite signs in the direction of the projected change. A notable exception is the Horn of Africa in SON, where at least 75% of models in all ensembles show a positive change, which is also robust in at least 30% of land. Consistency across model ensembles in other regions is scarce; for instance, over SH_E in JJA CMIP5, CORDEX and CORE project a decrease (in terms of the ensemble means) which is robust over more than 30% of land, but CMIP6 projects an increase. Over GN_C and CAF_N in SON, and EAF in DJF CMIP5, CMIP6 and CORDEX project an increase in precipitation, but CORE a decrease. Over SAH_E and ETH in JJA and CAF_S in DJF global models project an increase in precipitation, but the RCMs a decrease.

Fig. 8
figure 8

Changes in seasonal mean daily precipitation (SM) for each subregion for the different model ensembles and emission scenarios. In each panel, the box and whiskers plots show the median, interquartile- and full- range of the various ensembles. Blue (red) colors indicate that the mean positive (negative) change is robust over more than 30% of land points. White indicates that the change is non-significant over more than 30% of land points. For the CMIP5 ensemble, the blue lines indicate the median and full range of the subset of models downscaled by the CORDEX models. For CORDEX, the green lines indicate the median and full range of the RCM models used also in the CORDEX-CORE ensemble

Results for other indices are much more consistent: for instance, SDII (Fig. 9) shows a general increase in all regions (except for SAH_W from CORE) and scenarios (although the increase is robust over more than 30% of land mostly under SSP5-RCP8.5). RR1 shows a consistent increase over HRN in SON and a decrease over SAH_W in JJA and CAF_S, EAF, SAF_E and SAF_W in DJF. Also RX1day (SI Fig. 9) shows remarkable agreement amongst all ensembles, with a general tendency toward an increase over all regions, although model spread can be particularly large.

Fig. 9
figure 9

As Fig. 8 but for SDII (mm/day)

Fig. 10
figure 10

As Fig. 8 but for the number of rainy days

Figures 8, 9 and 10 can also be helpful to investigate the impact of the choice of the GCMs used to drive CORDEX and CORE simulations, and to answer the question of whether the RCMs results are an adequate sample of the full CMIP5 uncertainty range.

First we note that the subset of CMIP5 models used in the CORDEX runs reflects the entire CMIP5 range for SM in GN_C, CAF_N, CAF_S and ETH. However, over other regions (and for other indices) the represented range is much smaller than for the spanning CMIP5, especially over the HRN, SAH_W and southern Africa. As a consequence, the range of CORDEX projections is usually smaller than that of the full CMIP5 ensemble (apart notably for GN_C). However, for other indices the situation is different. For instance, for SDII the CORDEX range is usually comparable or even larger than for the full CMIP5 ensemble, whereas for RR1 the range is usually smaller and more similar to that of the subset of CMIP5 models used for downscaling. These results suggest that some precipitation characteristics such as the number of rainy days are critically dependent on the driving GCMs (especially in regions that are most affected by the position of the monsoon band inherited through the boundary conditions, or teleconnection patterns, see e.g. Endris et al. 2013), whereas the precipitation intensity is more dependent on the RCM parameterizations (such as convection scheme etc.). This is also generally in line with the findings of Bichet et al. (2020) who states that most of the uncertainty in CORDEX results over the Horn of Africa, coasts of North Africa, and southern Africa derives from that of the driving GCMs, whereas over the tropics and parts of the eastern Africa, most of the uncertainty results from a large dispersion across RCMs. However, Pinto et al. (2018) found that part of the disagreement in precipitation projections between GCMs and RCMs over southern Africa is due to the inconsistencies in the physical parameterizations of precipitation processes rather than inconsistencies in regional‐scale circulation patterns.

It must be noted that only one RCM (SMHI-RCA4) downscaled all ten of the GCMs used in CORDEX-Africa, whereas many other RCMs downscaled fewer GCMs, and in some cases, only one, which may impact the range of CORDEX future projections. In addition, Dosio et al. (2019) noted that where CORDEX results are uncertain, especially over Central Africa and parts of West Africa, subsampling the model ensemble (e.g. according to the RCM or the driving GCM) does not necessarily reduce the uncertainty or infer a more robust result.

For CORE, the situation is more complicated because not only is the subsample of CMIP5 downscaled very limited (3 GCMs compared to 10 for CORDEX) but also the number of RCMs used is small (3 RCMS compared to 7 in CORDEX). Despite this, for some indices and regions, the range of CORE results is comparable (and sometimes larger) to that of CORDEX, especially over GN_C and SAH_W.

This aspect is further investigated by analyzing the CORE performances in simulating present and future characteristics of the West African monsoon (Figs. 11 and 12). First we note that both CORE and CORDEX ensemble means provide a satisfactory representation of the present climate precipitation over the region, with superior results for SDII and RR1 compared to the global models (SI Fig. 11, see also Gnitou et al. 2021). However, the projected precipitation characteristics are strikingly different between CORE and all the other ensembles (Fig. 11). The global models and CORDEX show a decrease in mean precipitation between May and June over the coast of the Gulf of Guinea and an increase between July and November, especially over the Sahel (although the spatial pattern and intensity differ between CORDEX and the GCMs). The CORE simulations, on the other hand, show a marked drying over the entire monsoon precipitation band throughout the year, accompanied by a decrease in SDII between May and July that is absent in the other ensembles. The reduction of the number of rainy days, although projected by all ensembles, is also much stronger in the CORE simulations. To investigate this further, the individual CORE results are shown in Fig. 12 (for SM only). It is clear that the ensemble mean results are strongly influenced by the CLMcom-KIT-CLM5.0 simulations, all of which show marked drying throughout the year. However, the March-July drying of all nine CORE runs is stronger than the average of all the other ensembles (compare Figs. 11 and 12). Crucially, the CORE results presented by Teichmann et al. (2020) and Coppola et al. (2021) do not include the CLMcom-KIT-CLM5.0 simulations. Ashfaq et al. (2020) analyzed the ICTP-RegCM CORE runs and found a late arrival of the monsoon onset in response to warming, with the strongest delay in the start of the rainy season over the Sahel. They relate this delay to a suppression of the mostly convective pre-monsoon precipitation, linked to an increased boundary layer height over land and limited moisture supply as winds predominantly blow from the dry land regions. Dosio et al. (2020) investigated the different future precipitation over West Africa in CORDEX runs by separating the ‘dry’ from the ‘wet’ runs. They found that dry and wet models show similar patterns of the dynamic and thermodynamic terms of the moisture budget, although magnitudes are larger in the dry models. The largest discrepancies are found in the strength of the land–atmosphere coupling, with dry models showing a marked decrease in soil moisture and evapotranspiration. Also Diallo et al. (2016) highlighted the importance of the balance between evaporation and precipitation in projections over West Africa by the RCMs and their driving GCMs. By analyzing RCM runs at different resolutions over Africa, Wu et al (2020) found that the ability of RCMs to simulate precipitation (compared to their driving reanalysis) in many cases are simply related to model formulation (especially convection scheme) rather than resolution, which, however, controls the amplitude of the bias.

Fig. 11
figure 11

Time-latitude plots of change in precipitation characteristics (SM, SDII and RR1) over West Africa. For each month and latitude, model results are zonally averaged between 18° W and 10° E. The blue dashed lines indicate the West Sahel (SAH_W, region between the two dashed lines) and Guinea Coast (GN_C, region below the dashed line at 10° N) regions (see e.g. map in Fig. 2). The black line defines regions where present precipitation exceeds 4 mm/day (see SI Fig. 11)

Fig. 12
figure 12

Time-latitude plots of change in mean precipitation over West Africa according to the individual CORE model simulations

4 Summary and concluding remarks

To the best of our knowledge, this paper provides the first assessment of future daily characteristics of precipitation over Africa explicitly comparing the results of large ensembles of global (CMIP5, CMIP6) and regional (CORDEX, CORE) climate models. We highlight in particular the similarities and inconsistencies between the results of different ensembles. Where results are consistent across ensembles, a high degree of confidence in the projected change can be assumed. Where results differ, our work can be useful to identify the regions where further research is needed to thoroughly investigate these discrepancies.

In general, results for seasonal mean precipitation are not always consistent amongst climate model ensembles: in particular, global models tend to project a wetter future compared to RCMs, especially over the Eastern Sahel in JJA, as well as Central and East Africa in DJF. In addition, the intermodel spread is also very large over many regions, with models within the same ensemble showing opposite signs in the direction of the projected change.

Results for other indices are more consistent. For instance, under RCP8.5, all ensembles project a robust increase in SDII over central Africa in DJF and MAM, even where for instance CORE projects a decrease in SM. In general, all ensembles consistently project a mean positive change in SDII during the wet season over all regions (except SAH_W for CORE); this change is also consistently robust over a large fraction of land especially over central Africa, East Africa and the coast of the Gulf of Guinea.

A decrease in RR1 is projected in mean by all ensembles over West Sahel in JJA, and the Atlas region, CAF_S, East Africa and southern Africa in DJF. In particular, over the West Sahel, the Atlas and East Africa the change is negative for at least 75% of models in all ensembles, and it is robust over the majority of land. Conversely, over the Horn of Africa the majority of models in all ensembles project an increase in RR1 in SON. This change is indeed robust over more than 40% of land in all ensembles except CORE.

The maximum daily intensity is also projected to increase in mean (and by the majority of the models) by all ensembles over most of the regions during the wet season (apart from SAH_W for CORE), with the exception of the Atlas region in DJF, where a decrease is projected instead. Under RCP8.5 the change is robust over more than 30% of land by all ensembles over central and East Africa and the Ethiopian highlands (and to a lesser extent the Horn of Africa), especially for the GCMs.

Finally, under RCP8.5 an increase in CDD is consistently projected by all ensembles and by the majority of models (if not all) over southern Africa in DJF, the Ethiopian highlands in JJA and, particularly, the Atlas region in DJF, which, amongst all regions, shows the largest increase in both the magnitude (around 10 days/season) and geographical extension of the change (more than 60% of land).

Some inconsistencies between the different model ensembles exist, over specific regions and indices. For instance, large discrepancies exist between global and regional models in terms of the projected change in mean precipitation over central Africa in MAM, and the eastern Sahel and the Ethiopian Highlands in JJA, where CMIP5 and CMIP6 models project a robust increase in precipitation, CORDEX results show mainly little change or an uncertain signal, and CORE simulations show a robust decrease. In addition, over the Eastern Sahel in JJA (and to a lesser extent CAF_N and GN_C in SON), most GCMs show an increase in RR1 (which is robust over more than 30% of land) but RCMs project a robust decrease; over the region GCMs and RCMs project also an opposite sign in the change of the length of dry spells.

In order to provide a more synthetic summary of our findings, SI Tables 5–8 show the mean change (2071–2100 w.r.t. 1981–2010 under SSP5-RCP8.5) in precipitation indices for all subregions, model ensembles and seasons. We note that all ensembles agree on the sign of change in at least 7 out of 11 subregions in all seasons for SDII, RR1 and CDD. The results for Rx1day show agreement amongst the ensembles in at least 9 subregions in all seasons apart from MAM. SM shows the smallest consensus amongst the ensembles, especially in MAM and JJA. More specifically, in DJF all ensembles agree on the sign of change in at least 3 indices over all subregions, with HRN, CAF_N, ATL, SAH_E and SAH_W showing agreement for all of the indices. In MAM, the models ensembles do not agree on the sign of the change for any of the indices over SAH_E and ETP_H, but they agree on the sign of at least 4 indices over SAF_W, EAF, CAF_S, HRN and ATL, and for all indices over SAF_E. In JJA over SAH_W the ensembles agree on the sign of change only for RR1. However, over ATL, EAF and SAF_E they agree on the sign of change of all indices. Finally, in SON, the ensembles agree on the sign of change only for SDII over SAH_E but for at least 4 indices over ATL, CAF_S, EAF, SAF_E and SAF_W, and for all indices over ETP_H and HRN.

It is not straightforward to understand why differences between the results of different ensembles exist. It must be noted that first, the CORDEX and CORE RCMs have been driven by a small subset of the full CMIP5 ensemble. In regions and for seasons where precipitation (and its future change) is driven mainly by large scale dynamics, the difference between the results of the entire CMIP5 ensemble and those of CORDEX likely depend on the fact that some GCMs have not been downscaled. However, where local and regional mechanisms (including e.g. convection, topography, African easterly waves) are the main drivers for precipitation, the results can indeed be linked to the different simulation by GCMs and RCMs of the relevant physical and small scale processes. In addition, CORDEX results are influenced by the heterogeneity of the RCM-GCM matrix (with only one RCM downscaling 10 GCMs). On the other hand, the CORE ensemble includes only 3 RCMS.

The analysis of CORE results is particularly interesting for the evolution of the future characteristics of the West African Monsoon. Although CORDEX and CORE results are very similar over the reference period (and in general, closer to the observations than the GCMs), large difference exist when future projections are compared, with CORE showing a pronounced drying over the entire monsoon precipitation band throughout the year, accompanied by a decrease in SDII between May and July that is absent in the other ensembles. The reduction of the number of rainy days, although projected by all ensembles, is again much stronger in the CORE simulations. Although CORE ensemble mean results are clearly influenced by the CLMcom-KIT-CLM5.0 simulations (which are particularly ‘dry’), the March–July drying of all nine CORE runs is stronger than the average of all the other ensembles.

In addition to these general conclusions, there are some caveats to our study that need to be mentioned. In particular:

  1. 1.

    The definition of robustness is sensitive to the thresholds used (see methods) and the model ensemble size: in particular, smaller ensembles make the robustness criteria much more sensitive to outliers (i.e., dependent on the results of a single model). Although this may affect some of our results (e.g. the fraction of land where results are robust), the general conclusions of our analysis, based on the similarities and differences between ensembles results, hold. In addition, our results for CORDEX agree qualitatively with those of Dosio et al. (2019) based on a different threshold and statistical testing for the definition of robust climate change signal.

  2. 2.

    Our study is based on the ‘one simulation one vote’ approach, i.e., we do not weight results based on model independence (e.g., closely related versions of the same GCM). In particular, the matrix of the CORDEX GCM-RCMs simulations is very heterogeneous and partly unbalanced, with only one RCM downscaling a relatively large number (10) of GCMs. However, Dosio et al. (2019) showed that results are often robust (or non-significant) regardless of the choice of the specific RCMs or GCMs, and where the results are uncertain, a simple subsampling based on averaging according to the RCM and/or the GCM, is not able to reduce significantly the uncertainty nor the value of the mean change.

  3. 3.

    In this paper we focus on daily characteristics of precipitation. However, useful regional climate information can be constructed also by analyzing model results for present and future climate at different, multiple time scales, including e.g. El Nino Southern Oscillation (ENSO) timescales (see e.g. Alves et al. 2021).

  4. 4.

    When comparing CORDEX and CORE simulations, it must be noted that, besides the horizontal resolution, other factors such as model version, configuration (such as the vertical resolution) and parameterizations may have changed between the two experiments (see e.g. Sørland et al. 2021 for COSMO-CLM). Similarly to what was found by Panitz et al. (2014), Sørland et al. (2021) claim that boundary conditions may affect the results more critically than model version, configuration and horizontal resolution.

  5. 5.

    CORE results may be influenced by the relatively small ensemble of RCMs and GCMs used. It would be of great value if other modelling groups participated in the CORE initiative and provided projections for analysis.

It is interesting to note that the results by a convection permitting model (4.5 km resolution) by Kendon et al. (2019) show a future increase in dry spell length during the wet season over western and central Africa. Notably, this increase was weaker or not apparent in simulations with a 25 km resolution RCM. This may indicate that enhanced drying may be indeed related to a specific physical mechanism (or a combination of several specific mechanisms) that are better resolved by the higher resolution models. It is fundamental to understand and analyse, in all the different ensembles, the importance of the mechanisms controlling precipitation over the region (see e.g. Sultan et al. 2003); these can include the suppression of the pre-monsoon convective precipitation, the displacement of the rain band related to sea-surface temperature anomalies, the interaction between large scale motion (e.g. moisture transport) and regional forcings and feedbacks, such as the role of soil moisture and land–atmosphere coupling. This analysis can be achieved by a thorough process-oriented, regime-oriented, and use-inspired evaluation (Pendergrass et al. 2020) and sensitivity experiments aimed to quantify the impact of model grid resolution, physical parameterization, domain size etc.

Finally, it must be noted that even in cases where models’ results diverge, this should not be interpreted as a lack of useful information. Rather, in these regions, the user community (including e.g. policy makers) must cope with the idea that future projections of mean rainfall changes are uncertain, and, for instance, both drier and wetter conditions are equally plausible (e.g., Dosio et al. 2020), however, the changes in other precipitation characteristics (such as the intensification of mean and extreme precipitation) are robust.