1 Introduction

Tropical rainforests influence the transfers of energy, moisture and trace gases with the overlying atmosphere, the hydrologic cycle and carbon storage via various biogeophysical, biogeochemical and biological processes, and thus have the potential to modulate regional and global climate and carbon sequestration (e.g., Bonan 2008; Humphrey et al. 2018). Central Equatorial Africa (CEA), where the Congo Basin is located, contains the world’s second largest contiguous rainforest (CSC 2013). The three core regions of tropical rainforests (i.e., Congo, Amazon Basin and Southeast Asia) dominate the global rainfall amount in transition seasons (Webster 1983; Xie et al. 2003). In contrast to numerous studies on the Amazon Basin, there are relatively few research projects on the Congo Basin (e.g., Alsdorf et al. 2016). We know very little about the Congolese rainforest climate and variability, due to primarily the scarcity of in situ observations and meteorological stations for historical, political and economic reasons (Todd and Washington 2004; Washington et al. 2013; Zhou et al. 2014; Raghavendra et al. 2018).

As rainfall is the major climatic control on rainforest dynamics and an important component of the hydrological cycle, several studies probed the CEA rainfall patterns and variability using available precipitation data sets and identified large uncertainties among these data sets (Malhi and Wright 2004; Samba et al. 2008; Diem et al. 2014; Zhou et al. 2014). Maidment et al. (2015) combined both gauge-only and satellite-based rainfall data sets to analyze rainfall change in Africa and found that considerable differences in the trend and magnitude exist among available gridded rainfall data sets over Central Africa, where annual rainfall trends range between − 10 and + 39 mm/year per decade. Analyzing CMIP5 climate model simulations over Central Africa revealed little agreement among various models as to the distribution, magnitude and quantity of rainfall across the Congo Basin (e.g., Washington et al. 2013; Aloysius et al. 2016). Furthermore, multiple coupled models disagreed on the locations of maximum rainfall over the Congo Basin and multi-model ensemble means were also unable to reproduce the observed rainfall state (Creese and Washington 2016). These observational and modeling studies all point to large uncertainties and difficulties in understanding and modeling the rainfall climatology, patterns and variability over CEA.

Reanalysis, on the other hand, is a systematic approach to produce gridded dynamically consistent data sets for climate monitoring and research and may be a desirable choice for studying the rainfall variability and associated changes in atmospheric circulation (e.g., Bosilovich et al. 2008; Lorenz and Kunstmann 2012). It is particularly useful over data-scarce regions such as the Congo Basin (Parker 2016). However, the Congo rainfall climatology from several reanalyses also yielded significant disagreements (Washington et al. 2013). Differences in forecast models, data assimilation schemes and available observations (e.g., radiosonde, aircraft and satellite data) assimilated all likely contribute to the differences among the reanalyses (e.g., Lin et al. 2014). Since the suitability of the products varies by region and analysis goal (e.g., Betts et al. 2006; Bromwich et al. 2011; Peña-Arancibia et al. 2013), it is difficult to make a generalization about which reanalysis is the best at depicting the Congo rainfall climate and variability based on reanalysis evaluations over other regions.

Previous studies (e.g., Washington et al. 2013; Maidment et al. 2015) have indicated that there is little agreement as to the distribution and quantity of rainfall climatology among the gridded observational and reanalysis datasets over equatorial Africa. However, resolving this disagreement is difficult without validating them against a reliable gauge-based data set. Since the number of rain-gauge observations in the Congo Basin has declined sharply, especially during the recent three decades, and the amount of the observations incorporated into the observational and reanalysis data sets differ largely (e.g., Washington et al. 2013; Asefi-Najafabady and Saatchi 2013; Zhou et al. 2014; Nicholson et al. 2018a, b, c), it is necessary to identify the “best” available data set over CEA from the currently widely used reanalysis products. In addition, the quality of two newly released reanalysis products (i.e., MERRA2 and JRA-55) is still unclear over CEA. Therefore, this study aims to make a comprehensive assessment of the available reanalysis data sets based on radiosonde observations and a newly developed, gridded gauge-based rainfall data set (Nicholson et al. 2018a, b), and to identify the best reanalysis data sets available for understanding the rainfall climate, patterns, and variability over the understudied region of CEA. The identified data sets can also serve as a basis for model validation. More details on the study region, data and methods are described in Sect. 2. Section 3 presents the major features of rainfall over CEA, including the climatology, interannual variability, linear trends and spatiotemporal structures. Section 4 examines atmospheric circulation characteristics. The study concludes with a brief summary and discussion in Sect. 5.

2 Study region, data and methods

2.1 Study region

Central Equatorial Africa (CEA) covers the Congolese rainforest and its surrounding areas and features a bowl-like topography (Fig. 1a). Its main land cover includes evergreen broadleaf forest and woody savannas (Fig. 1b). The forest area is approximately 1.8 million km2 (CSC 2013). Two domains are considered in this study, one a relatively broad region that includes most countries of equatorial Africa (6°E–32°E, 14°S–12°N), henceforth termed “CEA”, and one within the central Congo Basin in which the gauge network (i.e., the Democratic Republic of the Congo) is extremely sparse after the early 1990s (Alsdorf et al. 2016). The outlined rectangle box in Fig. 1b (15°E–30°E, 5°S–5°N), termed “Congo Basin” hereafter, is defined to cover primarily the African tropical rainforest. This domain excludes western equatorial Africa to avoid coastal rainfall effects and human-induced deforestation influences (Zhou et al. 2014).

Fig. 1
figure 1

a The map of topography (units: m) from SRTM30_PLUS (Becker et al. 2009) in Central Equatorial Africa. Black circles indicate the observation sites. b The land cover map from MODIS (MCD12C1; Friedl et al. 2010) for 17 classes: 0 water, 1 evergreen needleleaf forest, 2 evergreen broadleaf forest, 3 deciduous needleleaf forest, 4 deciduous broadleaf forest, 5 mixed forests, 6 closed shrubland, 7 open shrubland, 8 woody savannas, 9 savannas, 10 grassland, 11 permanent wetland, 12 cropland, 13 urban and built up, 14 cropland natural vegetation mosaic, 15 snow and ice, 16 barren or sparsely vegetated. The rectangular box (5°S–5°N, 15°E–30°E) indicates the domain in the Congo Basin, and the broader region (6°E–32°E, 14°S–12°N) represents Central Equatorial Africa (CEA)

2.2 Observational and reanalysis data sets

The third author, S. E. Nicholson, created the precipitation data set in and/or bordering on the CEA that is used here. That data set, referred to as NIC131, includes 1826 stations and extends to 2014 (Nicholson et al. 2018a, b). NIC131 represents an update and expansion of the data described in various publications (Nicholson 1986; Nicholson et al. 2012a, b, 2018a). It consists of monthly rainfall totals at individual stations and includes recent updates provided both by individuals and by national meteorological services. Note that the Global Precipitation Climatology Centre monthly precipitation data (GPCC, Schneider et al. 2014) is often used as the benchmark in validation for African regions, but NIC131 and GPCC are not completely independent products. Most of the NIC131 data before the 1990s were incorporated into GPCC, while NIC131 includes recent updates from the Democratic Republic of the Congo, Central African Republic and numerous other countries in equatorial Africa. These updates were obtained directly from the African meteorological services and have not yet been incorporated into GPCC (Fig. S1). As the reanalysis data are grid box averages of rainfall estimates, NIC131 was also gridded with a spatial resolution of 2.5° by combining the gauge data with a statistical reconstruction approach (Nicholson et al. 2018b). The total number of available data is greater than that in GPCC over the CEA, and the observed long-term drought in the Congo Basin observed via multiple remote-sensing data sets (Zhou et al. 2014) is consistent with the results in NIC131 (Nicholson et al. 2018b). Therefore, the NIC131-gridded product, which has been extensively quality controlled (Nicholson et al. 2018a, b), can serve as independent in situ observations to validate other existing gridded data sets.

We include seven reanalysis products that are currently available in this study. ERA-Interim is the latest global atmospheric reanalysis and is produced by the European Centre for Medium-Range Weather Forecast (ECMWF), which uses a 4-dimensional variational (4D-Var) data assimilation (Dee et al. 2011). The Twentieth Century Reanalysis (20CR) Version 2c (V2c) is a global reanalysis product spanning 1851–2014 and covering the longest time period among the seven reanalyses. It uses an Ensemble Kalman Filter approach assimilating only surface observations of synoptic pressure (Compo et al. 2011). The Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA2) provides the data beginning in 1980 and has the highest spatial resolution which could yield more detailed climate features at regional scales. It is more advanced than the original MERRA data set because of improvements in the assimilation system that enable assimilation of modern hyperspectral radiance and microwave observations, along with GPS-Radio Occultation data sets (Gelaro et al. 2017). The Japanese 55-year Reanalysis (JRA-55) employs higher spatial resolution, a new radiation scheme, 4D-Var data assimilation with Variational Bias Correction (VarBC) for satellite radiances, than its predecessor, JRA-25 (Kobayashi et al. 2015). The National Centers for Environmental Prediction (NCEP)-National Center for Atmospheric Research (NCAR) Reanalysis 1 (NCEP-1) is the most widely used reanalysis and covers a relatively long time period from 1948 to the present (Kalnay et al. 1996), whereas NCEP-Department of Energy (DOE) Reanalysis 2 (NCEP-2) is an improved version of the NCEP-1 that has corrected the assorted data assimilation errors and updated the parameterizations of physical processes (Kanamitsu et al. 2002). The climate forecast system reanalysis (CFSR) is based on a coupled atmosphere–ocean–land surface–sea ice system (Saha et al. 2010). The CFSR version 2 (CFSRv2) is the extension of the CFSR from 2011 onwards (Saha et al. 2014). Details about each reanalysis data set are listed in Table 1.

Table 1 Reanalysis data sets used in the study

To examine the reanalyses’ ability in reproducing the atmospheric vertical structure features, we further compare the reanalyses with radiosonde and pilot balloon observations from the Integrated Global Radiosonde Archive (IGRA) Version 2 (https://www1.ncdc.noaa.gov/pub/data/igra/). IGRA consists of quality-controlled radiosonde observations of temperature, humidity, and wind at stations across all continents. Because IGRA records mix pibal and radiosonde data and are extremely discontinuous especially over Central Africa, we selected usable stations in equatorial Africa. The observation sites are indicated by the black circles in Fig. 1a and detailed information is listed in Table 2. Monthly mean variables including geopotential height (HGT) and wind fields in the lower- and middle-troposphere (i.e., at 850 hPa, 700 hPa and 500 hPa) are used.

Table 2 Geographical information and temporal coverage of available observation sites used in the study

2.3 Methodology

Three methods are used to measure the quality of reanalysis data with reference to the observational data. First, the seven reanalysis data sets are validated against the gridded NIC131 by calculating the mean bias, variance, linear trend, root mean square error (RMSE) and statistical correlation over two defined regions (i.e., CEA and Congo Basin). Second, we apply the centered pattern correlation coefficient, which measures the similarity of two patterns after removal of the global mean, to examine the spatial distributions between the reanalyses and NIC131. Finally, to measure the coherent pattern of the change in the rainy season rainfall, we conduct empirical orthogonal function (EOF) analysis to capture the maximum amount of variance for the rainfall anomalies for each data set, and use the leading EOF mode to represent the major spatial variability. We also use a Taylor diagram to examine the regional rainfall variations from the seven reanalyses and NIC131.

The vertical structures of the atmospheric conditions from the reanalysis data sets over the equatorial Africa are validated against the IGRA observations. We assess the reanalyses’ capacity in describing the HGT and wind fields in the lower- and middle-troposphere. We averaged 0000 and 1200 UTC monthly data to minimize the impacts of potential difficulties associated with diurnal variations between daytime and nighttime observations. Because the IGRA data sets have many missing records in the upper troposphere, we quantify the biases and RMSE of HGT and wind fields at 850 hPa, 700 hPa and 500 hPa.

Given the differences in spatial resolution among the various data sets used, we spatially re-project all the reanalysis into a common 2.5° × 2.5° grid. We use the observational and reanalysis data sets to obtain a broad and consistent understanding of the rainfall climatology and variability over CEA. Here we focus on the period from 1980 to 2014 when the seven reanalyses share the same temporal coverage. Besides rainfall, water vapor, vertical and horizontal winds from the reanalyses are also used to examine other factors that influence rainfall processes.

3 Rainfall climatology and variability

3.1 Mean climatology

We first examine the spatial patterns of annual mean climatological rainfall based on each of the seven reanalysis products and the gauge-based NIC131-gridded data from 1980 to 2014. The climatological rainfall from the NIC131-gridded estimates show two maximum centers, one with a strong maximum in the coast of Guinea exceeding 5 mm/day and the other with an equatorial maximum in the central Congo Basin (Fig. 2a). The NIC131 data set exhibits a gradient distribution with annual rainfall increasing from south and north sides to the core region of the Congo rainforest. All the seven reanalyses manifest similar distributions (i.e., north–central–south gradient) of large-scale rainfall patterns, analogous to these in NIC131, but the equatorial rainfall maximum in NCEP-2 and 20CR are shifted slightly westward (not shown). We also use a Taylor diagram to quantify the performance of each reanalysis against the observations (Fig. 3). Despite their differences in magnitude, all the reanalysis data sets have spatial patterns similar to the reference data of observations, with significant spatial correlations (r > 0.60) over CEA.

Fig. 2
figure 2

Spatial distribution of climatological annual rainfall (mm/day) over the equatorial Africa: a observations from the NIC131-gridded data (1980–2014), and bh biases of rainfall estimate from the seven reanalysis data against the NIC131 (reanalysis minus observation) for annual mean rainfall (mm/day) over the equatorial Africa

Fig. 3
figure 3

Taylor diagram of annual-mean (black dots), March–April–May (MAM, red dots) and September–October–November (SON, blue dots) rainfall variations between individual reanalyses and observations over Central Equatorial Africa. The angular coordinate indicates the weighted centered pattern correlation coefficient between the reanalysis and NIC131, and the radial coordinate shows the ratio (reanalysis/observation) of the normalized root-mean-square (RMS) differences. Further, negative correlations and standard deviations > 2.0 are shown as text at the bottom of the diagram. Additional bias statistics (reanalysis minus observation) are also added to the diagram

Spatial distribution of the mean bias (Fig. 2b–h) reveals that ERA-Interim largely overestimates the rainfall magnitude in most portions of the Central Africa, whereas JRA-55 shows strong rainfall positive biases (~ 15%) around southern Sahel and western equatorial Africa (Fig. 3). NCEP-1, NCEP-2 and CFSR have higher rainfall amount around East African Rift Valley where the topography is complicated. Among the seven reanalysis products, MERRA2 has the best performance in the annual rainfall estimates over CEA, with the least mean bias of 0.01 mm/day, whereas ERA-Interim has most rainfall along the tropical zones with the largest bias of 0.87 mm/day over CEA. For the Congo Basin, MERRA2 and 20CR show comparatively small biases against the observations, with the mean values of 0.15 and 0.26 mm/day, respectively.

3.2 Seasonal cycle

Figure 4 illustrates the climatology of rainfall seasonal cycle over the Congo Basin based on each reanalysis data and the reference observations. It is evident that all the reanalyses exhibit two peaks during March–May (MAM) and September–November (SON) in line with NIC131, and can capture well the bimodal distribution, with a lesser (greater) peak in MAM (SON), but largely overestimate the magnitude (Fig. 4). The results are qualitatively similar when using the regions averaged over CEA. The Congo rainfall overestimation is also found in the fully-coupled model simulations (Washington et al. 2013). Among the seven reanalyses, the seasonal cycle in MERRA2 is almost identical to that of the NIC131 data. ERA-Interim and NCEP-1 are the wettest across all months and have more rainfall over equatorial Africa. NCEP-2, CFSR and JRA-55 substantially overestimate rainfall in both two rainy seasons, but underestimate rainfall in June–August. The most inconsistent product appears to be 20CR which underestimates in MAM but overestimates the rainfall in SON (Fig. 3). For instance, the annual range of rainfall (maximum minus minimum values of the seasonal cycle) in NIC131 is 3.47 mm/day, whereas it is 8.39 mm/day and 7.01 mm/day, respectively, for 20CR and CFSR, which largely overestimate the values.

Fig. 4
figure 4

The seasonal cycle of the mean rainfall (mm/day) over the Congo Basin (15°E–30°E, 5°S–5°N) from the seven reanalyses and NIC131-gridded data

The seasonal evolution of mean rainfall in Central Africa (15°E–30°E) is very similar among the reanalyses, although the rainfall amounts are somewhat different (not shown). The northward rainfall propagation occurs during boreal spring and summer, while the southward turnaround takes place in August. ERA-Interim and CFSR have excessive rainfall over western Central Africa, particularly during boreal spring, whereas five of the reanalyses (ERA-Interim, NCEP-1, NCEP-2, 20CR and CFSR) have the largest rainfall amounts during the southward propagation. As the rainy seasons contribute approximately 60% of the annual rainfall, we mainly focus on the two peak seasons in the following sections.

Overall, the phase of the seasonal cycle in rainfall over the Congo Basin is well captured by the reanalyses, but the spread of rainfall magnitude is evident during both rainy and dry seasons among the seven reanalyses. The rain belt marches north during boreal spring and summer and retreats in late summer. Its maximum is centered at 7°N north around August. Also the reanalyses have the ability to capture the seasonal evolution. Despite their differences in magnitude, all the reanalyses demonstrate similar spatial patterns across Central Africa. Among the seven reanalysis products, MERRA2 is the best relative to NIC131, whereas ERA-Interim and CFSR are the wettest across all months and have more rainfall over equatorial Africa.

3.3 Interannual variability

Figure 5a shows the interannual variations of areal mean rainfall over the Congo Basin for the period 1980–2014, indicating that the interannual variability is relatively small in NIC131. To investigate the possibility that NIC131 may underestimate the true precipitation variability due to the limited number of observations since 1980s, we further examined the rainfall variability using the gauge station data during the time period 1940–1988 when the station network was dense and did not change much over time and using only selected stations that have the longest records. The results (not shown) also suggest that the observational data has relatively weak interannual variability in equatorial African regions. Relative to NIC131, all the reanalyses show larger interannual variability. A closer inspection shows that MERRA2 has the significant positive correlation with the reference data, with r = 0.41 (p < 0.05, n = 35). The rainfall in JRA-55 and NCEP-2 drops markedly since the late 1990s, whereas CFSR has a sudden shift around 2000s. These spurious trends might be due to the changes in observing systems (e.g., Bosilovich et al. 2011; Trenberth et al. 2011). Figure 5b displays the deseasonalized areal mean monthly rainfall differences between NIC131 and the reanalyses. Basically, 20CR and MERRA2 show relatively small deviations, whereas CFSR, NCEP-2 and JRA-55 are characterized by significant time-varying jumps. Again, such jumps could be related to the changes in observing systems such as the inclusion of new satellite-based atmospheric sounders and/or radiosonde observations (Trenberth et al. 2011; Zhang et al. 2012).

Fig. 5
figure 5

a Interannual variations of regional annual mean rainfall anomalies over the Congo Basin from the seven reanalyses and NIC131-gridded data during the period 1980–2014. b Deseasonalized monthly mean (12-month running average) rainfall differences between the NIC131-gridded data and the other seven reanalyses averaged over the Congo Basin

Figure 6a shows the standard deviation of MAM rainfall, which is used to measure the interannual variability. For the NIC131, larger interannual variability is dominant over western coastal regions and East Africa, whereas the Congo Basin has smaller variations. MERRA2 and ERA-Interim reasonably reproduce these spatial variations, with a significant spatial correlation of 0.60 and 0.34 over CEA, respectively. JRA-55 features largest interannual variability in the west coast and northern Congo Basin, which may be linked to the northward marine ITCZ convergence (e.g., Nicholson 2009). The two NCEP reanalyses show much larger interannual variability over the Great Rift Valley, indicating a poor spatial correlation with the observations. The standard deviation for SON rainfall (Fig. 6b) exhibits a spatial distribution similar to that for MAM rainfall. MERRA2, JRA-55 and 20CR could capture the spatial patterns of interannual variability over CEA (p < 0.05, n = 121), although their magnitude is largely overestimated. In CFSR, the largest variability is found in the central Congo Basin. In addition, NCEP-2 has its large variability centered over Great Rift Valley.

Fig. 6
figure 6

Spatial patterns of standard deviation for a March–April–May and b September–October–November rainfall (mm/day) from the seven reanalyses and NIC131-gridded data

Among the seven reanalyses, MERRA2 is the best at reproducing the interannual variability both in boreal spring and autumn over CEA, although it overestimates the magnitude of interannual variations, which is also the common problem in the other reanalysis estimates. Taylor diagram also shows that the reanalysis products have larger interannual variability than the reference data (Fig. 3).

3.4 Linear trend

Zhou et al. (2014) used multiple remote-sensing data to identify a long-term drought and its ecological impacts on the rainforest in the Congo Basin. They revealed a persistent decline in rainfall during the recent two decades, which occurred primarily during April–May–June (AMJ). Here we focus on the linear trend of rainfall in AMJ only as the drought is less significant in other seasons over CEA.

The NIC131 data shows stronger interannual variations before the year 1990 over the study region, with the mean negative anomalies falling in the last two decades (Fig. 7a). The regional-mean rainfall declined significantly by − 0.19 mm/day between the 1980s and 2000s (the decadal difference: 2000–2009 minus 1980–1989 averages).

Fig. 7
figure 7

a Regional mean anomalies from NIC131-gridded estimate April–May–June (AMJ) rainfall for 1980–2014 over the Congo Basin (15°E–30°E, 5°S–5°N). b Spatial patterns of AMJ rainfall linear trend for 1980–2014 (mm/day per decade) from the seven reanalyses and NIC131-gridded data

Figure 7b shows the linear trend of AMJ rainfall for the period 1980–2014. In the reference data, a significant decreasing trend is located in the eastern Congo Basin, while a wetting tendency is found in Sahel, indicating a meridional dipole distribution of rainfall anomalies, with less rainfall over equatorial regions and more rainfall in the Sahel. Except 20CR and CFSR, the other reanalyses could reproduce the rainfall decline in the Congo Basin. MERRA2 overestimates the rainfall decline in the whole tropics, while JRA-55 reanalyses exhibit a dipolar distribution, with more rainfall occurring to south of the equator and less rainfall to the north. ERA-Interim captures the Congo drought but shows an opposite trend in Sahel and East Africa. Among these seven reanalyses, NCEP-2 is the best at reproducing the rainfall trends during the AMJ season over tropical Africa.

3.5 Spatiotemporal characteristics

To examine the spatiotemporal rainfall variability, the leading EOF modes obtained for the period 1980–2014 in the rainy seasons are considered to explain the major distribution over equatorial Africa. The first EOF pattern (EOF1) in the NIC131 gridded data (Fig. 8a) is well represented, indicating negative rainfall anomalies over the whole equatorial regions, with the maximum over East Africa. For the reanalyses, MERRA2, NCEP-1, CFSR and 20CR could reproduce the wide-range pattern, although the maximum centers are somewhat different. For example, NCEP-1 has its anomaly center even farther in the Congo Basin, while MERRA2 features the maximum over the western Congo Basin (Fig. 8c, d). ERA-Interim, NCEP-2 and JRA-55 show a meridional dipole pattern, with positive anomalies over southern equatorial Africa. In addition, we found ERA-Interim has positive rainfall anomalies in East Africa which resemble the trend results (Fig. 8b), suggesting that ERA-Interim could not capture the rainfall patterns over East Africa. Moreover, the decline in MAM rainfall in eastern Africa is well known from other studies (e.g., Williams and Funk 2011; Funk et al. 2018).

Fig. 8
figure 8

Spatial patterns of the first EOF modes of March–April–May rainfall from the seven reanalyses and NIC131-gridded data

For SON, the leading EOF mode features stronger negative rainfall anomalies over the southern Congo Basin and East Africa, but weaker rainfall variability in the central Congo (Fig. 9a). This is consistent with the smaller interannual variations in the central Africa rainforest (Fig. 6). Most of the reanalyses cannot reproduce the observational patterns in NIC131. ERA-Interim, NCEP-1, JRA-55 have significant differences over the southern Sahel, whereas NCEP-2, MERRA2, 20CR and CFSR exhibit the strong Congo rainfall anomalies over the Congo Basin (Fig. 9b–h). Overall, the explained variances of the EOF modes from the reanalyses are generally higher than that in NIC131, and the reanalyses generally overestimate the rainfall variability in the Congo Basin and poorly capture the coherent changes over East Africa and southern Congo. Compared to the leading EOF mode in MAM, the spatiotemporal rainfall characteristics in SON differ considerably among the reanalyses. If we conduct the EOF analysis after removing the linear trends from the rainfall data first, the leading EOF temporal and spatial patterns differ insignificantly in the NIC131 data, whereas the seven reanalyses have large interannual to decadal variability in the time coefficients. This indicates that the reanalyses overestimate the rainfall variability in the recent three decades.

Fig. 9
figure 9

Same as Fig. 8, but for September–October–November

4 Atmospheric circulation characteristics

4.1 Evaluation of reanalyses using IGRA observations

As rainfall patterns are closely linked to the large-scale atmospheric circulation, we next assess the reanalysis atmospheric circulation fields (e.g., HGT and winds) against the upper air observations. As the IGRA data have many missing records for the period 1980–2014 and include few stations over the Congo Basin, we examine a region over western equatorial Africa with adequate upper air coverage. We compare the values from the IGRA data and the corresponding reanalysis data covering this period (Table 2). Figure 10 shows the mean biases and RMSE for the HGT, zonal (U) and meridional (V) winds at 850 hPa from the reanalyses against the IGRA date from eleven upper air sites. All the reanalyses underestimate HGT fields and two NCEP reanalyses have the least HGT biases (less than 5.8 gpm). For both U and V fields, MERRA2 has the least bias for the IGRA sites (less than 0.58 m/s). We also use RMSE to measure the reanalysis’ fidelity in capturing temporal variability. RMSE shows distinct variations, with the best results in two NCEP reanalyses for HGT fields (approximately 8.3 gpm, on average) and MERRA2 (about 1.47 m/s for U and 1.36 m/s for V, on average) for the wind fields (Fig. 10). The results at 700 hPa are similar to these at 850 hPa (Fig. S3). For the middle troposphere (500 hPa, Fig. S3), there are large differences in the mean biases and RMSE in the seven reanalyses. In contrast, MERRA2 has the best ability in reproducing the features of winds. Although there are only eleven observation sites to cover the equatorial Africa, the results suggest that NCEP-2 and MERRA2 are better able to capture the HGT and wind fields in the lower troposphere, respectively, than the other reanalyses.

Fig. 10
figure 10

The mean biases and RMSE for geopotential height (HGT) and wind fields (U and V) at 850 hPa from the reanalyses against the IGRA data. The abscissa denotes the station ID shown in Table 2

4.2 Transport of moisture

The aforementioned sections have identified distinct differences both in magnitude and spatiotemporal features in the CEA rainfall climatology, seasonality, variability and trend among the seven reanalyses, which may help to reveal some useful information associated with such differences. As the moisture flux (qflux) convergence is one of the major parameters used in evaluating the rainfall characteristics over CEA (Washington et al. 2013; Creese and Washington 2016), we next examine the ocean–land transport of moisture.

At 700 hPa, the changes of moisture flux in MAM is largely influenced by the northern branch of the African Easterly Jet (Fig. 11). Strong moisture flux divergence dominates in the Sahel, West and East Africa, whereas moisture flux convergence is located in Central Africa. CFSR shows the largest divergence over these regions, while JRA-55 has a weaker divergence. ERA-Interim features strong convergence across the whole Congo Basin, indicating the wettest rainfall climatological pattern (Fig. 2b). We also examine the moisture flux at 850 hPa (Fig. S4). Central Africa exhibits a mixture between moisture flux convergence and divergence, and the water flux into Central Africa is mostly derived from the Indian Ocean. ERA-Interim features the strongest moisture flux convergence in the core region of CEA among the seven reanalyses. 20CR has the weakest water transport from the Indian Ocean, implying the least rainfall amount over Central Africa in MAM. These lower- and middle-tropospheric moisture flux features shed insight into the differences in the mean rainfall climatology over CEA.

Fig. 11
figure 11

Climatological mean March–April–May moisture flux (kg/kg m/s) and divergence (contours: 10−8 kg/kg/s) at 700 hPa from the seven reanalyses

In SON, Central Africa features distinct differences in moisture flux divergence both in sign and magnitude at 850 hPa among the seven reanalyses (Fig. 12). The horizontal water vapor flux supply is not derived from the Indian Ocean, but rather results locally from the Congo Basin where the center of the convective zone is located. By comparison, ERA-Interim, 20CR, NCEP-2 and CFSR have the larger moisture flux convergence and thus induce more rainfall in the Congo Basin. Note that the jets play an important role in influencing the CEA rainfall processes because the rain belt is roughly bounded by the two branches of the African Easterly Jets, and the African easterly waves propagating along the jets can trigger the mesoscale convective activity through dynamical instability to form and maintain the moisture convergence (e.g., Nicholson and Grist 2003; Creese and Washington 2018). During boreal autumn, the northern and southern branches of the African Easterly Jets are all present, but the magnitude of the African Easterly Jets shows marked differences among the seven reanalyses (Fig. 13). Moisture flux convergence located over CEA is pronounced in ERA-Interim, NCEP-2 and CFSR at 700 hPa (Fig. S5).

Fig. 12
figure 12

Climatological mean September–October–November moisture flux (kg/kg m/s) and divergence (contours: 10−8 kg/kg/s) at 850 hPa from the seven reanalyses

Fig. 13
figure 13

Mean winds (m/s) in September–October–November at 600 hPa from the seven reanalyses

4.3 Vertical wind structures

In addition to the horizontal moisture convergence, we also examine the vertical structure of wind (omega, ω) for the seven reanalyses. Figure 14 shows the climatology of MAM vertical cross sections of the zonal wind components and ω winds averaged from 5°S to 5°N. All the reanalyses feature a Walker-type circulation with rising motion over CEA and subsidence over the western boundary of the Indian Ocean around 45°E. While the basic structure of wind is captured by all the reanalyses, there are still some differences. For example, the upward vertical motion in ERA-Interim, MERRA2, JRA-55 and 20CR is stronger than that in the two NCEP reanalyses. The vertical circulation in JRA-55 and MERRA2 extends to the south of 10°N. In particular, there are multiple rising branches in MERRA2, JRA-55 and 20CR. The maximum ω is around 300 hPa in most of the reanalyses, but is located around 500 hPa in NCEP-2. The position and strength of the rising branches differ significantly, indicating differences in depicting the tropical rain belt. Figure 15 illustrates the ω winds in SON. All the reanalyses show the upward branches located within 10°N–30°N over the Congo Basin, while the downward branches are in both sides over the Gulf of Guinea and the western boundary of the Indian Ocean. Basically, the rising branches of the circulation that are anomalously strong could strengthen the Congo Basin moisture transports. However, the ω maximum positions and the magnitude show distinct differences. For instance, ERA-Interim, MERRA2 and 20CR have stronger rising motion, while the ω maximum in NCEP-2 is in the lower level around 600 hPa.

Fig. 14
figure 14

Climatology of March–April–May meridional mean (5°S–5°N) of zonal vertical velocity (vectors, u: m/s; ω: − 102 Pa/s) and vertical velocity (shading, ω: − 102 Pa/s) from the seven reanalyses

Fig. 15
figure 15

Same as Fig. 14 but for September–October–November

5 Conclusions and discussion

Understanding the rainfall climatology and variability over Central Equatorial Africa (CEA), where the world’s second largest contiguous rainforest is located, has been largely hampered by the lack of adequate in situ observations and meteorological stations since the 1980s (Nicholson et al. 2018a). Large differences and uncertainties over CEA among several observational and reanalysis data sets and various climate model simulations present another big challenge. Resolving these differences is difficult without validating them against a reliable gauge-based data set. Therefore, this study comprehensively assesses the available reanalysis data sets based on quality-controlled radiosonde observations and a newly developed gauge-based rainfall data set (referred to as NIC131) to identify the “best” data sets for understanding the rainfall climate, patterns, and variability over the understudied region of CEA. The identified data sets can also serve as a basis for model validation.

Our results indicate that the seven reanalysis data sets (i.e., 20CR, CFSR, ERA-Interim, JRA-55, MERRA2, NCEP-1 and NCEP-2) all capture well the major features of the rainfall seasonal cycle and the seasonal evolution in mean rainfall in the reference observational data (NIC131) over the equatorial Africa, but exhibit a large spread of rainfall magnitude and spatiotemporal characteristics. MERRA2 is closest to NIC131 for the mean climatology and interannual variability, whereas ERA-Interim is too wet across all months and has more rainfall over equatorial Africa, which may largely result from the assimilation of humidity information and the presence of a substantial warm bias in the model associated with underestimated aerosol optical depth in Central Africa (Uppala et al. 2005; Dee et al. 2011). CFSR, JRA-55 and NCEP-2 are characterized by significant time-varying jumps in the late 1990s and 2000s, mainly due to the changes in observing systems, such as the introduction of new data into the assimilation systems (Trenberth et al. 2011; Zhang et al. 2012). Bosilovich et al. (2011) also identified that an erroneous radiosonde station produced the Central African rainfall problem in MERRA. For the spatiotemporal characteristics, MERRA2 and two NCEP data are relatively better in reproducing the observed long-term drought in AMJ over the Congo Basin. This multi-decadal drying trend is consistent with the findings in Zhou et al. (2014) who used optical, microwave and gravity remotely sensed data.

To attribute the distinct differences in the CEA rainfall among the seven reanalyses, we also examine the large-scale atmospheric circulation characteristics (e.g., transport of moisture and vertical motion). Although the upper air observations are obtained from only eleven sites in equatorial Africa, MERRA2 shows the least biases and RMSE in describing the wind fields against the observational data in the lower- and middle-troposphere. The two NCEP reanalyses better capture the HGT fields than the other reanalyses. However, NCEP-1 has the largest magnitude in the African Easterly Jets and the relatively weak vertical motion. By examining the water vapor transport, we find that the differences in the lower- and middle-tropospheric circulation can reasonably explain these in the rainfall climatology among the reanalyses.

Considering the large differences in horizontal and vertical wind fields among the seven reanalyses, we need to use the best reanalysis wind and moisture fields to explain the observed rainfall and associated circulation changes. For instance, the land-based convergence zone in 20CR is more relevant to the Indian Ocean ITCZ, whereas it is mainly affected by the Atlantic ITCZ in ERA-Interim (Zhang et al. 2013). SST boundary conditions might also influence the rainfall variability in Central Africa, as warmer tropical SST may lead to northward wind anomalies (Zhang et al. 2013), and the enhanced and westward extended tropical Walker circulation over Indo-Pacific could weaken the moisture transport and reduce the rainfall in Central Africa (Hua et al. 2016, 2018).

This study aims to evaluate the available reanalysis data sets based on upper air observations and a newly developed gauge-based rainfall data set for studies on understanding the rainfall climatology and variability over CEA. However, it is hard to attribute the discrepancies among the seven analyses. Each reanalysis has its own limitations likely related to assimilated observations, physical parameterizations and SST boundary conditions (Lin et al. 2014). Rainfall in particular depends heavily on the representation of the hydrological cycle in the models, and is further contaminated by model biases. As a result of the unclosed moisture budget in the reanalyses, much work has been done examining on the reanalysis hydrological processes. For example, large systematic differences of precipitable water and evaporation minus precipitation (E–P) are found between ERA-Interim and MERRA, especially over Africa (Trenberth et al. 2011). A possible explanation is that the lifetime of moisture in the reanalyses is too short and the recycling is too large (Trenberth et al. 2003). Trenberth (1999) found that the recycling activity over the Congo Basin is higher than that in the Amazon Basin, suggesting the importance of land surface processes in influencing the CEA rainfall. Likely moisture recycling plays an important role over CEA (Pokam et al. 2012) where evapotranspiration from the dense Congolese forests could contribute to the regional rainfall recycling. As the transport of moisture is a key component of rainfall, further work is needed to better evaluate the hydrological components in the reanalyses, and examine the E–P from the moisture budget over CEA (e.g., Reichle et al. 2017a, b).