1 Introduction

Due to global climate change, accurate rainfall forecasts in association with agricultural activities are needed more than ever, especially in Africa, in which the continent’s economy badly depends on rain-fed agriculture. Eastern Africa, including Ethiopia, is exceedingly vulnerable to climate change and needs accurate rainfall measurements to monitor agricultural resources. Unfortunately, the ground-based observations in Africa are insufficient due to the spatial and temporal discontinuities (Schreck and Semazzi 2004). Due to reduced rainfall, Ethiopia had been frequently exposed to famine from drought, which mainly occurred due to a lack of proper (skilled) forecast based on ground observations. For the past few decades, Scientists have been producing the indirect rainfall estimates from the satellite imagery and Numerical Weather Prediction model outputs at different spatial and temporal resolutions (e.g., Becker et al. 2013; Bergès et al. 2010; Harris et al. 2014; Herman et al. 1997; Huffman et al. 2010; Menne et al. 2012; Novella et al. 2013; Roebeling et al. 2012; Schneider et al. 2014). Most African countries have benefited from the freely available satellite-based and numerical model reanalysis output products. Nowadays, there are several freely available satellite-based and numerical model output alternatives, but these indirectly estimated products need to be validated before application. One of the most common questions among scientific community platforms is “which rainfall data set is appropriate for this and that research sector?” Such questions need to be addressed carefully as the rainfall is very sensitive to various conditions (e.g., topography), despite that many still conduct researches based on suggestions provided without verification. The availability of these data sets by itself is not enough until it gets validated and verified.

Compared to the production rate of the satellite-based and model output products, little validation works have been conducted in various regions. The Climate Hazard group Infrared Precipitation with Stations (CHIRPS v2.0) is one of the popular products to be validated in multiple countries, e.g., China (Bai et al. 2018), Pakistan (Ullah et al. 2019), and Africa (Nkunzimana et al. 2020a; Ayehu et al. 2018; Dinku et al. 2018). Some of the satellite-based products, including CHIRPS and the Tropical Applications of Meteorology Using Satellite Data and Ground-Based Observations (TAMSAT Versions 2 and 3), were validated in different regions of East Africa (e.g., Dinku et al. 2011, 2007, 2008a, 2008b; Koutsouris et al. 2016; Nkunzimana et al. 2020b; Romilly et al. 2011). On the other hand, the reanalysis model output products were also validated in Africa. For example, Maidment et al. (2013) validated ERA-Interim and ERA-40 products in Uganda, while Zhang et al. (2013) validated eight different reanalysis products in South Africa.

The latest validation work, which has an association with this study, was conducted in east Africa. Dinku et al. (2018) evaluated The Climate Hazards Group Infrared Precipitation (CHIRP) and CHIRPS products over East Africa (Ethiopia, Kenya, Somalia, Uganda, Rwanda, and Tanzania) at daily, dekadal (10-day), and monthly time scales. They compared these products with TAMSAT (2, 3) and the African Rainfall Climatology version 2 (ARC2). The same products were evaluated at nearly the same time by Ayehu et al. (2018), but this time limited to only the upper Blue Nile river basin region of Ethiopia. Some other satellite-based and reanalysis products were also validated in Uganda (Maidment et al. 2013), Burundi (Nkunzimana et al. 2020a), and Tanzania (Koutsouris et al. 2016).

This paper is intended to validate three products against the independent gauge observations at monthly, seasonal, in fact, the primary rainy season (June–September), and annual time scale in six selected regions of Ethiopia (Fig. 1b). The first product is CHIRPS, chosen because of its high resolution and performance in the region at daily, dekadal, and monthly time scales (Ayehu et al. 2018; Dinku et al. 2018). Secondly, the latest release (version 3.1) of Tropical Applications of Meteorology Using Satellite Data and Ground-Based Observations (TAMSAT). This product has the best resolution in Africa and is designed to provide accurate rainfall estimates to the continent. It has better performance than CHIRPS daily and is nearly similar at dekadal time scales (Dinku et al. 2018). The latest TAMSAT (version 3.1), which has not been validated in the region, is considered in this study. The third product considered for further validation in this paper is the fifth-generation reanalysis (ERA5) by the European Center for Medium-Range Weather Forecasts (ECMWF). The previous versions of this product, such as ERA-Interim and ERA-40, which are freely available at 0.75° spatial resolution, have been widely applied in various research sectors across the world. Despite its popularity, many still argue its appropriateness in Africa due to the less accuracy of modeled rainfall estimates than satellite-based products (Diro et al. 2009; Lim and Ho 2000; Maidment et al. 2013; Poccard et al. 2000). The previous versions, ERA-Interim and ERA-40, were evaluated in Uganda (Maidment et al. 2013). The latest version (ERA5) is selected for validation in this study because of the following reasons: (a) Due to the finest resolution (0.25°) compared to the previous versions (0.75°), (b) To test if the current version is improved and reliable in the region compared to the previous reports by (Maidment et al. 2013), and (c) Compare with the above two satellite-based products and see if this product could reverse the doubt (Diro et al. 2009; Lim and Ho 2000; Maidment et al. 2013; Poccard et al. 2000). So, validating this product would help to understand if it is reliable or not as well as its performance compared to satellite-based products.

Fig. 1
figure 1

Map of the study area (a) land cover (b) topography with specific study sites. In figure 1 above the subfigure (a) is a land cover map of the study area with a ten-class global land use/land cover (LULC) map for 2020 at 10-meter resolution. The map is derived from ESA Sentinel-2 imagery at 10m resolution. This map was produced by a deep learning model trained using over 5 billion hand-labeled Sentinel-2 pixels, sampled from over 20,000 sites distributed across all major biomes of the world (Esri 2021). Since Snow/Ice is not common in this region, only nine land cover types were considered from the ESRI land cover products. Subfigure (b) is for the topography map of the study area, with the locations of specific analysis sites

2 Study regions

The study region (Ethiopia) belongs to the sub-tropics (3.30°–15°N, 33°–48°E) and experiences tri-modal rainfall regime: October–January, February–May, and the main rainy season (June–September) that accounts for 50–80% of annual rainfall (Dubache et al. 2019a, 2019b). Therefore, most of the agricultural productivity depends on rain during June–September season. Among the six selected study sites in this study, at least four falls in this major rain season (Fig. 1b). The sites were chosen carefully, considering the region's complex topography (Fig. 1b) and the variety of land cover types (Fig. 1a). Additional information on the locations of selected study sites is included in Table 1.

Table 1 Locations and their elevations of the selected study sites

In Fig. 1 above the subfigure (a) is a land cover map of the study area with a ten-class global land use/land cover (LULC) map for 2020 at 10 m resolution. The map is derived from ESA Sentinel-2 imagery at 10 m resolution. This map was produced by a deep learning model trained using over 5 billion hand-labeled Sentinel-2 pixels, sampled from over 20,000 sites distributed across all major biomes of the world (Esri 2021). Since snow/ice is not common in this region, only nine land cover types were considered from the ESRI land cover products. Subfigure (b) is for the topography map of the study area, with the locations of specific analysis sites.

3 Data and methods

3.1 Station data

The National Meteorological Agency of Ethiopia provided monthly mean rain gauge data for 1992–2009. After excluding the stations with large missing values, sixty-eight stations across the country were considered in this study. Seasonal (June–September) and annual mean rainfall data used in the study were derived from the given monthly mean. The well-known Kriging interpolation method has been applied to estimate the monthly, seasonal, and annual rainfall for the locations lacking gauge observations.

3.2 Satellite products

The TAMSAT, v3.1, and CHIRPS v2.0 are the two satellite-based products selected for this study. Both products have the best resolution and cover the study region for an extended period.

TAMSAT satellite is produced at the University of Reading to provide accurate rainfall estimates with high resolution for Continent Africa (Maidment et al. 2014b; Maidment et al. 2017; Tarnavsky et al. 2014). The TAMSAT algorithm works on the assumption that cold cloud-top temperatures (obtained from Meteosat thermal-infrared images) of tropical storms identify raining clouds. The previous versions of this product have been validated in some regions of Africa (Dinku et al. 2007; Herman et al. 1997; Jobard et al. 2011; Laurent et al. 1998; Maidment et al. 2013; Thorne et al. 2001; Tucker et al. 2001). Therefore, the most recent version (TAMSAT v3.1) is used in this study.

CHIRPS is the combination of stations with various satellite products and developed in the University of California by the U.S. Geological Survey and the Climate Hazards Group and available at daily, pentadal, dekadal, and monthly time scales from 1981 to near present (Funk et al. 2015a, 2015b, 2014). CHIRPS is one of the few latest satellite products, in which validation works have been conducted in parts of East Africa and recommended as more reliable to the region than other products (Ayehu et al. 2018; Dinku et al. 2018; Nkunzimana et al. 2020a). In this study, the monthly average (0.25°) is used for validation.

3.3 ERA5

ERA5 is called the fifth-generation reanalysis data set, which extends the popular ERA products family such as ERA-Interim and ERA-40 produced by the European Center for Medium-Range Weather Forecasts (ECMWF) (Hersbach and Dee 2016; Hoffmann et al. 2019). According to Hersbach and Dee (2016), ERA5 was produced by including different characteristics to the ERA-Interim version and available from 1979 to present. The spatial resolution has improved from 79 km horizontal resolution to 31 km, whereas the temporal resolution from six hours in ERA-Interim to hourly in ERA5 (Hersbach and Dee 2016). The previous versions were validated and reported in East Africa, Uganda (Maidment et al. 2013). The monthly averaged (0.25°) for 1992–2009 is used for further validation in this study.

3.4 Data preparation

The point-scale gauge observations with few missing values were considered for testing the satellite-based and reanalysis products. Since the gauge observation is not representative of the area average, it would be better if the average of some more gauge stations in a pixel were considered for comparison. However, due to the scarcity of gauge stations and also the best resolution of both satellite-based as well as reanalysis data sets, the nearest neighbor-interpolated gridded data from the satellite/model grids to the location of a weather station (grids around the station) were compared with a single-point gauge station. The six regions selected for validation in this study (Fig. 1b) are far apart, have much elevation difference (Table 1), and experience different climate conditions. Considering the complex topography of Ethiopia and variety of land cover types, the selected regions could help understand the performance of these products in different climatic zones. For the average over the country, the Kriging interpolation from sixty-eight-gauge stations was compared with the satellite-based and reanalysis products (Fig. 3).

3.5 Statistics

Various skill metrics have been applied to validate the satellite-based and reanalysis products against the independent gauge measurement. Some of the common skill metrics used in this study include Index of Agreement (IA), simple linear regression (slope and coefficient of determination), Pearson’s correlation confidence (r), Root Mean Square Error (RMSE), Nash and Sutcliffe efficiency (NSE), and BIAS. A detailed description of these and related skill metrics for statistical performance evaluation are presented elsewhere (Dubache et al. 2019a, 2019b; Moriasi et al. 2007; Nageswararao et al. 2018; Nash et al. 1970; Singh et al. 2021). The mathematical expression of the skill metrics used in this study is presented in (Eqs. 1, 2, 3, 4, 5).

$$IA = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {S_{i} - G_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {\left| {S_{i} - \overline{G}} \right| + \left| {G_{i} - \overline{G}} \right|} \right)^{2} }}$$
(1)
$$NSE = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {G_{i} - S_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {G_{i} - \overline{G} } \right)^{2} }}$$
(2)
$$RMSE = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {S_{i} - G_{i} } \right)^{2} }}{n}}$$
(3)
$$r = \frac{{\sum G_{i} S_{i} - n\overline{G}\overline{S}}}{{\sqrt {\left( {\sum G_{i}^{2} - n\overline{S}^{2} } \right)} \sqrt {\left( {\sum G_{i}^{2} - n\overline{S}^{2} } \right)} }}$$
(4)
$$BIAS = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {S_{i} } \right)}}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {G_{i} } \right)}}$$
(5)

where i and n, for i = 1, 2, …, n indicate the ith number of gauge, satellite, or model products, whereas S, \(\overline{S }\), G, and \(\overline{G }\) stand for satellite or model products, mean of satellite or model products, and gauge observations, respectively. RMSE is used to measure the absolute mean difference between the satellite- or model-based products and the corresponding gauge observation. RMSE value close to 0 indicates better satellite or model performance. Pearson’s correlation confidence (r) is used to measure the linear correlation between satellite- or model-based products and rain gauge data, with − 1 and 1 indicating a high negative/positive correlation between the satellite/or model products and gauge observations, respectively. The BIAS is used to measure how much the satellite and model products underestimate or overestimate compared to the gauge observations. The BIAS values close to 1 indicate the best performance of the satellite or model products, whereas less than/or greater than 1 suggests under/overestimation, respectively. The IA index falls between 0 and 1, with a value closer to 1 indicating better simulation and otherwise worse. Finally, the NSE is defined to compare the overall deviation of simulations from observations with the observation variance, with 1 telling the best simulation, those values between 0 and 1 showing acceptable model performance, and otherwise worse.

4 Results

4.1 Climatology

All the products well captured the temporal pattern of the ground observation; however, the reanalysis ERA5 product showed considerable underestimation in all cases, especially during rainy months (Fig. 2). Sites located in the rift valley belt (Dire Dawa and Moyale) followed a bimodal rainfall distribution, while the rest showed a uni-modal regime. March–April and July are rainy months in Dire Dawa, whereas, Moyale receives high rainfall in April, October, and November. The other four sites experience high rainfall June–September. Awassa is also one of the sites located in the rift valley belt but shares a uni-modal regime. The rain in Awassa starts around April and ends around October, which showed a unique pattern compared to other sites. The satellite-based products showed slight overestimation during the rainy season at Bole, Gonder, and Adwa sites but underestimated at the Dire Dawa site. The ERA5 reanalysis product showed underestimation at all locations throughout the study period. The previous product, ERA-40, was validated in Uganda and underestimated the dekadal rainfall estimates (Maidment et al. 2013).

Fig. 2
figure 2

Mean monthly climatology of Gauge, TAMSAT 3.1, CHIRPS, and ERA5 rainfall estimates over the selected study sites in Ethiopia (1992-2009)

The mean monthly rainfall from sixty-eight-gauge observations with fewer missing values was considered, and the popular kriging interpolation method (Oliver and Webster 1990) was applied to estimate rainfall for the places with no observations (Fig. 3). Kriging was applied to the seasonal and annual mean time cycles derived from mean monthly values. The spatial distribution in (Fig. 3) showed all the products correctly displayed the magnitudes of the rainfall in the region, despite the bias during interpolation due to lack of observations in the southeastern part of the country. This region usually experiences quite a low rainfall and has no significant influence on the country's economy. Compared to the reanalysis ERA output product, both satellite-based products better reflected the rainfall distributions at seasonal and annual scales.

Fig. 3
figure 3

Gauge, Kinging (interpolation from existing 68-gauge observation stations), TAMSAT 3.1, CHIRPS, and ERA5 monthly mean, seasonal mean, and annual mean rainfall products over Ethiopia (1992-2009)

4.2 Monthly mean rainfall

The mean monthly rainfall distribution pattern was well captured by both the satellite-based products and the reanalysis product, despite the different magnitudes of bias in each case (Fig. 4). The ERA5 product showed considerable underestimation at all selected study sites, while the satellite products were underestimated significantly at the Dire Dawa site, which is located in the Eastern part of the country and experiences a bi-model rainfall distribution, and overestimated at the Bole site.

Fig. 4
figure 4

Monthly mean rainfall pattern of Gauge, TAMSAT 3.1, CHIRPS, and ERA5 over the selected study sites in Ethiopia (1992-2009)

The qualitative regression results (Fig. 5) and the performance indices (Table 2) explain the relationship between the compared constituents. ERA5 has shown a considerable underestimation with a relatively low correlation compared to the satellite products. Both (Fig. 6c and f) showed significant depreciation by both satellite products; however, the large bias in (Fig. 5f) is due to the outlier of gauge observation toward the last two years of the study period (see, Fig. 4, Adwa).

Fig. 5
figure 5

Linear regression between the monthly mean Gauge and TAMSAT, CHIRPS, ERA5 rainfall estimates in selected study sites in Ethiopia (1992-2009). The subplots (a), (b), (c), (d), (e), and (f) represent the selected study sites Bole, Awassa, Diredawa, Gonder, Moyale, and Adwa, respectively. The grey diagonal line stands for the 1:1 fitting line

Table 2 Performance indices result for the mean monthly rainfall estimates between gauge and TAMSAT, CHIRPS, and ERA5 products
Fig. 6
figure 6

Seasonal mean (June-September) rainfall pattern of gauge, TAMSAT 3.1, CHIRPS, and ERA5 over the selected study in Ethiopia (1992-2009)

The scatter plot in (Fig. 5) presented the various magnitude of underestimated values of both satellites and reanalysis almost in all cases, except Bole, which has slight overestimation in satellite products. Despite the difference in accuracy from site to site, CHIRPS showed better results in most cases, except the Gonder site (Fig. 5d), in which TAMSAT outperformed. Both satellite products showed a large bias in Adwa (Fig. 5f) and moderate bias in Dire Dawa (Fig. 5c). ERA5 was found to be highly biased/underestimated for all products at all sites.

According to the skill metrics results (Table 2), various magnitudes of performance were observed at the monthly time scale. In this case, except for two sites (Moyale and Adwa), CHIRPS showed high performance in all other locations with high correlation coefficient, coefficient of determination, index of agreement, and low RMSE and BIAS values. For the CHIRPS, in most cases, the NSE values are above 0.5 and the slopes are above 0.6 and below 1.2. TAMSAT followed CHIRPS with the slightest difference in most cases, whereas ERA5 showed less performance than the satellite products, yet still significant in most cases.

4.3 Seasonal mean rainfall

June–September is the main rainy season in Ethiopia. Except for Moyale other five study sites in (Fig. 6) are influenced by this season. Compared to the monthly mean, the seasonal mean rainfall pattern was less captured by all products. The quantitative analysis results in Table 3 and the regression plot (Fig. 7) testified the same. Thus, the negative bias of ERA5 in all study sites and the nearly close magnitude of both satellite-based products reflect what is presented in the monthly case.

Table 3 Performance indices result for the mean seasonal rainfall estimates between gauge and TAMSAT, CHIRPS, and ERA5 products
Fig. 7
figure 7

Linear regression between seasonal mean (June-September) gauge and TAMSAT, CHIRPS, ERA5 rainfall estimates in selected study sites in Ethiopia (1992-2009). The subplots (a), (b), (c), (d), (e), and (f) represent the selected study sites Bole, Awassa, Diredawa, Gonder, Moyale, and Adwa, respectively. The corresponding (g), (h), (i), (j), (k), and (l) stand for the same sites but excluding the ERA5 data set. The grey diagonal line stands for the 1:1 fitting line

Despite the struggle to capture the seasonal pattern of the rainfall adequately, CHIRPS still performed well in finding the phase of gauge observations (Fig. 6). The regression plot in (Fig. 7) and performance indices values in Table 3 indicated that CHIRPS has slightly better performance than TAMSAT, except the Awassa site in which TAMSAT outperformed. The ERA5 product is underestimated in all cases but still significant.

In the case of the seasonal time scale, the products were less accurate compared to monthly performance. However, still, CHIRPS performed well with the seasonal average correlation coefficient of 0.59, Root Mean Square Error of 0.47, and BIAS value of 0.89. In contrast, TAMSAT is next with the correlation coefficient of 0.36, Root Mean Square Error of 0.53, and BIAS value of 0.76. ERA5 showed a better correlation in two individual cases; however, the overall average is less with the correlation coefficient 0.40, RMSE 0.89, and BIAS value 0.38. Finally, the quantitative results at the Moyale site (which is not rainy during this season) in Table 3 indicated that CHIRPS matched very well with the gauge observation during normal conditions.

The scatter plot (Fig. 7) can be seen with the corresponding metrics results in Table 3 to confirm the sizeable negative bias of ERA5 in all study sites (Fig. 7af) and the corresponding plot without ERA5 (Fig. 7gl). The plot presented the over/underestimation by the products and the struggle to capture the pattern of the seasonal rainfall distribution at each study site.

4.4 Annual mean rainfall

The performance of the products at an annual scale falls somewhere between monthly and seasonal time scales. For example, in (Fig. 8) all the products tried to capture the shape of yearly mean rainfall distribution and the corresponding statistics results in Table 4 showed moderately good performance.

Fig. 8
figure 8

Annual mean rainfall pattern of gauge, TAMSAT 3.1, CHIRPS, and ERA5 over the selected study sites in Ethiopia (1992-2009)

Table 4 Performance indices result for the mean annual rainfall estimates between gauge and TAMSAT, CHIRPS, and ERA5 products

Annual mean rainfall pattern of gauge, TAMSAT 3.1, CHIRPS, and ERA5 over the selected study sites in Ethiopia (1992–2009).

The performance of the products at an annual scale falls somewhere between monthly and seasonal time scales. For example, in (Fig. 8) all the products tried to capture the shape of yearly mean rainfall distribution and the corresponding statistics results in Table 4 showed moderately good performance.

The reanalysis product is still underestimated but relatively better compared to the seasonal scale. The regression subplots (Fig. 9gf) were used to see the results excluding the ERA5 product, which is mainly biased at all sites throughout the study time range.

Fig. 9
figure 9

Linear regression between the annual mean gauge and TAMSAT, CHIRPS, ERA5 rainfall estimates in selected study sites in Ethiopia (1992-2009). The subplots (a), (b), (c), (d), (e), and (f) represent the selected study sites Bole, Awassa, Diredawa, Gonder, Moyale, and Adwa, respectively. The corresponding (g), (h), (i), (j), (k), and (l) stand for the same sites but excluding the ERA5 data set. The grey diagonal line stands for the 1:1 fitting line

All the products have relatively better performance (Table 4) with slight differences from site to site. CHIRPS achieved better with the average correlation coefficient of 0.63, Root Mean Square Error of 0.17, and BIAS value of 0.97. TAMSAT also performed well with the average correlation coefficient of 0.60; Root Mean Square Error of 0.17 and BIAS value of 0.95. In the case of ERA5, 0.65, 0.44, and 0.42 are the average correlation coefficient, Root Mean Square Error, and BIAS values, respectively. The index of agreement is above average in most cases, but NSE values are good only in few cases. For the detailed statistical results of individual study sites, see Tables 2, 3, and 4 for monthly, seasonal, and annual time scales, respectively.

The monthly, seasonal, and annual mean and total, as well as their difference of gauge, TAMSAT, CHIRPS, and ERA5 data sets of all sites are presented in Table 5. From this, it is obvious to see the magnitude of the bias based on the difference between gauge and other products. As clearly indicated in the above results, the bias correction results also reflected the large difference between gauge and ERA5 data in all sites. TAMSAT and CHIRPS also showed noticeable differences with the gauge in some cases, low compared to ERA5.

Table 5 The difference in total and average values of the gauge against satellite/model products

4.5 .

5 Discussion

The observed monthly mean rainfall data were used to validate TAMSAT, CHIRPS, and ERA5 precipitation products. Despite the considerable underestimation of the ERA5 product, all three products were able to capture the climatological pattern of the rainfall in sites. The spatial pattern over the country also showed that the products could indicate the rainfall amounts in different regions. Statistically, the satellite products are much better than the reanalysis products, which showed significant underestimation in all study sites and cycles. The results showed that all the products outperformed on a monthly scale compared to seasonal and annual scales. CHIRPS showed slightly better accuracy than TAMSAT at monthly scale. This agrees with a recent study in the region (Dinku et al. 2018), which reported that CHIRPS performed slightly better than TAMSAT (version 3) at dekadal and monthly scales. CHIRPS still performed better in capturing the shapes and estimating the magnitudes of the seasonal rainfall amounts. The same is true on the annual scale, even though TAMSAT showed slightly better performance in few cases. Another study in East African country, Burundi (Nkunzimana et al. 2020a), has also reported the execution of CHIRPS at the annual time scale compared to other satellite products. The poor performance of the satellites in capturing the pattern and estimating the magnitudes of the seasonal rainfalls can be attributed to various factors. For instance, the complex topography of the study region, the nature of heavy seasonal rain in the region, and the possible limitation in algorithm design could be some of the reasons. Derin and Yilmaz (2014) evaluated several satellite products globally. They reported that all the satellite-based estimations have difficulties representing the precipitation gradient normal to the elevated terrain. The evaluation work on several satellite products worldwide showed a general underestimation of heavy precipitation over eastern Africa (Sun et al. 2018; Thiemig et al. 2012).

Both CHIRPS and TAMSAT are highly accepted and widely used products in the region. But, to be honest, TAMSAT is supposed to be better than CHIRPS, especially in this study region, for some reasons. Firstly, it has a much higher spatial resolution (0.0375°) than CHIRPS (0.25°, monthly) and other data sources that cover this study region. Secondly, the CHIRPS satellite uses a fixed rain/no rain temperature threshold, whereas TAMSAT uses a varying condition (Dinku et al. 2018), so both products have nearly similar algorithms. Thirdly, TAMSAT has a single goal of having accurate rainfall estimates in Africa considering more detailed continental conditions, while CHIRPS satellite covers beyond. Last but not least, TAMSAT estimates have been validated over the complex terrain of Ethiopia (which is also the region of this study) and demonstrated good skills (Maidment et al. 2017). Yet, still, CHIRPS performs better in all cases, except at a daily scale (Dinku et al. 2018).

Most satellite products include rain gauge measurements from the Global Telecommunications System, which only considers synoptic observations. Despite a limited number of synoptic observations in the Global Telecommunications System, it also reports sporadically (Dinku et al. 2014). It was reported that about 20-gauge stations are accessible from Ethiopia through the Global Telecommunications System, while a country has over 500 gauges (Dinku et al. 2014). So, CHIRPS might have access to the more significant number of gauges in the region during the calibration, whereas TAMSAT did not. Another study in the region pointed out the difficulty of local calibration due to accessing the rain gauge data (Dinku et al. 2011) and suggested the products could be improved through blending with gauge observations. These might be some of the reasons behind the less accuracy of the satellite products.

The reanalysis model output product (ERA5) showed large negative bias almost in all cases and was less accurate than the satellite products. This agrees with several previous studies (Diro et al. 2009; Lim and Ho 2000; Maidment et al. 2013; Poccard et al. 2000), which documented that rainfall is less accurate over Africa than satellite-based products. It was also reported that the general circulation models, which operate at resolutions greater than 50 km and rely on parameterizations that approximate precipitation processes across a model grid square, are not able to resolve the small-scale processes associated with tropical convection and precipitation (Maidment et al. 2014a), and the uncertainties in these approximations could lead to bulky errors in the estimated rainfall (Allan et al. 2007; Flato et al. 2014; Pearson et al. 2014; Pope et al. 2000; Yang and Slingo 2001). The latest ERA5 product, which operated at around 30 km grid resolution, still resulted in a large underestimation in all the cases considered in this study. The previous version of this product (ERA-40) was validated in Uganda at dekadal time scale (Maidment et al. 2013) but still underestimated.

6 Conclusion

Historical records indicate that Ethiopia lost millions of lives due to the devastating drought and most of those droughts were directly associated with the failure of seasonal rainfall. The satellite-based and numerical model output rainfall estimates are the best alternatives to replace the region's sparse and discontinuous ground-based observations. Thus, the frequent drought and flood events could better be forecasted and monitored. According to some previous studies, CHIRPS is a reliable product among several freely available satellite-based rainfall estimates in the region. In this study, two recently produced high-resolution satellite-based products such as CHIRPS and TAMSAT and one latest released high-resolution reanalysis model output product (ERA5) were evaluated using gauge observations. The evaluation was conducted at monthly, seasonal, and annual time scales at six different sites in Ethiopia. The results indicated that the CHIRPS product performed well in all cases and TAMSAT followed with the slightest difference.

On the other hand, the reanalysis model output product showed considerable underestimation in all cases. Thus, despite the negligible differences, both satellite-based products can be reliable in the region. Still, it would be better to compare both products against the in situ observations before application for the site-based research works. Furthermore, the algorithms need further calibration with quality and enough gauge observations to improve the estimation of seasonal rainfall amounts.