1 Introduction

Drought is a complex phenomenon (Van Loon, 2015) and as a result of water shortage has more direct and significant impacts on environmental, social, and economic aspects than any other major natural disasters (Dai, 2013; Erian et al., 2010, 2021; Liu et al., 2020a; Vogt et al., 2018; Zabihi et al., 2017). Droughts can arise either from climate extremes (e.g., advection of hot and dry air masses or prevailing anticyclonic conditions) or from the complex interaction of natural processes and high levels of human activity that affect the water balance (Erian et al., 2021; Van Loon et al., 2016). There are different types of drought, which may be related to (i) precipitation (meteorological drought), (ii) streamflow (hydrological drought), (iii) soil moisture (agricultural drought), or (iv) any combination of these three drought types (Dracup et al., 1980). Drought occurs when the seasonal precipitation drops below normal or long-term average (Wilhite, 2005). Drought in Ethiopia occurs during different seasons that occur in different regions of the country, and it exists when seasonal rainfall drops below normal by almost 30–50% (Mera, 2018).

Global warming has played a very important role in shortening the recurrence frequency of droughts in Ethiopia, and it is believed to have increased the severity of the drought impact (Wilhite & Buchanan-Smith, 2005). This explains the occurrence of droughts in Ethiopia on average once per decade from 1950 to the 1980s, while it recently occurred once every 3 years (Block, 2008). In addition, some drought events that occurred in Ethiopia have been linked to the El Nino event that commonly occurs in the equatorial pacific (Ewbank et al., 2019). Several studies have been performed in Ethiopia to assess drought during the Belg and Kirmet seasons (Alemu et al., 2021; El Kenawy et al., 2016; Mohammed et al., 2018; Nasir et al., 2021). Such studies showed an increase in the intensity and frequency of drought during the Belg season and a decrease in the Kirmet season due to the complex variations in temperature and precipitation.

Several drought monitoring indices have been introduced and employed by researchers in different hydrological, meteorological, and agricultural fields over Ethiopia, including the standardized precipitation index (SPI) (McKee, 1995; Mekonen et al., 2020), standardized precipitation evapotranspiration index (SPEI) (Beguería et al., 2014; Haile et al., 2020), reconnaissance drought index (RDI) (Mohammed & Yimam, 2021), drought severity index (DSI) (Kenea et al., 2020), standardized runoff index (SRI) (Pathak & Dodamani, 2016; Yisehak & Zenebe, 2021), Palmer drought severity index (PDSI) (Ayugi et al., 2020; Palmer, 1965), streamflow drought index (SDI) (Mabrouk et al., 2020), and others (Guo et al., 2016; Esfahanian et al., 2017). The choice of drought monitoring indices depends on the quantity and quality of the available climate data, aims or objectives of the study, computational simplicity, and the ability of the index to detect the spatiotemporal distributions and variations in drought events (Morid et al., 2006). SPI and SPEI are the most widely used meteorological drought indices, where only precipitation values are used for SPI computation, while the SPEI calculation considers the effects of both evapotranspiration and precipitation together (Singh & Dhanya, 2019). Accordingly, global warming is poorly considered or represented in the SPI because it does not take into account the effect of the temperature element (Venkataraman et al., 2016). Bai et al. (2020) presented a comparison between the SPEI and the self-calibrating Palmer Drought Severity Index (scPDSI), in which both indices consider global warming but with different mechanisms. Their findings revealed that SPEI should be the first choice for use in drought monitoring, mainly because of the high uncertainty and instability of the scPDSI.

However, most studies of drought in Ethiopia have limitations such as relying on precipitation only, without temperature, to assess the drought index or covering a small area or a short record period. Moreover, despite extensive studies of drought in Ethiopia, very few studies have attempted to investigate the performance of gridded datasets against corresponding observations and how they simulate the drought events over Ethiopia. However, the assessment of the climatic parameters from the gridded datasets could be useful to identify their accuracy for drought estimation and their reliability in drought monitoring in the study area. Reda et al. (2021) illustrated the reliability of nine gridded precipitation and temperature datasets, including the Climatic Research Unit (CRU) TS v4.03, compared to ground-based observations, used to estimate the drought index (SPI) over the upper Tekeze River basin in Ethiopia from 1982 to 2016. This study demonstrated that the CRU shows good agreement with the observed values, with an extremely strong correlation coefficient of 0.85 at the monthly timescale. In addition, the different drought indices (SPI, SPEI, etc.) in all drought studies over Ethiopia are mostly computed at timescales of 1, 3, 6, 9, 12, 24, and 48 months, except for the study by Mekonen et al. (2020), which used timescales of 1, 4, and 8 months.

Therefore, this study aims to (1) evaluate the performance of the CRU gridded dataset to explain both temperature and rainfall against observations over Ethiopia during the period 1970–2005, (2) assess the suitability and the robustness of the used gridded dataset to estimate the SPEI drought index across Ethiopia, (3) computing the SPEI at 4-month timescale to represent drought events during the two rainy seasons of Belg (February–May) and Kirmet (June–September) separately, (4) computing the SPEI at 8-month timescale to represent drought events during these two rainy seasons together (February–September), and (5) interpret and analyzing of the estimated SPEI based on the observed and CRU at 13 stations across Ethiopia during the Belg (B), Kirmet (K), and Belg–Kirmet (B–K) seasons. Moreover, only the two rainy seasons of Belg and Kirmet are considered in this study, because the third season of Bega (October–January) represents the dry season in Ethiopia.

2 The Study Area

Ethiopia is located in the northeastern part of the African continent within 3–15°N and 33–48°E and forms the main part of the Horn of Africa region. The country occupies an area of approximately 1.14 million km2, and it is rich in geographical diversity, with high, rugged plateaus and outlying lowlands. As indicated in Fig. 1, the elevation in the country ranges from 160 m below sea level at the northern end of the Rift Valley to more than 4600 m above sea level in northern mountainous regions (Teshome & Zhang, 2019). Several factors affect the climate of Ethiopia, and one of these factors is the regular movement of the Intertropical Convergence Zone (ITCZ), which moves to the north between March and September and to the south between October and January (Lashkari & Jafari, 2021).

Fig. 1
figure 1

The geographical distribution and characteristics of the chosen 13 stations in Ethiopia

The wide variety of topography in Ethiopia and the observed contrast in elevation, where the central plateau descends in the mid-range between 1800 and 2500 m and the lowlands have an elevation below 1500 m, result in a variety of climates, from very arid to very humid, typical of equatorial mountains. Precipitation also varies with latitude, decreasing from south to north, and the distribution of annual precipitation ranges from less than 250 mm—and as low as 50 mm in the Danakil depression—to 2000 mm in the highland (Fazzini et al., 2015). Also, the temperature is much cooler in high areas, ranging between 6 °C and 26 °C, whereas in the lowlands it ranges from 25 °C to 30 °C (Mera, 2018). In winter, the presence of trade winds, cool but dry, blowing from the northeast to southwest control the dry period (Bega). In spring, the impact of southwesterly winds from the Congo basin is responsible for the season of little rainfall (Belg), which can bring relatively plentiful precipitation in the southern part of the country. In summer, the Guinean monsoon, consisting of equatorial warm and humid winds, leads to abundant rains (Kiremt) (Fazzini et al., 2015); in addition, Ethiopian agriculture is highly reliant on rainfall, with low percentages of less than 3% of irrigated land for cereals (Mann & Warner, 2017). Accordingly, the short rainfall Belg season (February–May) in the south and southeast is caused by the “monsoon winds” from the southern Indian Ocean, while the heavy rainfall Kirmet season (June–September) comes from the Atlantic Ocean and is related to southwesterly winds. Several parts of the central, northern, and eastern highlands have short-season rain from March to April (Mera, 2018; Seleshi & Zanke, 2004).

3 Data Collection

The spatial distribution and geographical characteristics of the selected 13 in situ meteorological stations that cover the different parts of Ethiopia are illustrated in Fig. 1. The selection of these 13 stations is based on the highest amount of received annual average rainfall (> 600 mm) during the period 1970–2005 as shown in Table 1. The accumulated monthly rainfall (mm) from rain gauges and mean monthly of both maximum and minimum temperatures (°C) data at the selected 13 stations are obtained from the Ethiopian National Meteorological Agency (ENMA) during the period from 1970 to 2005. Additionally, the chosen stations are coded and ranked descending according to the observed annual rainfall amount as indicated in Table 1. There are different sensors, instruments, and platforms for measuring various meteorological elements summarized in detail in Gultepe et al. (2019).

Table 1 Station geoinformation, codes, and descending rank according to the annual rainfall

Moreover, the gridded (0.5° × 0.5°) monthly dataset for accumulated precipitation (mm) and mean maximum and minimum temperatures (°C) during the period from 1970 to 2005 is obtained from the Climatic Research Unit (CRU) TS version 4 (Harris et al., 2020). The CRU TS, which is produced by the National Centre for Atmospheric Science (NCAS) at the University of East Anglia’s CRU, is one of the most widely used gridded climate datasets.

4 Methods

To ensure precision, the CRU dataset is interpolated at the specific geographical locations of the 13 chosen stations using the nearest neighbor remapping (remapnn) operator in Climate Data Operator (CDO) software. To validate the values of precipitation and temperature obtained from the CRU gridded dataset and investigate its accuracy against the observed data, some of the most robust statistical procedures are applied. The main goal of this validation is to evaluate the CRU accuracy for these two parameters for use in SPEI estimation during the period from 1970 to 2005. The estimated SPEI values from the CRU and observation are compared during the Belg (B), Kirmet (K), and Belg–Kirmet (B–K) seasons. The statistical procedures performed are root mean square error (RMSE), mean percentage error (MPE%), and Pearson correlation coefficient (R). The RMSE as in Eq. (1) accounts for the scatter of the error distribution (Lara-Fanego et al., 2012), while the MPE% as in Eq. (2) tells us the accuracy of the CRU data (Gong et al., 2020), where small values mean that CRU values are close to the observed values. Also, R as shown in Eq. (3) is used to evaluate the linear correlation between the observed and CRU values, which range from + 1 to −1, where ± 1 indicates a perfect correlation and zero indicates no relationship at all. The range of R can be categorized according to Liu et al. (2020b) as: (i) an extremely strong correlation (R ≥ 0.8); (ii) strong correlation (0.6 ≤ R < 0.8); and (iii) weak correlation (R < 0.6).

$$RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{({\varepsilon }_{i})}^{2}}$$
(1)
$$MPE\%=\frac{100}{n}\sum_{i=1}^{n}(\frac{{\varepsilon }_{i}}{{\left(xo\right)}_{i}})$$
(2)
$$R=\frac{\sum_{i=1}^{n}({{(x}_{o})}_{i}-{\overline{x}}_{o})({{(x}_{c})}_{i}-{\overline{x}}_{c})}{\sqrt{{\sum_{i=1}^{n}({{(x}_{o})}_{i}-{\overline{x}}_{o})}^{2}}\sqrt{{\sum_{i=1}^{n}({{(x}_{c})}_{i}-{\overline{x}}_{c})}^{2}}}$$
(3)

where \({\varepsilon }_{i}=({{(x}_{c})}_{i}-{{(x}_{o})}_{i})\) is the residual or difference between CRU (\({x}_{c}\)) and the observed (\({x}_{\mathrm{o}}\)) values, n is the number of the given values or times, i = (1, 2, 3, 4 …. n) is the iteration time (monthly), and \({\overline{x}}_{o}\) and \({\overline{x}}_{c}\) are the climate average of the observed and CRU data respectively.

Furthermore, the C++ program developed by the Spanish Scientific Research Council (CSIC) is used to estimate the SPEI either from the observed or the CRU dataset. This program requires a text data file (.txt) containing monthly time series of precipitation and mean temperature during the study period, and the station latitude in decimal degree. The program manual, source, and examples are available at http://digital.csic.es/handle/10261/10002. The sequential steps of calculating SPEI are described in detail by Vicente-Serrano et al. (2010), where it depends on the monthly difference between precipitation and potential evapotranspiration for each month.

In this study, the SPEI is computed at two timescales: (i) at 4 months, the value of the SPEI in May is chosen to represent the aggregated value during the Belg season, and the value of the SPEI in September is chosen to represent the aggregated value during the Kirmet season; (ii) at 8 months, the SPEI value in September is chosen to represent the aggregated value during both the Belg and Kirmet seasons together. The SPEI values are categorized according to Hayes et al. (1999) and grouped into wet, normal, and dry events, as shown in Table 2.

Table 2 SPEI categories and event clustering

The success percentage for dry (SPD%), normal (SPN%), and wet (SPW%) events and the general accuracy (GA%) of occurrence (Sankaranarayanan et al., 2020; Sayad et al., 2021) are used to assess and measure the compatibility and process the count of correctness between the estimated SPEI cluster events from CRU and observed data, as shown in Eqs. 4 and 5, respectively.

$$SPD\%=\left(\frac{\mathrm{NCD}}{\mathrm{NOD}}\right)\times \;100 \& \;SPN\%=\left(\frac{\mathrm{NCN}}{\mathrm{NON}}\right)\times\; 100\; \& SPW\%=\left(\frac{\mathrm{NCW}}{\mathrm{NOW}}\right)\times 100$$
(4)
$$GA\%=\left(\frac{\mathrm{NCD}+\mathrm{NCN}+\mathrm{NCW}}{\mathrm{NOD}+\mathrm{NON}+\mathrm{NOW}}\right)\times 100$$
(5)

where NCD, NCN, and NCW are the number of dry, normal, and wet events from CRU that are dry, normal, and wet events from observations, while NOD, NON, and NOW are the number of dry, normal, and wet events from observed data.

5 Results and Discussion

5.1 Statistical Evaluation of CRU

The correlation coefficient (R) is calculated to assess the strength of the relationship between the observed and CRU data for precipitation, temperature, and SPEI at the 13 selected stations during the B, K, and B–K seasons over the study period (1970–2005) as shown in Table 3. The strength of the relationship is classified as extremely strong (R ≥ 0.8), strong (0.6 ≤ R < 0.8), or weak (R < 0.6) correlation according to Liu et al. (2020b). Extremely strong and strong correlations for precipitation are detected at 92% (12) and 85% (11) of the stations during the B and B–K seasons, respectively, whereas the other stations have strong correlations in both B and B–K. The K season showed extremely strong and strong correlations at most stations, except a weak correlation at St8 (0.53) and St9 (0.55). A strong correlation was dominant for temperature during the B, K, and B–K seasons. Moreover, an extremely strong correlation for SPEI was found at seven stations (54%) during the three seasons, followed by a strong correlation at six (46%), four (31%), and four (31%) stations during the B, K, B–K seasons, respectively. A weak correlation appeared at only two (15%) stations during the K and B–K seasons. The extremely strong correlation is the dominant correlation at most stations for both precipitation and SPEI, while a strong correlation is the dominant correlation at most stations for temperature.

Table 3 Correlation coefficient (R) between observed and CRU for precipitation, temperature, and SPEI

The percentage (%) of the stations that have extremely strong, strong, and weak correlations for the three variables (precipitation, temperature, and SPEI) during the B, K, and B–K seasons are illustrated in Fig. 2. During the three seasons (B, K, and B–K), the extremely strong correlation has the highest station percentage, followed by a strong correlation for precipitation. Also, the strong correlation has the highest station percentage followed by a weak correlation in the B and K seasons and extremely strong in the B–K season for temperature. For SPEI, the extremely strong correlation is the dominant percentage followed by the strong correlation and weak correlation in the three seasons. According to Table 4, the statistical parameters (RMSE and MPE%) that are used to evaluate the accuracy of the CRU dataset against observations showed that the CRU has an MPE ranging from −30 to 73% and RMSE from 4 to 99 mm during the B, K, and B–K seasons. The largest MPE (73%), corresponding to an RMSE of 47 mm, occurred at St10 in the K season, while the smallest MPE (−30%), corresponding to an RMSE of 99 mm, is found at St1 in the K season. The CRU underestimates the precipitation values at St1, St2, St3, and St7 during all seasons, while it overestimates the precipitation values at the other stations (St4–St6 and St8–St13) during all seasons except at St4 in the K season.

Fig. 2
figure 2

Percentage (%) of the stations based on the correlation strength during B, K, and B–K for the three variables

Table 4 The RMSE and the MPE% for precipitation, temperature, and SPEI

The results also showed that the CRU overestimates the temperature at St1, St2, St5, St12, and St13, with MPE ranging from 2 to 26% with RMSE ranging from 0.7 to 4 °C during all seasons except at St13 in the B–K season (MPE = −4% and RMSE = 1 °C), whereas CRU underestimates the temperature at the rest of the stations, with MPE ranging from −0.7 to −22% and RMSE ranging from 0.8 to 5 °C during all seasons except overestimation at St3 in the B season, with MPE = 6% and RMSE = 1.5 °C and St11 in the K season with MPE = 0.6% and RMSE = 0.9 °C. The largest MPE is 26% and RMSE is 4 °C at St4 in the K season, while the smallest MPE is −0.7% and RMSE is 1 °C at St11 in the B season. Furthermore, the computed SPEI from CRU is less than that estimated from observations in most seasons, with MPE ranging from −3 to −49% and RMSE ranging from 0.32 to 0.96 during all seasons except for some overestimations during the B (St3, St4, St7, and St11), K (St7 and St8) and B–K (St4, St5, St6, and St11). The largest MPE for SPEI is 55% at St3 and RMSE is 0.96 at St1 in the B season, while the smallest MPE and RMSE are −49% and 0.28 at St1 and St11, respectively, in the B-K season.

5.2 SPEI Frequency Analysis

The total number of occurrences (frequency) for each SPEI category at the selected 13 stations during the three seasons (B, K, and B–K) over the period 1970–2005 based on observed data is shown in Table 5, while the frequency of the three SPEI clusters (wet, normal, and dry) is shown in Fig. 3. The SPEI in the normal cluster has the largest frequency (60–80%) during the three seasons at all stations, while the extremely wet and extremely dry categories have the smallest frequency (3–6%). The moderately wet, moderately dry, very wet, and severely dry categories have moderate frequency between normal and extreme events.

Table 5 Frequency of SPEI categories at all stations during the B, K, and B–K seasons over the period 1970–2005
Fig. 3
figure 3

Frequency of each SPEI cluster during a B, b K, and c B–K seasons over the period from 1970 to 2005

Furthermore, the frequency of SPEI dry cluster (17–22%) in the B season is greater than the wet cluster (11–17%) at 69% (St2–St6, St11, and St12) of the stations, while the frequency of SPEI wet cluster (BW) at St7, St8, St9, and St13 is more than the SPEI dry cluster (BD). The frequency of SPEI dry cluster in the K season (KD) is higher than the SPEI wet cluster (KW) at St2, St5, St8, St9, St10, and St12, and vice versa at St6, St7, St11, and St13, while they are identical at St1, St3, and St4. Also, the frequency of the SPEI dry cluster in the B–K season ([B–K] D) is higher than the SPEI wet cluster ([B–K] W) at St2, St4, St5, St8, St10, St11, and St12, and vice versa at St3, St6, St9, and St13, while they are identical at St1 and St7. During the three seasons, the SPEI dry cluster has high frequency at St2, St5, St10, St11, and St12, while the SPEI wet cluster has high frequency at St7 and St13.

5.3 CRU Reliability for SPEI Estimation

The long-term ranges of the estimated SPEI from CRU and observed data for each season during the period 1970–2005 at all stations are demonstrated in Table 6. It can be seen that the range of the estimated SPEI values from CRU is nearly equal to that from observed data, with small differences (within −1 to +1) at all stations. This small difference indicates that the SPEI estimated from CRU remains in the same SPEI cluster event (wet, normal, or drought) as from observation. This difference across all stations implies that the estimated SPEI from CRU may be in the other SPEI category (extreme, severe, or moderate) but often still within the same SPEI cluster.

Table 6 SPEI long-term ranges from CRU and observation during the period from 1970 to 2005

The largest difference (1.62) in the top positive SPEI values between CRU (1.59, very wet) and observation (3.21, extremely wet) occurred at St8 in the K season, which is still in the same SPEI cluster (wet events). Also, at St1 in the B–K season, the top positive value of CRU is 1.02 (moderately wet), and from observation it is 2.04 (extremely wet), with a difference (1.02) in the category only but still in the same SPEI wet cluster. This confirms the accuracy of the CRU data in estimating SPEI values that are within the same SPEI cluster but may differ in the category. Figures 4, 5, and 6 show the estimated SPEI and its clusters across the study period (1970–2005) for the B, K, and B–K seasons, respectively.

Fig. 4
figure 4

The estimated SPEI clusters from observations (blue line) and CRU (red line) in the Belg (B) season during the period from 1970 to 2005

Fig. 5
figure 5

The estimated SPEI clusters from observations (blue line) and CRU (red line) in the Kirmet (K) season during the period from 1970 to 2005

Fig. 6
figure 6

The estimated SPEI clusters from observations (blue line) and CRU (red line) in the Belg-Kirmet (B–K) season during the period from 1970 to 2005

Although these three figures confirm the robust performance and reliability of the CRU dataset for estimating SPEI clusters, CRU records a few SPEI cluster events outside the observed cluster (miss or false alarm). Such false alarms appear at St6 during 1972, which is recorded as a wet cluster in B, K, and B–K from CRU, while it is a normal cluster in B and a dry cluster in K and B–K from observations. Also, at St10 from observed wet cluster to CRU dry cluster during 2001 in the B season and 1996 in the K season and 2001 in the B–K season. Furthermore, there are some observed dry clusters detected as CRU wet clusters such as St8 during 1985 and 1994 in the K and B–K seasons, respectively, and at St10 during 1987 in the B–K season. However, the observed dry clusters are detected as CRU wet clusters during 1988 at St11 and 2001 at St12 in the K season, and at St13 during 1989 in the B season and 2000 in the B–K season. Additionally, there are some observed wet clusters detected as CRU normal clusters like in the Belg for 2000 at St1, 1989 at St6, 2000 at St7, 1978 at St8, 2001 at St9, 1971, 1976, 1979, 1983, 1987, and 1988 at St10, and 1980 at St13.

The success percentages of the three SPEI clusters (dry [SPD%], normal [SPN%], and wet [SPW%]) between CRU and observation during the three seasons at all stations are illustrated in Table 7. In the B season, the CRU extremely strongly estimates (≥ 80%) the corresponding observed dry events at St3, St6, and St7, normal events at St2, St4–St7, St11, and St12, and wet events at St1, St11, and St12. It also strongly estimates the dry events at St2, St4, St5, and St12, the normal events at St1, St3, St10, and St13, and the wet events at St2, St5, St7, and St13, while it weakly estimates the events at the rest of the stations.

Table 7 Success percentage (%) of the estimated cluster events from CRU against observation in the same month

In the K season, the CRU extremely strongly estimates the number of dry events at St1–St5 and St7, normal events at St2, St5, St6, St7, and St12, and wet events at St2 and St7, while it strongly estimates the number of dry events at St6 and St13, normal events at St1, St3, St4, St8–St11, and St13, and wet events at St6, St11, and St12. In the B–K season, the CRU extremely strongly estimates the number of dry events at St5, normal events at St5 and St11–St13, and wet events at St5, St11, and St12, while it strongly estimates the number of dry events at St6, St11, and St12, normal events at St1–St4, St6, St7, St9, and St10, and wet events at St4 and St6.

Furthermore, the general accuracy of the CRU in estimating the number of all (dry, normal, and wet) events as compared with observations at all stations during the three seasons over the study period is shown in Table 8. In the B season, the general accuracy of all events for CRU shows an extremely strong estimation at St2, St5–St7, and St12, with GA ranging from 81 to 92%, while it strongly estimates all events at St1, St3, St4, St11, and St13, with GA ranging from 69 to 78%, and weakly estimates all events at St8, St9, and St10, with GA ranging from 50 to 56%. In the K season, the CRU extremely strongly estimates all events at St2 and St5–St7, with GA ranging from 81 to 97%, while it has strong general accuracy at St1, St3, St4, St9, St12, and St13, with GA ranging from 63 to 75% and has a weak general accuracy at St8, St10, and St11, with GA ranging from 53 to 58%. In the B–K season, the CRU has extremely strong general accuracy at St5, St11, and St12, with GA ranging from 83 to 86%, while strong general accuracy appears at St2, St4, St6, St7, St9, St10, and St13, with GA ranging from 61 to 75%, and weak general accuracy at St1, St3, and St8, with GA ranging from 46 to 53%.

Table 8 General accuracy (%) of the estimated cluster events from CRU against observation in the same month

The contingency table is implemented to assess the interaction of statistics between the estimated SPEI cluster events (dry, normal, and wet) from CRU and observation and displays their frequency or joined distribution as indicated in Table 9. This contingency table also indicates the joined distribution of the three SPEI clusters their correspondence from CRU within the three SPEI clusters from observation. The diagonal cells in Table 9 for the three clusters in the same season and station represent the correct hits (results) of the cluster from CRU within the same cluster from the observation, while the other cells show false-alarm clusters.

Table 9 Contingency table for the SPEI clusters between CRU and observed data

The correct hits for the three clusters from the CRU within the observation occupy the largest percentage compared with the false clusters alarm at all stations during the three seasons, except for wet in the K season and wet and dry in the B–K season. In the B season, the CRU hits all observed wet events (100%) at St11 and St12 and hits most (60–85%) of observed wet events at St1, St2, St5, St7, and St13, while it hits from 38 to 50% of observed wet events at the rest of the stations except false hits for all observed wet events at St10. The CRU hits most (55–100%) of the observed normal events at all stations, while it hits from 63 to 88% of the observed dry events at St2, St3, St4, St5, St6, St7, and St12 and from 43 to 57% at St8, St9, St10, St11, and St13 during the B season. In the K season, the CRU hits all observed wet events at St2 and St7, and hits from 63 to 67% at St6, St11 and, St12 and from 20 to 57% at St3, St4, St5, St8, St9, St10, and St13, while it did not hit any of the observed wet events at St1. The CRU hits more than 67% of the normal observed wet events at all stations, while it hits all (100%) of the observed dry events at St1, St2, St5, and St7 and hits from 60 to 83% at St3, St4, St6, and St13, but it hits from 17 to 57% at St8, St9, St10, St11, and St12 during the K season. In the B–K season, the CRU hits all observed wet events at St11 and from 63 to 83% at St4, St5, St6, and St12, while it hits only from 25 to 57% at St3, St7, St8, St9, St10, and St13 and did not hit any of the observed wet events at St1 and St2. The CRU hits about 57% to 95 of the normal observed wet events at all stations, and hits from 63 to 88% of the observed dry events at St5, St6, St11, and St12 and from 17 to 50% at St2, St3, St4, St7, St8, St9, and St10, while it did not hit any of the observed dry events at St1 during the B–K season.

6 Conclusions

Drought is a major natural disaster that has direct and significant impacts on the environmental, social, and economic sectors due to the shortage of precipitation or water resources. In Ethiopia, drought occurs during the different rainy seasons, particularly the Belg (February–May) and Kirmet (June–September), in the different country regions because the rainfall drops below the normal climate by about 30% to 50%. In this study, the SPEI is utilized to monitor the drought and wet events over Ethiopia at 4- and 8-month timescales during the period from 1970 to 2005 based on both observations and Climatic Research Unit (CRU) datasets.

The estimation of SPEI is performed at a 4-month timescale to represent each of the Belg (B) and Kirmet (K) seasons separately, and at an 8-month timescale to represent both the Belg and Kirmet (B–K) seasons together. The evaluation of both temperature and rainfall from the CRU dataset is accomplished at 13 in situ meteorological stations in the different geographical Ethiopian regions, to assess the CRU accuracy and its reliability for SPEI estimation across Ethiopia. The correlation coefficient (R), RMSE, and MPE% statistical procedures are used for evaluating CRU against observation during the B, K, and B–K over the study period (1970–2005). An extremely strong correlation (R ≥ 0.8) is found at most stations, followed by a strong correlation (0.6 ≤ R < 0.8) for precipitation, and a strong correlation at most stations followed by an extremely strong correlation for temperature. For SPEI, the extremely strong correlation is dominant at most stations in the B season, and the strong correlation is dominant in the K season, while during the B–K season the weak correlation (R < 0.6) occurs at about 46% of the stations as compared with strong (31%) and extremely strong (23%). Moreover, the CRU overestimates the precipitation values at the stations from St3 to St13, with MPE ranging from 0.1 to 123% and RMSE from 34 to 39 mm during the three seasons (B, K, and B–K), while it underestimates the precipitation at St1 and St2, with MPE ranging from −6.5 to −35% and RMSE from 39 to 163 mm during the three seasons.

The results also show that the CRU overestimates the temperature at St1, St2, St5, St12, and St13, while it underestimates the temperature at the rest of the stations. Furthermore, the computed SPEI from CRU is greater than that estimated from observations at St3, St4, St7, St8, St9, St10, and St12, with RMSE ranging from 0.5 to 0.8 during the three seasons, except some underestimations during B and B–K. Additionally, CRU demonstrates a high success percentage (SP) in the estimation of SPEI clusters (dry, normal, and wet) as well as high general accuracy (GA) in the estimation of dry and wet clusters together. It is concluded that the CRU is almost extremely strong (SP ≥ 80%) in estimating the number of dry, normal, and wet events, with some slight overestimations up to about 50% greater than the estimated events from the observed data during the three seasons. Also, the GA of the CRU for estimating the total number of both dry and wet events together at most stations during the three seasons is almost extremely strong (GA ≥ 80%), with some overestimations reaching 30% over the estimated events from the observed data. It is also concluded that the range of the estimated SPEI values from CRU are nearly equal to those from observed data, with little difference (within −1 to +1) at all stations, implying that the estimated SPEI from CRU may represent a different category (extreme, severe, moderate) but is still within the same SPEI cluster (wet, normal, or dry), except for a few events. Finally, the contingency table reveals that the correct hits for the three SPEI clusters from CRU within the observation are mostly larger than the false-alarm clusters at all stations during the three seasons except for wet in the K season and wet and dry in the B–K season.