Introduction

Zimbabwe is one of the poorest countries in the world. A global business magazine has ranked Zimbabwe as the second poorest country in the world, see https://bulawayo24.com/index-id-news-sc-national-byo-70943.html Its economy in recent years has been battered by lack of rainfall, drought, sanctions, AIDS pandemic, mass unemployment and hyper inflation. One of the major factors has been the lack of rainfall. Zimbabwe has experienced many periods of droughts. The most recent drought has been in December 2019, which ignited the worst hunger crisis the country has faced in nearly a decade. In November 2019, farmers received only 55% of normal rainfall. Livestock losses reached 2.2 million people in urban areas and 5.5 million in rural ones.

The aim of this paper is to provide the first statistical analysis of annual highest monthly rainfall in Zimbabwe. The following questions and more can then be answered: What are the wettest areas with respect to annual highest monthly rainfall? What are the driest areas with respect to annual highest monthly rainfall? Which areas are most variable with respect to annual highest monthly rainfall? Which areas are least variable with respect to annual highest monthly rainfall? The answers to these questions and more could lead to actions (for example, increased agricultural production in wet areas and planting of crops withstanding droughts in dry areas) which may be of help to improve the economy of Zimbabwe.

To the best knowledge of the authors, there have been no papers on maximum rainfall in Zimbabwe. A related paper on minimum rainfall is due to Chikobvu and Chifurira1. Focus on maximum rainfall than minimum rainfall is more meaningful. Minimum rainfall will be mostly zero for a country like Zimbabwe.

However, there have been several papers focusing on rainfall (not maximum rainfall) in specific regions of Zimbabwe. For example, Mooring et al.2 examined the effect of rainfall on tick challenge at Kyle-Recreational-Park, Zimbabwe; Gargett et al.3 examined the influence of rainfall on Black Eagle breeding in the Matobo Hills, Zimbabwe; Bourgarel et al.4 studied the effects of annual rainfall and habitat types on the body mass of impala in the Zambezi Valley; Muchuru et al.5 assessed variability of rainfall over the Lake Kariba catchment area in the Zambezi river basin; Sibanda et al.6 studied long-term rainfall characteristics in the Mzingwane catchment of south-western Zimbabwe; and so on.

Many papers have been published on extreme rainfall from several other African countries. These papers have been written mostly by scientists from the West, with no collaboration with scientists based in Africa; see, for example, Williams et al.7,8,9, Williams and Kniveton10, Pohl et al.11, De Paola et al.12, Woodhams et al.13 and Finney et al.14. This adds to the sickening attitude that the West has looked to Africa only for exploitation; not stopping with slave trade, not stopping with colonization, not stopping with stealing of minerals to make among others computers, the West continues scientific exploitation of Africa at an alarming level, see Wiegand et al.15. This paper is part of a crusade initiated by the second author to empower Africans to conduct their own research, see http://educateafrica.org/.

There are also hundreds if not thousands of papers published on extreme rainfall from outside of Africa. See, for example, Douka and Karacostas16 for Greece, Moccia et al.17 for Italy, and Ng et al.18,19 for Malaysia. But this paper was motivated by the lack of such papers for Zimbabwe and the lack of such papers written by African scientists.

The contents of the paper are organized into the following sections. “Data” section describes the data. “Method” section describes the method used to analyze the data. “Results and discussion” section presents the results of the method and their discussion. The paper is concluded in “Conclusions” section.

Data

The data are monthly rainfall in millimetres for 103 stations in Zimbabwe. The station names and years of record are given in Table 1. The locations of the stations are shown in Fig. 1. The stations give a good representation of the geography of Zimbabwe. A large number of stations appears around Harare and Bulawayo, the two largest cities, closeups of the stations around these cities are also shown in Fig. 1. The length of records is reasonable for most stations. The 10th, 25th, 50th, 75th and 90th percentiles of the length are 51, 65, 94, 106 and 114, respectively.

The data were obtained from the Meteorological Services Department, Harare. Many stations for recording rainfall have been discontinued (indicated by \(*\) in Table 1) because of the deteriorating economy and lack of resources.

Table 1 Station numbers, station names and years of record.
Figure 1
figure 1

Location of stations as identified by the station numbers in Table 1 (top); Location of stations around Bulawayo, station number 10 (bottom left); Location of stations around Harare, station number 35 (bottom right). ggplot2 version 3.3.5, https://cran.r-project.org/web/packages/ggplot2/index.html was used for plotting.

The missing values for each station contributed less than 10% of the total period. They were treated as missing in the data analysis reported in “Results and discussion” section. The R software20 used for statistical modeling does account for missing values. Data from neighboring stations were compared to see if they were highly inconsistent. Duplication of data values was checked to see if they were real. Values outside of two standard deviations were also checked to see if they were real.

The annual highest monthly rainfall for each year was recorded as the maximum of the twelve monthly values. Some summary statistics (mean, median, skewness, kurtosis, standard deviation, range, minimum and maximum) of the annual highest monthly rainfall are shown in Figs. 2 and 3. The interpolation used in these two and later figures uses the function interp in the R package interp20. The interpolation is based on the algorithms developed by Akima21,22 and Renka23 which are widely used.

Figure 2
figure 2

Mean (top left), median (top right), skewness (bottom left) and kurtosis (bottom right) of the annual highest monthly rainfall in Zimbabwe. ggplot2 version 3.3.5, https://cran.r-project.org/web/packages/ggplot2/index.html was used for plotting.

Figure 3
figure 3

Minimum (top left), maximum (top right), standard deviation (bottom left) and range (bottom right) of the annual highest monthly rainfall in Zimbabwe. ggplot2 version 3.3.5, https://cran.r-project.org/web/packages/ggplot2/index.html was used for plotting.

According to mean annual highest monthly rainfall, the wettest areas are those around Harare and that between Masvingo and Mutare. The driest areas are those bordering Botswana and South Africa. The picture is similar for median annual highest monthly rainfall.

Skewness of annual highest monthly rainfall is generally positive. Areas surrounding Chirundu and Kariba give the largest skewness. A small area in the north of the country has negative skewness.

Kurtosis of annual highest monthly rainfall is not far from that of the normal distribution for much of the country. Kurtosis values in areas surrounding Chirundu and Kariba are the largest and correspond to much heavier tails than the normal distribution.

According to standard deviation of annual highest monthly rainfall, the most variable areas are those surrounding Chirundu and Kariba. A big area covering the two major cities, Harare and Bulawayo, shows the least variability. The picture is similar for the range of annual highest monthly rainfall.

According to the minimum of annual highest monthly rainfall, the wettest areas are those close to Chegutu and Harare. The driest areas are those close to the Zambian border and Bulawayo. According to the maximum of annual highest monthly rainfall, the wettest areas are those surrounding Chirundu and Kariba. The driest areas are those surrounding Bulawayo and Renco.

Method

Let X denote a random variable representing the annual highest monthly rainfall. According to extreme value theory (see Leadbetter et al.24, Resnick25 and Embrechts et al.26), the cumulative distribution function of X can be approximated by

$$\begin{aligned} \displaystyle F_X (x) = \exp \left[ -\left( 1 + \xi \frac{x - \mu }{\sigma } \right) ^{-1 / \xi } \right] \end{aligned}$$
(1)

for \(\mu - \sigma / \xi \le x < \infty \) if \(\xi > 0\), \(-\infty< x < \infty \) if \(\xi = 0\) and \(-\infty < x \le \mu - \sigma / \xi \) if \(\xi < 0\), where \(-\infty< \mu < \infty \) denotes a location parameter, \(\sigma > 0\) denotes a scale parameter and \(-\infty< \xi < \infty \) denotes a shape parameter. Note that if \(\xi > 0\) then X has a heavy tail bounded below by \(\mu - \sigma / \xi \). If \(\xi < 0\) then X has a short tail bounded above by \(\mu - \sigma / \xi \).

The distribution in (1) is known as the generalized extreme value (GEV) distribution. The GEV distribution was fitted to the data in “Data” by the method of maximum likelihood, see Coles27 for details. The command fgev in the R package evd20,28 was used to compute the maximum likelihood estimates. Other distributions (for example, the normal distribution) may provide better fits to the annual highest monthly rainfall. But the GEV distribution is theoretically justified.

Let \(\widehat{\mu }\), \(\widehat{\sigma }\) and \(\widehat{\xi }\) denote the maximum likelihood estimates of \(\mu \), \(\sigma \) and \(\xi \), respectively. A quantity of interest based on (1) is the T-year return level loosely interpreted as the annual highest monthly rainfall expected on average once in every T years. Let \(x_T\) denote the T-year return level corresponding to (1). It must satisfy

$$\begin{aligned} \displaystyle F_X \left( x_T \right) = 1 - \frac{1}{T}. \end{aligned}$$
(2)

Inverting (2),

$$\begin{aligned} \displaystyle x_T = \widehat{\mu } + \frac{\widehat{\sigma }}{\widehat{\xi }} \left\{ \left[ -\log \left( 1 - \frac{1}{T} \right) \right] ^{-\widehat{\xi }} - 1 \right\} . \end{aligned}$$
(3)

See equation (3.4) in Coles27.

Ethical approval

All authors kept the ‘Ethical Responsibilities of Authors’.

Consent to participate

All authors gave explicit consent to participate in this study.

Consent to publish

All authors gave explicit consent to publish this manuscript.

Results and discussion

The GEV distribution was fitted to the annual highest monthly rainfall data from each of the 103 stations. The estimates \(\widehat{\xi }\) were found to be positive for sixteen of the 103 stations. They are Beitbridge, Bikita Agric, Buffalo Range, Buhera, Chisumbanje, Glendale Rail, Kezi, Lupane, Matopos Research Station, Middle Sabi Tanganda, Mphoengs, Nyamadhlovu, Rukomechi, Sawmills, Tashinga and West Nicholson. The distribution of annual highest monthly rainfall for these stations is heavy tailed, meaning that the rainfall recorded at these stations can be unbounded. The distribution of annual highest monthly rainfall for the remaining eighty seven stations is short tailed and is bounded above by \(\widehat{\mu } - \widehat{\sigma } / \widehat{\xi }\), which will be referred to as the probable maximum of annual highest monthly rainfall. The estimates of the probable maximum of annual highest monthly rainfall and their standard errors are given in Table 2.

Table 2 Estimates and standard errors of probable maximum of annual highest monthly rainfall.

The largest of the probable maximum of annual highest monthly rainfall is for Plumtree, and the second largest of the probable maximum of annual highest monthly rainfall is for Mutare Fire, but both have large standard errors. The smallest of the probable maximum of annual highest monthly rainfall is for Rupike. The second smallest of the probable maximum of annual highest monthly rainfall is for Bulawayo Goetz.

In parallel to Table 2, the 100-year return levels of annual highest monthly rainfall for all of the stations were also computed. These estimates and their standard errors are given in Table 3. The largest of the return level is for Rukomechi, and the second largest of the return level is for Chisengu, but one of these has a large standard error. The smallest of the return level is for Rupike. The second smallest of the return level is for Tuli Police.

Table 3 Estimates and standard errors of 100-year return level of annual highest monthly rainfall.

However, many of the locations in Tables 2 and 3 have large standard errors compared to the estimates of probable maximum/100-year return level. In Table 2, they are Acturus Mine, Centenary, Chimanimani DA, Chisengu, Chivhu, Eiffel Flats Blue, Forthergill, Gwanda Rail, Harare Airport, Harare Belvedere, Hwange National Park, Kwekwe, Makoholi, Mberengwa DA, Mutare Fire, Mvuma Arex, Nkayi, Odzi Police Rail, Plumtreee, Rusape, Rutenga, Selous, Tuli Police, Trelawney West Enton, Tsholotsho and Victoria Falls. In Table 3, they are Buffalo Range, Buhera, Centenary, Chimanimani DA, Chisengu, Chisumbanje, Forthergill, Glendale Rail, Gwanda Rail, Harare Belvedere, Hwange National Park, Kezi, Lupane, Makoholi, Matopos Research Station, Mberengwa DA, Middle Sabi Tanganda, Mphoengs, Mutare Fire, Mvuma Arex, Nkayi, Nyamadhlovu, Odzi Police Rail, Plumtreee, Rutenga, Tashinga, Tuli Police, Trelawney West Enton, Tsholotsho and West Nicholson. The conclusions for these locations should be treated with caution.

The fit of the GEV distribution for each station was checked by probability plots, quantile plots and the Kolmogorov–Smirnov test. The plots are shown in Figs. 4 and 5 for two of the stations. The plots were similar for other stations. The p-values of the Kolmogorov–Smirnov test for the two stations were 0.081 and 0.078. The p-values for other stations were greater than 0.05 too. Hence, the GEV distribution provides an adequate fit for all stations.

Figure 4
figure 4

Probability (left) and quantile (right) plots for Bulawayo airport with 95% simulated confidence intervals (dashed lines). R software, version 4.1.2, https://www.r-project.org/ was used for plotting.

Figure 5
figure 5

Probability (left) and quantile (right) plots for Harare airport with 95% simulated confidence intervals (dashed lines). R software, version 4.1.2, https://www.r-project.org/ was used for plotting.

Having checked the goodness of fit, (3) was computed for every station and a range of values of T. Plots of \(x_T\) for \(T = 2, 5, 10, 20, 50, 100\) years are shown in Figs. 6 and 7.

Figure 6
figure 6

Estimates of 2-year return level (first raw, left), 2-year return level 10 years ahead (first raw, middle), 2-year return level 20 years ahead (first raw, right), 5-year return level (second raw, left), 5-year return level 10 years ahead (second raw, middle), 5-year return level 20 years ahead (second raw, right), 10-year return level (third raw, left), 10-year return level 10 years ahead (third raw, middle) and 10-year return level 20 years ahead (third raw, right). ggplot2 version 3.3.5, https://cran.r-project.org/web/packages/ggplot2/index.html was used for plotting.

Figure 7
figure 7

Estimates of 20-year return level (first raw, left), 20-year return level 10 years ahead (first raw, middle), 20-year return level 20 years ahead (first raw, right), 50-year return level (second raw, left), 50-year return level 10 years ahead (second raw, middle), 50-year return level 20 years ahead (second raw, right), 100-year return level (third raw, left), 100-year return level 10 years ahead (third raw, middle) and 100-year return level 20 years ahead (third raw, right). ggplot2 version 3.3.5, https://cran.r-project.org/web/packages/ggplot2/index.html was used for plotting.

According to the 2-year return level, the wettest areas are those around Shurugwi, those around Harare and that between Masvingo and Mutare. The driest areas are those bordering Botswana and South Africa. The picture for the 5-year return level is similar, but the wettest regions are smaller compared to those for the 2-year return level.

According to the 10-year return level, the wettest areas are those around Shurugwi, an area between Masvingo and Mutare and a northern area bordering Zambia. The driest areas are once again those bordering Botswana and South Africa. The picture for the 20-year return level is similar, but the wettest regions are smaller compared to those for the 10-year return level.

According to the 50-year return level, the wettest area is a northern area bordering Zambia. The driest areas are once again those bordering Botswana and South Africa. The picture for the 100-year return level is similar, but the wettest region is smaller compared to that for the 50-year return level.

Finally, significant trends in the annual highest monthly rainfall for each station are investigated. The distribution (1) with the location parameter \(\mu = a + b \times \mathrm{Year}\) was fitted, where b is the trend parameter. The trend was seen to be significant or not by comparing the fit of this model with the earlier fit of the GEV distribution. Models like \(\mu = a + b \times \mathrm{Year} + c \times \mathrm{Year}^2\) and \(\mu = \exp \left( a + b \times \mathrm{Year} \right) \) were also fitted, but they did not provide significantly better fits. The methodology used for fitting models like \(\mu = a + b \times \mathrm{Year}\) is described in Chapter 6 of Coles27.

Table 4 Stations exhibiting significant trends in the location parameter.

Table 4 lists the station names and the parameter estimates of a and b, and p-values showing significance of the trend (since they are all less than 0.05). For the stations not listed in Table 4, the p-values were greater than 0.05, hence trends were not significant. Only 15 of the 103 stations exhibit significant trends. Of the 15 stations, 12 stations exhibit negative trends. These stations are plotted in red in Figs. 6 and 7. The remaining 3 stations exhibit positive trends. These stations are plotted in blue in Figs. 6 and 7. The return level estimates 10 years ahead and 20 years ahead of the data records for \(T = 2, 5, 10, 20, 50, 100\) years are also shown in Figs. 6 and 7. The return level estimate m years ahead of the data records was computed using

$$\begin{aligned} \displaystyle x_T = \widehat{a} + \widehat{b} \left( \text{ Last } \text{ year } \text{ of } \text{ records } + m \right) + \frac{\widehat{\sigma }}{\widehat{\xi }} \left\{ \left[ -\log \left( 1 - \frac{1}{T} \right) \right] ^{-\widehat{\xi }} - 1 \right\} . \end{aligned}$$

The general pattern is that the weather is getting drier with time. However, the changes are statistical significant only at the 15 stations.

The negative trends may be due to climate change or other factors. But this must be treated with caution because seven of the fifteen stations have limited data: Gwanda Rail (1909–2011), Headlands Rail (1916–2013), Hwange Rail (1909–2011), Marula West (1909–2014), Rutenga (1955–2006), Rugare Tengwe Thurlaston (1952–2002) and Tuli Police (1898–2001).

Conclusions

This paper has provided the first statistical analysis of maximum rainfall in Zimbabwe involving data from 103 stations. The generalized extreme value distribution was shown to provide an adequate fit (as assessed by probability plots, quantile plots and Kolmogorov–Smirnov tests) to data from each station. Eight of the stations (Beatrice Post Office, Chimanimani DA, Concession, Gwanda Rail, Headlands Rail, Hwange Rail, Lusulu, Rutenga, Shurugwi, Rugare Tengwe Thurlaston, Tuli Police and Victoria Falls) exhibit significant negative trends in maximum rainfall. Three of the stations (Chisengu, Marula West and Murehwa) exhibit significant positive trends in maximum rainfall. The remaining stations do not exhibit significant trends.

The wettest areas with respect to 2-year and 5-year return levels are those around Shurugwi, those around Harare and that between Masvingo and Mutare. The wettest areas with respect to 10-year and 20-year return levels are those around Shurugwi, an area between Masvingo and Mutare and a northern area bordering Zambia. The wettest area with respect to 50-year and 100-year return levels is a northern area bordering Zambia. Zimbabwe has taken measures to make good use of the wettest areas. For example, some recent dams built include the Mutange dam built in the Gokwe area in 2016, the Tokwe Mukorsi dam built in the Masvingo area in 2017 and the Kunzvi dam built in the Goromonzi district in 2021.

The driest areas are those bordering Botswana and South Africa. Drought resistent crops (including sunflower, millet, sorghum, bambara nuts and groundnuts) are being grown in these and other areas. Farmers are also using water saving “drip irrigation” methods to grow crops. According to Wikipedia, drip irrigation is a “type of micro-irrigation system that has the potential to save water and nutrients by allowing water to drip slowly to the roots of plants, either from above the soil surface or buried below the surface”.

The results presented in this paper can inform positive actions by the Government of Zimbabwe: for example, further vegetables and other commodities less reliable on rain can be planted on areas showing negative trends; increased agricultural and electricity production based on water can take place in the wettest areas; increased electricity production based on solar energy can take place in the driest areas; and so on.