1 Introduction

Rainfall Prediction is one of the most difficult research topics globally (Oswal, 2019), but it is extremely important in water resource engineering for proper management of floods and droughts (Gao et al., 2020). Modeling and predicting rainfall is crucial for generating data as well as for providing information, which can then be used in a variety of applications, such as water resource management, hydrology, and agriculture (He et al., 2022). The prediction of rainfall and other climate conditions requires various models, depending on the time and spatial scales involved (Yusuf et al., 2014). For analyzing, simulating and forecasting hydrological variables, a variety of models, methods and techniques are found in literature review, e.g., Autoregressive Integrated Moving Average (ARIMA), Artificial Neural Network (ANN), Nearest-Neighbors (NN), Fuzzy System, Numerical Weather Prediction model, and Holt’s method (Holt, 1957). These methods, models and techniques can be selected typically based on the goals of research, the accessibility of input data, the quality of models, and certain predefined assumptions (Makridakis et al., 1998).

To guarantee a high degree of accuracy, researchers have evaluated the properties and characteristics of different models, so as to determine whether a certain model is appropriate for application in a given real-world situation. As a result, model selection becomes one of the major factors that influence the precision of the prediction data series. For example, Brath et al. (2002) used three models – ANN, ARIMA and NN – to enhance the prediction of the occurrence of floods caused by rainfall. With six hours' worth of rainfall data, it was found that ANN was the best at predicting rainfall. In a study by Kottegoda et al. (2004), a first-order Markov chain model successfully fitted the observed rainfall data in Italy; the model was built under the presumption that daily rainfall was dependent on the amount of rain that had fallen down the day before. Also using a Markov chain model, Barkotulla (2010) generated the daily rainfall occurrence based on transitional probability matrices; the model's parameters were acquired from historical daily rainfall records from 1980 to 2009. The results revealed that the model could successfully generate rainfall data.

Moreover, Liu et al. (2011) applied a Markov chain model to predict the daily rainfall series in 2004 based on the daily rainfall data collected in 2002 and 2003 in Tianjin, China. Because the predicted results met practical requirements, the model could be used as a weather generator to produce rainfall data for future periods. Chung et al. (2016) also used a Markov chain model for hourly rainfall data collected from 1985 to 2014 in Korea, finding that the rainfall occurrence was successfully fitted with the results of the Markov chain model. In addition, Fadhil et al. (2016) developed a stochastic rainfall generator model based on a first-order Markov chain model, by employing the data of a rainfall time series collected from 1976 to 2006 in Northwest Perak, Malaysia, with a two-state model (i.e., for dry and wet conditions) taken into account. In a word, first-order Markov models are generally deemed to be satisfactory, so it is justifiable to use such models to produce future rainfall series under various climate change scenarios.

Moreover, Gui and Shao (2017) and Zhou et al. (2017) applied certain Markov chain models to predict annual rainfall data series in Dangshan County and Shandong Province, China, respectively, finding that their proposed models had good prediction accuracy. Mahanta et al. (2018) applied a Markov chain model to the daily rainfall data collected at two stations (Dhaka and Chittagong) in Bangladesh. Examination of the behavior of the then-current daily rainfall data revealed that 56% of the days from May to October in Dhaka station are rainy, while 58% of the days in Chittagong are sunny. Malakoutian et al. (2021) used the rainfall data in six meteorological regions of North Cyprus from 1975 to 2014 to predict each meteorological region's yearly rainfall 5 years ahead. Three different models (i.e., Markov, ARIMA, and Holt-Winter) were adopted. The selected model for each region was then used to predict the rainfall for the five successive hydrologic years from 2014–2015 to 2018–2019.

Although most of the earlier rainfall forecasting studies have noted the effectiveness of the Markov model in forecasting rainfall, few studies have compared the outcomes of the prediction of Markov probability matrixes using various rainfall states with the outcomes of Markov chain models for forecasting rainfall in future periods. To address the gaps in previous research, this study sets the followings goals: (1) To provide additional insights into the changes in rainfall patterns; (2) To calculate the length of time needed for finding the steady-state probabilities in forecasting rains; (3) To predict and forecast rainfall in future periods; (4) To show how to use the first-order Markov chain model to create annual rainfall data for future times. This study makes the following research contributions: (1) A novel approach for creating prediction models is proposed using different rainfall states based on the Markov model; (2) The ability of the Markov model to predict and generate time series data is proved. The methodology used in this study can be applied to other regions of Iraq as well as to other nations.

2 Area and data of this study

Located in northern Iraq, and bordering Iran, Syria and Turkey, the Kurdistan Region comprises three main governorates: Duhok, Erbil and Sulaymaniya (Danilovich, 2016). With a Mediterranean climate (Hajani & Klari, 2022), the Kurdistan Region has hot and dry summers, but its temperature is mild in winter, with a very attractive spring season (WCG, 2019). Its average annual temperature is 32 °C, with about 71 mm of precipitation in a year (WCK, 2021). The daily average maximum temperature in Kurdistan goes up to 45 °C in the summer (in July) and the minimum goes down to 11 °C in January (Aziz et al., 2022). It is dry for 315 days across a year, with an average humidity of 26% and a UV index of 7 (Hajani et al., 2022). This study covers the historical daily rainfall data series for 32 years in the period of 1990–2021 at three rainfall stations in the Kurdistan region (see Fig. 1). The daily rainfall data are analyzed to identify the maximum precipitation in a year (i.e., 365 days), so as to use in the analysis of this study.

Fig. 1
figure 1

The location map of the study area

Table 1 shows the descriptive statistics of the three adopted stations, including their variance, standard deviation (Std. D.), coefficient of variation (C.V.), variance, coefficient of skewness (SK), and kurtosis. The mean AMR varies between 48.179 mm in the northern part of the study area (i.e., the Duhok station) and 62.998 mm in the southwestern part (i.e., the Sulaymaniya station). The minimum (Min) and maximum (Max) in Table 1 show that the lowest amount of rainfall (23.9 mm in 1996) occurred at the Erbil station and an extremely high amount occurred at the Duhok station (150 mm in 1993). The highest values of Std. D., C.V., Variance and Kurtosis are discovered at the Duhok station, indicating a significant variation in the rainfall data; and the highest Kurtosis value in the data set tends to have a clear peak close to the mean. The Sulaymaniya station has the highest value of SK, which clearly shows that this station is highly skewed, and its asymmetric tail also extends to the right of the mean value.

Table 1 The statistical description of the Duhok, Erbil and Sulaymaniya stations

3 Methodologies

3.1 Markov probability model

Markov analysis is a scientific method for studying and analyzing a phenomenon in the current period, so as to predict its behavior in the future (Kenton, 2021). The Markov process is a type of stochastic process that can, in essence, predict a random variable based solely on the current circumstances surrounding the variable, without the need to know the past (Rykov et al., 2010). The Markov chain contributes to situation forecasting by identifying a phenomenon from one period to the next with a Markov Matrix, which is known as Transition Probabilities Matrix (TPM) from the prior case (Jimoh & Webster, 1996). According to Eq. (1), the conditional prediction of any future state (\({X}_{n+1}\)) by using a Markov chain model is independent of the past state (\({X}_{0}\), …, \({X}_{n-1}\)), but depend only on the present state (\({X}_{n}\)) (Yusuf et al., 2014).

$$P\left({X}_{n+1}=j\mid {X}_{n}=i,{X}_{n-1}={i}_{n-1},\dots ,{X}_{1}={i}_{1},{X}_{0}={i}_{0}\right)=P\left({X}_{n+1}=j\mid {X}_{n}=i\right)={P}_{ij}.$$
(1)

In this study, the first-order Markov chain model was used. The three states adopted include: decrease—"d"; stability—"s"; and increase—"i". The flowchart of Markov modeling adopted in this study is given in Fig. 2. The transition probability matrix (Markov matrix) is defined as \({P}_{ij}\equiv P(j\mid i)\), where \(i,j\in\) the states. The Markov matrix is formed through observing the time series and identifying the number of transitions from one state to another for the three states under study ("d", "s", and "i"). In this study, depending on the AMR data for the three adopted stations, the transition forms of the Markov matrix are modeled as below:

  1. 1.

    State decrease—"d": the AMR data are below 50 mm.

  2. 2.

    State stability—"s": the AMR data are between 50 mm and 60 mm.

  3. 3.

    State increase—"i": the AMR data are above 60 mm.

Fig. 2
figure 2

The procedure of applying the Markov model

The transition between the states is described by the transition diagram and probability matrix (\(P\)) below:

$$P=\left[\begin{array}{ccc}{P}_{11}& {P}_{12}& {P}_{13}\\ {P}_{21}& {P}_{22}& {P}_{23}\\ {P}_{31}& {P}_{32}& {P}_{33}\end{array}\right].$$

The Maximum Likelihood Estimator (MLE) is used for estimating the transition probabilities in the Markov matrix (Jale et al., 2019; Zhang et al., 2014). The sum of the probabilities of the \(P\) matrix in each row must equal one. The MLE process involves dividing the elements in a row of the \(P\) matrix by the sum of all the elements in that row, so as to estimate each element in the Markov matrix.

3.2 Test for goodness of fit

In order to predict the AMR data series by using the Markov chain model, the validity of the proposed three states (i.e., decrease, stability and increase of rainfall) used in the Markov chain approach was tested in the following way: The null hypothesis H0: rainfall occurrences in consecutive years are independent; vs. The alternative hypothesis H1: rainfall occurrences in consecutive years are not independent (Garg & Singh, 2010). For the three suggested states in the Markov chain, two tests for goodness of fit, namely the WS test and the Chi-Squared (\(\chi^{2}\)) test (Wang & Martiz, 1990; Preacher, 2001; Jale et al., 2019), were made, as given in Eqs. (2) and (3) below:

$$\mathrm{WS}=\frac{A+B-1}{\sqrt{V(A+B-1)}} \quad N\left(\mathrm{0,1}\right),$$
(2)

where \(A={P}_{dd}+{P}_{ss}+{P}_{ii}; B={P}_{id}{P}_{di}+{P}_{si}{P}_{is}+{P}_{ds}{P}_{sd}-{P}_{dd}{P}_{ss}-{P}_{dd}{P}_{ii}-{P}_{ss}{P}_{ii}\). The variance (V) of \(\left(A+B-1\right)\) in Eq. (1) is given by\(V\left(A+B-1\right)=2{p}_{1}{p}_{2}{p}_{3}\left(\frac{1}{{n}_{d}{n}_{s}}+\frac{1}{{n}_{s}{n}_{i}}+\frac{1}{{n}_{i}{n}_{d}}\right)\), where \({n}_{d},{n}_{s}\) and \(, {n}_{r}\) are the numbers of states used in this study (i.e., "d", "s", and "i" states), while \({p}_{1},{p}_{2}\) and \({p}_{3}\) indicate the stationary probabilities, specifically:\({p}_{1}=[\left(1+p\right)+\frac{\left(1+s\right)p}{q{]}^{-1}}; {p}_{2}=\left[r+\frac{ps}{q}\right]{p}_{1}; {p}_{3}=[p/q]{p}_{1}\). \(p=\left[{P}_{di}+\frac{{P}_{si}\left(1-{P}_{dd}\right)}{{P}_{sd}}\right]\left(\frac{1}{1-{P}_{ii}}\right);\) \(q=1+\left[\frac{{P}_{si}{P}_{id}}{{P}_{sd}\left(1-{P}_{ii}\right)}\right];r=\left(\frac{{P}_{ds}}{1-{P}_{ss}}\right); s=\left(\frac{{P}_{is}}{1-{P}_{ss}}\right)\). The probability (\(P\)) transforming from a state of "d" into another state of "d" is represented by\({P}_{dd}\); that from a state of "s" into another "s" is represented by\({P}_{ss}\); that from a state of "i" into another "i" is represented by\({P}_{ii}\); that from a state of "d" into a state of "i" is represented by\({P}_{di}\); that from a state of "i" into a state of "d" is represented by\({P}_{id}\); that from a state of "s" into a state of "i" is represented by\({P}_{si}\); that from a state of "i" into a state of "s" is represented by\({P}_{is}\); that from a state of "d" into a state of "s" is represented by\({P}_{ds}\); and that from a state of "s" into a state of "d" is represented by\({P}_{sd}\).\({P}_{dd}\), \({P}_{ss}\), \({P}_{ii}\), \({P}_{di}\) ,\({P}_{id}\), \({P}_{si}\), \({P}_{is}\), \({P}_{ds}\) and \({P}_{sd}\) are the elements of the TPM.

According to the test procedure, the critical region is \(|\mathrm{WS}{|}_{c}\ge {Z}_{\alpha }\) at the \(\alpha\) level of significance, i.e., the null hypothesis is rejected if \(|\mathrm{WS}|\ge {Z}_{\alpha }\), where \({Z}_{\alpha }\) indicates the \(100(1-\alpha )\) lower percentage points of a standard normal distribution (Garg & Singh, 2010).

$${\chi }^{2}=\sum_{i=1}^{n}\frac{{\left({X}_{i}-{E}_{i}\right)}^{2}}{{E}_{i}},$$
(3)

where \({X}_{i}\) is the observed frequency of the data sample, \(E_{i}\) is the expected frequency of the data sample as calculated by \(E_{i} = F(X_{2} ) - F(X_{1} )\), and \(F\) is the cumulative distribution function of the probability distribution being tested. The test was conducted at a 5% significant level.

3.3 The initial state vector and the n-step transition state vectors

The initial state vector (\({\pi }^{0}\)) of the Markov chain is estimated from the sum elements of the rows of the TPM divided by the sum of all the elements of the TPM (Howard, 1971; Jain, 1986) as follows:

$${\pi }^{0}=\frac{\sum_{i=1}^{n}{P}_{i}}{\sum P}.$$
(4)

In this study, the n-step transition probability state vectors (\({\pi }^{n}\)) of the Markov chain are following Cox and Miller (1984) and Yusuf et al. (2014). In the equation above, \({P}_{i}^{n}\) is the probability that the annual maximum rainfall is in the ith state at the nth observation. In particular, \({\pi }^{\mathrm{n}}\) is the state vector of the Markov chain at the nth year (Jain, 1986; Yusuf et al., 2014), and it can be estimated by using the following equation:

$${\pi }^{(n+1)}={\pi }^{(n)}P,$$
(5)

where \(P\) is the TPM and \({\pi }^{n+1}\) is the state vector at the \((n+1)\)th data observation. After a different number of iterations, the n-step state vectors will be estimated as:

$${\pi }^{n}={\pi }^{0}{P}^{n}.$$
(6)

3.4 Equilibrium probabilities

The equilibrium probabilities \({\pi }_{d,}{\pi }_{s}\) and \({\pi }_{i}\) (of a "d", "s" and "i", respectively) are determined by resolving the stationary matrix equation (Garg & Singh, 2010; Jale et al., 2019). As a result, there is no change in the probability of being in any state over time, i.e., there is a limit of \({\mathrm{lim}}_{n\to \infty } {P}_{ij}^{(n)}={\pi }_{j}>0\), where \({\pi }_{j}\) satisfies only the following stable state equation: \({\pi }_{j}={\sum }_{i=1}^{m} {\pi }_{i}{P}_{ij}\) to \(j=1,\dots ,m\). Therefore, the linear system of equations (i.e., \({\pi }_{d}={\pi }_{d}{P}_{dd}+{\pi }_{s}{P}_{sd}+{\pi }_{i}{P}_{id},{\pi }_{s}={\pi }_{d}{P}_{ds}+{\pi }_{s}{P}_{ss}+{\pi }_{i}{P}_{is}\) and \({\pi }_{i}={\pi }_{d}{P}_{di}+{\pi }_{s}{P}_{si}+{\pi }_{i}{P}_{ii}\)) can be solved to obtain the estimators of the long-term equilibrium probabilities, together with the probability normalization condition \({\pi }_{d}+{\pi }_{s}+{\pi }_{i}=1\).

4 Using a Markov model in forecasting

The formula of a Markov model for generating a data series is based on the following procedures (Gupta, 1989).

  1. i.

    Transform data to the normal distribution

    Before starting the steps of building a random number generation model, it must be first confirmed that the time series used in the model is subject to the normal distribution. In this study, Anderson–Darling (AD) and Lilliefors (LT) tests for normality were applied (Anderson & Darling, 1954; Abdi & Molin, 2007). If the normal distribution cannot be achieved in a data series, the Box-Cox method will be applied for transformation (Box & Cox, 1964; Sakia, 1992). The forms of the Box-Cox transformation are given by:

    $$X\left(\lambda \right)=\frac{\left({X}^{\lambda }-1\right)}{\lambda } \quad {\mathrm{for}}\, \lambda \ne 0.$$
    (7)
    $$X\left(\lambda \right)=\mathrm{ln}(X) \quad {\mathrm{for}}\, \lambda =0.$$
    (8)

    The lambda (λ) value ranges from − 5 to 5, and the best λ value for the data is chosen after taking all possible values into account. The optimal value of λ appears where a normal distribution curve is most closely approximated (Chedded, 2020).

  2. ii.

    Make descriptive statistics

    Three important parameters were used during the analysis of the Markov model: the mean, standard deviation, and correlation coefficient, as determined by the following equations:

    $${\mu }_{X}=\frac{1}{n}\sum_{i=1}^{n}{X}_{i},$$
    (9)
    $${\sigma }_{X}=\sqrt{\frac{1}{n-1}\sum_{i=1}^{n}{({X}_{i}-{\mu }_{X})}^{2},}$$
    (10)
    $${r}_{k}=\frac{n\sum_{i=1}^{n-k}({X}_{i}-{\mu }_{X})({X}_{i+k}-{\mu }_{X})}{(n-k){\sigma }_{X}^{2}}; \quad k=1.$$
    (11)
  3. iii.

    Generate random numbers

    Random numbers (t) were generated in Microsoft Excel (Kuo, 2016; Maass et al., 1962) with the RAND () command. After a random variable t with an arithmetic mean of zero and unit variance was acquired, i.e., N (0, 1), it was then converted to follow the standard normal distribution according to the following equation (inverse error function).

    $${\mathrm{erf}}^{-1}\left(z\right)=\frac{1}{2}\sqrt{\pi }\left(z+\frac{\pi }{12}{z}^{3}+\frac{7{\pi }^{2}}{480}{z}^{5}+\frac{127{\pi }^{3}}{40320}{z}^{7}+\frac{4369{\pi }^{4}}{5806080}{z}^{9}+\cdots \right),$$
    (12)

    where z value can be found through the cumulative function distribution (CDF) function of the normal logarithm, as follows:

    $$\mathrm{CDF}=\frac{1}{2}+\frac{1}{2}{\mathrm{erf}}\left|\frac{\mathrm{ln}x-\mu }{\sqrt{{2\sigma }^{2}}}\right|=\mathrm{RAND}\left(\right),$$
    (13)
    $${\mathrm{erf}}\left(z\right)={\mathrm{erf}}\left|\frac{\mathrm{ln}x-\mu }{\sqrt{{2\sigma }^{2}}}\right|=\left[\mathrm{RAND}\left(\right)-0.5\right]*2,$$
    (14)
    $$z=2\left[\mathrm{RAND}\left(\right)-0.5\right].$$
    (15)

    As the normal logarithm of random numbers has properties of μ = 1 and σ = 1, therefore:

    $$\left|\frac{\mathrm{ln}x-1}{\sqrt{2}}\right|=z \to {\mathrm{erf}}^{-1}\left(z\right)=\frac{\mathrm{ln}x-1}{\sqrt{2}} \to \sqrt{2}{\mathrm{erf}}^{-1}\left(z\right)=\mathrm{ln}x-1$$

    Then the random number formula is:

    $$t=\mathrm{ln}x=\sqrt{2}{\mathrm{erf}}^{-1}\left(z\right)+1.$$
    (16)
  4. iv.

    Build the Markov model for the generation

    The Markov model's generalized form, as developed by Thomas and Fiering (Bin Muhammad, 2012), is represented by the following equation:

    $${X}_{i+1}={\mu }_{X}+{r}_{i}\left({X}_{i}-{\mu }_{X}\right)+{\sigma }_{x}{t}_{i}\sqrt{(1-{r}_{i}^{2})},$$
    (17)

    where \({X}_{i}\) is the rainfall value in time i, \({\mu }_{X}\) is the mean value of the data series, \({r}_{i}\) is the correlation coefficient, \({\sigma }_{x}\) is the standard normal deviation, and \({t}_{i}\) is the random number.

4.1 Measurements of the reliability of the forecasting

To assess the forecasting model in this study, the relative error (RE), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) were measured. Additionally, the Willmott (1981) index of agreement (D) was measured, which can be interpreted as the relative prediction error between the actual data and the fitted data from the Markov chain model. D = 1 indicates a perfect match, while D = 0 indicates no agreement at all. The errors mentioned above are expressed in the following equations:

$${\mathrm{RE}} =\sum_{t=1}^{n}\frac{{X}_{t}-{\hat{X}}_{t}}{{X}_{t}},$$
(18)
$$\mathrm{RMSD} = \sqrt{\frac{1}{n}\sum_{t=1}^{n}{\left({X}_{t}-{\hat{X}}_{t}\right)}^{2}},$$
(19)
$$\mathrm{MAE} =\frac{1}{n} \sum_{t=1}^{n}\left|{X}_{t}-{\hat{X}}_{t}\right|,$$
(20)
$$\mathrm{MAPE} = \frac{1}{n} \sum_{t=1}^{n}\frac{\left|{X}_{t}-{\hat{X}}_{t}\right|}{{X}_{t}}*100,$$
(21)
$$D=1-\frac{{\sum }_{i=1}^{n} {\left({X}_{t}-{\hat{X}}_{t}\right)}^{2}}{{\sum }_{i=1}^{n} {\left(\left|{\hat{X}}_{t}-{X}\right|+\left|{\bar{X}}_{t}-\bar{X}\right|\right)}^{2}} , \quad 0\le D\le 1,$$
(22)

where \({X}_{t}\) is the actual AMR value (mm), \({\hat{X}}_{t}\) is the estimated (forecasted) AMR value (mm), and \(\bar{X}\) is the average actual AMR (mm).

5 Results and discussion

5.1 The AMR data series

The plots for the time series of the AMR data at the three adopted stations during the period of 1990–2021 are presented in Fig. 3. As can be seen, the trend line moves downward at a rate of 0.215 mm/year at the Duhok station and at a rate of 0.044 mm/year at the Erbil station; for the Sulaymaniya station, however, the trend line moves upward at a rate of 0.794 mm/year, although the trend slopes of decreasing and increasing are quite small in magnitude.

Fig. 3
figure 3

The AMR data series for the three adopted stations

5.2 The transition probability matrix (TPM)

The data of AMR time series were used in this study to form the Markov matrix (TPM), with three states taken into account (i.e., "d", "s" and "i"). For these states, the TPM was sorted by the number of transitions from one state to another. To obtain the TPM, the elements of each row of the TPM were divided by the sum of the rows. The number of transfers in the states being examined is shown in Table 2. The Initial State Vector (\({\pi }^{0}\)) was estimated for each station based on the sum of the rows in Table 2. The TPM for the three adopted stations is shown in Table 3. The state-transition diagrams of the system for the three stations are shown in Fig. 4. The results of the TPM for the three adopted stations show that the probabilities of transitions from the state "i" to "d" and from the sate "d" to "i" are higher than for other states at the Duhok and Erbil stations; but at the Sulaymaniya station, the probability of transition from the state "i" to "d" is higher than for other states. The lowest value of the probability of transition is found for transition from the state "s" to "i" at the Duhok station.

Table 2 The transition count and initial state vector for the AMR behavior between "d", "s" and "i" states
Table 3 The TPM (\(P\)) of the AMR behavior for the three adopted stations
Fig. 4
figure 4

Diagrams of state transitions for the Duhok, Erbil and Sulaymaniya stations

5.3 The results of goodness-of-fit tests

The statistics of WS and \({\chi }^{2}\) tests (as described in Sect. 3) were used to measure the goodness of fit of the Markov chain to the AMR data for the three adopted stations (Duhok, Erbil, and Sulaymaniya). The estimated values of the WS and \({\chi }^{2}\) statistics as well as their p-values are presented in Table 4. As seen in Table 4, the results of these two statistics indicate that the Markov chain is a model appropriate to the AMR data series for the three adopted stations, and it is proven that the both tests satisfy a major property of the Markov chain model.

Table 4 Estimated values of the WS and \({\chi }^{2}\) statistics and their p-values

5.4 The transition probability for future periods

Depending on the initial state vector shown in Table 2, the average probability of moving from all transition count states to the state "d" for all the three stations is 0.427; that for the state "s" is 0.135; and that for the state "i" is 0.438. The probability prediction of all the states under study for a future period (e.g., 5 years) was estimated based on the initial state vector and Markov matrix, as described in Sect.  3 (i.e., \({\pi }^{1}={\pi }^{0}*P\), \({\pi }^{2}={\pi }^{1}*P\), …, \({\pi }^{5}={\pi }^{4}*P\)). The result is shown in Table 5.

Table 5 The probability prediction vectors for a certain future period in the Duhok, Erbil, and Sulaymaniya stations

5.5 The transition probability matrix for the n-steps (equilibrium state)

The equilibrium and stability state can be reached based on Formula (4) in iteration, as described in Sect.  3. In this study, MATLAB software was used to obtain accurate results of n-steps of the TPM. It should be noted that the TPM was corrected to 3 decimal places. As observed in Additional file 1: Table S1, when n increases, the results will become more similar. According to the results in Additional file 1: Table S1, the TPM will reach an equilibrium and stability state in 12 years for the Duhok station, in 11 years for the Erbil station, and in 9 years for the Sulaymaniya station. These probability matrices continue to infinity (if the current climate conditions are kept). After the process reaches a steady state as mentioned before, the limiting state vectors (by using Eq. (5)) are: for the Duhok station: \({\pi }_{\mathrm{Duhok}}^{(n\ge 12)} =\left[\begin{array}{ccc}0.464& 0.178& 0358\end{array}\right]\); for the Erbil station: \({\pi }_{\mathrm{Erbil}}^{(n\ge 11)}=\left[\begin{array}{ccc}0.448& 0.123& 0.\end{array}429\right]\); and for the Sulaymaniya station: \({\pi }_{\mathrm{Sulaymaniya}}^{(n\ge 9)}=\left[\begin{array}{ccc}0.387& 0.174& 0.439\end{array}\right]\). Therefore, the results of limiting state vectors show that the probabilities would drop slowly from 0.433 to 0.409 in about 11 years on average for the three adopted stations.

5.6 Steps of applying the Markov model in forecasting the AMR data

  1. i.

    The normality distribution of the data

    To find out whether the AMR data series follows a normal distribution or not, the normality tests (i.e., AD and LN) was conducted, as mentioned in Sect. 3, with the results demonstrated in Table 6. Whereas the null hypothesis for the tests is that the data were normally distributed, the alternate hypothesis is that the data did not come from a normal distribution. The p values of the both tests in Table 6 are less than 0.05, so the tests reject the hypothesis of normality. Hence, it can be concluded that the AMR data series for three adopted stations do not follow a normal distribution. The Box-Cox transformation method (Sect. 3) was used to transform non-normal AMR data, so as to meet a normal distribution, as shown in Table 7.

  2. ii.

    The statistical measures and features of the Markov model

    Table 6 Estimated values of the normality tests of AMR data series
    Table 7 The transformed function depends on the Lambda value (λ)

    Table 8 presents the summary of statistical results of the transformed AMR data for the three adopted stations. These statistical results are essential for generating the AMR data in the Markov model, as described in Sect. 3.

  3. iii.

    Generation of random numbers

    Table 8 Descriptive statistical measures of the transformed AMR data series

    At this stage, random numbers were generated in Microsoft Excel through its RAND () command, as mentioned in Sect. 3. For example, the procedures for generating random numbers for the years 1990–2000 are shown in Additional file 1: Table S2.

  4. iv.

    The application of the Markov Model for generating the AMR data

    The Markov model consists of two parts: the first part is deterministic and considers the effect of the previous statistical values on the model (in Subsection ii); and the second part is for random numbers and represents the random part of the model (in Subsection iii). By combining and adding these two parts, a Markov generation model is applied according to Eq. (17) in Sect. 3. It should be noted that for the purposes of using Eq. (17), the first-generation value was assumed to be the mean (μ) of the data series. The plots between the observed and predicted values in Fig. 5 for the Markov model on the transformed AMR data series for the three adopted stations indicates that the predicted values follow the observed data closely enough.

Fig. 5
figure 5

Time series data was generated by using the Markov model for the Duhok, Erbil and Sulaymaniya stations. The actual data are given in blue color, and the fitted values are in red line color

In addition, five other statistical tests (i.e., RE, MAE, MAPE, RMSE, and D) were conducted to evaluate the closeness between the actual AMR data and the fitted AMR data in the Markov chain model, as shown in Table 9. The lower values of RE, MAE, MSE and RMSE indicate a higher accuracy of the Markov chain model for each adopted station. The value of D is sensitive to the differences between the observed data and the generated data, and is also sensitive to certain changes in proportionality (Willmott et al., 1981). The results in Table 9 show that the D value between the observed data and the generated AMR from the Markov model for the three stations are greater than 80%, indicating that in the three adopted stations, the magnitudes of D were consistent with the independent and intuitive evaluations.

Table 9 The performance of the Markov model

With a D value of 0.871, the fitted value from the Markov model for the Sulaymaniya station is identified as a slightly "better" estimator of the observed variables than for the Duhok (D = 0.834) and Erbil (D = 0.829) stations. The results of the Willmott index (D) in Table 9 show a good match between the actual data and the generated AMR from the Markov chain model. Therefore, the generated AMR is acceptable at large percentages for the three adopted stations. The actual AMR data and the fitted AMR data by the Markov chain model are plotted in a scatterplot, as shown in Additional file 1: Fig. S1. A visual examination of the AMR time series scatterplot (Additional file 1: Fig. S1) confirms the results of the Willmott index.

The Markov chain model was used to generate the AMR data for a future period from 2022 to 2026 (five years), as shown in Table 10. It is found that, in general, the results of the generated AMR data for the future period confirm the future probability prediction vectors for the three stations as made in Sub Sect. 4.4. For example, for the Duhok station, it is found that the AMR data (Table 10) would decrease for the future period. This consists of the probability prediction vectors for the Duhok station (Table 5, and Sub Sect. 4.4), showcasing a decrease in the probability for the state "d" for most of the adopted future period more than for the "s" and "i" states.

Table 10 The AMR time series (mm) forecasted by using the Markov model for the period from 2022 to 2026

6 Conclusions

In this study, a Markov chain model was adopted to examine the pattern, distribution and forecast of the annual maximum rainfall (AMR) data at three selected stations (Duhok, Erbil, and Sulaymaniya) in the Kurdistan region of Iraq based on the rainfall data collected there within 32 years of 1990–2021. A Markov matrix (TPM) was formed based on the AMR time series data and sorted by the number of transitions from one state to another among the three states under study (i.e., decrease—"d"; stability—"s"; and increase—"i"). The Markov model was used to forecast the AMR data for several upcoming years (i.e., 2022–2026). To evaluate how well the Markov chain model fits the data, the Chi-square and WS tests were conducted. Additionally, the Lilliefors and Anderson–Darling tests for normality were used, while the Box-Cox transformation method was used to transform the non-normal AMR data to meet a normal distribution. To assess the Markov chain generation model, the following tests were conducted: the relative error (RE), the root mean square error (RMSE), the mean absolute error (MAE), the mean absolute percentage error (MAPE), and the Willmott index. The results of two goodness-of-fit tests (i.e., WS and \({\chi }^{2}\)) indicate that the Markov chain is an appropriate model for the AMR data series from the three adopted stations. As demonstrated, in the years of 2022–2026, the probability of the annual rainfall data decreasing will be 44%, with 16% of the annual rainfall keeping stable and 40% of the annual rainfall increasing. The TPM will reach an equilibrium and stability state after 12 years at the Duhok station, after 11 years at the Erbil station, and after 9 years at the Sulaymaniya station. In addition, it is shown that the probabilities will drop slowly from 0.433 to 0.409 for the AMR data series in about 11 years as an average for the three adopted stations. The predicted AMR data from the Markov model were compared with the observed AMR data, so as to determine the prediction precision, revealing that the RE, RMSE, MAE and MAPE test results between the observed data and the predicted data are less than 4%, which satisfies the criteria for forecast accuracy. In addition, the Willmott index shows a good match between the actual data and the generated AMR data from the Markov chain model. Therefore, the upcoming AMR data can be forecasted in this Markov model. The results of the generated AMR data for future periods were found able to confirm the probability prediction vectors for future periods at the three stations.