Evaluation of multivariate linear regression for reference evapotranspiration modeling in different climates of Iran

Sharafi, Saeed; Ghaleni, Mehdi Mohammadi

doi:10.1007/s00704-020-03473-0

Evaluation of multivariate linear regression for reference evapotranspiration modeling in different climates of Iran

Original Paper
Open access
Published: 07 January 2021

Volume 143, pages 1409–1423, (2021)
Cite this article

Download PDF

You have full access to this open access article

Theoretical and Applied Climatology Aims and scope Submit manuscript

Evaluation of multivariate linear regression for reference evapotranspiration modeling in different climates of Iran

Download PDF

Saeed Sharafi¹ &
Mehdi Mohammadi Ghaleni²

2384 Accesses
20 Citations
Explore all metrics

Abstract

The study aimed to evaluate the accuracy of empirical equations (Hargreaves-Samani; HS, Irmak; IR and Dalton; DT) and multivariate linear regression models (MLR1–6) for estimating reference evapotranspiration (ET_Ref) in different climates of Iran based on the Köppen method including arid desert (Bw), semiarid (Bs), humid with mild winters (C), and humid with severe winters (D). For this purpose, climatic data of 33 meteorological stations during 30 statistical years 1990–2019 were used with a monthly time step. Based on various meteorological data (minimum and maximum temperature, relative humidity, wind speed, solar radiation, extraterrestrial radiation, and vapor pressure deficit), in addition to 6 multivariate linear regression models and three empirical equations were used as MLR1, MLR2, and HS (temperature-based), MLR3 and IR (radiation-based), MLR4, MLR5 and DT (mass transfer-based), and MLR6 (combination-based) were also used to estimate the reference evapotranspiration. The results of these models were compared using the root mean square error (RMSE), mean absolute error (MAE), scatter index (SI), determination coefficient (R²), and Nash-Sutcliffe efficiency (NSE) statistical criteria with the evapotranspiration results of the FAO₅₆ Penman-Monteith reference as target data. All MLR models gave better results than empirical equations. The results showed that the simplest regression model (MLR1) based on the minimum and maximum temperature data was more accurate than the empirical equations. The lowest and highest accuracy related to the MLR6 model and HS empirical equation with RMSE was 10.8–15.1 mm month⁻¹ and 22–28.3 mm month⁻¹, respectively. Also, among all the evaluated equations, radiation-based models such as IR in Bw and Bs climates with MAE = 8.01–11.2 mm month⁻¹ had higher accuracy than C and D climates with MAE = 13.44–14.48 mm month⁻¹. In general, the results showed that the ability of regression models was excellent in all climates from Bw to D based on SI < 0.2.

Modelling reference evapotranspiration for Megecha catchment by multiple linear regression

Article 06 March 2019

Modeling of reference evapotranspiration for temperate Kashmir Valley using linear regression

Article 07 August 2020

Calibration of empirical equations for estimating reference evapotranspiration in different climates of Iran

Article 08 June 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Accurate estimation of reference evapotranspiration (ET_Ref) is one of the priorities for estimating the water requirement of agricultural products. The complexity of the evapotranspiration process and its dependence on meteorological variables, lack of access to all meteorological data, and the lack of generalizability of a model for different climates have made it difficult to accurately estimate this variable. Meanwhile, the development of simple models based on multivariate linear regression due to the simplicity and the ability to select different models according to the available meteorological variables can help in estimating the appropriate ET_Ref in different climates. In general, the estimation of ET_Ref can be done through direct and indirect methods and using mathematical models (Jing et al. 2019). Indirect methods for estimating ET_Ref began with the development of empirical equations such as Penman-Monteith (PM) (Monteith 1965). Based on the PM method, FAO₅₆ published a standard method for estimating ET_Ref. The PM method presented in FAO₅₆ has been used by many researchers as the base method in calculating ET_Ref (Güçlü et al. 2017; Saggi and Jain 2019; Shiri et al. 2019).

In recent years, however, artificial intelligence-based methods such as the neural networks (Kisi et al. 2015; Gavili et al. 2018), the support vector machine (Tabari et al. 2013), the extreme learning machine (Abdullah et al. 2015), decision tree (Huang et al. 2019; Raza et al. 2020), and hybrid methods (Ehteram et al. 2019; Shiri et al. 2020; Tikhamarine et al. 2019; Zhu et al. 2020; Kim et al. 2014; Sanikhani et al. 2019; Mehdizadeh et al. 2017) have had many applications in estimating ET_Ref, but among them, multivariate linear regression method has been compared with other empirical equations and soft computing, validated by many researchers (Reis et al. 2019; Kisi and Heddam 2019; Mattar and Alazba 2019; Tabari et al. 2012).

The appropriate method for estimating ET_Ref in each region depends on climatic conditions, required data, and related costs (Sharafi et al. 2016). Accordingly, Unes et al. (2020) predicted daily ET_Ref based on climatic conditions using empirical equations (Hargreaves-Samani, Ritchie, Turc, and Penman FAO 56), multilinear regression (MLR), and different data mining techniques (M5T, ANFIS, SVM). According to their results, Turc empirical formula (radiation-based) is found better than other empirical equations and the highest correlation coefficient is calculated for ANFIS, and the minimum errors are calculated for radial basis function SVM. Also, Chen et al. (2020) estimate daily ET_Ref based on limited meteorological data using three deep learning methods, two classical machine learning methods, and seven empirical equations. Their results show that, when temperature-based features were available, the deep learning models performed markedly better than temperature-based empirical models, and when radiation-based or humidity-based features were available, all of the proposed deep and classical learning machine models outperformed radiation-based or humidity-based empirical equations beyond the study areas.

Yirga (2019) estimated ET_Ref by multiple linear regression. The results of the MLR model could estimate and predict ET_Ref in the Megecha basin and can be used in similar parts of data-sparse regions. Karbasi (2018) investigated the Gaussian process regression (GPR) and Wavelet-GPR models to forecast multi-step ahead daily ET_Ref at the synoptic station of Zanjan (Iran). The results of the Wavelet-GPR model showed that the performance of the model during the warmer season is better than its performance throughout the year.

dos Santos Farias et al. (2020) evaluated the performance of machine learning techniques and stepwise multiple linear regression method to estimate daily ET_Ref with limited weather data in a Brazilian agricultural frontier. Their results showed that machine learning methods are robust in estimating ET_Ref, even in the absence of some variables. On the other hand, the use of artificial intelligence models in estimating ET_Ref with high accuracy has become prevalent in recent years, but the complexity of these models makes their application difficult for different regions. For this purpose, the present paper uses multivariate linear regression models with different data input to develop a simple comprehensive model with the minimum data required to estimate the ET_Ref in different climates of Iran. Although in most of the previous researches, the accuracy of the results has been investigated based on meteorological stations (points), in this research, the accuracy of multivariate linear regression models has been investigated based on different climates in Iran. This innovation extends the results of this paper to similar climates around the world, and also reduces the uncertainty of results as a result of fluctuations in climate variables on a point-by-point basis.

2 Materials and methods

2.1 Study area

Iran is located in the northern hemisphere between two longitudes of eastern 44° and 64° and two latitudes of northern 40° and 25°. Meteorological data of different synoptic stations, including different climates, are collected and analyzed. Some records of data input were incomplete, or not available for some stations; therefore, only stations with long statistical period length remained. Accordingly, in the present study, meteorological data of 33 synoptic stations in Iran over a statistical period of 30 years (1990–2019) are used. Figure 1 shows the location of the studied stations along with climatic classification based on the Köppen method (Köppen 1931) and according to the values of air temperature and precipitation. According to the study of Sharafi and Karim (2020) (the Köppen climate classification), 7 stations (Bandar Abbas, Abadan, Ahwaz, Bushehr, Bam, and Zabol), 7 stations (Zahedan, Semnan, Sabzevar, Kerman, Isfahan, Shahroud and Torbat-e Heydarieh), 13 stations (Shiraz, Gorgan, Tehran, Khorramabad, Bandar Anzali, Birjand, Rasht, Kermanshah, Mashhad, Arak, Qazvin, Sanandaj, and Tabriz), and 6 stations (Shahrekord, Khoy, Saqez, Urmia, Hamedan, and Zanjan) are located in the arid desert (Bw), semiarid (Bs), humid with very mild winters (C), and humid with severe winters (D) zones, respectively (Fig. 1).

To calculate ET_Ref values using empirical equations and regression models, the monthly data, parameters of minimum temperature (T_min), maximum temperature (T_max), relative humidity (RH), wind speed (u), solar radiation (R_s), extraterrestrial radiation (R_a), and vapor pressure deficit (VPD) were used in different stations. The statistical characteristics of these parameters are presented for different climates separately in Table 1.

Table 1 Statistical characteristics of meteorological variables

Full size table

According to the results in Table 1, the average monthly ET_Ref values decreased from 123.3 to 90.5 mm month⁻¹, respectively, from arid desert (Bw) to humid climate with severe winters (D). Also, the highest and lowest coefficient of variation (CV%) of ET_Ref changes in Bw and D climates was reported to be equal to 39.2% and 53.8%, respectively. The coefficients of variation of minimum and maximum temperatures similar to ET_Ref increased from Bw to D climates (Table 1).

R, rainfall; RH, relative humidity; T_min, minimum temperature; T_max, maximum temperature; u, wind speed; R_s, solar radiation; R_a, extraterrestrial radiation; ET_Ref, reference evapotranspiration by Penman-Monteith (Allen et al. 1998); Min, minimum; Max, maximum; Ave, average; SD, standard deviation; CV (%), coefficient of variation; Kur, kurtosis; Sk, skewness

In Fig. 2, the Pearson correlation coefficient values between ET_Ref and other meteorological variables (average, maximum and minimum temperature, vapor pressure deficient, wind speed, relative humidity, solar radiation, and extraterrestrial radiation) are shown for 33 study stations based on Köppen method. Pearson correlation coefficient values of + 1 and − 1 show the highest correlation with a direct and inverse relationship between the dependent variable (ET_Ref) and independent variable (meteorological variables), respectively. Figure 2 shows that ET_Ref were directly related to all meteorological variables. The only correlation coefficient between ET_Ref and relative humidity was negative, indicating an inverse relationship between ET_Ref and RH. Accordingly, in all 33 stations studied, there was the highest correlation coefficient of ET_Ref with average, maximum, and minimum temperature and VPD. Also, the correlation coefficient between ET_Ref and wind speed and relative humidity had the lowest values in most of the studied stations. ET_Ref in C and D climates (Fig. 2 c and d) compared to Bw and Bs climates (Fig. 2 a and b) show more correlation with solar radiation (R_s) and extraterrestrial radiation (Ra) variable. On the other hand, based on the results of Fig. 2 a and d, there were high and low changes in the correlation coefficient values between ET_Ref and meteorological variables in stations with Bw and D climates, respectively.

2.2 ET_Ref estimation models

2.2.1 Empirical equations

In the present paper, the FAO₅₆ Penman-Monteith (ET_Ref -PMF56) method was used as the target ET_Ref values to compare with the results of empirical equations and MLR models, which can be calculated as:

$$ {\mathrm{ET}}_{\mathrm{Ref}}=\frac{0.408\Delta \left({R}_{n-}G\right)+\gamma \frac{900}{T_a+273}{u}_2\left({e}_s-{e}_a\right)}{\Delta +\gamma \left(1+0.34{u}_2\right)} $$

(1)

For more information on this method, you can refer to FAO₅₆ (Allen et al. 1998). The developed empirical equations can be categorized into the temperature-based, radiation-based, and mass transfer-based according to the data used (Liu et al. 2017; Zhang et al. 2018). In this research, the empirical equations of Hargreaves-Samani (temperature-based), Irmak (radiation-based), and Dalton (mass transfer-based) are used based on Eqs. (2) (Hargreaves-Samani), (3) (Irmak), and (4) (Dalton), respectively, to estimate ET_Ref (Hargreaves and Samani 1985; Irmak et al. 2003; Dalton 1802).

$$ {\mathrm{ET}}_{\mathrm{Ref}}=a{R}_a\left({T}_a+b\right)\Delta {T}^{0.5} $$

(2)

$$ {\mathrm{ET}}_{\mathrm{Ref}}=a{R}_s\left(b{T}_a+c\right) $$

(3)

$$ {\mathrm{ET}}_{\mathrm{Ref}}=\left(a+{bu}_2\right) VPD $$

(4)

In these equations, ET_Ref, reference evapotranspiration (mm month⁻¹); Δ, the slope of the saturation vapor pressure function (kPa °C); γ, psychometric constant (kPa °C); R_n, net radiation (MJ m⁻² day⁻¹); G, soil heat flux density (MJ m⁻² day⁻¹); u, average wind speed at 2 m height (m s⁻¹); e_s, saturation vapor pressure (kPa); e_a, actual vapor pressure; VPD, vapor pressure deficit; α, 1.26; λ, latent heat of the evaporation (MJ kg⁻¹); R_a, extraterrestrial radiation (mm month⁻¹); R_s, monthly solar radiation (MJ m⁻² month⁻¹); RH, relative humidity (%); T_a, average air temperature (°C); T_max, maximum air temperature (°C); T_min, minimum air temperature (°C); the values of a, b, and c are empirical coefficients.

2.2.2 Multivariate linear regression models

Multivariate linear regression is a method to model the relationship between several independent variables (average, maximum and minimum temperature, vapor pressure deficient, wind speed, relative humidity, solar radiation, and extraterrestrial radiation) with a dependent variable (ET_Ref). This method, based on the minimum of mean square error, performs the empirical coefficients of the linear relationship between the dependent variable and the independent variables in such a way that the linear model data has the best fit with the target data. In general, the form of multivariate linear regression equations is the following expression 5.

$$ \hat{Y}={b}_0+{b}_1{X}_1+{b}_2{X}_2+{b}_3{X}_3+\dots +{b}_m{X}_m $$

(5)

In this expression, $ \hat{Y} $ is a dependent variable, b₀ to b_m are empirical coefficients and X₁ to X_m are independent variables. Also, the accuracy of the results of multivariate linear regression models extremely depends on the number and type of input variables to the model. In the present study, 6 multivariate linear regression models presented in Table 2 have been developed. Meteorological variables are referred separately for each empirical equations and multivariate linear regression models. On the other hand, MLR1, MLR2, and HS models are defined as temperature-based, MLR3, and IR as radiation-based, MLR4, MLR5, and DT as mass transfer-based, and MLR6 as combination models, respectively (Table 2).

Table 2 Characteristics of multivariate linear regression models for ET_Ref estimation

Full size table

2.3 Evaluation criteria

In this study, 5 statistical criteria include the following: RMSE, SI, MAE, NSE, and R² were used to compare the results of empirical equations and MLR with ET_Ref -PMF56 based on Eqs. (6) to (10).

$$ \mathrm{Root}\ \mathrm{Mean}\ \mathrm{Square}\ \mathrm{Error}\ \left(\mathrm{RMSE}\right)\kern0.5em \mathrm{RMSE}=\sqrt{\frac{1}{N}{\sum}_{i=1}^N{\left({ET}_{Ref_i}^{\operatorname{mod} el}-{ET}_{Ref_i}^{PMF56}\right)}^2} $$

(6)

$$ \mathrm{Scatter}\ \mathrm{Index}\ \left(\mathrm{SI}\right)\kern0.5em \mathrm{SI}=\frac{RMSE}{{\overline{ET}}_{Ref}^{PMF56}} $$

(7)

$$ \mathrm{Mean}\ \mathrm{Absolute}\ \mathrm{Error}\ \left(\mathrm{MAE}\right)\kern0.5em \mathrm{MAE}=\frac{1}{N}{\sum}_N^{i=1}\left|{ET}_{Ref_i}^{\operatorname{mod} el}-{ET}_{Ref_i}^{PMF56}\right| $$

(8)

$$ \mathrm{Nash}-\mathrm{Sutcliffe}\ \mathrm{Efficiency}\ \left(\mathrm{NSE}\right)\kern0.5em \mathrm{NSE}=1-\left[\frac{\sum \limits_N^{i=1}\ {\left({ET}_{Ref_i}^{PMF56}-{ET}_{Ref_i}^{\operatorname{mod} el}\right)}^2}{\sum \limits_{i=1}^N{\left({ET}_{Ref_i}^{PMF56}-{\overline{ET}}_{Ref}^{PMF56}\right)}^2}\right] $$

(9)

$$ \mathrm{Coefficient}\ \mathrm{of}\ \mathrm{determination}\ \left(\mathrm{R}2\right)\kern0.5em {R}^2={\left[\frac{\sum \limits_N^{i=1}\left({ET}_{Ref_i}^{PMF56}-{\overline{ET}}_{Ref}^{PMF56}\right)\left({ET}_{Ref_i}^{\operatorname{mod} el}-{\overline{ET}}_{Ref}^{\operatorname{mod} el}\right)}{\sqrt{\left[\sum \limits_{i=1}^N{\left({ET}_{Ref_i}^{PMF56}-{\overline{ET}}_{Ref}^{PMF56}\right)}^2\right]\left[\sum \limits_{i=1}^N{\left({ET}_{Ref_i}^{\operatorname{mod} el}-{\overline{ET}}_{Ref}^{\operatorname{mod} el}\right)}^2\right]}}\right]}^2 $$

(10)

In Eqs. (6) to (10), $ {ET}_{Ref_i}^{PMF56} $ and $ {ET}_{Ref_i}^{\operatorname{mod} el} $ are the ET_Ref based on PMF56 and modeled ET_Ref, $ {\overline{ET}}_{Ref}^{PMF56} $ and $ {\overline{ET}}_{Ref}^{\operatorname{mod} el} $ are the mean values of ET_Ref based on PMF56 and modeled ET_Ref, and N is the number of data set (360 months).

The perfect value for MAE, RMSE, SI, and MAE indices, except NSE and R², is zero, and for NSE and R² is unity. According to Li et al. (2013), the range of SI for the accuracy of the models as:

$$ IF\left\{\begin{array}{c} SI<0.1\mathrm{Excellent}\\ {}0.1< SI<0.2\mathrm{Good}\\ {}\begin{array}{c}0.2< SI<0.3\mathrm{Fair}\\ {} SI>0.3\mathrm{Poor}\end{array}\end{array}\right\} $$

3 Results and discussion

In this paper, a mathematical equation was developed for each of the 6 predefined multivariate linear regression models (MLR1–6) to more accurately estimate ET_Ref for each of the four climates of Iran. The developed equations illustrated in Table 3 are based on various independent variables (meteorological parameters) and dependent variables (ET_Ref). Thus, MLR1 and MLR5 models were developed with two variables, MLR2 model with three variables, MLR3 and MLR4 models with four variables, and MLR6 model with the combination variables, i.e., 2 to 7 variables (Table 3). The MLR6 model has been developed as the best selection from 7 independent variables to the estimation of minimum error between $ {ET}_{Ref}^{PMF56} $ and $ {ET}_{Ref}^{model} $. For example, according to Eq. (32), $ {ET}_{Ref}^{model} $ values have a linear relationship with the three variables T_min, u, and VPD in Bw climate (Table 3).

Table 3 Equations derived for multivariate linear regression models in different climates

Full size table

Kiafar et al. (2017) conducted the comparison of gene expression programming (GEP) models and empirical models led to the development of new equations for each climatic region. Their research showed that the models developed for very dry (Bw) and humid (C and D) climates had more accurate results. Also, based on their results, models that used the more climatic variables had a more accurate estimate of ET_Ref. Other researchers have reported similar results in the same climates (Traore et al. 2010; Ozkan et al. 2011; Huo et al. 2012).

To evaluate the relationship between ET_Ref and meteorological variables, a t test with a 95% confidence level (α = 0.05) was used. The significance of the multivariate regression linear relationship between ET_Ref and each independent variable will be significant at the 95% level if the t stat value is greater than 1.96 or less than − 1.96 (Mattar and Alazba 2019). Figure 3 shows the t stat parameter values for all stations studied by climates classified according to the Köppen method (Fig. 3).

According to the results of Fig. 3, in Bw climate, the two variables R_a and R_s had no significant linear relationship with ET_Ref in none of the stations of this climate at a 95% confidence level. Also in this climate, changes in the T_min variable had a significant relationship with ET_Ref only in three stations of Bandar Abbas, Bushehr, and Yazd, and the relationship of this variable with ET_Ref was not significant in four stations of Abadan, Ahvaz, Bam, and Zabol at 95% confidence level. Also, in Bs climate (Fig. 3b), there was only a significant linear relationship between the three variables T_min, u, and VPD at 95% confidence level with ET_Ref in most stations of this climate. For other variables including intercept width, T_max, R_a, R_s, and RH, there was a significant linear relationship with ET_Ref at a 95% level in two out of seven stations in this climate (Fig. 3b).

Figure 3c shows that there is a significant linear relationship between ET_Ref and most meteorological variables in station C climates. In other words, there is a significant linear relationship between ET_Ref and most meteorological parameters in this climate at a 95% level (Fig. 3c). There is no significant linear relationship between ET_Ref values and R_a, R_s, and RH variables in the most stations with D climate (Fig. 3d). Based on the results of Fig. 3, there is a significant linear relationship between ET_Ref and VPD, u, and T_min variables in 32, 28, and 25 stations of the 33 studied stations (α = 0.05%), respectively. Also, the least significant linear relationship between ET_Ref and R_a and R_s variables was observed in 8 and 10 stations of all studied stations, respectively (Fig. 3). According to the results of Shiri et al. (2013) and Khanmohammadi et al. (2018), models in which the RH variable was used showed better results in C and D climates, because the effect of RH on the ET_Ref is greater in these climates. Also, based on the results of Yassin et al. (2016), the Irmak model (radiation-based) was more accurate in Bw and Bs climates. RMSE values for the Irmak model in arid climates were 25.78% lower than humid climates.

Table 4 shows the average values of RMSE and R² criteria in stations based on climates classified according to the Köppen method for empirical equations and MLR models. Also, the highest and lowest accuracy of the models in all climates was related to the MLR6 model and the empirical Hargreaves-Samani (HS) equation, respectively. The Irmak method had the best results in all climates in comparison with the other empirical equations. After the MLR6 model, which was estimated based on the linear relationship between ET_Ref and all climatic variables, the MLR4 model (based on the minimum and maximum temperature, relative humidity, and wind speed) was more accurate in Bw and Bs climates. In C and D climates, the MLR3 model (based on minimum, maximum temperatures, and solar and extraterrestrial radiations) showed the best estimates of ET_Ref between radiation-based models. This can be justified due to the higher correlation of ET_Ref with radiation variables in C and D climates (Fig. 2 c and d).

Table 4 RMSE and R² criteria for different models in different climates

Full size table

Based on the results of this study, a significant linear relationship between ET_Ref and VPD, u, and T_min variables were reported in 32, 28, and 25 stations, but this linear relationship was only significant in three variables T_min, u, and VPD with ET_Ref in most semiarid climate stations (Bs). However, the linear relationship between ET_Ref and most meteorological variables in C climate was significant.

Furthermore, according to the results of Table 4, results were more accurate of the simplest regression model (i.e., MLR1) in estimating the ET_Ref compared to all empirical equations. Also, in terms of RMSE criteria, the lowest error values were reported for C climate and with changes in the range of 13.3–22.0 mm month⁻¹. On the other hand, the highest values of R² for this climate were around 0.88–0.96 (Table 4).

The results of Shiri et al. (2012) showed that the HS method has less accurate in estimating daily reference evapotranspiration in the north of Spain in comparison with the Priestley-Taylor empirical method, intelligent gene expression programming (GEP), and adaptive neuro-fuzzy inference system (ANFIS) methods. Their results corroborate the results of the present study due to the lower accuracy of the HS method in all four climates of Iran.

Figure 4 shows the mean of the estimated ET_Ref values for the best empirical method (Irmak) and the best regression model (MLR6) in different climates with two criteria NSE and MAE (Fig. 4). Accordingly, NSE values in all climates for the MLR6 model were 0.97, which indicates the high accuracy of this model in estimating ET_Ref at all stations in different climates of the country.

The results also showed that the highest and lowest accuracy of the models in all climates was related to the MLR6 model and the empirical Hargreaves-Samani (HS) equation, respectively. Of the three empirical equations, the Irmak method showed the best results in all climates. After the MLR6 regression model, which was estimated based on the linear relationship between ET_Ref and all climatic variables, in Bw and Bs climates, the MLR4 model in climates C and D, the MLR3 model showed the best estimates of ET_Ref. Another noteworthy point was the high accuracy of the simplest regression model (i.e., MLR1) in estimating the ET_Ref compared to all empirical equations.

Also, Fig. 4 shows that in terms of NSE and MAE criteria, the Irmak empirical method is more accurate in arid and semiarid climates (Bw and Bs) than in humid climates (C and D).

According to the value of the SI criteria, the performance of a model can be divided into four levels: excellent to poor. Figure 5 shows the SI values at different stations for all empirical equations and the MLR models. Also, the MLR6 model had excellent estimates in 10 stations including Bushehr and Yazd stations (Bw); Zahedan, Kerman, and Isfahan (Bs); Shiraz, Gorgan, Birjand, and Qazvin (C); and Zanjan (D) (SI < 0.1) which in this regard had the highest accuracy in different climates. Then, MLR3 model in two stations in Bs climate (Zahedan and Kerman), in three stations in C climate (Gorgan, Birjand, and Qazvin), and one station in D climate (Zanjan) had the best estimate of ET_Ref. In other words, the HS equation in four stations of Ahwaz, Isfahan, Birjand, and Rasht had the lowest SI values and ET_Ref estimation using this method in these stations was poor (SI > 0.3). In general, Fig. 5 shows that except for the HS equation, the accuracy of estimation of other methods in most of the studied stations concerning SI values was good (0.1 < SI < 0.2).

Figure 5 shows that the MLR6 model has the best estimation of ET_Ref in almost of the studied station located on different climates. However, in a few stations such as Arak, a model other than MLR6 has had better results in ET_Ref estimation based on the scatter index. According to Fig. 5, the Irmak method has been more accurate to estimate ET_Ref among the three studied empirical equations in most stations located in different climates. According to Unes et al. (2020) results, radiation-based empirical equations (Turc and Irmak) were found better than other empirical equations, the same with our results about Irmak empirical equation (radiation-based).

4 Conclusion

Based on the results of the study, the MLR6 model was developed as the best multivariate linear regression model with a minimum error value between ET_Ref -PMF56, and ET_Ref. The results of the model showed a significant linear relationship between ET_Ref as a dependent variable and other meteorological variables as independent variables, because of the MLR6 model. Also, the model had excellent estimates in 10 stations including Bushehr and Yazd stations (Bw), Zahedan, Kerman, and Isfahan (Bs); Shiraz, Gorgan, Birjand, and Qazvin (C); and Zanjan (D) (SI < 0.1), which in this regard had the highest accuracy of estimation in different climates. Therefore, the development of multivariate linear regression models provided the necessary preconditions for evaluating the function of the empirical equations of ET_Ref in the study.

Generally, the MLR6 shows efficient results under various climatic conditions. However, intensive data is required for this method and in developing countries; numerous climatic data are not available readily for all the weather stations. Also, the data always lack reliable quality. Certainly, there is a need to develop some approaches that can estimate ET_Ref precisely with available limited climatic data. The study demonstrated that all MLR models (even MLR1 with the T_min and T_max as input data) gave reliable results to estimation of ET_Ref in different climates.

References

Abdullah SS, Malek MA, Abdullah NS, Kisi O, Yap KS (2015) Extreme learning machines: a new approach for prediction of reference evapotranspiration. Journal of Hydrology 527:184–195
Article Google Scholar
Allen RG, Pereira LS, Raes D, Smith M (1998) Crop evapotranspiration-guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56 Fao. Rome 300:D05109
Google Scholar
Chen Z, Zhu Z, Jiang H, Sun S (2020) Estimating daily reference evapotranspiration based on limited meteorological data using deep learning and classical machine learning methods. Journal of Hydrology 591:125286
Article Google Scholar
Dalton J (1802) Experimental essays on the constitution of mixed gases Manchester Literary and Philosophical Society Memo 5:535–602
Google Scholar
dos Santos Farias DB, Althoff D, Rodrigues LN, Filgueiras R (2020) Performance evaluation of numerical and machine learning methods in estimating reference evapotranspiration in a Brazilian agricultural frontier Theoretical and Applied Climatology:1–12
Ehteram M et al (2019) An improved model based on the support vector machine and cuckoo algorithm for simulating reference evapotranspiration PloS one 14:e0217499
Google Scholar
Gavili S, Sanikhani H, Kisi O, Mahmoudi MH (2018) Evaluation of several soft computing methods in monthly evapotranspiration modelling. Meteorological Applications 25:128–138
Article Google Scholar
Güçlü YS, Subyani AM, Şen Z (2017) Regional fuzzy chain model for evapotranspiration estimation Journal of hydrology 544:233–241
Google Scholar
Hargreaves GH, Samani ZA (1985) Reference crop evapotranspiration from temperature. Applied engineering in agriculture 1:96–99
Article Google Scholar
Huang G, Wu L, Ma X, Zhang W, Fan J, Yu X, Zeng W, Zhou H (2019) Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J Hydrol 574:1029–1041
Article Google Scholar
Huo Z, Feng S, Kang S, Dai X (2012) Artificial neural network models for reference evapotranspiration in an arid area of northwest China Journal of arid environments 82:81–90
Google Scholar
Irmak S, Irmak A, Allen R, Jones J (2003) Solar and net radiation-based equations to estimate reference evapotranspiration in humid climates Journal of irrigation and drainage engineering 129:336–347
Google Scholar
Jing W et al. (2019) Implementation of evolutionary computing models for reference evapotranspiration modeling: short review, assessment and possible future research directions Engineering applications of computational fluid mechanics 13:811-823
Karbasi M (2018) Forecasting of multi-step ahead reference evapotranspiration using wavelet-Gaussian process regression model. Water Resour Manag 32:1035–1052
Article Google Scholar
Khanmohammadi N, Rezaie H, Montaseri M, Behmanesh J (2018) The application of multiple linear regression method in reference evapotranspiration trend calculation Stochastic environmental research and risk assessment 32:661–673
Google Scholar
Kiafar H, Babazadeh H, Marti P, Kisi O, Landeras G, Karimi S, Shiri J (2017) Evaluating the generalizability of GEP models for estimating reference evapotranspiration in distant humid and arid locations. Theoretical and Applied Climatology 130:377–389
Article Google Scholar
Kim S, Singh VP, Seo Y, Kim HS (2014) Modeling nonlinear monthly evapotranspiration using soft computing and data reconstruction techniques Water resources management 28:185–206
Google Scholar
Kisi O, Heddam S (2019) Evaporation modelling by heuristic regression approaches using only temperature data. Hydrological Sciences Journal 64:653–672
Article Google Scholar
Kisi O, Sanikhani H, Zounemat-Kermani M, Niazi F (2015) Long-term monthly evapotranspiration modeling by several data-driven methods without climatic data. Computers and Electronics in Agriculture 115:66–77
Article Google Scholar
Köppen W (1931) Grundriss der klimakunde. de Gruyter,
Book Google Scholar
Li M-F, Tang X-P, Wu W, Liu H-B (2013) General models for estimating daily global solar radiation for different solar radiation zones in mainland China Energy conversion and management 70:139–148
Google Scholar
Liu X, Xu C, Zhong X, Li Y, Yuan X, Cao J (2017) Comparison of 16 models for reference crop evapotranspiration against weighing lysimeter measurement. Agric Water Manag 184:145–155
Article Google Scholar
Mattar MA, Alazba A (2019) GEP and MLR approaches for the prediction of reference evapotranspiration. Neural Computing and Applications 31:5843–5855
Article Google Scholar
Mehdizadeh S, Behmanesh J, Khalili K (2017) Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration. Computers and Electronics in Agriculture 139:103–114
Article Google Scholar
Monteith J The state and movement of water in living organisms. In: 19th Symposia of the Society for Experimental Biology. Cambridge University Press, London, 1965, 1965. pp 205–234
Ozkan C, Kisi O, Akay B (2011) Neural networks with artificial bee colony algorithm for modeling daily reference evapotranspiration. Irrigation Science 29:431–441
Article Google Scholar
Raza A, Shoaib M, Khan A, Baig F, Faiz MA, Khan MM (2020) Application of non-conventional soft computing approaches for estimation of reference evapotranspiration in various climatic regions. Theoretical and Applied Climatology 139:1459–1477
Article Google Scholar
Reis MM, da Silva AJ, Junior JZ, Santos LDT, Azevedo AM, Lopes ÉMG (2019) Empirical and learning machine approaches to estimating reference evapotranspiration based on temperature data Computers and electronics in agriculture 165:104937
Google Scholar
Saggi MK, Jain S (2019) Reference evapotranspiration estimation and modeling of the Punjab Northern India using deep learning Computers and electronics in agriculture 156:387–398
Google Scholar
Sanikhani H, Kisi O, Maroufpoor E, Yaseen ZM (2019) Temperature-based modeling of reference evapotranspiration using several artificial intelligence models: application of different modeling scenarios. Theoretical and Applied Climatology 135:449–462
Article Google Scholar
Sharafi S, Karim NM (2020) Investigating trend changes of annual mean temperature and precipitation in Iran. Arabian Journal of Geosciences 13:1–11
Article Google Scholar
Sharafi S, Ramroudi M, Nasiri M, Galavi M, Kamali GA (2016) Role of early warning systems for sustainable agriculture in Iran. Arabian Journal of Geosciences 9:734
Article Google Scholar
Shiri J, Kişi Ö, Landeras G, López JJ, Nazemi AH, Stuyt LC (2012) Daily reference evapotranspiration modeling by using genetic programming approach in the Basque Country (Northern Spain) Journal of Hydrology 414:302–316
Shiri J, Nazemi AH, Sadraddini AA, Landeras G, Kisi O, Fard AF, Marti P (2013) Global cross-station assessment of neuro-fuzzy models for estimating daily reference evapotranspiration Journal of hydrology 480:46–57
Google Scholar
Shiri J, Marti P, Karimi S, Landeras G (2019) Data splitting strategies for improving data driven models for reference evapotranspiration estimation among similar stations Computers and Electronics in Agriculture 162:70–81
Shiri J, Zounemat-Kermani M, Kisi O, Mohsenzadeh Karimi S (2020) Comprehensive assessment of 12 soft computing approaches for modelling reference evapotranspiration in humid locations. Meteorol Appl 27:e1841
Article Google Scholar
Tabari H, Kisi O, Ezani A, Talaee PH (2012) SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment Journal of Hydrology 444:78–89
Tabari H, Martinez C, Ezani A, Talaee PH (2013) Applicability of support vector machines and adaptive neurofuzzy inference system for modeling potato crop evapotranspiration. Irrigation Science 31:575–588
Article Google Scholar
Tikhamarine Y, Malik A, Kumar A, Souag-Gamane D, Kisi O (2019) Estimation of monthly reference evapotranspiration using novel hybrid machine learning approaches. Hydrological Sciences Journal 64:1824–1842
Article Google Scholar
Traore S, Wang Y-M, Kerh T (2010) Artificial neural network for modeling reference evapotranspiration complex process in Sudano-Sahelian zone Agricultural water management 97:707–714
Google Scholar
Unes F, Kaya YZ, Mamak M (2020) Daily reference evapotranspiration prediction based on climatic conditions applying different data mining techniques and empirical equations THEORETICAL AND APPLIED CLIMATOLOGY 141:763–773
Google Scholar
Yassin MA, Alazba AA, Mattar MA (2016) Comparison between gene expression programming and traditional models for estimating evapotranspiration under hyper arid conditions. Water Resources 43:412–427
Article Google Scholar
Yirga SA (2019) Modelling reference evapotranspiration for Megecha catchment by multiple linear regression Modeling Earth Systems and Environment 5:471–477
Google Scholar
Zhang D, Lin J, Peng Q, Wang D, Yang T, Sorooshian S, Liu X, Zhuang J (2018) Modeling and simulating of reservoir operation using the artificial neural network, support vector regression, deep learning algorithm. J Hydrol 565:720–736
Article Google Scholar
Zhu B, Feng Y, Gong D, Jiang S, Zhao L, Cui N (2020) Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data. Computers and Electronics in Agriculture 173:105430
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Environment Science and Engineering, Arak University, Arak, Iran
Saeed Sharafi
Department of Water Science and Engineering, Arak University, Arak, Iran
Mehdi Mohammadi Ghaleni

Authors

Saeed Sharafi
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Mohammadi Ghaleni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeed Sharafi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table 5 The values of t stat and p value of independent variables in multivariate linear regression analysis for synoptic stations of Iran

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sharafi, S., Ghaleni, M.M. Evaluation of multivariate linear regression for reference evapotranspiration modeling in different climates of Iran. Theor Appl Climatol 143, 1409–1423 (2021). https://doi.org/10.1007/s00704-020-03473-0

Download citation

Received: 07 September 2020
Accepted: 16 November 2020
Published: 07 January 2021
Issue Date: February 2021
DOI: https://doi.org/10.1007/s00704-020-03473-0

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Evaluation of multivariate linear regression for reference evapotranspiration modeling in different climates of Iran

Abstract

Similar content being viewed by others

Modelling reference evapotranspiration for Megecha catchment by multiple linear regression

Modeling of reference evapotranspiration for temperate Kashmir Valley using linear regression

Calibration of empirical equations for estimating reference evapotranspiration in different climates of Iran

1 Introduction