Machine learning techniques in estimation of eggplant crop evapotranspiration

Cemek, Bilal; Tasan, Sevda; Canturk, Aslıhan; Tasan, Mehmet; Simsek, Halis

doi:10.1007/s13201-023-01942-1

Machine learning techniques in estimation of eggplant crop evapotranspiration

Original Article
Open access
Published: 22 May 2023

Volume 13, article number 136, (2023)
Cite this article

Download PDF

You have full access to this open access article

Applied Water Science Aims and scope Submit manuscript

Machine learning techniques in estimation of eggplant crop evapotranspiration

Download PDF

Bilal Cemek¹,
Sevda Tasan¹,
Aslıhan Canturk²,
Mehmet Tasan² &
…
Halis Simsek³

1837 Accesses
2 Citations
Explore all metrics

Abstract

This study predicted the daily evapotranspiration of eggplant (Solanum melongena L.) under full and deficit irrigation in the Bafra district of Samsun province, Turkey, using machine learning methods. Artificial neural networks (ANNs), deep neural networks (DNN), M5 model tree (M5Tree), random forest (RF), support vector machine (SVM), k-nearest neighbor (kNN), and adaptive boosting were investigated as machine learning approaches. Determination of evapotranspiration in this study consists of three methods: (i) The reference evapotranspiration (ET_o) was obtained from the Food and Agriculture Organization-56 Penman–Monteith equation, (ii) the values of evapotranspiration (ET_c) calculated by multiplying the reference evapotranspiration by the crop coefficient (K_c), and (iii) the values of evapotranspiration (ET_a) measured using soil water balance between successive soil water measurements as the outputs. The model’s performance in ET_o estimation was higher when minimum and maximum temperature (T_max and T_min), wind speed (u₂), average relative humidity (RH_avg), solar radiation (R_s), and days of the year were used as inputs. The best performance was obtained in the ANN model with a coefficient of determination (R²) value of 0.984, a mean absolute error (MAE) of 0.098 mm d⁻¹, a root-mean-square error (RMSE) of 0.153 mm d⁻¹, and Nash–Sutcliffe efficiency of 0.983. The model’s performance in ET_c estimation was significantly improved with the addition of leaf area index (LAI) and crop height (h_c) to the climate parameters (MAE and RMSE values decreased by 22.6 and 23.2%, respectively). The accuracy of ET_c estimation for some plant traits (h_c and LAI) and average temperature (T_avg) was sufficient. The best statistical performance in estimating ET_a was obtained by the RF model (T_avg, u₂, RH_avg, and R_s) using climate parameters. DNN proved to be the least successful model compared to the other six models in predicting ET_o, ET_c, and ET_a.

Comparison of different empirical methods and data-driven models for estimating reference evapotranspiration in semi-arid Central Anatolian Region of Turkey

Article 19 September 2021

Performance evaluation of numerical and machine learning methods in estimating reference evapotranspiration in a Brazilian agricultural frontier

Article 24 September 2020

Machine Learning Approach for Reference Evapotranspiration Estimation in the Region of Fes, Morocco

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Evapotranspiration (ET) is the sum of water lost by transpiration from leaf surface and evaporation from soil surface (Allen et al. 1998). ET is a key component of the regional water budget and plays an important role in controlling interactions between the atmosphere, soil, and vegetation (Liu et al. 2013). Therefore, the correct calculation of ET is an important issue for the successful management of water resources in irrigated agriculture. The accurate determination of ET helps to plan and manage the irrigation and drainage systems and increases the irrigation water efficiency (Elbeltagi et al. 2020).

Direct and indirect methods are used to determine crop evapotranspiration. Direct methods include lysimeter, experimental field plots, moisture reduction control, and measurement of runoff into and out of the basin. Indirect methods consist of the aerodynamic method, energy balance, and their combination. Direct methods are costly and time-consuming compared to indirect methods. The crop evapotranspiration (ET_c) estimate is based on the correct calculation of the reference evapotranspiration (ET_o). In this way, however, ET_c and ET_o can be estimated by multiplying the crop coefficient (K_c) (Jensen 1968). Many methods have been developed to estimate the ET_o based on available meteorological data (Hargreaves and Samani 1985; Monteith 1965; Allen et al. 1998; Odili et al. 2023). The equations for temperature-based regional evapotranspiration (Thornthwaite 1948; Blaney and Criddle 1950; Doorenbos and Pruitt 1977; Hargreaves and Samani 1985), radiation-based (Makkink 1957; Turc 1961, Priestly and Taylor 1972), mass transfer-based (Monteith 1965; Penman 1948), and combination based (the Food and Agriculture Organization (FAO) 56 Penman–Monteith, Allen et al. 1998) models were developed. The accuracy of FAO-56 Penman–Monteith method (FAO-56 PM), which incorporates thermodynamic and aerodynamic effects, is superior to other empirical models. FAO suggested that the (FAO-56 PM) be utilized as the standard method to assess ET_o. Major difficulties of FAO-56 PM method are the requirement of a large amount of meteorological data (Feng et al. 2017; Fan et al. 2018), the high cost of climate data measurement equipment, and the difficulties in measurement of the climate data required by the method at every station. However, the calculation of ET can be considered as a complex and nonlinear regression process as it depends on a large number of meteorological variables. It is very difficult to develop empirical models that will accurately represent all these complex processes. Some researchers have suggested that machine learning methods can be used to predict plant water consumption due to their ability to successfully process nonlinear data (Wang et al. 2017; Mokari et al. 2022).

Machine learning techniques have been extensively proposed recently in estimation of ET_o, ET_c, and ET_a, including support vector machine (SVM) with the rapid development of machine learning algorithms. However, machine learning models have been used more frequently in estimation of ET_o compared to the ET_c and ET_a predictions (Chen et al. 2020; Wu et al. 2021; Dong et al. 2022). Ferreira et al. (2019) used artificial neural network (ANN) and SVM methods to estimate the ET_o with limited meteorological data in Brazil. In the machine learning models, the meteorological data of the previous days were used along with data of the current day as input. The performance of ANN model with temperature and relative humidity data from the previous four days was better than the SVM. The performance of the ANN and M5 tree models was compared in estimating the reference evapotranspiration in an arid region by Rahimikhoob (2014) who stated that both models were well suited to the study area, while ANN ET_o predicted the ET_o better than the M5 tree model. The comparison of CatBoost, RF, and SVM methods to estimate ET_o in a humid environment showed that the CatBoost model had a much lower computational cost and outperformed the other two models when complete input data were available (Huang et al. 2019). The FAO-56 K_c approach has been used to estimate daily and monthly ET_c with different machine learning models (Tang et al. 2018; Granata 2019; Yamaç and Todorovic 2020; Gong et al. 2021). Abrishami et al. (2019) estimated the ET_c of wheat and maize with the ANN model. Meteorological variables, leaf area index (LAI), and crop height were used as inputs to the models.

The researchers indicated that ANNs with two hidden layers are effective in estimating the ET_c of wheat and maize. Han et al. (2021) compared the back-propagation neural network with the multiple linear regression method to estimate ET_c measured by eddy covariance. It is observed that the combination of the eddy covariance method and BP model achieved a higher coefficient of determination (0.87) and accuracy (91.44%). Chen et al. (2020) compared the performance of SVM and RF with empirical methods using three deep learning methods, including deep neural network (DNN), transient convolutional neural network (TCN), and long short-memory neural network (LSTM). The TCN and LSTM models outperformed when temperature-based features are available, and all the proposed deep learning and machine learning models outperformed the radiation or humidity-based empirical equations when radiation- or humidity-based features are available. Tang et al. (2018) compared genetic algorithm-optimized artificial neural network (GANN) and SVM models to simulate the ET_a in a rainfed cornfield using climate data, LAI, and crop height as the input parameters. In their study, the GANN models performed slightly better than the SVM models. Granata et al. (2020) compared RF, kNN, additive regression of decision stump (ARDS), and multilayer perceptron (MLP) in estimating the ET_a, and obtained the best performance with RF.

There are few studies on applying different machine learning models to predict ET_o, ET_c, and ET_a of eggplant that is grown under drip irrigation in field conditions in a semi-humid region. In this study, the prediction performances of seven different machine learning models include ANN, deep neural networks (DNN), M5 model tree (M5Tree), support vector regression (SVM), k nearest neighbors (kNN), random forest (RF), and adaboost (AB) were used to estimate ET_o, ET_c, and ET_a during the eggplant growing period. In addition, soil conditions such as volumetric soil moisture content (SWC), crop characteristics such as crop height (h_c), LAI, canopy temperature (T_c), and climate parameters such as air temperature, relative humidity, wind speed, and solar radiation were used to estimate ET_o, ET_c, and ET_a. The effects of different input variables consisting of different combinations on the performance of the models were also investigated. This study revealed that field measurements that require a long time, labor, and cost could be obtained using models with high predictive power in a short time and low cost. In addition, the effects of these three factors on crop evapotranspiration were evaluated by using combinations of soil, crop, and climate variables, which are important parameters in irrigation scheduling. This study covers a comprehensive understanding of ET, and it will provide a good source of information for engineers and scientists.

Materials and methods

Study area and soil measurements

The study was carried out in the experimental station of the Black Sea Agricultural Research Institute located in the Central Black Sea Region of Turkey during eggplant growing seasons in 2015, 2016, and 2017. The experimental station is present between 41° 36′ 8″ N and 35° 55′ 8″ E coordinates (elevation 17 m). The climate is semi-humid with average annual precipitation, temperature, and relative humidity of 715.5 mm, 14.46 °C, and 75.40%, respectively (MGM, 2018). Cereals (rice, corn, and wheat) and vegetables (eggplant, red pepper, tomato, cabbage, and watermelon) are predominantly grown in the Bafra Plain of Samsun province in Turkey. Soils in the experimental field formed over alluvial deposits.

Three disturbed and undisturbed soil samples were taken from 0–30, 30–60, 60–90, and 90–120 cm depths. Bulk density, soil texture, field capacity, and wilting point moisture content were determined using the methods explained in the literature (Blake and Hartge 1986; Bouyoucos 1951; Meyer and Gee 1999). Available phosphorus content was determined using the method described by Olsen (1954). Available micronutrients were determined using the method of Lindsay and Norvel (1978). Soil texture was clayey (36% silt and 45% clay) in 0–90 cm and clayey loam in 90–120 cm of soil profile (33% silt and 36% clay). Soil bulk density values varied between 1.26 and 1.35 g cm³. Field capacity (0.33 kPa) moisture content varied between 0.277 and 0.308 m³/m³, and moisture content at permanent wilting point (1500 kPa) varied between 0.163 and 0.197 m³/m³. Before the drip irrigation system was installed, the infiltration rate of the experimental area was determined as 10 mm/h using a double-ring infiltrometer according to the methodology described by Reynolds et al. (2002).

Crop management

Aykara F1 eggplant (Solanum melongena L.) cultivar was used as the plant material. Eggplant seeds were sown in viols at the beginning of April in each study year. When the seedlings reached to 15–18 cm in height, they were transplanted to the field on May 15 in 2015 and 2017 and on May 22 in 2016 depending on the rainfall schedule. According to laboratory soil results, a total of 100 kg N/ha and 60 kg P₂O₅/ha were applied on soil before transplanting. All of P₂O₅ and half of N were applied at transplanting, and the remaining half of N was applied in the later stages of crop growth.

Irrigation treatments and experimental design

Experimental data were obtained during a 3-year field study to investigate the most suitable irrigation program for eggplant in semi-humid climatic conditions. Five different water management strategies were investigated in the study. The irrigation levels were full irrigation (I₁), 75% of full irrigation (I₂), 50% of full irrigation (I₃), 25% of full irrigation (I₄), and rainfed-based irrigation (I₅). The amount of water applied in the full irrigation treatment was calculated as the amount of water required to bring the current moisture in the root zone to the field capacity. The amount of irrigation water applied to other treatments was gradually reduced. However, the actual evapotranspiration values considered in this study were obtained from full irrigation treatment (I₁) applied throughout the entire growing season.

Soil water content (SWC) was determined with a neutron meter (Model 503 DR, Campbell Pacific Nuclear, Martinez, CA), which was calibrated gravimetrically in each layer (0–30, 30–60, 60–90, 90–120, and 120–150 cm) before the season. The moisture content during the growing season was measured with the help of aluminum tubes placed 10 cm away from the crop rows, in the middle of each plot. In addition, the soil moisture content in the first 30 cm depth of the soil surface was determined by the gravimetric method (Köksal et al. 2011).

Total size of the experimental field was 990 m². The layout of the experiment was randomized blocks with three replications, and the treatments were placed in 15 plots. Each plot consisted of 10 rows with 7 m width and 7.2 m length. The plantings of seedlings were carried out with 0.70 m interrow and 0.60 m intra row spacing. Middle 6 rows were used for measurements to avoid the side effects. In addition, eggplant seedlings were planted between the plots to prevent the advection effect. In the transmission unit of the drip irrigation system, the main line consisting of PE pipes was used as manifold and lateral pipes. The diameter of the lateral pipes was 16 mm with 2 l h⁻¹ in-line drippers spaced at 25 cm. One lateral was placed in each crop row. Water application in each treatment was controlled by a valve on the manifold pipelines where water was supplied to the laterals, and each treatment had a different valve. In the control unit of the system, a fertilizer tank, sand-gravel filter, disc filter, manometer, water meter, valves, and fittings were used. The irrigation was carried out twice a week during the experiment.

Estimation of reference evapotranspiration (ET_o)

The FAO-56 Penman–Monteith (FAO-56 PM) method was used to calculate the daily ET_o. The FAO-56 PM method for the estimation of daily ET_o is described by Eq. (1) (Allen et al. 1998; ASCE-EWRI 2005).

$${\text{ET}}_{{\text{o}}} = \frac{{0.408\Delta \left( {R_{{\text{n}}} - G} \right) + \gamma \frac{900}{{T + 273}}u_{2} \left( {e_{{\text{a}}} - e_{{\text{d}}} } \right)}}{{\Delta + \gamma \left( {1 + 0.34u_{2} } \right)}}$$

(1)

where ET_o is the daily reference evapotranspiration (mm/d); G is soil heat flux density; R_n is the net radiation at the crop surface (MJ/m² d); T is the mean daily air temperature at 2 m height (°C); u₂ is the wind speed at 2 m height (m/s); e_s is the saturated vapor pressure (kPa); e_a is the actual vapor pressure (kPa); e_s–e_a is the saturated vapor pressure deficit (kPa); Δ is the slope of vapor pressure curve (kPa/°C); and γ is the psychometric constant (kPa/°C).

Crop evapotranspiration (ETc) and crop coefficient (K _c)

The crop coefficient approach was used to calculate ET_c. The ET_c in the FAO-56 can be estimated by multiplying the ET_o value by the single crop coefficient (K_c) (Eq. 2).

$${\text{ET}}_{{\text{c}}} = K_{{\text{c}}} \times {\text{ET}}_{{\text{o}}}$$

(2)

The K_c values of FAO-56 method were adapted considering the climate data. The $K_{{{\text{c}}_{{{\text{ini}}}} }}$ was also determined according to FAO-56 (Allen et al. 1998). The $K_{{{\text{c}}_{{{\text{mid}}}} }}$ and $K_{{{\text{c}}_{{{\text{end}}}} }}$ values were adapted to the region using Eqs. 3 and 4.

$$K_{{c_{{{\text{mid}}}} }} = K_{{{\text{c}}_{{{\text{mid-FAO}}56}} }} + \left[ {0.04\left( {u_{2} - 2} \right) - 0.004\left( {{\text{RH}}_{\min } - 45} \right)} \right]\left[ \frac{h}{3} \right]^{0.3}$$

(3)

$$K_{{{\text{c}}_{{{\text{end}}}} }} = K_{{{\text{c}}_{{{\text{end-FAO}}56}} }} + \left[ {0.04\left( {u_{2} - 2} \right) - 0.004\left( {{\text{RH}}_{\min } - 45} \right)} \right]\left[ \frac{h}{3} \right]^{0.3}$$

(4)

where $K_{{{\text{c}}_{{{\text{mid}}\_{\text{FAO}}56}} }}$ is mid-period K_c values in the FAO 56; $K_{{{\text{c}}_{{{\text{end}}\_{\text{FAO}}56}} }}$ is the last period K_c values in the FAO-56; u₂ is the wind speed measured at 2 m height (m/s); RH_min is the lowest mean relative humidity at the relevant period (%); and h is average crop height in the relevant period (m). The length of phenological stages was recorded regularly during the experiment. The initial stage in 2015 was 31 days, the development phase was 40 days, the mid-stage was 39 days, and the late phase was 40 days. Total growing season was 146 days, which were consisting of 24, 39, 39, and 40 days of different periods in 2016. Total growing season in 2017 was 140 days (30, 38, 40, and 28 days). The $K_{{{\text{c}}_{{{\text{FAO - }}56}} }}$ values calculated for each period considering relative humidity, wind speed, and crop height values were identified. The $K_{{{\text{c}}_{{{\text{ini}}}} }}$, $K_{{{\text{c}}_{{{\text{mid}}}} }}$, and $K_{{{\text{c}}_{{{\text{end}}}} }}$ values in 2015 were 0.80, 1.07, and 0.88, respectively. Similarly, the values were 0.65, 1.08, and 0.85 in 2016 and 0.75, 1.04, and 0.87 in 2017.

Actual evapotranspiration (ET_a)

The ET_a was determined using the direct measurement method. The soil water balance method based on the principles of conservation of mass was used as the direct measurement method. The water balance method assumes that the difference between the amount of water entering and leaving a certain soil volume in a certain period of time is equal to the change in the soil water volume in the same time interval. The soil water balance equation used to estimate the daily crop evapotranspiration that uses different components of the soil water balance is explained in Eq. (5) (Jensen et al. 1990; Allen et al. 1998; Evett 2002).

$$P + I - D_{r} - R_{{\text{o}}} - {\text{ET}} \pm \left( {S_{{\text{e}}} - S_{{\text{b}}} } \right) = 0$$

(5)

where P is precipitation (mm); I is the irrigation water applied (mm); D_r is the drainage (mm); R_o is the runoff (mm); ET is actual evapotranspiration (mm); S_e is soil moisture content at the end of the time interval between two irrigations (mm); and S_b is soil moisture content at the beginning of the time interval between two irrigations (mm). The actual evapotranspiration value obtained with Eq. (5) was expressed as ET_a throughout the paper. The effective root depth of eggplant was considered as 60 cm in the calculations. Drip irrigation method was used in the study. The deficient soil water in I₁ treatment (full irrigation) was completed to the field capacity, and the irrigation water application rate was lower than the infiltration rate. Therefore, runoff did not occur and water was not drained deeper than 60 cm in irrigations. There was a 3-m-deep drainage system in the study area, and groundwater close to the crop root zone was not observed during the experiment.

Measurement of crop height (h _c) and leaf area index (LAI)

The h_c was measured at 15-day intervals during the eggplant growing season. The h_c was measured 8 times in both the years 2015 and 2016 and five times in 2017. The distance from soil surface to the top of the crop was measured with a ruler in 3 plants marked on each plot, and the average of h_c values was recorded as the crop height.

The LAI is directly measured in the field. The destructive harvesting method was carried out 4 times in 2015 (22 June, 15 July, 12 August, and 10 September) and 2016 (26 July, 09 and 31 August, and 08 September) and 7 times in 2017 (18 July, 01 and 15 August, 08 and 26 September, 06 and 13 October). Leaves were removed from the plant for each treatment, placed on an A4 white plain paper, and transferred to the computer as a scanned image. The images were then digitized in the AutoCAD (Version 2016, Autodesk, San Rafael, CA, ABD), and the total leaf area of each treatment was determined. Crop area was calculated by the ratio of total leaf area to the canopy projection measured at 12:00 during the day. The LAI is a dimensionless quantity that characterizes crop canopies (LAI = leaf area/crop area or LAI = leaf area/canopy projection area).

Calculation of growing degree days (GDD)

The growing degree day (GDD) is a weather-based indicator to assess crop development. The GDD is defined as the mean daily temperature (average of daily maximum and minimum temperatures) above a certain base temperature accumulated on a daily basis over a period of time. The base temperature varies among crops, and the value is derived from the growth habits of each specific crop. In this study, the base temperature was considered as 10 °C for eggplant (Fereres et al. 2012). The GDD is calculated using Eq. (6):

$${\text{For}}\;T_{{{\text{base}}}} \le \left( {\left( {T_{\max } + T_{\min } } \right)/2} \right) \le {\text{TG}}_{\max } \Rightarrow {\text{GDD}} = \sum\limits_{i = 1}^{n} {\left[ {\left( {\left( {T_{\max } + T_{\min } } \right)/2} \right) - T_{{{\text{base}}}} } \right]}$$

(6)

where n is the number of days from seedling planting to harvest; T_base is the base temperature for eggplant; T_max is the daily maximum air temperature; T_min is the daily minimum air temperature and; TG_max is the average air temperature at which the crop growth stops.

Canopy temperature measurements (T _c) and vapor pressure deficit (VPD)

Thermal images were captured using a thermal imager (Testo 875-2i, Testo, Germany) measuring in the spectrum range of 8–14 µm, with 32°*23°/0.1 m lens, 160*120 pixel resolution detector, 3.3 mrad geometric resolution, and ≤ 0.08 °C thermal sensitivity. Canopy temperatures obtained with thermal images were measured once in two irrigations between 10:00 and 14:00 before the irrigation. The thermal camera was positioned in four directions, perpendicular and parallel to the crop row, covering the crop and the soil in view. The average temperature of the top leaves in the four directions was recorded as the canopy temperature of the eggplant. Average canopy temperature was determined after masking all other elements, such as soil in the image. Since the temperature of crop surface was measured, the emissivity value of the instrument was used as 0.9 (Jones et al. 2002). Wet and dry leaf surfaces were used as the reference surfaces (Leinonen and Jones 2004). Canopy and air temperature difference (T_c–T_a) was used as an input for the estimation of the ET_a.

Vapor pressure deficit (VPD) is a climatic variable related to ambient temperature and relative humidity. The difference between the saturated water vapor pressure and the actual VPD at a given temperature is an important indicator of atmospheric water demand for plants (Rawson et al. 1977). Increasing the VPD increases atmospheric demand for water. Although ET is expected to increase with the increase in atmospheric demand, plants can reduce ET by closing their stomata in response to increased VPD (Massmann et al. 2019). The changes in VPD directly or indirectly affect the plant water consumption; therefore, the variable VPD was also used as an input to estimate the ET_a.

Machine learning models used in estimation of evapotranspiration

ANN is a mathematical model that emulates the ability of human brain to learn from experience (Haykin 1998). The ANN method can learn and predict complex processes with high accuracy. There are several types of ANN techniques available to model various agricultural and environmental problems. In this study, a feedforward MLP with an input layer, a hidden layer, and an output layer was used to model evapotranspiration during the eggplant growing period. Hornik et al. (1989) suggested that a single hidden layer may be sufficient for accurate model prediction. In this study, the number of neurons in the hidden layer was determined by trial-and-error approach. The neural network was trained using the Levenberg–Marquardt training algorithm. Furthermore, tangent sigmoid (tansig) and linear (purelin) activation functions were used in the input and output layers.

Deep learning has been used recently for agricultural applications in a variety of fields (Manikumari et al. 2022; Khaki et al. 2020). Dechter (1986) pioneered the use of deep neural network (DNN), which is a machine learning computation method and follows a similar technique as in ANN model. However, more hidden layers are used in the DNN model (Kamilaris and Prenafeta-Boldú 2018). In this study, three or four hidden layer DNN models with ReLu activation functions were created, and the traditional gradient-descent method was used to minimize the loss. Chen et al. (2020) provided comprehensive details on the DNN model and calculation procedure.

The M5 model tree (M5Tree), introduced by Quinlan (1992), is a subset of machine learning techniques. Decision tree-based methods are among the most well-known machine learning methods. The M5Tree predicts target variables as output in a model with a tree structure based on input data. Detailed information on the M5Tree model can be obtained from Quinlan (1992).

Random forest (RF) is an ensemble learning algorithm that can be used to model complex processes (Breiman 2001). In the RF, all trees are dependent on a set of random variables, and many of the regression trees of forest are brought together to form a community. Before running the RF model, the number of trees to be created in the forest to run and the number of attributes to be used in each tree creation process must be set. The accuracy of the estimation model is primarily assessed by these two parameters (Zhang and Wang 2009).

Support vector machine (SVM), introduced by Vapnik (2013), is a supervised learning model used for classification and regression operations. In SVM, linear or nonlinear models project the input vectors into a high-dimensional feature space by defining complex input–output relationships in a relatively simple way (Wu et al. 2008a). The nonlinear radial basis function (RBF) kernel function, which performs better than other kernel functions (Yamaç 2021), was used to estimate evapotranspiration in this study. The accuracy of the prediction model depends on the selection of the optimal hyperparameters (C, γ) for the kernel operation. In this study, parameter C was determined through trial and error. The detailed information on the SVR model can be found in Vapnik (2013).

The k-Nearest neighbor (kNN) method is one of the most basic nonparametric machine learning methods for classification and regression problems (Cover and Hart 1967). Because the kNN algorithm is based on calculating the distances between two points within a data set, a method for calculating this distance is required. The Euclidean distance measure was used to calculate the distance between two points. The optimal choice of k depends on the data used in the model. For this reason, the optimum k value of different models created with different input parameters was different.

The adaptive boosting (AB) algorithm, proposed by Freund and Schapire (1997), is one of the most widely used ensemble learning methods due to its simplicity, speed, and ease of implementation (Wu et al. 2008a, b). The AB algorithm operates by fitting a primary prediction function to the sum of the original data, calculating a prediction error, and then applying a weighted vector to the data based on the prediction error. The detailed information on the AB model can be found in Freund and Schapire (1997).

In this study, ET_o, ET_c, and ET_a parameters for eggplant grown in semi-humid climatic conditions were estimated by using seven different machine learning methods using different input variables and combinations (climate, soil, and crop properties). The hyperparameter settings of the ANN, DNN, kNN, SVM, M5Tree, RF, AB methods, and the models with different input datasets (Model 1–15) are provided in Table 7 in Appendix.

Selection of input parameters and data normalization

The success of machine learning models is directly related to factors variables such as input combination, model structure, basic parameters, and performance criteria. The first step in developing a prediction model is to identify the input variables. Many factors affect plant water consumption, including climate, soil properties, and plant characteristics, and all these three factors were used in the study to create a simple and applicable approach for ET_o, ET_c, and ET_a. The most important plant factors affecting ET are crop type, plant growth stage, LAI, and h_c (Liu et al. 2020). Different combinations of input variables were used to achieve the best prediction. The input combinations used to estimate the ET_o, ET_c, and ET_a parameters to evaluate the performance of seven different machine learning models under different input parameters are presented in Table 1.

Table 1 Different input combinations used in the estimation of ET_o, ET_c, and ET_a with different machine learning methods

Full size table

All input and output variables were normalized in the range of 0–1 to meet the requirements of the machine learning models before the training and testing phases using Eq. (7).

$$X_{{{\text{norm}}}} = \frac{{X_{{\text{a}}} - X_{\min } }}{{X_{\max } - X_{\min } }}$$

(7)

where X_norm is the normalized value of a variable; X_a is the measured value of a variable and; X_max and X_min are the measured maximum and minimum values of a variable.

The data used in machine learning models were collected during eggplant growing seasons in 2015, 2016, and 2017. The data collected during the experiment were taken as a whole, randomly partitioned as 70% for training and 30% for testing using k-fold cross-validation. In k-fold cross-validation technique, original dataset was randomly divided into k equally sized subsets (k-folds). Of the k partitions, a single subset was designated as the validation data to evaluate the model performance, and the remaining k − 1 subsets were used as training data. This process was repeated k times, and the average cross-validation error was used as the performance indicator. The k value was set to 10 in this study. The detailed information on this procedure can be obtained from Cemek et al. (2020).

Evaluation of statistical model performance

A number of performance assessment methods are used to assess the precision of estimations and to compare the models. Statistical model performance criteria used in this study were coefficient of determination (R²), root-mean-square error (RMSE), mean absolute error (MAE), and Nash–Sutcliffe efficiency (NSE) (Nash and Sutcliffe 1970). The equations of the model performance criteria were defined as presented in Eqs. (8) to (11) (Waller 2003).

$$R^{2} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left[ {\left( {Z_{{i^{*} }} - \overline{{Z_{{i^{*} }} }} } \right)\left( {Z_{i} - \overline{{Z_{i} }} } \right)} \right]^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Z_{{i^{*} }} - \overline{{Z_{{i^{*} }} }} } \right)^{2} \mathop \sum \nolimits_{i = 1}^{n} \left( {Z_{i} - \overline{{Z_{i} }} } \right)^{2} }}$$

(8)

$${\text{RMSE}} = \sqrt {\frac{{\sum \left( {Z_{{i^{*} }} - Z_{i} } \right)^{2} }}{n}}$$

(9)

$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {Z_{{i^{*} }} - Z_{i} } \right|$$

(10)

$${\text{NSE}} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Z_{i} - Z_{{i^{*} }} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {Z_{i} - \overline{{Z_{i} }} } \right)^{2} }}$$

(11)

where $Z_{i}$ is the measured value; $Z_{{i^{*} }}$ is the estimated value; $\overline{Z}_{i}$ is the average value measured; $\overline{Z}_{{i^{*} }}$ is the average value estimated and; $n$ is the number of data. In addition, Taylor diagrams were exploited to investigate the standard deviation (SD) and correlation coefficients (R) between the model-estimated and measured values.

Results

Soil water content and evapotranspiration

The changes in precipitation, applied irrigation, and soil water content in 60 cm effective root zone for full irrigation application (I₁) during 2015, 2016, and 2017 growing seasons are presented in Fig. 1. Total amount of irrigation water applied to eggplants under full irrigation application during the growing season in 2015 was measured as 487 mm. Less irrigation water was applied (310 mm) in 2016 since it was much rainier in that year compared to other two years. In 2017, a total of 416 mm of irrigation water was applied to the plant during the growing season. Because of the variation in the amount of precipitation each year, the total amount of irrigation water applied to the plant varied greatly in three growing seasons. Thus, the total amount of irrigation water applied to the parcels with full irrigation in 2016 was 36 and 25% less than in 2015 and 2017, respectively. During the season, 83, 323, and 157 mm of precipitation occurred in 2015, 2016, and 2017. In 2016, 42% of precipitation during the season occurred at the beginning of the trial and 40% occurred at the end of the season (Fig. 1). As a result, soil preparation and transplanting were delayed compared to other two seasons.

The diurnal variations of ET_o, which is calculated with climatic parameters and reflects the climatic conditions of the trial years, during the growing season in 2015, 2016, and 2017, are presented in Fig. 2. The calculated daily ET_o values varied between 1.91 and 5.29 mm d⁻¹ in 2015, between 1.71 and 4.99 mm d⁻¹ in 2016, and between 2.19 and 5.6 mm d⁻¹ in 2017. The seasonal total ET_o values in 2015, 2016, and 2017 were found as 560, 544, and 525 mm, respectively. The reason for the relatively lower ET_o value in 2017 compared to other years is due to the shorter vegetation period (the vegetation period was 150 days in 2015, 150 days in 2016 and 139 days in 2017).

The daily changes of ET_c values calculated for the eggplant by the FAO-56 PM method during the growing season in 2015, 2016, and 2017 are shown in Fig. 3. The calculated daily ET_c values varied between 1.43 and 6.14 mm d⁻¹ in 2015, between 1.36 and 5.49 mm d⁻¹ in 2016, and between 1.87 and 6.16 mm d⁻¹ in 2017. Seasonal total ET_c values for eggplant in 2015, 2016, and 2017 were determined as 564, 491, and 544 mm, respectively.

The daily changes of ET_a values determined according to the water budget method during the growing season in 2015, 2016, and 2017 are shown in Fig. 4. The measured daily ET_a values varied between 1.35 and 5.56 mm d⁻¹ in 2015, between 1.25 and 4.87 mm d⁻¹ in 2016, and between 1.93 and 6.47 mm d⁻¹ in 2017. The seasonal total ET_a values measured for eggplant in 2015, 2016, and 2017 were 563, 487, and 558 mm, respectively. The variation in ET_c between the three growing seasons may be due to variation in the amount and distribution of precipitation, which has different effects on soil wetting and consequent evaporation losses and water uptake by the plant (Dar et al. 2017). The three-year average seasonal evapotranspiration amount (533 mm) calculated by the FAO-56 PM method was found to be very close to the three-year average seasonal evapotranspiration amount (536 mm) measured according to the water budget method.

When the scatter plot created using 3-year data was examined, a good agreement was observed between the ET_c calculated by the FAO-56 PM method and the ET_a measured by the water budget, with a high coefficient of determination (R² = 0.84) and the slope of the high linear regression curve (0.95) (Fig. 5).

Modeling database

The descriptive statistics for soil and climate parameters determined during the experiment, as well as data for daily climate, crop and soil conditions used in modeling with machine learning are given in Table 2. Daily reference evapotranspiration (ET_o) values for the training and test datasets ranged from 0.55 to 6.45 mm and 0.33 to 6.52 mm, respectively. Values between May 1 and November 30 in all three years were used to model the daily ET_o. Crop evapotranspiration values (ET_c) calculated by the FAO-56 PM method ranged between 1.36 and 5.50 mm in the training data set and between 1.57 and 6.16 mm in the test data set (Table 2). The daily actual evapotranspiration (ET_a) values measured by the water budget method, on the other hand, ranged between 1.26–5.50 and 1.56–6.47 mm in the training and test datasets, respectively (for scenario 2). In the 1st scenario created for ET_a, ET_a values of full and deficit irrigation issues were used. Therefore, ET_a values for scenario 1 varied between 0.99 and 5.61 mm d⁻¹ in the training dataset and between 0.97 and 5.94 mm d⁻¹ in the test dataset.

Table 2 Descriptive statistics of input and output variables

Full size table

Estimation of ET_o using different machine learning models

The ET_o values estimated by seven different machine learning models were compared to the ET_o values obtained by the empirically calculated using the FAO-56 PM method. The training and testing accuracies of the ANN, DNN, M5Tree, SVM, kNN, RF, and AB models with four different input combinations in ET_o estimation are given in Table 3.

Table 3 Statistical results of four different input combinations and seven different machine learning models during the training and testing in estimation of ET_o (bolded values are the models that presented the best results)

Full size table

The performance of machine learning models in training and testing phases was evaluated using R², MAE, RMSE, and NSE values as the variability of performances among the models with different inputs and estimation models (Table 3). The ANN model outperformed the other models, while the performances of AB and RF models were the lowest for all input combinations. The estimation of ANN4 and M5Tree4 models in the 3-input models was very close. The highest performance in the training phase was recorded with the RF model, while it was poor in the testing phase. The result indicated a certain degree of over-fitting in the RF models.

The highest prediction performance was obtained in Model 1 (T_min, T_max, T_avg, u₂, RH_avg, R_s, and DOY), while the lowest prediction performance was obtained in Model 4, which used three inputs (T_avg, RH_avg, and DOY). The comparison of Model 4 and Model 3 showed that the inclusion of u₂ and R_s to the models significantly improved the prediction performance. The MAE and RMSE values in the ANN model decreased from 0.401 to 0.176 (56.1%) and from 0.531 to 0.241 (54.6%), respectively, and the R² and NSE values increased from 0.802 to 0.959 (19.6%) and 0.796 to 0.958 (20.4%), respectively. The removal of R_s from the variable input list in the ET_o estimation (Model 2) caused a significant decrease in the R² (12.8%) and NSE (12.7%) compared to ANN2, while a significant increase in RMSE (65.5%). The ET_o estimation without R_s as an input variable had a negative impact on the estimation performance of the model. Granata et al. (2020) reported similar findings related to the R_s as an input variable. When Model 3 and Model 1 are compared in terms of ANN, the RMSE decreases by 36.5%, while the R² increases by 2.6%.

The addition of T_min and T_max variables to the models improved the accuracy compared to the use of T_avg input alone. The results demonstrated that addition of the maximum and minimum temperatures is preferable to the use of only the average temperature. Lu et al. (2018) reported similar results in the estimation of daily pan evaporation. The comparison of the model in which only T_avg was the only input variable in the estimation of evaporation with the models with T_min and T_max input variables revealed that the RMSE value decreased by 41.7% in the 2-input model. Therefore, the ANN1 model is the best machine learning model in the estimation of the ET_o. However, collection of climate data throughout the experiments may not always be possible; thus, limited data are used in the estimation. For instance, the ANN4 model with 3 input variables provided quite reliable estimation of the ET_o (NSE = 0.796).

The test phase scatter of ET_o values estimated by seven machine learning models versus FAO-56 PM values under four different input combinations and the line graphs of the best model is shown in Fig. 6. The ET_o values estimated by the ANN, M5Tree, and kNN models had a distribution similar to the ET_o values calculated using the FAO-56 PM equation. The results suggested that the performance of these three models in estimating the ET_o is better. The RF, AB, DNN, and SVM models produced more scattered estimations than the other machine learning models. The highest estimation performance with all meteorological variables was obtained in Model 1 compared to the other input combinations (Fig. 6). The estimation accuracy of Model 3 and Model 1 was similar in the absence of all meteorological variables. Machine learning models produced more diffuse predictions and moved away from the fitted line in the absence of R_s in Model 2 and u₂ and R_s in Model 4. The line graphs revealed that the predicted series overlap better with the observed series when all meteorological variables are used in the models, while the predicted series fluctuate more when fewer input variables are used (Fig. 6). The lack of parameters reduced the estimation performance of models, while the addition of the R_s parameter improved the model performances.

The Taylor diagram was used for the comparison between the machine learning models (Fig. 7). The radial axis in a Taylor diagram represents the standard deviations, while the angular axis represents the correlation coefficients (Taylor 2001). Each point on the diagram represents the performance of a specific model, and the model closest to the reference point is considered estimating more accurately. The Taylor diagrams comparing various input combinations and machine learning models in estimating the ET_o during the testing phase are shown in Fig. 7. The ANN model is placed much closer to the reference points than the other machine learning models. The RF and AB models are placed at the furthest location from the observed values for all input combinations. Therefore, the worst predictive models were the RF and AB models. The Model 1, in which all meteorological variables were used as the input data, was the closest model to the observed values in terms of standard deviation, correlation, and RMSE (Fig. 7).

Estimation of ET_c using different machine learning models

The ET_c value was calculated by multiplying ET_o by the crop coefficient (K_c) as reported by Allen et al. (1998). Soil water and salinity stress, plant density, disease and pests, weed infestation, or low productivity did not cause any restrictions on plant growth and evapotranspiration. Meteorological conditions such as climate data, and some plant growth indicators such as crop height, LAI, and CGDD were used as the inputs in the estimation of ETc. Four different input datasets consisting of different combinations were prepared to estimate the ET_c. The performances of ANN, DNN, M5Tree, SVM, kNN, RF, and AB models during training and testing phases in ET_c estimation are presented in Table 4. The performance of various machine learning models and models with different input variables varied significantly. The prediction performance of the RF and ANN models in Model 5 and Model 6 was very close to each other (Table 4). The ANN models, on the other hand, had the best predictive performance (i.e., maximum R² and NSE and minimum MAE and RMSE values) during the testing phase for Model 7 and 8 compared to other models.

Table 4 The performances of four different input combinations and seven different machine learning models during training and testing phases in ET_c prediction (bolded values are the models that presented the best results)

Full size table

ANN7 model performed better when T_avg, u₂, RH_avg, and R_s variables were used as the inputs, while RF6 (ANN6 very close) model performed better when T_avg, u₂, RH_avg, R_s, and CGDD variables were used as the inputs. The performance of models was improved by including cumulative growing degree days (CGDD) in addition to climate data. Including the CGDD variable in the ET_c estimation increased the R² and NSE of the model from 0.660 to 0.774 and 0.656 to 0.771, respectively, while decreasing the MAE and RMSE values from 0.503 to 0.395 and 0.676 to 0.551, respectively. Granata (2019) compared the performances of machine learning models in ET_c estimation and indicated that the accuracy of RF model including T_avg, u₂, RH_avg, and R_s as the input variables was the highest. In addition, M5Tree model provided the best performance when soil moisture and sensible heat flux variables were added to the model. In this study, adding LAI and crop height variables significantly improved the performance of the model. MAE and RMSE values in Model 5 (eggplant characteristics were used as input variables) were 27.8 and 26.8% lower than those obtained in the Model 7 (only climatic parameters). The comparison of prediction performances of Model 5 (with the highest number of variables) and Model 8 (with the lowest number of variables) revealed that the RF5 model had 4.01% higher R² and 5.4% higher NSE values (ANN5 model is very close). The ET_c was estimated with sufficient accuracy using fewer climatic parameters (only average temperature) in addition to the plant characteristics (crop height, LAI). In general, the prediction performance of RF with high number of input variables was very close to that of the ANN in models, while the prediction performance of RF decreased with the decrease in the number of input variables. The performance of RF in the ET_o prediction was the highest in the training phase of all models. However, after testing, the best prediction performance was obtained from the ANN and RF, while the model with the lowest prediction accuracy was obtained from the SVM.

Comparisons of measured and estimated ET_c values during the testing phase of seven different machine learning models and four different input combinations are presented in the scatter plots, and the model with the most accurate estimation is presented in a line chart (Fig. 8). The ANN model provided the best estimates with the lowest scattering for all other input combinations except Models 5 and 6. The prediction performances of the RF and ANN models were similar in Models 5 and 6. In addition, almost all machine learning models predicted ET_c values greater than 5.5 mm d⁻¹. Yamaç and Todorovic (2020) used three different machine learning methods to estimate ET (ET_c) for potatoes and indicated that the ET_c values estimated by the AB and kNN methods were lower than the measured ET_c values and higher than the ET_c values estimated by the ANN method. In addition, the predicted series overlap better with the measured series when all meteorological variables and plant characteristics are used in the models (Model 5), while they disperse when less input variables are used (Model 7).

The Taylor diagrams comparing various input combinations and machine learning models in the estimation of eggplant plants’ evapotranspiration (ET_c) value are given in Fig. 9. The correlation coefficients of Models 5 and 6 are close, and the ANN models have a standard deviation closer to the reference point (Fig. 9). The ANN had the lowest RMSE and the highest correlation coefficients for Models 7 and 8 in all machine learning models. The prediction performance criteria and visual inspections differed depending on the input variables and the machine learning models employed. Machine learning models provided highly accurate predictions with fewer climate parameters in the presence of crop height and LAI data.

Estimation of ET_a using different machine learning models

In this section, the actual evapotranspiration values measured with the soil water balance approach during the 3-year research under field conditions were estimated and compared with seven different machine learning models. Transpiration and evaporation are both combined in a single K_c coefficient, which relates to the plant properties and takes average soil evaporation into the account. The average K_c coefficient is more useful than the K_c calculated in daily time frames using separate plant and soil coefficients for the development of basic irrigation programs and many hydrological balance studies. Therefore, ET_a was estimated by using not only climate parameters but also soil and plant-specific data to determine ET_a with machine learning models.

The prediction performances of two different data sets and seven different input combinations were compared in ET_a estimation with machine learning models. The first data set included T_c–T_a, VPD, and SWC and the second data set included h_c, LAI, T_avg, u₂, RH_avg, and R_s (Table 1). The ET_a was estimated in the first data set using four different input combinations. The training and testing accuracies of ANN, DNN, M5Tree, SVM, kNN, RF, and AB models for ET_a prediction using the first dataset are given in Table 5. Overall, the ANN models outperformed the other seven machine learning methods in estimation performance during the testing phase and followed by the AB and kNN methods. The lowest estimation performances were obtained in SVM and DNN models. The results revealed that the decrease in the number of input variables caused a decrease in the performance of the DNN method while an increase in the estimation error. The accuracy of the models increased with an addition of a new variable as input, and the highest accuracy in ET_a prediction was obtained in Model 9, which included all three variables. The inclusion of the VPD variable in the model during the testing phase caused a 76 and 72% decrease in MAE and RMSE values of the ANN9 model, respectively. At the same time, R² increased from 0.846 to 0.988 (16.8%). The addition of the SWC variable to the model caused a 60% and 57% decrease in MAE and RMSE, respectively, and the R² value increased from 0.933 to 0.993 (6%) (comparing ANN11 to ANN9). The RMSE value decreased from 0.812 to 0.137 (83%) and the NSE increased from 0.558 to 0.987 (76.8%) when T_c–T_a was added to the model compared to the ANN9 model. The results showed that the T_c–T_a variable is the most effective in the estimation of ET_a, followed by the VPD and SWC.

Table 5 The estimation performances of four different input combinations and seven different machine learning models during training and testing phases in the estimation of ET_a for the dataset 1 (bolded values are the models that presented the best results)

Full size table

The performances of three different input combinations and seven different machine learning methods in ET_a estimation were compared in the second data set. Using the second dataset, the performance criteria of different machine learning methods between measured and predicted ET_a in both training and testing phases are summarized in Table 6. When only climate parameters were used in Model 14 (T_avg, u₂, RH_avg, R_s), RF produced the most accurate prediction. The RF model outperformed other machine learning methods in ET_c prediction when the same input variables were used. The Model 13, which included all six variables of the second data set, provided the highest accuracy in ET_a estimation. The MAE value was 14.5% higher and the R² value was 7% lower in Model 14 (which included only climate parameters) compared to the Model 13, which included crop height and LAI as well as climate parameters. The value of MAE was 40.4% lower, and the values of R² and NSE were 33% and 42.6% higher, respectively, for model 13 (h_c, LAI, T_avg, u₂, RH_avg, and R_s) compared with model 15 (h_c, LAI, and T_avg). The climate parameters increased the accuracy of models in ET_a estimation. Finally, the evaluation of both datasets revealed that the ANN9 model with T_c–T_a, VPD, and SWC inputs produced the best estimation results. The RF14 model provided adequate accuracy in estimation of ET_a when only climatic data were available.

Table 6 The estimation performances of three different input combinations and seven different machine learning models during training and testing phases in the estimation of ET_a for the dataset 2 (bolded values are the models that presented the best results)

Full size table

The scatter and line graphs created for the ET_a values measured during the testing phase and estimated with different machine learning models are shown in Fig. 10. The scatter and line graphs created for the second data set are shown in Fig. 11. The estimations of ANN model for all scenarios are less dispersed than the other methods. The accuracy of ANN9 model for the first data set and ANN13 model for the second data set were significantly better than other machine learning methods (Fig. 10). The ANN9 model, on the other hand, had a distribution just above the fit line. In addition, the methods that provided the most scattered results were DNN and SVM. The estimated ET_a values deviated from the fit line with the increase in measured ET_a values in both datasets. Therefore, the ANN9 model had slightly less estimation accuracy for ET_a with high values compared to low values.

A Taylor diagram was used to analyze the standard deviation, RMSE, and correlation coefficient between the measured and estimated ET_a values during the testing phase for the ANN, DNN, M5Tree, SVM, kNN, RF, and AB models (Figs. 12, 13). The ANN model had a standard deviation closer to the measured value compared to the other machine learning methods and had the highest correlation coefficient and the lowest RMSE value. The results indicated that the ANN9 model (T_c–T_a, VPD, and SWC) provided the closest estimation of the ET_a values to the measured values.

Comparison of the stability of different input combinations and various machine learning methods in estimation of evapotranspiration

Four input combinations were developed to estimate the ET_o, four to estimate the ET_c using the crop coefficient approach, and seven to estimate the ET_a using the soil water balance. A heat map comparison of the RMSE values of the ANN, DNN, M5TREE, SVM, kNN, RF, and AB models during the training and testing phases is shown in Fig. 14. Different input combinations had a significant impact on prediction performance of each model. The models with more input variables performed better overall. The RMSE values of the Models 1 and 3 which used the net radiation (R_s) variable for ET_o estimation in both the training and testing stages were lower than the RMSE values of the models that did not use the R_s. The RMSE values of all machine learning methods were very low in the combinations of T_min, T_max, T_avg, u₂, RH_avg, R_s, and DOY. The performance of DNN model was the lowest in 2-input models (especially in Model 11 and Model 12) with the least number of inputs (Fig. 14). In general, the RF models outperformed the other machine learning models during the training stage.

The ANN model had the best prediction performance in all models except Model 5 (RF), Model 6 (RF), Model 13 (RF), Model 14 (RF), and Model 15 (RF) during the testing stage. The RMSE increase in RF models during testing phase compared to training phase was significantly greater than that in other machine learning models. On the contrary, the increase in the ANN model was very small, which indicated that the ANN method is the most stable model in estimating the ET_o, ET_c, and ET_a for eggplant plant.

Discussion

The ET_o, ET_c, and ET_a prediction performances of different machine learning methods and ANN models were compared. The ANN model provided better predictions than other machine learning models, and the models used all input variables had a lower RMSE and MAE values. The RF model performed better compared to the other machine learning models in the estimation of ET_c (h_c, LAI, T_avg, u₂, RH_avg, R_s) and ET_a (T_avg, u₂, RH_avg, R_s); the estimation performance of RF model was close to the ANN model. Similar to our findings, Yamaç (2021) reported that the RF method provided the best prediction performance in the estimation model using all input data for sugar beet ET_c. In addition, a strong linear relationship was recorded between the model inputs and the ET_c values. The ET_c and K_c were positively correlated with T_max and T_min, while T_min was negatively correlated with RH_max and RH_min. On the other hand, it has been reported that the prediction performance of the ANN method is better even when an input variable with a lower correlation with ET_c is entered into the model.

In addition to the accuracy of machine learning models, the stability of the model is also important for a reliable estimate of evapotranspiration. When comparing the RMSE values for the training and test data sets in this study, the largest RMSE increase in the test data set was found for the RF model for almost all models. This increase has revealed the instability of the models of RF, as the prediction accuracy decreases significantly when new data outside the training dataset are used in the test dataset. Fan et al. (2018) compared different machine learning methods in estimating reference evapotranspiration and reported that kernel-based models such as SVM are generally more stable than the tree-based (RF) machine learning models. Evapotranspiration is a complex, dynamic, and highly nonlinear hydrological phenomenon that is affected by a variety of meteorological factors as well as crop growth indicators (Shan et al. 2020). The results demonstrated that the ANN models can accurately model the complex nonlinear relationships between ET and meteorological factors, as well as crop growth indicators. The ANN models obtained in this study, even in the absence of missing data, can be recommended in estimating the water consumption of eggplant grown in semi-humid regions. In addition, plant water consumption can be estimated directly without the need for machine learning methods using the crop coefficient (K_c).

The DNN models performed worse than other machine learning models in estimating the reference plant water consumption. The result contradicts the findings of Saggi and Jain (2019), who reported that the DNN model outperforms the RF model in ET_o prediction. The difference in the results may be attributed to the factors such as the location of the study and the hyperparameter settings of the models, which may affect the performance of ANN and DNN models. The worst prediction performance with DNN models obtained when the number of inputs decreased (models with two inputs). The requirement of large datasets as inputs during the training phase is the most significant disadvantage of deep learning models (Kamilaris and Prenafeta-Bold 2018). The optimization problems can arise in small datasets. Chen et al. (2020) compared the performance of temporal convolution network (TCN) models for the estimation of maize evapotranspiration to the long short-term memory networks (LSTM) and deep neural networks (DNN) and reported that the TCN model, which used 11 input variables, consistently outperformed the LSTM and DNN models for different input data sets. Therefore, the number of data should be increased to improve the performance of deep learning models and other deep learning techniques (such as LSTM, CNN) should also be tested in future studies.

The generalization capacity of evapotranspiration estimation models decreases with the decrease in the number of meteorological variables and crop growth indicators used as inputs in the models. The removal of R_s from the input variables in ET_o estimation significantly decreased the performance of models. The finding, in particular, demonstrates that R_s is an important variable that should be used in estimating the reference crop water consumption using machine learning models. The result also confirms the importance of using temperature as the only input variable in reference crop water consumption estimation models currently used in water resource planning. Finally, previous studies have shown that the ET_c can be influenced by various factors including crop type, crop height, leaf area index, canopy temperature, soil temperature, and climate parameters (Chen et al. 2020). Therefore, the prediction accuracy was improved by using eggplant plant characteristics such as crop height and leaf area index as inputs in modeling ET_c and ET_a.

Within the scope of the study, while the total seasonal ET_a amount measured with the soil water budget in 2015 was 563 mm, the ET_c calculated according to FAO-56 PM was 564 mm, which showed only 1 mm (0.2%) difference. In 2016, ET_a was measured as 487 mm and ET_c as 491 mm, and the difference was determined as 4 mm (0.8%). Similarly, while ET_a was 558 mm in 2017, ET_c was calculated as 544 mm with the difference of 14 mm (2.5%). When the seasonal total evapotranspiration values calculated by these two different methods (soil water balance and FAO-56 PM methods) were compared, a high coefficient of determination (R² = 0.84) was found between them. It can be assumed that the ET values calculated by both methods are similar and the differences observed in the range of 0.2–2.5% can be negligible. Soubie et al. (2016) reported that the measurement of evapotranspiration with the soil water budget method depends on the degree of characterization of soil heterogeneity and drainage status. Soil water budget measurements are not practical to use due to the need for expensive instrumentation, long time, difficulty in application and dependence on farm conditions. Therefore, the FAO-56 PM method, which is widely used in many parts of the world where direct measurements are not available due to complexity or cost, may be preferred for ET estimation.

Conclusion

This study was performed to predict the irrigation water requirement of eggplant grown in a semi-humid region in northern Türkiye in 2015, 2016, and 2017 using machine learning methods. In order to predict the crop evapotranspiration of eggplant, the estimation performances of seven different machine learning algorithms, including ANN, DNN, M5Tree, SVM, kNN, RF, and AB models, were compared by considering different statistical criteria and graphical methods.

This study was carried out in 3 stages. In the first stage, climate parameters were used in the estimation of ET_o. In the second stage, ET_c was used as both climate and crop parameters, and in the estimation of ET_a, climate, plant and soil properties were used as output parameters. In the models created for ET_o, the ANN1 model, in which all input variables (T_min, T_max, T_avg, u₂, RH_avg, R_s, and DOY) were used, showed the best prediction performance (NSE = 0.983; RMSE = 0.153 mm d⁻¹). In the models created for ET_c, the RF5 model (NSE = 0.816; RMSE = 0.495 mm d⁻¹) in which the variables h_c, LAI, T_avg, u₂, RH_avg, and R_s were used as inputs, showed the best estimation performance. The best model performance for ET_a was obtained from the ANN9 model (NSE = 0.987; RMSE = 0.137 mm d⁻¹) in which T_c–T_a, VPD and SWC variables were used as inputs.

It is seen that the most effective variables in the estimation of ET_o are temperature and solar radiation. It has been determined that the different combinations of temperature and solar radiation increase the prediction performance of the models. By using the crop height and leaf area index together with the climate data in the ET_c estimation, the performance of the model (RF5) increased considerably (R² increased 19.6% and RMSE value decreased 26.8%).

In the estimation of ET_a values measured according to the water budget, water deficit applications were also considered as an input parameter in dataset 1 (T_c–T_a, VPD, SWC). The accuracy of the ET_a prediction models for dataset 1 increased considerably when a new variable was added as an input. When T_c–T_a data were added to the input variables used in the ANN12 model, the RMSE value of the ANN9 model decreased from 0.812 to 0.137 (83%), and the NSE increased from 0.558 to 0.987 (76.8%). This shows that the variable that are most affected on ETa was T_c–T_a.

In all three growing seasons (2015–2016–2017), total ET_c and ET_a values were close to each other and a high correlation was found between them. In addition, experimental and field data are needed to calibrate K_c in the ET_c formula to local conditions. This is where the importance of actual evapotranspiration (ET_a) comes into play. It has been concluded that machine learning models can be successfully used to predict ET_c and ET_a.

Abbreviations

ET_o :: Reference evapotranspiration
ET_c: Crop evapotranspiration:: Evapotranspiration calculated from the equation ET_c = K_c × ET_o
ET_a: Actual crop evapotranspiration:: The soil water balance-measured evapotranspiration
K _c :: Crop coefficient
FAO-56 PM:: FAO-56 Penman–Monteith
ANN:: Artificial neural networks
DNN:: Deep neural networks
M5Tree:: M5 model tree
SVM:: Support vector machine
kNN:: k-Nearest neighbor
RF:: Random forests
AB:: Adaptive boosting
RBF:: Radial basis function
DOY:: Day of year
CGDD:: Cumulative growing degree days
MAD:: Management-allowed depletion
FC:: Field capacity
PWP:: Permanent wilting point
SWC:: Soil water content
VPD:: Vapor pressure deficit
R ² :: Coefficient of determination
MAE:: Mean absolute error
RMSE:: Root-mean-square error
NSE:: Nash–Sutcliffe efficiency

References

Abrishami N, Sepaskhah AR, Shahrokhnia MH (2019) Estimating wheat and maize daily evapotranspiration using artificial neural network. Theor Appl Climatol 135(3):945–958. https://doi.org/10.1007/s00704-018-2418-4
Article Google Scholar
Allen RG, Pereira LS, Raes D, Smith M (1998) Crop Evapotranspiration-guidelines for computing crop water requirements-FAO irrigation and drainage paper 56. Fao, Rome 300:D05109
Google Scholar
ASCE-EWRI (2005) The ASCE standardized reference evapotranspiration equation. ASCE-EWRI Standardization of Reference Evapotranspiration Task Committe Report
Blake GR, Hartge KH (1986) Bulk density. In: Methods of soil analysis. Part I, physical and mineralogical methods, pp 363–375. ASA and SSSA. Agronomy Monograph No: 9. Madison, Wisconsin USA
Blaney H, Criddle W (1950) Determining water needs from climatological data. USDA Soil Conservation Service. SOS-TP, pp 8–9
Bouyoucos GJ (1951) A recalibration of the hydrometer method for making mechanical analysis of soils. Agron J 43:434–438
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Cemek B, Ünlükara A, Kurunç A, Küçüktopcu E (2020) Leaf area modeling of bell pepper (Capsicum annuum L.) grown under different stress conditions by soft computing approaches. Comput Electron Agric 174:105514
Article Google Scholar
Chen Z, Zhu Z, Jiang H, Sun S (2020) Estimating daily reference evapotranspiration based on limited meteorological data using deep learning and classical machine learning methods. J Hydrol 591:125286
Article Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27. https://doi.org/10.1109/TIT.1967.1053964
Article Google Scholar
Dar EA, Brar AS, Singh KB (2017) Water use and productivity of drip irrigated wheat under variable climatic and soil moisture regimes in North-West. India Agric Ecosyst Environ 248:9–19. https://doi.org/10.1016/j.agee.2017.07.019
Article Google Scholar
Dechter R (1986) Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory
Google Scholar
Dong J, Zhu Y, Jia X, Han X, Qiao J, Bai C, Tang X (2022) Nation-scale reference evapotranspiration estimation by using deep learning and classical machine learning models in China. J Hydrol 604:127207
Article Google Scholar
Doorenbos J, Pruitt WO (1977) Crop water requirements. Revised 1977. FAO Irrig Drain. Paper 24. FAO of the United Nations, Rome, p 144
Elbeltagi A, Zhang L, Deng J, Juma A, Wang K (2020) Modeling monthly crop coefficients of maize based on limited meteorological data: a case study in Nile Delta, Egypt. Comput Electron Agric 173:105368
Article Google Scholar
Evett SR (2002) Water and energy balances at soil-plant-atmosphere interfaces. In: The soil physics companion, pp 127–188
Fan J, Yue W, Wu L, Zhang F, Cai H, Wang X, Xiang Y (2018) Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric for Meteorol 263:225–241
Article Google Scholar
Feng Y, Cui N, Gong D, Zhang Q, Zhao L (2017) Evaluation of random forests and generalized regression neural networks for daily reference evapotranspiration modelling. Agric Water Manag 193:163–173
Article Google Scholar
Fereres E, Goldhamer D, Sadras V (2012) Yield response to water of fruit trees and vines: guidelines. FAO Irrigation and Drainage Paper, pp 246–497
Ferreira LB, da Cunha FF, de Oliveira RA, Fernandes Filho EI (2019) Estimation of reference evapotranspiration in Brazil with limited meteorological data using ANN and SVM—a new approach. J Hydrol 572:556–570
Article Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. https://doi.org/10.1006/jcss.1997.1504
Article Google Scholar
Gong D, Hao W, Gao L, Feng Y, Cui N (2021) Extreme learning machine for reference crop evapotranspiration estimation: model optimization and spatiotemporal assessment across different climates in China. Comput Electron Agric 187:106294
Article Google Scholar
Granata F (2019) Evapotranspiration evaluation models based on machine learning algorithms—a comparative study. Agric Water Manag 217:303–315. https://doi.org/10.1016/j.agwat.2019.03.015
Article Google Scholar
Granata F, Gargano R, de Marinis G (2020) Artificial intelligence based approaches to evaluate actual evapotranspiration in wetlands. Sci Total Environ 703:135653. https://doi.org/10.1016/j.scitotenv.2019.135653
Article Google Scholar
Han X, Wei Z, Zhang B, Li Y, Du T, Chen H (2021) Crop evapotranspiration prediction by considering dynamic change of crop coefficient and the precipitation effect in back-propagation neural network model. J Hydrol 596:126104. https://doi.org/10.1016/j.jhydrol.2021.126104
Article Google Scholar
Hargreaves GH, Samani ZA (1985) Reference crop evapotranspiration from temperature. Appl Eng Agric 1:96–99
Article Google Scholar
Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River
Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article Google Scholar
Huang G, Wu L, Ma X, Zhang W, Fan J, Yu X, Zhou H (2019) Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J Hydrol 574:1029–1041
Article Google Scholar
Jensen ME (1968) Water consumption by agricultural plants. In: Kozlowski TT (ed) Water deficits and plant growth, vol 2. Academic Press, New York, pp 1–22
Google Scholar
Jensen ME, Burman RD, Allen RG (1990) Evapotranspiration and irrigation water requirements. ASCE
Jones HG, Stoll M, Santos T, Sousa CD, Chaves MM, Grant OM (2002) Use of infrared thermography for monitoring stomatal closure in the field: application to grapevine. J Exp Bot 53(378):2249–2260. https://doi.org/10.1093/jxb/erf083
Article Google Scholar
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90. https://doi.org/10.1016/j.compag.2018.02.016
Article Google Scholar
Khaki S, Wang L, Archontoulis SV (2020) A cnn-rnn framework for crop yield prediction. Front Plant Sci 10:1750. https://doi.org/10.3389/fpls.2019.01750
Article Google Scholar
Köksal ES, Cemek B, Artık C, Temizel KE, Taşan M (2011) A new approach for neutron moisture meter calibration: artificial neural network. Irrig Sci 29:369–377
Article Google Scholar
Leinonen I, Jones HG (2004) Combining thermal and visible imagery for estimating canopy temperature and identifying plant stress. J Exp Bot 55(401):1423–1431. https://doi.org/10.1093/jxb/erh146
Article Google Scholar
Lindsay WL, Norvell W (1978) Development of a DTPA soil test for zinc, iron, manganese, and copper. Soil Sci Soc Am J 42(3):421–428
Article Google Scholar
Liu SM, Xu ZW, Zhu ZL, Jia ZZ, Zhu MJ (2013) Measurements of evapotranspiration from eddy-covariance systems and large aperture scintillometers in the Hai River Basin, China. J Hydrol 487:24–38. https://doi.org/10.1016/j.jhydrol.2013.02.025
Article Google Scholar
Liu J, Meng X, Ma Y, Liu X (2020) Introduce canopy temperature to evaluate actual evapotranspiration of green peppers using optimized ENN models. J Hydrol 590:125437. https://doi.org/10.1016/j.jhidrol.2020.125437
Article Google Scholar
Lu X, Ju Y, Wu L, Fan J, Zhang F, Li Z (2018) Daily pan evaporation modeling from local and cross-station data using three tree-based machine learning models. J Hydrol 566:668–684. https://doi.org/10.1016/j.jhydrol.2018.09.055
Article Google Scholar
Makkink G (1957) Testing the Penman formula by means of lysimeters. J Inst Water Eng Sci 11:277–288
Google Scholar
Manikumari N, Vinodhini G, Murugappan A (2022) Modelling of reference evapotransipration using climatic parameters for irrigation scheduling using machine learning. ISH J Hydraul Eng 28(S1):272–281. https://doi.org/10.1080/09715010.2020.1771783
Article Google Scholar
Massmann A, Gentine P, Lin C (2019) When does vapor pressure deficit drive or reduce evapotranspiration? J Adv Model Earth Syst 11(10):3305–3320. https://doi.org/10.1029/2019MS001790
Article Google Scholar
Meyer PD, Gee GW (1999) Flux-based estimation of field capacity. J Geotech Geoenviron Eng 125(7):595–599
Article Google Scholar
Mokari E, DuBois D, Samani Z, Mohebzadeh H, Djaman K (2022) Estimation of daily reference evapotranspiration with limited climatic data using machine learning approaches across different climate zones in New Mexico. Theor Appl Climatol 147(1):575–587
Article Google Scholar
Monteith JL (1965) Evaporation and environment. In: Symposia of the society for experimental biology, vol 19, pp 205–234. Cambridge University Press (CUP), Cambridge.
Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I—a discussion of principles. J Hydrol 10(3):282–290
Article Google Scholar
Odili F, Bhushan S, Hatterman-Valenti H, Magallanes López AM, Green A, Simsek S, Vaddevolu UB, Simsek H (2023) Water table depth effect on growth and yield parameters of hard red spring wheat (Triticum aestivum L.): a lysimeter study. Appl Water Sci 13(2):65
Article Google Scholar
Olsen SR (1954) Estimation of available phosphorus in soils by extraction with sodium bicarbonate. US Department of Agriculture, No:939, Washington D.C.
Penman HL (1948) Natural evaporation from open water, bare soil and grass. Proc R Soc Lond Ser A Math Phys Sci 193:120–145
Google Scholar
Priestley CHB, Taylor R (1972) On the assessment of surface heat flux and evaporation using large-scale parameters. Mon Weather Rev 100:81–92
Article Google Scholar
Quinlan JR (1992) Learning with continuous classes. In: Proceedings of Australian joint conference on artificial ıntelligence, pp 343–348. World Scientific Press, Singapore
Rahimikhoob A (2014) Comparison between M5 model tree and neural networks for estimating reference evapotranspiration in an arid environment. Water Resour Manag 28(3):657–669
Article Google Scholar
Rawson HM, Begg JE, Woodward RG (1977) The effect of atmospheric humidity on photosynthesis, transpiration and water use efficiency of leaves of several plant species. Planta 134(1):5–10. https://doi.org/10.1007/BF00390086
Article Google Scholar
Reynold WD, Elrick DE, Youngs EG (2002) Single-ring and double-or concentring-ring infiltrometers. In: Dane JH, Topp GC (eds) Methods of soil analysis: part 4—physical methods. Soil Science Society of America, Madison, pp 821–826
Google Scholar
Saggi MK, Jain S (2019) Reference evapotranspiration estimation and modeling of the Punjab Northern India using deep learning. Comput Electron Agric 156:387–398. https://doi.org/10.1016/j.compag.2018.11.031
Article Google Scholar
Shan X, Cui N, Cai H, Hu X, Zhao L (2020) Estimation of summer maize evapotranspiration using MARS model in the semi-arid region of northwest China. Comput Electron Agric 174:105495. https://doi.org/10.1016/jcompag2020105495
Article Google Scholar
Soubie R, Heinesch B, Granier A, Aubinet M, Vincke C (2016) Evapotranspiration assessment of a mixed temperate forest by four methods: Eddy covariance, soil water budget, analytical and model. Agric for Meteorol 228:191–204. https://doi.org/10.1016/jagrformet201607001
Article Google Scholar
Tang D, Feng Y, Gong D, Hao W, Cui N (2018) Evaluation of artificial intelligence models for actual crop evapotranspiration modeling in mulched and non-mulched maize croplands. Comput Electron Agric 152:375–384. https://doi.org/10.1016/jcompag201807029
Article Google Scholar
Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res Atmos 106(D7):7183–7192
Article Google Scholar
Thornthwaite CW (1948) An approach towards a rational classification of climate. Geogr Rev 38(1):55–94
Article Google Scholar
Turc L (1961) Estimation of irrigation water requirements, potential evapotranspiration: a simple climatic formula evolved up to date. Ann Agron 12:13–49
Google Scholar
Vapnik V (2013) The nature of statistical learning theory. Springer
Google Scholar
Waller DL (2003) Operations management: a supply chain approach. Cengage Learning Business Press, London
Google Scholar
Wang L, Kisi O, Zounemat-Kermani M, Li H (2017) Pan evaporation modeling using six different heuristic com puting methods in different climates of China. J Hydrol 544:407–427
Article Google Scholar
Wu CL, Chau KW, Li YS (2008a) River stage prediction based on a distributed support vector regression. J Hydrol 358(1–2):96–111. https://doi.org/10.1016/j.jhydrol.2008.05.028
Article Google Scholar
Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, Steinberg D (2008b) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Article Google Scholar
Yamaç SS (2021) Artificial intelligence methods reliably predict crop evapotranspiration with different combinations of meteorological data for sugar beet in a semiarid area. Agric Water Manag 254:106968. https://doi.org/10.1016/jagwat2021106968
Article Google Scholar
Yamaç SS, Todorovic M (2020) Estimation of daily potato crop evapotranspiration using three different machine learning algorithms and four scenarios of available meteorological data. Agric Water Manag 228:105875
Article Google Scholar
Zhang H, Wang M (2009) Search for the smallest random forest Stat. Interface 2(3):381
Google Scholar

Download references

Funding

This work was financially supported by the Scientific and Technological Research Council of Turkiye (TUBITAK) under Grant [Number 114O538].

Author information

Authors and Affiliations

Department of Agricultural Structures and Irrigation, Faculty of Agriculture, Ondokuz Mayis University, Samsun, Turkey
Bilal Cemek & Sevda Tasan
Department of Soil and Water Resources, Black Sea Agricultural Research Institute, Samsun, Turkey
Aslıhan Canturk & Mehmet Tasan
Department of Agricultural and Biological Engineering, Purdue University, West Lafayette, IN, USA
Halis Simsek

Authors

Bilal Cemek
View author publications
You can also search for this author in PubMed Google Scholar
Sevda Tasan
View author publications
You can also search for this author in PubMed Google Scholar
Aslıhan Canturk
View author publications
You can also search for this author in PubMed Google Scholar
Mehmet Tasan
View author publications
You can also search for this author in PubMed Google Scholar
Halis Simsek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bilal Cemek.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent to publish

The research is scientifically consent to be published.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Table

Table 7 Hyperparameters employed in chosen machine learning (ML) models

Full size table

7.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Cemek, B., Tasan, S., Canturk, A. et al. Machine learning techniques in estimation of eggplant crop evapotranspiration. Appl Water Sci 13, 136 (2023). https://doi.org/10.1007/s13201-023-01942-1

Download citation

Received: 03 February 2023
Accepted: 08 May 2023
Published: 22 May 2023
DOI: https://doi.org/10.1007/s13201-023-01942-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Machine learning techniques in estimation of eggplant crop evapotranspiration

Abstract

Similar content being viewed by others

Comparison of different empirical methods and data-driven models for estimating reference evapotranspiration in semi-arid Central Anatolian Region of Turkey

Performance evaluation of numerical and machine learning methods in estimating reference evapotranspiration in a Brazilian agricultural frontier

Machine Learning Approach for Reference Evapotranspiration Estimation in the Region of Fes, Morocco

Introduction

Materials and methods

Study area and soil measurements

Crop management

Irrigation treatments and experimental design

Estimation of reference evapotranspiration (ETo)

Crop evapotranspiration (ETc) and crop coefficient (K c)

Actual evapotranspiration (ETa)

Measurement of crop height (h c) and leaf area index (LAI)

Calculation of growing degree days (GDD)

Canopy temperature measurements (T c) and vapor pressure deficit (VPD)

Machine learning models used in estimation of evapotranspiration

Selection of input parameters and data normalization

Evaluation of statistical model performance

Results

Soil water content and evapotranspiration

Modeling database

Estimation of ETo using different machine learning models

Estimation of ETc using different machine learning models

Estimation of ETa using different machine learning models

Comparison of the stability of different input combinations and various machine learning methods in estimation of evapotranspiration

Discussion

Conclusion

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Estimation of reference evapotranspiration (ET_o)

Crop evapotranspiration (ETc) and crop coefficient (K _c)

Actual evapotranspiration (ET_a)

Measurement of crop height (h _c) and leaf area index (LAI)

Canopy temperature measurements (T _c) and vapor pressure deficit (VPD)

Estimation of ET_o using different machine learning models

Estimation of ET_c using different machine learning models

Estimation of ET_a using different machine learning models