Estimation of Potato Water Footprint Using Machine Learning Algorithm Models in Arid Regions

Precise assessment of water footprint to improve the water consumption and crop yield for irrigated agricultural efficiency is required in order to achieve water management sustainability. Although Penman-Monteith is more successful than other methods and it is the most frequently used technique to calculate water footprint, however, it requires a significant number of meteorological parameters at different spatio-temporal scales, which are sometimes inaccessible in many of the developing countries such as Egypt. Machine learning models are widely used to represent complicated phenomena because of their high performance in the non-linear relations of inputs and outputs. Therefore, the objectives of this research were to (1) develop and compare four machine learning models: support vector regression (SVR), random forest (RF), extreme gradient boost (XGB), and artificial neural network (ANN) over three potato governorates (Al-Gharbia, Al-Dakahlia, and Al-Beheira) in the Nile Delta of Egypt and (2) select the best model in the best combination of climate input variables. The available variables used for this study were maximum temperature ( T max ), minimum temperature ( T min ), average temperature ( T ave ), wind speed (WS), relative humidity (RH), precipitation (P), vapor pressure deficit (VPD), solar radiation (SR), sown area (SA), and crop coefficient (Kc) to predict the potato blue water footprint (BWF) during 1990–2016. Six scenarios (Sc1–Sc6) of input variables were used to test the weight of each variable in four applied models. The results demonstrated that Sc5 with the XGB and ANN model gave the most promising results to predict BWF in this arid region based on vapor pressure deficit, precipitation, solar radiation, crop coefficient data, followed by Sc1. The created models produced comparatively superior outcomes and can contribute to the decision-making process for water management and development planners.


Introduction
Freshwater supplies around the world are under significant pressure as a result of increasing consumption and pollution (Steffen et al. 2015, Mekonnen andHoekstra 2016).Agriculture consumes the largest amount of water, about 92% of the total world water usage (Hoekstra and Mekonnen 2012).In Egypt, a serious problem facing the water supply system is limited water resources and water shortages (Mohie El Din and Moussa 2016).As the result of climate change conditions and rapid increase in the population, agricultural water resources are decreasing in regions worldwide, especially in the semi-arid and arid zones (Farg et al. 2012).A number of interventions have been carried out to enhance water use efficiency and crop yield efficiency for saving water in irrigated agriculture to achieve water management sustainability (Ward and Pulido-Velazquez 2008).Many indicators are available for evaluating the sustainability of water usage and food production, for example water footprints, water shortages, and crop water productivity (Mekonnen and Hoekstra 2016;Liu et al. 2009).Water footprint (WF) gives an indication of the direct and indirect usage of water and is a metric for determining how much water a product consumes during its life cycle.Its components include green, blue, and grey water (Hoekstra and Mekonnen 2012).The precipitation water absorbed by plants (excluding runoff) is called the green WF.Water intake from rivers, reservoirs, and groundwater is represented by the blue WF, while freshwater resources used to assimilate pollutants are represented by the grey WF.Water footprint in agriculture has been extensively researched, using a variety of crops and areas (Hoekstra and Mekonnen 2012).
Water footprint research focused mainly on lowering the world average use of freshwater (Lovarelli et al. 2016).Local conditions, geographical area, atmosphere, and technology are all considered by the WF (Huang et al. 2012, Zhuo et al. 2016, Tuninetti et al. 2017).The water footprint concept as volumetric water-use indicator can be calculated by dividing the actual evapotranspiration by the crop yield (Chapagain and Hoekstra 2008).The first parameter, potential crop evapotranspiration, is essential in water footprint calculations.Several mathematical methods are used to assess the reference evapotranspiration (ETo).However, the Food and Agriculture Organization's (FAO) FAO-56 Penman-Monteith method (Allen et al. 1998) is more effective than others because it can be used in a great variety of environments and climate scenarios due to its strong basic physics (Landeras et al. 2008).Although this method is the most reliable one, and a wide range of spatiotemporal parameters is needed (maximum and minimum air temperatures, wind speed (Mokhtar et al. 2020a, c), solar radiation and vapor pressure deficit), the Penman-Monteith equation is most commonly used for WF calculations (Chico et al. 2013;Hoekstra et al. 2009;Manzardo et al. 2014).Although these works produced good results, the method consumes much time, cost, data, and effort.In the meantime, The Penman-Monteith method becomes restricted due to the widespread use of empirical methods for evapotranspiration estimation in areas lacking complete climatic variables (Feng et al. 2017, Feng et al. 2018).
Reference evapotranspiration is affected by a variety of meteorological variables, making it difficult to deal with the dynamic and nonlinear relationships between independent and dependent variables.As a result, developing empirical models that take into account all of these complex processes is a big challenge (Wu et al. 2019).Because of the highest performance among nonlinear input-output connections in the model (Xiao et al. 2019), machine learning methods for the description of complex hydrological processes have been used in several studies (Wu et al. 2019, Feng et al. 2019, Mokhtar et al. 2021, Yaseen et al. 2018), including ETo estimation (Wang et al. 2017, Kisi andSanikhani 2015).Machine learning has now been established as an artificial intelligence (AI) discipline, involving algorithms that capture relevant information from vast datasets and is utilized for self-learning purposes to make accurate calculations or predictions.During the last decades, in the domain of water sciences and technologies, the use of various machine learning technologies has been shown to have considerable relevance, like artificial neural networks (ANN) (Landeras et al. 2008, Antonopoulos andAntonopoulos 2017), support vector regression (SVR) (Shiri et al. 2014), fuzzy logic models, neuro-fuzzy models, random forest model (Elbeltagi et al.), and k-Nearest Neighbor (k-NN) (Heddam et al, 2014, Rehman et al. 2019).Machine learning approaches are nowadays widely used to predict ETo, actual water use, water resource variables, hydrological cycles, management of water resources, water quality prevision, and storage activities (Elbeltagi et al. 2021a, Elbeltagi et al. 2021b, Mokhtar et al. 2021).Goyal et al. (2014) employed ANN and other machine learning approaches such as least squares support vector regression (LS-SVR) and fuzzy logic to estimate actual evaporation in subtropical climates.Artificial neural networks were also used effectively to evaluate ETo using incomplete meteorological data (Laaboudi et al. 2012).Algorithms for machine learning were utilized successfully to provide solutions to the problems related to potato cultivation in farmland, such as to predict leaf water potential, as suggested by (Zakaluk et al. 2006), root development modeling (Delgoda et al. 2016), tuber growth (Fortin et al. 2010), and ETo estimation (Tabari et al. 2013;Sabziparvar and Tabari 2010;Yamaç and Todorovic 2020).
Therefore, the objectives of this research were to (1) develop and compare the accuracy of ETo estimation from limited weather input data by four machine learning models (SVR, RF, XGB and ANN) over three potato governorates and (2) select the best model under different scenarios that achieves the highest accuracy and lowest error in forecasting the potato blue WF.This study can provide an innovative modeling method that will enhance efforts to forecast the WFP, which in turn will help mitigation strategies like water usage policies and food safety development plans.

Study Area and Datasets
The Nile Delta is considered Egypt's center of commercial and financial activities, and is the home to Egypt's most populated governorates, containing roughly half of the country's population.Despite accounting for just about 2% of Egypt's total land area, the Nile Delta is responsible for approximately 63 % of the country's agricultural activities (Dumont and El-Shabrawy 2007).Three governorates in the Nile Delta were selected, namely, Al-Gharbia, Al-Dakahlia, Al-Beheira (Figure 1) as the highest governorates in potato production during the period 1990-2016.
In addition, data of solar radiation (SR), soil moisture (SM), and vapor pressure deficit (VPD) were gathered during October to February for the period 1990-2016 from the Climatology Lab (https:// clima te.north westk nowle dge.net/) (Adhikari et al. 2019).Moreover, actual potato yield data from 1990 to 2016 were collected from the agriculture directorates of the governorates, Economic Affairs Sector, Ministry of Agriculture and Land Reclamation.
In this study, potato crop has been identified as one of Egypt's most significant crops, and it is considered one of Egypt's key crops for production, national consumption, and exports.In Egypt, the winter season duration is from October to February the following year and is the main cultivation season for export potato production (Gennari et al. 2019).

Water Footprint Calculations
Only the Blue water footprint (BWF) was calculated in this study, as there is usually no rainfall in Egypt during the winter production season of potato.BWF was calculated using Eq. ( 1).Additionally, the blue water footprint was estimated by multiplying evapotranspiration of blue water by 10 over the growing year, as indicated in Eq. ( 2) (Romaguera et al. 2010, Xinchun et al. 2018).
where Y is the crop production (ton ha -1 ), CWR is the crop water requirement (m 3 ha -1 ).The blue water consumption in crop water requirement (CWR) was determined using the accumulated daily evapotranspiration (ET) (Mekonnen and Hoekstra 2011) as indicated as follows: where ET cblue is the crop evapotranspiration in (mm), which was calculated using Eq.3: where ETo is the reference evapotranspiration (mm day −1 ), and Kc is crop coefficient.Kc was adjusted for particular climatic adaptation in environments with RH min differing from 45% or when u 2 was higher or lower than 2.0 m/s, from FAO Irrigation and Drainage Paper No. 56 as shown below; ETo calculator software was utilized to estimate the reference evapotranspiration (http:// www.fao.org/ land-water/ datab ases-and-softw are/ eto-calcu lator/ en/) based on the Penman-Monteith equation.The detailed data and computation procedure of the ET o can be found in (Mokhtar et al. 2020b, d).

Machine Learning Approaches
In this study, support vector regression (SVR), random forest (RF) (Elbeltagi et al.), artificial neural network (ANN), and extreme gradient boost (XGB) were used to estimate the blue water footprint.The data were split into two categories, the first group (75%) was used for "training" data, which was used for learning the model, while the second group (25%) was used to validate and predict blue WF values with actual calculated values.Six scenarios (combinations) of input variables were developed to test the weight of each variable for the four applied models (Table 1). (1)

Support Vector Regression (SVR)
SVR is a machine learning algorithm for data processing, analysis, and pattern detection that is commonly used for regression and forecasting.SVR is used in classification problems by using versatile class boundary representation, and automated complexity management to reduce the fit and to identify a single global minimum over time.The SVR model generates regression using a kernel collection that implicitly transforms the original, smaller dimension input data into a larger feature area.The SVR provides a unique solution due to the convex existence of the optimality problem, as opposed to the ANN model, which usually has several local minima.The general non-linear SVR is presented as follows: where f(x) the relationship between dependent and independent variables, i + * i are the lagrangian multipliers, kx i, k is the kernel function, and b = the function bias.

Random Forest (RF)
RF is a classification, regression, and clustering ensemble learning system.It creates a collection of randomly determined trees and foresees the class which is either the class (classification) mode or the meaning of the individual trees (regression).The following is a quick rundown of the steps involved in creating an RF model (Elbeltagi et al. 2021a): • From the original files, create n tree bootstrap samples.A subset of bootstrap contains about 2/3 of the components of the initial dataset.• Create an un-pruned regression tree for each bootstrap subset: at each node, randomly sample n tree the predictors and find from these variables the best division, rather than select the best division among all the predictors.• Summarize n tree trees' predictions to predict fresh outcomes (majority votes for classification, and average for regression) (5) RF can be used in large data procedures and thousands of input independent variables, or high-dimensional data can be correctly adjusted (Rodriguez-Galiano et al. 2012).Goyal et al. (2014) defined ANN as a type of computer program that simulates how the human brain processes information and handles data.They are, in other words, digitized simulations of the human brain.An initialization mechanism distinguishes ANN models, which transforms the input into output using interconnected information processing units.Neural networks acquire knowledge by identifying correlations and data patterns.Input data is provided to the first layer of the neural network, which is analyzed before being sent to the hidden layers.The information is then passed from the hidden layer to the final layer, which results in the output.ANNs, like people, are trained by familiarity, but not by programming, with suitable study examples.They learn from data with a known result which improves weight in circumstances when the result is unknown for a good forecast.An ANN design generally comprises of a series or signals for artificial neuron inputs (x), a weighted average computation of them (z) utilizing the summing function and weighs (w), and some activation function (f) to generate an output.Equation ( 6) shows the mathematical model, in which y denotes the output variable, x 1 , x 2 ,…, x n refer to the input-variables, w 1 , w 2 ,..., w n correspond with the weights of the combination that produces the output, θ (.) is the unit phase function, w i is the weight relative to the ith input, and μ is the mean.

Artificial Neural Network
The generalized weight is represented as follows: where o(x) is the anticipated covariate vector result probability and ( ) log-odds is the logistic regression model correlation function where the generalized weight shows the influence of the particular x i covariate and is thus equivalent to the i th regression parameter for regression models, (x) is the predicted covariate vector result probability.Note that the generalized weight depends on all other covariates.

Extreme Gradient Boosting (XGB)
Chen and Guestrin (2016) proposed the XGB method, which is a new implementation tool, especially for K Classification and Regression Trees (Chen and Guestrin 2016).It is based on the "boosting" idea that includes all in a group of weak learners.Predictions are by additive training strategies in order to develop a "strong" learner.XGB is designed to avoid overfitting while still optimizing computing resources.This is done by simplifying the target functions in such a way that they can combine predictive and regularization terms while also retaining a high computational speed.During the training process of XGB, parallel simulations are also performed automatically for the functions.In the XG additive learning procedure, the learner is fitted initially with the entire input space, and then a second model with the residuals is fitted to overcome a weak learner's disadvantages.This fit is performed until the stop criteria has been met.The final forecast of the model is computed according to the number of predictions of each learner.The general function is presented as follows: where f (t) i and f (t−1) i represent the forecasts in steps t, f t (x i ) is the learner step by step, and (t-1) and x i are the input variables.Chen and Guestrin (2016) give exten- sive details and calculations for the XGB algorithm.

Evaluation of Model's Performance
In this study, root mean square error (RMSE) is the standard sample variance between predicted and actual values and is calculated as follows: where both O i and P i are the actual and forecasted values.
The mean absolute error (MAE) assesses without considering its signs, the average extent of the mistakes on a number of projections.The absolute discrepancies between anticipated and experimental levels are averaged throughout the test sample.MAE is calculated as: The mean bias error (MBE) (Springmann et al. 2018) was used to assess the applied models: Moreover, the accuracy (A) and the coefficient of determination (R 2 ) are as follows: Greater R 2 shows higher accuracy of prediction, while lower RMSE is evidence of improved model performance.
The standardized statistical of the Nash-Sutcliffe model efficiency coefficient (NSE) is the magnitude of the residual variance relative to a data variance measured, where both O i and P i are the actual and forecasted values, O is the actual and pro- jected mean values.
The range of Nash-Sutcliffe efficiency coefficient values and the scatter index (SI) for the accuracy of the models are shown in Table 2.
Mean average percentage error (MAPE) was also calculated.Additionally, uncertainty with 95% (U 95 ) showing the model deviations to assess substantial variations in the predicted and calculated BWF gave more information on the efficacy of the model.MAPE is defined as: where both O i and P i are the actual and forecasted values, O and P are the actual and projected mean values, i is the number of observations, and SD is the standard difference between the estimated and calculated values.
Percentage of BIAS (PBIAS) to assess the prediction model's bias.The metric of PBIAS quantifies the degree of deviation between the predicted value and the actual value (Barzegar et al. 2020).The ideal PBIAS value is 0.0, whereby a positive value denotes a bias of overestimation, and a negative value indicates a bias of underestimation in the model (Aslan et al. 2022).The PBIAS metric is expressed as a percentage and is calculated using the following equation: Table 2 The range of Nash-Sutcliffe efficiency coefficient (NSE) values and the scatter index (SI) for the accuracy of the models according to (Moriasi et al. 2007) and (Li et al. 2013

Evaluation of the Machine Learning Models
The model performance statistics utilizing climatic data scenarios for the BWF prediction are illustrated in Fig. 2a for the four models applied.The results showed that R 2 was higher than 0.70 for all scenarios except Sc4 in SVR, where it was 0.52.While the highest R 2 value was observed for XGB and ANN (Sc6) (R 2 >0.95), followed by Sc5 for both XGB and ANN (R 2 close to 0.90).
The SI was fair for Sc5 when using XGB and ANN, while it was poor for Sc4 using SVR (Fig 2d).The findings showed that Sc5 was sufficient to estimate BWF when using XGB and ANN models and there is only data available of vapor pressure deficit, precipitation, solar radiation, and crop coefficient.The obtained results are in line with the findings of Xu et al. (2015), who showed that the changes in temperature, the vapor pressure deficit, relative humidity, wind speed, and sunshine (18) Fig. 2 Performance statistics of ML models applied to the six distinct climate variable scenarios duration have the strongest impact on ETo, which affect the BWF directly.Also, Elbeltagi et al. (2021b) stated that SR, RH, and VPD data integration using ANN gave the best estimation of the blue WF in Al-Daqahliyah Governorate.
By inspecting the uncertainty in the performance metrics, the results showed that there are significant differences in the performance of the SVR, RF, XGB, and ANN (Table 3).The correlation coefficient (CC) values were higher than 0.80 for all scenarios and models except SVR, for which Sc4 gave a CC of 0.71.The scenarios that gave highest CC were Sc6 for XGB, ANN, and RF, which gave correlations of 0.97, 0.95, and 0.95, respectively, followed by Sc5 for XGB ANN, and RF, giving values of 0.94, 0.94, and 0.93 respectively, while SVR Sc1 gave a CC value of 95%.While the lowest CC value was 71% for SVR at Sc4.On the other hand, MBE, MAE, and P bias values for the SVR model values were −1.04 m 3 ton -1 , 7.48 m 3 ton -1 4.1%, respectively for Sc4.On the other hand, XGB had the lowest U 95 value, which demonstrates its higher performance over other models, with 10 % for Sc5, followed by 10.5% in Sc5 for the RF model (Table 3).The results showed that the highest R 2 was recorded for XGB, followed by the ANN model.Moreover, the results of all applied models are satisfactory in predicting the BWF over the study area.BWF.The current research outputs are in agreement with the findings of Elbeltagi et al. (2020), who applied the ANN model for forecasting maize BWF with R 2 > 0.95 by applying T mean , WS, SR, VPD, H, K c , and SM as ANN model inputs.Furthermore, Mokarram et al. (2021) investigated the WF of maize, and reported a correlation coefficient (R 2 ) of 0.75 using the MLP (Multilayer perceptron)-ANN model.However, the current study showed better correlations when estimating the WF of potatoes.
For almost all scenarios, the XGB model consistently achieved accuracy above 95%, while the SVR model exhibited the lowest accuracy, notably around 44% in scenario Sc4.Moreover, the XGB model demonstrated the lowest mean absolute percentage error (MAPE) across all scenarios, followed closely by the RF model.Conversely, the SVR model, particularly in Sc4, displayed the highest MAPE value.The findings represented in Fig. 4 indicated that over 80% of the data was within the range of -20 to 20 for Sc1 across RF, SVR, and XGB models and for Sc5 in RF and XGB models.However, in Sc4, specifically with the SVR model, around 40% of the Fig. 3 Radar charts of the accuracy and mean absolute percentage error (MAPE) of the blue water footprint (BWF) for the applied ML models data points exceeded the range of −20 to 20 during testing, reflecting performance variations in the best model scenarios.

Comparison of the Machine Learning Models
To better understand the distribution of data and ability of the selected model to predict BWF, the predicted and actual BWF values for the testing stage, datasets were presented and compared as scatter plots and box plots (Figs. 5 and 6).
The results showed that there are significant differences between the four applied models.The box plots show the defect distribution based upon four values: first quartile (Q1), third quartile (Q3), interquartile range (IQR), and one section within the rectangle indicating the median.The XGB model had the lowest error IQR, however, SVR model gave the highest IQR in Sc5 (Fig. 5).The lowest IQR demonstrates that the distribution of the error is close to zero and the median line in the middle of the rectangle represents the normal distribution of the error.
As shown in Fig. 6, the predicted BWF values have a very close distribution pattern with actual BWF values in the four applied models for Sc5, and the R 2 values for SVR, RF, XGB, and ANN were 0.83, 0.87, 0.89, and 0.89, respectively.Based on the foregoing results, the best models for the entire region were XGB and ANN models, although expecting high performance from the developed models.
Similar to the present study, Elbeltagi et al. ( 2020) employed a deep neural networks (DNN) approach to simulate future water footprints based on monthly climatic data of minimum temperature (T min ), maximum temperature (T max ), Fig. 4 Percentage error of the applied ML models during the testing stage precipitation (P), solar radiation (SR), soil moisture (SM), wind speed (WS), and vapor pressure deficit (VPD), and reported that their results will assist optimize future climate change water planning for the agricultural sector.Furthermore, Elbeltagi et al. (2021a) studied the spatiotemporal variability of the blue ET which Fig. 5 Boxplots illustrating the distribution of the BWF's estimate errors for the best model scenarios in the test section.Q1 is a lower error quartile, Q3 is a higher error quartile, and IQR is an interquartile range for each model Fig. 6 Scatter plot of the applied models over the 6 scenarios for estimating the blue water footprint is a part of blue WF based on the available climate data in this study.In addition, the models applied in the present study produced better results than the that of Kersebaum et al. (2016), which showed that the WF differences across the seven crop models ranged from 15 to 49% Furthermore, our results are acceptable compared to Garofalo et al. (2019) who employed four crop models for modeling WF in two areas and their results showed that the variations in WF values were on average between 5 and 23% smaller than their real data.The model outcomes were also comparable to those reported by Karandish and Šimůnek (2019), who explained that their simulated WF results were 0.3 to 3.2% below those of the SALTMED model in respect to the observed maize WF.No machine learning method is generally the best for all purposes.The performance of the various methods depends heavily on the size and structure of the data that is provided.The above algorithms were chosen because they are generally highly successful and very effective at learning complex, non-linear relations (Granata 2019).

Conclusion
There is an increasing interest in enhancing agricultural water productivity with decreasing water availability to fulfill the increasing global food demand with limited freshwater.The objective is to grow more crops with less water, hence lowering the WF per agricultural unit produced.In this study, BWF was estimated for potatoes crops from 1990 to 2016 in three governorates in the Nile Delta of Egypt.Four machine learning algorithms (SVR, RF, XGB, and ANN) were developed and Six scenarios (combinations) of input variables were used to test the weight of each variable when using the four applied models.The findings of several statistical indexes indicated successful WF estimation outcome when using XGB and ANN models, with high accuracy of more than 90%, R 2 = 0.90, RMSE = 3.6 m 3 t -1 and very good NSE in the three governorates.The results demonstrated that the XGB and ANN models can estimate the BWF with acceptable accuracy when vapor pressure deficit, precipitation, solar radiation, and crop coefficient data are available (Sc5).The developed methodology may thus be a valuable decision tool for ensuring the sustainable management of agricultural water in the semi-arid zones.
Abbreviations SVR: support vector regression; RF: random forest; XGB: extreme gradient boost; ANN: artificial neural network; BWF: blue water footprint; NSE: Nash-Sutcliffe model efficiency coefficient; RMSE: root mean square error; MAE: the mean absolute error; MBE: the mean bias error; R 2 : the coefficient of determination; MAPE: mean average percentage error; U 95 : uncertainty with 95

Table 1
Summary combinations of blue water footprint for the developed models

Table 3
Model performance measures for blue water footprint estimation * CC, correlation coefficient