Abstract
Various critical applications, spanning from watershed management to agricultural planning and ecological sustainability, hinge upon the accurate prediction of reference evapotranspiration (ETo). In this context, our study aimed to enhance the accuracy of ETo prediction models by combining a variety of signal decomposition techniques with an Artificial Bee Colony (ABC)–artificial neural network (ANN) (codename: ABC–ANN). To this end, historical (1979–2014) daily climate variables, including maximum temperature, minimum temperature, mean temperature, wind speed, relative humidity, solar radiation, and precipitation from four arid and semi-arid regions in Egypt: Al-Qalyubiyah, Cairo, Damietta, and Port Said, were used. Six techniques, namely, Empirical Mode Decomposition, Variational Mode Decomposition, Ensemble Empirical Mode Decomposition, Local Mean Decomposition, Complete Ensemble Empirical Mode Decomposition with Adaptive Noise, and Empirical Wavelet Transform were used to evaluate signal decomposition efficiency in ETo prediction. Our results showed that the highest ETo prediction accuracy was obtained with ABC-ANN (Train R2: 0.990 and Test R2: 0.989), (Train R2: 0.986 and Test R2: 0.986), (Train R2: 0.991 and Test R2: 0.989) and (Train R2: 0.988 and Test R2: 0.987) for Al-Qalyubiyah, Cairo, Damietta, and Port Said, respectively. The impressive results of our hybrid model attest to its importance as a powerful tool for tackling the problems associated with ETo prediction.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Avoid common mistakes on your manuscript.
Introduction
Evapotranspiration (ET) stands as a pivotal element within the water cycle, wielding significant influence over crop growth, agricultural yield (Elsadek et al. 2023), and micro-meteorological dynamics (Dong et al. 2021; Yi et al. 2022). Its impact reverberates throughout agricultural management, directly influencing soil moisture levels, crop water requirements, and the formulation of effective irrigation strategies (Tang et al. 2018; Jiang et al. 2019). Lysimeters and eddy covariance are widely used instruments for ET measuring. They give precise information, but they are time-consuming and expensive (Dong et al. 2021). The Food and Agriculture Organization (FAO) and the American Society of Civineers endorse the standardized Penman–Monteith (PM) equation as the preferred method for estimating reference crop evapotranspiration (ETo). This equation, which integrates physiological and aerodynamic parameters, estimates ET from reference surfaces and has been validated across diverse climatic regions using lysimeter measurements (Allen et al. 1998). Nonetheless, its reliance on extensive meteorological datasets (such as solar radiation, wind speed, and relative humidity) poses challenges in regions where such data may be scarce (Lee et al. 2024). As an alternative, the Hargreaves (HG), a simplified temperature-based equation that utilizes maximum, minimum, and average temperatures (Elsadek 2023; Elsadek et al. 2024), has been proposed. However, potential evapotranspiration (PET) estimates derived from the HG equation have exhibited discrepancies when compared to the PM equation across various geographical locations worldwide. As a result, numerous researchers have devised practical models that demand fewer climatic data inputs for estimating ETo, yielding satisfactory outcomes across diverse climatic conditions (Tabari et al. 2013; Feng et al. 2017a; Zhang et al. 2018). Conversely, these models may entail an uncertain estimation process due to the influence exerted by local climates and geographic variations (Zhao et al. 2023).
Recently, machine learning (ML) techniques such as artificial neural networks (ANNs), extreme learning machines, random forests, gene expression programming, and extreme gradient boosting have shown satisfactory results in ETo modeling and prediction (Kumar et al. 2002; Landeras et al. 2008; Traore et al. 2010; Fan et al. 2018; Reis et al. 2019; Ferreira et al. 2019; Petkovi et al. 2020; Zhu et al. 2020; Gong et al. 2021; Chia et al. 2021). These models provide credible estimations of ETo, and they require fewer climatic data inputs in comparison with conventional methods (Feng et al. 2017b; Fan et al. 2019; Chia et al. 2021). These models have proven their effectiveness across different climates and geographic regions, making them valuable tools for ETo modeling (Ferreira and da Cunha 2020; Sowmya et al. 2020; Bellido-Jiménez et al. 2021; Kaya et al. 2021). Moreover, hybrid ML models that combine various algorithms are considered an efficient alternative model that enhances the reliability of ETo estimates (Reis et al. 2019; Zhu et al. 2020; Mohammadi and Mehdizadeh 2020; Gong et al. 2021; Roy 2021; Chia et al. 2021; Elzain et al. 2024; Granata et al. 2024).
In this study, various data preprocessing techniques were combined with the ABC-ANN algorithm for predicting daily ETo values based on several meteorological variables (minimum temperature (Tmin), maximum temperature (Tmax), mean temperature (Tmean), precipitation (P), wind speed (WS), relative humidity (RH), solar radiation (Rs), and extraterrestrial solar radiation (Ra). By utilizing these sophisticated decomposition techniques, the study aims to effectively capture the intrinsic modes and underlying patterns in the climatic data, leading to a more accurate and robust prediction of ETo. The present study also aimed to investigate how the signal processing approach would impact the ETo prediction performance of the ML model. Furthermore, the prediction of ETo values was compared to evaluate the prediction performance of different models. The ultimate goal is to provide a powerful and reliable tool for critical applications such as watershed management, agricultural planning, and ecological sustainability, thereby addressing the challenges associated with accurate ETo prediction in arid and semi-arid climates.
Material and methods
Study area and data collection
This research employed daily meteorological variables from 1979 to 2014 from four regions in Egypt, namely Al-Qalyubiyah, Cairo, Damietta, and Port Said. Figure 1 shows the geographical information for the chosen locations. Also, Fig. 1 depicts the locations of the case studies on a map of Egypt. Besides, we also employed ET datasets for five regions during the same period, which were calculated by the HG method because of limited meteorological variables. The Al-Qalyubiyah region witnesses a Mediterranean climate and practices traditional agriculture, which includes the growth of crops such as wheat and vegetables. The desert climate of Cairo is well known, as is its urbanization and the cultivation of dates and citrus fruits. Damietta is located under arid climate conditions and is well-known for rice (Elsadek 2023; Elsadek et al. 2024) and citrus cultivation. At the same time, Port Said is strategically positioned at the entrance of the Suez Canal and engages in trade-related activities as well as wheat and barley farming. Geographical information for each station, including latitude, longitude, and elevation, is listed in Table 1.
Nine daily meteorological observations were used to implement this study, which includes Tmin, Tmax, Tmean, P, WS, RH, Rs, Ra, and ETo. The first eight climatic variables were used as inputs in the models, while the ETo was predicted as the target variable. The selection of meteorological variables and considering their time series as the inputs of the model were based on their correlation (Fig. 2) and previous studies, which proved the effectiveness of these variables on ETo prediction (Abdallah et al. 2022; Mehdizadeh et al. 2021).
Artificial neural network (ANN)
ANNs are systems of computation that simulate how information is addressed, learned, and processed in the human brain (Hussan et al. 2022). They have many applications in hydrology (Tung et al. 2020; Yin et al. 2023a), allowing them to approximate unknown functions or forecast values in the future based on potentially noisy time series data (Bisoyi et al. 2019). Analysis of the structure of an ANN involves some simple components working in parallel. Describing the function of the ANN in a similar way to natural processing mainly relies on the links between the layers (Yaseen et al. 2016). The hidden layer represents the primary key element of the ANN, as it is located between the output and input layers, in which the neurons get a set of weighted inputs and thus generate an output by implementing a specific automation function. For details on ANN training, the reader is referred to Yaseen et al. (2016).
Artificial Bee Colony (ABC)
The ABC is a metaheuristic optimization algorithm inspired by the foraging behavior of honeybees (Karaboga 2005). In ABC, the search process is modeled based on the food-foraging behavior of honeybee colonies. The algorithm maintains a population of artificial bees, where each bee represents a potential solution to the optimization problem. The ABC algorithm consists of three main components:
-
Employed bees These bees explore the search space by exploiting the information obtained from other bees in the population. They iteratively improve their solutions by selecting promising solutions found by other bees and making small modifications to them.
-
Onlooker (keeper) bees These bees choose which employed bees to follow based on the quality of their solutions. They observe the solutions generated by the employed bees and decide whether to explore similar regions of the search space or exploit different regions.
-
Scout bees These bees are responsible for exploring new areas of the search space that other bees have not visited. If an employed bee exhausts its search for a solution without finding an improvement, it becomes a scout bee and explores a new solution randomly.
The bees transmit messages about the location of food sources by performing a series of motions referred to as a waggle dance. Utilizing the food source consumed by the bees, the bee will become a scout to randomly search for new food sources (Karaboga and Basturk 2007, 2008; Iqbal et al. 2024).
In this study, ABC-ANN, EMD, VMD, LMD, and EWT codes were created with Matlab 2019 software, while EEMD and CEEMDAN codes were produced with R software.
Empirical mode decomposition (EMD)
EMD offers extensively applied adaptive instantaneous frequency-based intrinsic mode functions (IMFs) to flexibly analyze multichannel data with linear and non-stationary time series. Each IMF symbolizes a basic oscillation in the signal. IMFs’ frequencies and amplitudes are not fixed, yet they can vary with time. Suppose x(n) is stated as a time series. In that case, it adaptively separates a non-stationary signal into a series of IMFs from high frequencies to low frequencies, and the separated signal could be labeled as follows:
in which Ci(t): ith IMF x(t), rN(t): residual of signal (Kedadouche et al. 2016).
Variational mode decomposition (VMD)
It recursively separates a real-valued multicomponent signal f into quasi-vertical band-limited sub-signals. In addition, all modes are compact around a center vibration. Therefore, the equation of the limited variational problem can be formulated as follows:
in which, wk shows the frequency center of each IMF, \(\left\{ {w_{k} } \right\} = \left\{ {w_{1} ,w_{2} , \ldots , w_{k} } \right\};\left\{ {u_{k} } \right\} = \left\{ {u_{1} ,u_{2} , \ldots ,u_{k} } \right\}\) uk refers to decomposed band-limited IMF (Danandeh Mehr et al. 2023).
Ensemble empirical mode decomposition (EEMD)
While EMD has been extensively applied in non-stationary and/or nonlinear signal analysis, the decomposition results are susceptible to several problems due to mode mixing. To overcome this problem, the EMD ensemble method has been presented. Using the EMD method, a finite-amplitude white noise is introduced into the original signal while the signal is repeatedly decomposed into IMFs. The average of the IMFs obtained from each test is presented as the IMFs of the EEMD method. This means that the mode mixing problem is avoided with the EEMD method (Zhang et al. 2018).
Local mean decomposition (LMD)
Smith (2005) is a pioneer of the LMD method and has presented its advantages in the processing of non-stationary and nonlinear signals. The basic principle of using the LMD method in signal processing is given as follows:
-
The signal instantaneous amplitudes and frequencies are deduced from the initial signal using the moving average method (as in Eqs. (3) to (9)).
-
The LMD and frequency serve to produce a set of small product functions (PF), each of which is an envelope signal, a frequency-modulated signal being the sum of an envelope signal and a time-varying instantaneous phase, and a frequency-modulated signal from which an instantaneous frequency can be deduced (as in Eq. 10). The procedures for the calculation of the LMD are shown below (Li et al. 2023; Yongbo et al. 2019):
-
To derive the distance between successive local extrema of the signal to be averaged by weighting, the average value ni, ni+1 of two successive local extrema should be computed.
$$m_{i} = \frac{{n_{i} + n_{i + 1} }}{2}$$(3)
The ai (i.e., envelope estimate) of two successive local extrema is given below:
-
(2)
The procedure is given as follows to obtain a completely frequency-modulated signal from which a positive frequency can be generated: Firstly, m11(t) (the local mean function) is acquired by implementing the moving average method to smooth the prediction of the envelope. Then, sort m11(t) from the original signal x(t) to deduce h11(t) (signal residual).
$$h_{11} \left( t \right) = x\left( t \right) - m_{11} \left( t \right)$$(5)
Furthermore, the a11(t) envelope function is generated by utilizing the moving average method to smooth the estimations of the envelope.
-
(3)
Sort h11(t) from a11(t):
$$s_{11} \left( t \right) = \frac{{h_{11} \left( t \right)}}{{a_{11} \left( t \right)}}.$$(6)
By repeating steps 1 to 2 q times with s11(t) as the original signal, a pure frequency modulated signal s1q(t) is derived with the envelope function meeting the following a1(q+1) (t) ≈ 1.
in which
Once a signal is obtained meeting the envelope function, the iterations finish a1q(t) = 1.
-
(4)
The a1(t) is in Eq. (7), and the 1. component PF1(t) is:
$$a_{1} \left( t \right) = a_{11} \left( t \right) \times a_{12} \left( t \right) \times a_{13} \left( t \right) \times \cdots \times a_{1q} \left( t \right)$$(9)$${\text{PF}}_{1} \left( t \right) = s_{1q} \left( t \right) \times a_{1} \left( t \right)$$(10) -
(5)
μ1(t) can be subtracted via division PF1(t) from x(t):
$$\mu_{1} \left( t \right) = x\left( t \right) - {\text{PF}}_{1} \left( t \right)$$(11)
To apply the LMD method, repeat steps 1 to 4 with μ1(t) as the new data and iterate this process k times until μk(t) remains fixed. The cycle concludes at this point, and μk(t) is directly associated with the residual (R). All residuals and PFs are equal to the accumulation of x(t):
This method can divide the load into various parts and summarize the original load’s complexity according to various characteristics. Once the signal is acquired by LMD, a set of low-frequency PFs can be deduced from the high frequency, thereby abstracting the load signal with complicated properties into pure PFs with clear properties.
The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN)
A new decomposition algorithm, namely CEEMDAN, has been proposed. It can overcome the mode overlap (overlapping of signal modes at multiple frequencies) problem from EMD. By adaptively adding white noise to each component of the EEMD, the white noise-induced reconstruction error will be eliminated by iteration, and IMFs will be acquired by calculating the residual error (Ma et al. 2020). The CEEMDAN enables decomposition accuracy and greatly minimizes mode overlap. The CEEMDAN’s dedicated procedures are given as follows:
-
(1)
The si(n) (signal time series) is separated by the CEEMDAN method. \({\text{IMF}}_{1} \left( n \right)\) is calculated as follows:
$${\text{IMF}}_{1} \left( n \right) = \frac{1}{I}\mathop \sum \limits_{i = 1}^{I} {\text{IMF}}_{1}^{i} \left( n \right) = \overline{{{\text{IMF}}_{1} \left( n \right)}}$$(13)I is the time of decomposition.
-
(2)
Calculate the residual sequence and give the first residual sequence as follows:
$$r_{1} \left( n \right) = s\left( n \right) - {\text{IMF}}_{1} \left( n \right).$$(14) -
(3)
Adding ε1E1[νI (n)] to the first residual sequence, the realization r1(n) + ε1E1[νi (n)], i = 1, 2,…, I. is obtained. Separate r1(n) + ε1E1[νi (n)], i = 1, 2, …, I, and \(IMF_{2} \left( n \right)\) can be computed as follows:
$${\text{IMF}}_{2} \left( n \right) = \frac{1}{I}\mathop \sum \limits_{i = 1}^{I} E_{1} \left( {r_{1} \left( n \right)} \right) + \varepsilon_{1} E_{1} \left( {\nu^{i} \left( n \right)} \right).$$(15) -
(4)
After the k-th residue sequence is computed, the next modal component can be calculated as follows:
$$r_{k} \left( n \right) = r_{k - 1} \left( n \right) - {\text{IMF}}_{k} \left( n \right)$$(16)$${\text{IMF}}_{k + 1} \left( n \right) = \frac{1}{I}\mathop \sum \limits_{i = 1}^{I} E_{1} \left( {r_{k} \left( n \right)} \right) + \varepsilon_{k} E_{k} \left( {\nu^{i} \left( n \right)} \right).$$(17) -
(5)
When the residual sequence is unsuitable for further decomposition, the decomposition ends. Once decomposition has been performed K times, the residual sequence is shown as the result:
$$R\left( n \right) = s\left( n \right) - \mathop \sum \limits_{i = 1}^{K} {\text{IMF}}_{k} \left( n \right).$$(18)
The s(n) can be formulated as:
Empirical wavelet transform (EWT)
The wavelet transform has emerged as a promising decomposition technique, demonstrating its potential to enhance the accuracy of hydrological models (Mehr et al. 2021). Gilles introduced EWT in 2013. EWT’s main idea is to derive amplitude-modulated-frequency-modulated (AM-FM) signals of a particular signal by adaptively designing a suitable wavelet filter bank. The EWT generates IMFs based on the information available in the signal spectrum. The process of decomposing the signal x(t) by using EWT can be expressed in the following way:
-
(1)
Use the Fast Fourier transform algorithm (FFT) to calculate the Fourier spectrum F(ω) of the original time series x(t).
-
(2)
Proper segmentation of the Fourier spectrum for extracting distinct modes and determination of boundaries.
-
(3)
Once the limits are detected by proper segmentation, compute the coefficients of approximation and detail by applying the appropriate scaling function and empirical wavelets to each determined segment. The limits ωi can be defined as follows:
$$\omega_{i} = \frac{{f_{i} + f_{i + 1} }}{2}\;{\text{for}}\;1 \le i \le N - 1$$(20)
The code in the “Appendix” implements a comprehensive decomposition of a signal Empirical Orthogonal Function (EOF)-based LMD algorithm with an Empirical Optimal method (EOE). It iteratively extracts PFs until a stop criterion is met or the maximum number of PFs is reached. PFs, amplitudes, and siftings are stored in arrays, and the Orthogonal Residual Trend (ORT) is computed. Figure 3 indicates the flow chart of the applied modeling steps.
Results
Accurate estimation of ETo values is vital for managing water resources, agricultural production, dams, reservoir planning, irrigation projects, water treatment plants, hydroelectric generation, and ecological aspects. Within the scope of this study, a novel hybrid approach was proposed by combining nature-inspired ABC optimization and various signal processing techniques with neural networks to estimate ETo values. During the modeling phase, 70% of the data were separated as training (1.01.1979–11/25/2003) and 30% as testing (11/26/2003–7/24/2014). Tmax, Tmin, Tmean, WS, RH, Rs, Ra, and precipitation were input in all models to obtain ETo. It has benefited from the correlation matrix as a particular choice for detecting hidden combinations during modeling. Since ET values depend on many meteorological parameters, all parameters are used in the models. The correlation matrix was used to form the best model input combinations. Figures 4, 5, 6 and 7 demonstrate the 95% confidence interval correlation matrix used to choose the input variables to obtain ETo using hybrid-ABC-ANN with EMD, VMD, LMD, EEMD, CEEMDAN, and EWT techniques for Al-Qalyubiyah, Cairo, Damietta, and Port Said stations. The relationships of Tmax, Tmin, Tmean, WS, RH, Rs, Ra, and P were evaluated to forecast monthly ETo values. As a result, it was found that the solar radiation and min, average, and max temperature values are very effective in forecasting ETo. In addition, it was also determined that precipitation values are weakly correlated with ETo. Multiple combinations of parameters significant at the 95% confidence interval and positive correlation with ETo values were employed to establish the input combination of the hybrid models.
The input and target variables used during modeling are presented below. In all models, inputs are Tmax, Tmin, Tmean, WS, RH, Rs, and Ra. The ABC-ANN hybrid model has been established. The parameters set include a feedforward neural network as the network structure, a transfer function of the tangent sigmoid, and a maximum iteration of 200.
Results of Al-Qalyubiyah station
Table 2 summarizes the performance analysis of the ETo prediction models et al.-Qalyubiyah station using three statistical criteria: MAE, MSE, and R2. Based on our analysis, the highest ETo prediction accuracy was obtained with ABC-ANN (Train R2: 0.990 and Test R2: 0.989), while the weakest results were obtained with LMD-ABC-ANN (Train R2: 0.872 and Test R2: 0.704). The ABC-ANN model outperformed all signal-processing-based models. In addition, the order of superiority of other data decomposition-based models was determined as EWT, CEEMDAN, VMD, and EMD.
Figure 7 compares the ETo prediction successes of hybrid-ABC-ANN with EMD, VMD, LMD, EEMD, CEEMDAN, and EWT techniques in Al-Qalyubiyah station with the Taylor diagram. Model performances were evaluated using R, RMSE, and standard deviation values. Accordingly, while the ABC-ANN model gave the most accurate prediction results in both the training and testing phases, the LMD-ABC-ANN model showed the weakest prediction results (Fig. 8).
Figure 9 shows the scatter diagrams of the training and test results at the Al-Qalyubiyah station. Scatter plots are graphical representations utilized to visualize the direction and configuration of the correlation between two data sets and the data distribution. Regions with high dot density indicate greater data concentration, whereas regions with low dot density may indicate less data. In the event that the dots indicate a downward trend, it can be deduced that there exists a negative correlation between the variables. A linear correlation between the variables exists if the dots are evenly spread in a straight line. If the data are randomly distributed, there is no correlation present. When examining the scatter plots of the training data in Fig. 9, it can be observed that all models exhibit a positive and linear distribution with the actual values. Additionally, the ABC-ANN algorithm shows an approximately linear trend. This indicates that it delivers superior results in ETo prediction. Furthermore, the hybrid approaches of EWT-ABC-ANN, VMD-ABC-ANN, and CEEMDAN-ABC-ANN demonstrate more accurate performance than EMD-ABC-ANN and LMD-ABC-ANN, as inferred. When evaluating the test results’ scatter plots, the ABC-ANN algorithm, which exhibits the most accurate and linear relationship, demonstrates superior outcomes. EWT-ABC-ANN and CEEMDAN-ABC-ANN also exhibit successful prediction results up to the second degree. Particularly, the LMD-ABC-ANN algorithm shows the weakest ETo predictions in terms of a random distribution.
Violin diagrams of Al-Qalyubiyah station ETo prediction results are shown in Fig. 10. Accordingly, it was evaluated which model had higher accuracy in estimating ETo. For this purpose, the structure and distribution of violin shapes belonging to the estimated dataset and the actual dataset were examined. The ABC-ANN algorithm has the closest structure to the actual values in the training and testing phases. In addition, the LMD-ABC-ANN model has the furthest distribution from the actual values and has the weakest results.
Results of Cairo station
In Table 3, the ETo prediction models at Cairo station were analyzed according to various statistical criteria. As a result of the analysis, the highest ETo prediction accuracy was obtained with ABC-ANN (Train R2: 0.986 and Test R2: 0.986), while the weakest results were obtained with LMD-ABC-ANN (Train R2: 0.557 and Test R2: 0.794). In addition, it has been determined that the performance of the ABC-ANN model is slightly weakened when combined with data separation techniques. Moreover, the order of superiority of other data decomposition-based models was found in EWT, CEEMDAN, VMD, and EMD.
In Fig. 11, the ETo prediction performances of hybrid-ABC-ANN-based EMD, VMD, LMD, EEMD, CEEMDAN, and EWT models in Cairo station are evaluated according to the Taylor diagram. Model performances were evaluated according to R, RMSE, and standard deviation values. According to this, it can be said that it has the highest success since the highest R and lowest RMSE values were obtained with the ABC-ANN model both in the training and testing stages. In addition, the EWT-ABC-ANN model has the second most accurate result.
Figure 12 displays scatter plots of ETo predictions for Cairo Station during the training and testing phases. These graphs evaluate the prediction accuracy by examining the relationship and distribution between the predicted and actual time series. Based on these graphs, it can be inferred that the ABC-ANN, EWT-ABC-ANN, and CEEMDAN-ABC-ANN algorithms produce similar and satisfactory prediction results during the training phase. This is evident from the scatter plots, where the predicted values align closely with the actual values along a linear trend. However, the other models generally exhibit weaker ETo predictions regarding random distribution. A similar pattern emerges when analyzing the testing phase’s scatter plots. The ABC-ANN, EWT-ABC-ANN, and CEEMDAN-ABC-ANN hybrid algorithms also generate close and highly accurate prediction outputs. However, the LMD-ABC-ANN, VMD-ABC-ANN, and EMD-ABC-ANN algorithms yield weaker predictions compared to the others.
In Fig. 13, the performances of ETo prediction models established at Cairo station were evaluated according to violin diagrams. The structure and distribution of the dataset estimated by the actual dataset were evaluated with violin plots. According to the analyses, the best estimation performance was shown in the training and testing phases since the ABC-ANN algorithm has the closest structure to the actual values. In addition, the LMD-ABC-ANN and EWT-ABC-ANN models have the furthest distribution and distribution from the true values and have the weakest results.
Results of Damietta station
In Table 4, the accuracy of the ETo prediction models at Damietta station is evaluated. Accordingly, the highest ETo prediction success was obtained with ABC-ANN (Train R2: 0.991 and Test R2: 0.989), while the weakest predictions were produced by LMD-ABC-ANN (Train R2: 0.385 and Test R2: 0.211). Moreover, the performance rankings of other data preprocessing algorithms, such as EWT, CEEMDAN, VMD, and EMD, are given.
Figure 14 depicts the evaluation of ETo prediction success for the Damietta station using a hybrid-ABC-ANN-based data decomposition algorithm, which is assessed with a Taylor diagram. This analysis indicates that the ABC-ANN model achieved the highest success rates in both the training and testing stages, as evidenced by the lowest RMSE and highest R values. Moreover, the CEEMDAN-ABC-ANN model displays the second most accurate outcome.
Scatter plots of ETo predictions for Damietta Station during the training and testing phases are depicted in Fig. 15. The prediction accuracy is evaluated through an analysis of the relationship and distribution between the predicted and actual time series, as depicted in these graphs. The graphs suggest that the ABC-ANN, EWT-ABC-ANN, and CEEMDAN-ABC-ANN algorithms yield comparable and satisfactory prediction results during the training phase. The scatter plots demonstrate a linear trend in which the predicted values closely align with the actual values. Upon examination of the scatter plots of the testing phase, a similar pattern is evident. The output predictions of the ABC-ANN, EWT-ABC-ANN, and CEEMDAN-ABC-ANN hybrid algorithms are also highly accurate and closely related. Nevertheless, the LMD-ABC-ANN, VMD-ABC-ANN, and EMD-ABC-ANN algorithms demonstrate inferior predictive capabilities compared to the remaining algorithms. Furthermore, it can be inferred that the LMD-ABC-ANN approach yields the least accurate estimates.
The ETo prediction models at Damietta station were evaluated in Fig. 16 using violin diagrams. The evaluation of the dataset’s structure and distribution, estimated by the actual dataset, was conducted using violin plots. The analyses show that the training and testing phases demonstrate the best estimation performance. This is because the ABC-ANN algorithm has the structure closest to the actual values. Furthermore, the LMD-ABC-ANN model, characterized by the farthest distribution from the true values, exhibits the weakest results.
Results of Port Said station
In Table 5, the ETo prediction results at the Port Said station are evaluated. Accordingly, the ABC-ANN (Train R2: 0.988 and Test R2: 0.987) model, which has the highest R2 and lowest error values, has the highest ETo prediction accuracy, while the weakest results are LMD-ABC-ANN (Train R2: 0.850 and Test R2: 0.735). Moreover, the performance superiority ranking of signal processing techniques is given as CEEMDAN EWT, VMD, and EMD.
The ETo prediction success of the hybrid-ABC-ANN-based data decomposition algorithm at Port Said station was assessed with the Taylor diagram, as shown in Fig. 17. Model performances were assessed according to the statistical values in the diagram. According to this, it can be stated that it has the highest success since the highest R and lowest RMSE values were obtained with the ABC-ANN model both in the training and testing stages. In addition, the CEEMDAN-ABC-ANN model has the second most accurate result.
The scatter plots of ETo predictions for Port Said Station during the training and testing phases are observed in Fig. 18. The precision of predictions is assessed via an examination of the correlation and dispersion among the predicted and actual time series, as illustrated in these charts. Based on the graphs, it can be inferred that the ABC-ANN, EWT-ABC-ANN, and CEEMDAN-ABC-ANN algorithms offer comparable and satisfactory prediction outcomes throughout the training phase. A linear pattern is evident in the scatter plots, with the projected values closely approximating the actual values. Despite this, the other models typically present less accurate ETo forecasts than the stochastic distribution. Upon analyzing the scatter plots of the testing phase, it becomes apparent that there is a comparable pattern.
Furthermore, the forecast predictions obtained from the hybrid algorithms ABC-ANN, EWT-ABC-ANN, and CEEMDAN-ABC-ANN are remarkably precise and closely associated. Despite this, it should be pointed out that the predictive capabilities of the LMD-ABC-ANN, VMD-ABC-ANN, and EMD-ABC-ANN algorithms are inferior compared to the other algorithms. Moreover, one can conclude that the LMD-ABC-ANN technique results in the least accurate predictions.
Figure 19 evaluates ETo prediction models in Port Said with violin diagrams. Violin plots evaluated dataset structure and distribution. The training and testing phase showed that the ABC-ANN algorithm had the closest structure to the actual values and performed the best. The LMD-ABC-ANN model had the least accurate results, the furthest from the true values.
Discussion
Accurate prediction of ETo is of paramount importance in water resource management. Traditional instrumental measuring methods, such as the use of lysimeters, are expensive. On the other hand, well-documented empirical methods such as Penman–Monteith (PM) require diverse meteorological parameters that make their application difficult, particularly in scarcely gaged catchments existing in Egypt. Our literature review indicated that the previous applications of ML models in the field of ETo estimation commonly rely on using ordinary ML algorithms (e.g., ANNs, ELM, RF, and GEP) that may estimate ETo using fewer climatic predictors (Feng et al. 2017b; Fan et al. 2019; Gong et al. 2021). Recent studies have proposed hybrid ML models that combine various ML techniques to reduce estimation error (Reis et al. 2019; Chia et al. 2021). However, despite the complicated structure and time-consuming calculations of ensemble models, some cases may yield no significant difference between these ensemble models and standalone methods (Katipoğlu et al. 2023). Achieving new ML algorithms with simpler structures and fast calculation times can contribute to overcoming the weaknesses of ensemble ML models. Although the idea of hybridization using signal preprocessing commonly showed more accurate hydrological estimations (Mehr et al. 2021; Danandeh Mehr et al. 2023), the proposed ABC-ANN model consistently demonstrated superior performance over the hybrid models based on a variety of the models that used decomposed input variables.
Our findings demonstrated that the standalone ANN failed to forecast ETo effectively, especially in complex climatic conditions, as they could capture the nonlinear relationships among the variables. The ABC algorithm, with its advantage of structural optimization through automatic optimization of the number of hidden neurons, can capture the underlying pattern between the data. However, our study showed that the use of signal decomposition methods sometimes struggles to effectively capture temporal patterns such as seasonality and periodicity in ETo time series. The results from the four case-study areas clearly demonstrated that the integration of data preprocessing techniques with a perfectly structured ANN model (here, ABC-ANN) not only increases the evolved models’ complexity but also may reach even less accurate estimations. Accordingly, it is highlighted that the integration of ML models with optimization algorithms to identify the optimal parameters for accurate prediction is crucial. This conclusion agrees with a previous study by Zhu et al. (2020) and Roy et al. (2021), which, respectively, demonstrated the importance of structural optimization of the ELM and ANFIS for accurate ETo estimation. The effectiveness of ABC in structural optimization of ML models for hydrology is consistent with the findings of Huo et al. (2018), Ibrahim et al. (2022), and Katipoğlu et al. (2023). Likewise, the efficacy of the other optimization algorithm in hydrological modeling aligns with the findings of Tikhamarine et al. (2020). Optimization algorithms can quickly and accurately identify the best parameters, as opposed to manual trial-and-error methods.
The ABC-ANN hybrid model, designed for predicting ETo, offers several notable advantages, including high accuracy with impressive R-squared values observed during both the training and testing phases, robustness against data noise using advanced signal processing techniques such as EEMD and CEEMDAN, and adaptability via ABC optimization, which fine-tunes the model to varying datasets and climatic conditions within the tested regions. Although the ABC-ANN hybrid model is innovative, it has some algorithmic limitations that could affect its wider use. Integrating ABC optimization with ANNs, along with multiple signal processing techniques, results in high computational complexity. This complexity requires significant computational resources and can lead to longer execution times. Consequently, the model’s applicability may be limited in real-time or resource-limited environments where quick and efficient processing is crucial.
Furthermore, the scalability of the model presents an additional challenge. To apply the model to more extensive or more diverse datasets or to use it in different geographical and climatic conditions, some adjustments may be needed to ensure accuracy and reliability. Moreover, the performance of the ABC algorithm heavily relies on fine-tuning its parameters, such as population size and limit. This sensitivity requires expertise in optimization algorithms, potentially limiting the accessibility and usability of the model for those without specialized knowledge. As a result, its practical application may be affected.
The data collection process for our hybrid ABC-ANN model faces several limitations that could impact its overall performance and applicability. Challenges arise from issues concerning the availability and consistency of long-term climate datasets. These challenges include gaps, missing values, and variations in data collection methodologies over time. Such inconsistencies can have a negative impact on the model’s performance and restrict the generalizability of the study’s findings across different regions or contexts. Additionally, the data representatives play a crucial role in ensuring the accuracy of the model. Suppose the datasets fail to capture all relevant climate variations across different seasons and years. In that case, the model’s robustness and its ability to accurately predict ETo under various climatic scenarios may be compromised. Furthermore, it is crucial to prioritize the accuracy and resolution of the meteorological data, which is obtained from weather stations or satellite observations. Any inaccuracies or biases in these data sources can result in erroneous estimations of ETo, which in turn can misguide water resource planning and management efforts.
Conclusion and future outlook
The main objective of this study was to develop a novel hybrid prediction model based on an ABC-integrated ANN model with various signal processing techniques for accurate ETo prediction in four Egyptian Govern-orates from 1979 to 2014. Model inputs included Tmin, Tmax, Tmean, P, WS, RH, Rs, and Ra to understand the complex relationships between weather conditions and ETo dynamics. Our findings stated that highly effective factors in predicting monthly ETo time series were solar radiation and temperatures, which include maximum, minimum, and mean temperatures. In contrast, the correlation between precipitation and ETo was weak, highlighting its limited influence on the prediction accuracy of the applied models. As well, the hybrid ABC-ANN model proved capable of predicting ETo using various data decomposition techniques (EEMD, CEEMDAN, VMD, and EMD), demonstrating its effectiveness in capturing complex patterns within the datasets. The ABC-ANN model consistently outperformed other methods, with EEMD and CEEMDAN showing more promising results. The thorough evaluation of the model’s performance at different stations confirms the robustness and general applicability of the proposed hybrid approach. The ability of the ABC-ANN model to effectively capture complex relationships within the meteorological data is emphasized by its undisputed dominance over signal processing-based models. The superior performance of the hybrid model confirms its importance as a powerful tool in addressing the challenges associated with ETo prediction when compared to conventional signal processing techniques. The optimal model achieved highest ETo prediction accuracy in Al-Qalyubiyah, Cairo, Damietta, and Port Said for testing, respectively, compared to other developed models.
In the field of ETo estimation, hybrid models such as ABC algorithms and ANN have been identified as a hybrid approach that produces results close to reality (R2 values are usually around 0.99) due to their ability to optimize neural network parameters and increase model accuracy. Many studies in the literature support these results (Katipoğlu et al. 2024). ABC optimization further improves this by adjusting the neural network parameters and provides better adaptation to the variability in evapotranspiration processes, especially in arid and semiarid climates where such estimates are vital.
Nourani et al. (2014) showed that decomposed data significantly increase ANN performance by isolating important features from noisy datasets. In the study, although ABC-ANN and data decomposition-based ABC-ANN results give very close results, the ABC-ANN model produced slightly more accurate outputs. In this respect, the results do not overlap. The reason for this can be explained by the length of the data series used, the multitude of data parameters and the use of different input types.
Future research can investigate the suitability of this approach for other geographical regions and climates, leading to further advances in accurate ETo estimation and its diverse applications. The combination of nature-inspired optimization and ML has the potential to capture intricate patterns within the data, as highlighted by this finding. A better understanding of ETo prediction could be improved by including other meteorological variables, extending the approach to different regions, and evaluating the transferability of the model to different time scales. The accuracy of novel hybrid models established by combining multi-verse optimizers, artificial algae algorithms, biogeography-based optimization, dragonfly optimization, flower pollination algorithms, monarch butterfly optimization, and various deep learning algorithms can be evaluated to predict ETo values. It is also recommended to compare the performances of models established in different climatic regions using hybrid ML-based non-negative matrix factorization, independent component analysis, and time–frequency signal decomposition techniques.
Data availability
The datasets generated during and/or analyzed during the current study are available on reasonable request.
References
Abdallah M, Mohammadi B, Zaroug MA, Omer A, Cheraghalizadeh M, Eldow ME, Duan Z (2022) Reference evapotranspiration estimation in hyper-arid regions via D-vine copula based-quantile regression and comparison with empirical approaches and machine learning models. J Hydrol: Reg Stud 44:101259. https://doi.org/10.1016/j.ejrh.2022.101259
Allen RG, Pereira LS, Raes D, Smith M (1998) Crop Evapotranspiration. Guidelines for Computing Crop Evapotranspiration. FAO Irrigation and Drainage Paper no. 56, Rome, Italy
Bellido-Jiménez JA, Estévez J, García-Marín AP (2021) New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agric Water Manag. https://doi.org/10.1016/j.agwat.2020.106558
Chia MY, Huang YF, Koo CH (2021) Swarm-based optimization as stochastic training strategy for estimation of reference evapotranspiration using extreme learning machine. Agric Water Manag 243:106447. https://doi.org/10.1016/j.agwat.2020.106447
Danandeh Mehr A, Reihanifar M, Alee MM, Vazifehkhah Ghaffari MA, Safari MJS, Mohammadi B (2023) VMD-GP: a new evolutionary explicit model for meteorological drought prediction at ungauged catchments. Water 15(15):2686. https://doi.org/10.3390/w15152686
Dong J, Liu X, Huang G, Fan J, Wu L, Wu J (2021) Comparison of four bioinspired algorithms to optimize KNEA for predicting monthly reference evapotranspiration in different climate zones of China. Comput Electron Agr 186:106211. https://doi.org/10.1016/j.compag.2021.106211
Elsadek EA (2023) Study on the in-field water balance and the projected impacts of climate change on rice yields in the Nile River Delta. Hohai University
Elsadek E, Zhang K, Mousa A, Ezaz GT, Tola TL, Shaghaleh H, Hamad AAA, Alhaj Hamoud Y (2023) Study on the in-field water balance of direct-seeded rice with various irrigation regimes under arid climatic conditions in Egypt using the AquaCrop Model. Agronomy 13:609. https://doi.org/10.3390/agronomy13020609
Elsadek EA, Zhang K, Hamoud YA, Mousa A, Awad A, Abdallah M, Shaghaleh H, Hamad AAA, Jamil MT, Elbeltagi A (2024) Impacts of climate change on rice yields in the Nile River Delta of Egypt: a large-scale projection analysis based on CMIP6. Agric Water Manag 292:108673. https://doi.org/10.1016/j.agwat.2024.108673
Elzain HE, Abdalla OA, Abdallah M, Al-Maktoumi A, Eltayeb M, Abba SI (2024) Innovative approach for predicting daily reference evapotranspiration using improved shallow and deep learning models in a coastal region: a comparative study. J Environ Manag. https://doi.org/10.1016/j.jenvman.2024.120246
Fan J, Yue W, Wu L, Zhang F, Cai H, Wang X, Lu X, Xiang Y (2018) Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric Forest Meteorol 263:225–241. https://doi.org/10.1016/j.agrformet.2018.08.019
Fan J, Ma X, Wu L, Zhang F, Yu X, Zeng W (2019) Light Gradient Boosting Machine: an efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agric Water Manag 225:105758. https://doi.org/10.1016/j.agwat.2019.105758
Feng Y, Cui N, Gong D, Zhang Q, Zhao L (2017a) Evaluation of random forests and generalized regression neural networks for daily reference evapotranspiration modelling. Agric Water Manag 193:163–173. https://doi.org/10.1016/j.agwat.2017.08.003
Feng Y, Peng Y, Cui N, Gong D, Zhang K (2017b) Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput Electron Agric 136:71–78. https://doi.org/10.1016/j.compag.2017.01.027
Ferreira LB, da Cunha FF, de Oliveira RA, Fernandes Filho EI (2019) Estimation of reference evapotranspiration in Brazil with limited meteorological data using ANN and SVM—a new approach. J Hydrol 572:556–570. https://doi.org/10.1016/j.jhydrol.2019.03.028
Ferreira LB, da Cunha FF (2020) New approach to estimate daily reference evapotranspiration based on hourly temperature and relative humidity using machine learning and deep learning. Agric Water Manag 234:106113. https://doi.org/10.1016/j.agwat.2020.106113
Gong D, Hao W, Gao L, Feng Y, Cui N (2021) Extreme learning machine for reference crop evapotranspiration estimation: model optimization and spatiotemporal assessment across different climates in China. Comput Electron Agric 187:106294. https://doi.org/10.1016/j.compag.2021.106294
Granata F, Di Nunno F, de Marinis G (2024) Advanced evapotranspiration forecasting in Central Italy: stacked MLP-RF algorithm and correlated Nystrom views with feature selection strategies. Comput Electron Agric. https://doi.org/10.1016/j.compag.2024.108887
Huo J, Liu L, Zhang Y (2018) An improved multi-cores parallel artificial Bee colony optimization algorithm for parameters calibration of hydrological model. Future Gener Comput Syst 81:492–504. https://doi.org/10.1016/j.future.2017.07.020
Hussan BK, Rashid ZN, Zeebaree SR, Zebari RR (2023) Optimal deep belief network enabled vulnerability detection on smart environment. J Smart Internet Things 2022(1):146–162. https://doi.org/10.2478/jsiot-2022-0010
Ibrahim KSMH, Huang YF, Ahmed AN, Koo CH, El-Shafie A (2022) A review of the hybrid artificial intelligence and optimization modelling of hydrological streamflow forecasting. Alex Eng J 61(1):279–303. https://doi.org/10.1016/j.aej.2021.04.100
Iqbal MS, Limon MFA, Kabir MM, Rabby MKM, Soeb MJA, Jubayer MF (2024) A hybrid optimization algorithm for improving load frequency control in interconnected power systems. Expert Syst Appl 249:123702. https://doi.org/10.1016/j.eswa.2024.123702
Jiang S, Liang C, Cui N, Zhao Lu, Du T, Hu X, Feng Yu, Guan J, Yi F (2019) Impacts of climatic variables on reference evapotranspiration during growing season in Southwest China. Agric Water Manag 216:365–378. https://doi.org/10.1016/j.agwat.2019.02.014
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization (Vol. 200, pp. 1-10). Technical report-tr06, Erciyes university, engineering faculty, computer engineering department
Karaboga D, Basturk B (2007) Artificial Bee Colony (ABC) Optimization Algorithm for Solving Constrained Optimization Problems. In: Melin P, Castillo O, Aguilar LT, Kacprzyk J, Pedrycz W (eds) Foundations of Fuzzy Logic and Soft Computing. IFSA 2007. Lecture Notes in Computer Science(), vol 4529. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72950-1_77
Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl soft comput 8(1):687–697. https://doi.org/10.1016/j.asoc.2007.05.007
Katipoğlu OM, Keblouti M, Mohammadi B (2023) Application of novel artificial bee colony optimized ANN and data preprocessing techniques for monthly streamflow estimation. Environ Sci Pollut Res. https://doi.org/10.1007/s11356-023-28678-4
Katipoğlu OM, Mohammadi B, Keblouti M (2024) Bee-inspired insights: Unleashing the potential of artificial bee colony optimized hybrid neural networks for enhanced groundwater level time series prediction. Environ Monit Assess 196(8):724. https://doi.org/10.1007/s10661-024-12838-1
Kedadouche M, Thomas M, Tahan A (2016) Empirical mode decomposition combined with empirical wavelets for extracting bearing frequencies in a noisy environment and early detection of defects. In: Advances in condition monitoring of machinery in non-stationary operations: proceedings of the fourth international conference on condition monitoring of machinery in non-stationary operations, CMMNO’2014, Lyon, France December 15–17, pp 151–165. Springer. https://doi.org/10.1007/978-3-319-20463-5_12
Kumar M, Raghuwanshi NS, Singh R, Wallender WW, Pruitt WO (2002) Estimating evapotranspiration using artificial neural network. J Irrig Drain Eng 128(4):224–233. https://doi.org/10.1061/(ASCE)0733-9437(2002)128:4(224)
Landeras G, Ortiz-Barredo A, Lopez JJ (2008) Comparison of artificial neural network models and empirical and semi-empirical equations for daily reference evapotranspiration estimation in the Basque Country (Northern Spain). Agric Water Manag 95(5):553–565. https://doi.org/10.1016/j.agwat.2007.12.011
Lee S, Moriasi DN, Mehr AD, Mirchi A (2024) Sensitivity of Standardized Precipitation and Evapotranspiration Index (SPEI) to the choice of SPEI probability distribution and evapotranspiration method. J Hydrol Reg Stud 53:101761. https://doi.org/10.1016/j.ejrh.2024.101761
Li J, Pang Z, Liu Y, Hu S, Jiang W, Tian L, Tian J (2023) Changes in groundwater dynamics and geochemical evolution induced by drainage reorganization: evidence from 81Kr and 36Cl dating of geothermal water in the Weihe Basin of China. Earth Planet Sci Lett 623:118425. https://doi.org/10.1016/j.epsl.2023.118425
Ma Z, Chen H, Wang J, Yang X, Yan R, Jia J, Xu W (2020) Application of hybrid model based on double decomposition, error correction and deep learning in shortterm wind speed prediction. Energy Convers Manag 205:112345. https://doi.org/10.1016/j.enconman.2019.112345
Mehdizadeh S, Mohammadi B, Pham QB, Duan Z (2021) Development of boosted machine learning models for estimating daily reference evapotranspiration and comparison with empirical approaches. Water 13(24):3489. https://doi.org/10.3390/w13243489
Mehr AD, Safari MJS, Nourani V (2021) Wavelet packet-genetic programming: a new model for meteorological drought hindcasting. Teknik Dergi. https://doi.org/10.18400/tekderg.605453
Mohammadi B, Mehdizadeh S (2020) Modeling daily reference evapotranspiration via a novel approach based on support vector regression coupled with whale optimization algorithm. Agric Water Manag 237:106145. https://doi.org/10.1016/j.agwat.2020.106145
Nourani V, Baghanam AH, Adamowski J, Kisi O (2014) Applications of hybrid wavelet–artificial intelligence models in hydrology: a review. J Hydrol 514:358–377. https://doi.org/10.1016/j.jhydrol.2014.03.057
Bisoyi N, Gupta H, Padhy NP, Chakrapani GJ (2019) Prediction of daily sediment discharge using a back propagation neural network training algorithm: a case study of the Narmada River, India. Int J Sediment Res 34(2):125–135. https://doi.org/10.1016/j.ijsrc.2018.10.010
Petković B, Petković D, Kuzman B, Milovančević M, Wakil K, Ho LS, Jermsittiparsert K (2020) Neurofuzzy estimation of reference crop evapotranspiration by neuro fuzzy logic based on weather conditions. Comput Electron Agric 173:105358. https://doi.org/10.1016/j.compag.2020.105358
Reis MM, da Silva AJ, Zullo Junior J, Tuffi Santos LD, Azevedo AM, Lopes EMG (2019) Empirical and learning machine approaches to estimating reference evapotranspiration based on temperature data. Comput Electron Agr 165:104937. https://doi.org/10.1016/j.compag.2019.104937
Roy DK (2021) Long short-term memory networks to predict one-step ahead reference evapotranspiration in a subtropical climatic zone. Environ Process 8:911–941. https://doi.org/10.1007/s40710-021-00512-4
Roy DK, Lal A, Sarker KK, Saha KK, Datta B (2021) Optimization algorithms as training approaches for prediction of reference evapotranspiration using adaptive neuro fuzzy inference system. Agric Water Manag 255:107003
Smith JS (2005) The local mean decomposition and its application to EEG perception data. J R Soc Interface. https://doi.org/10.1098/rsif.2005.0058
Sowmya MR, Kumar MBS, Ambat SK (2020) Comparison of deep neural networks for reference evapotranspiration prediction using minimal meteorological data. In: 2020 Advanced computing and communication technologies for high performance applications (ACCTHPA), pp 27–33. https://doi.org/10.1109/ACCTHPA49271.2020.9213201
Tabari H, Grismer ME, Trajkovic S (2013) Comparative analysis of 31 reference evapotranspiration methods under humid conditions. Irrig Sci 31(2):107–117. https://doi.org/10.1007/s00271-011-0295-z
Tang D, Feng Yu, Gong D, Hao W, Cui N (2018) Evaluation of artificial intelligence models for actual crop evapotranspiration modeling in mulched and non-mulched maize croplands. Comput Electron Agric 152:375–384. https://doi.org/10.1016/j.compag.2018.07.029
Tikhamarine Y, Souag-Gamane D, Najah Ahmed A, Kisi O, El-Shafie A (2020) Improving artificial intelligence models accuracy for monthly streamflow forecasting using grey Wolf optimization (GWO) algorithm. J Hydrol 582:124435. https://doi.org/10.1016/j.jhydrol.2019.124435
Traore S, Wang Y-M, Kerh T (2010) Artificial neural network for modeling reference evapotranspiration complex process in Sudano-Sahelian zone. Agric Water Manag 97(5):707–714. https://doi.org/10.1016/j.agwat.2010.01.002
Wu T, Zhang W, Jiao X, Guo W, Hamoud YA (2021) Evaluation of stacking and blending ensemble learning methods for estimating daily reference evapotranspiration. Comput Electron Agric 184:106039. https://doi.org/10.1016/j.compag.2021.106039
Yaseen ZM, El-Shafie A, Afan HA, Hameed M, Mohtar WHM, Hussain A (2016) RBFNN versus FFNN for daily river flow forecasting at Johor river. Malays Neural Comput Appl 27(6):1533–1542. https://doi.org/10.1007/s00521-015-1952-6
Yi J, Li H, Zhao Y, Shao M, Zhang H, Liu M (2022) Assessing soil water balance to optimize irrigation schedules of flood-irrigated maize fields with different cultivation histories in the arid region. Agric Water Manag 265:107543. https://doi.org/10.1016/j.agwat.2022.107543
Yin L, Wang L, Li T, Lu S, Yin Z, Liu X, Zheng W (2023a) U-Net-STN: a novel end-to-end lake boundary prediction model. Land 12(8):1602. https://doi.org/10.3390/land12081602
Yongbo LI, Shubin SI, Zhiliang LIU, Xihui L (2019) Review of local mean decomposition and its application in fault diagnosis of rotating machinery. J Syst Eng Electron 30(4):799–814. https://doi.org/10.21629/JSEE.2019.04.17
Zhang Q, Cui N, Feng Yu, Gong D, Hu X (2018) Improvement of Makkink model for reference evapotranspiration estimation using temperature data in Northwest China. J Hydrol 566:264–273. https://doi.org/10.1016/j.jhydrol.2018.09.021
Zhao Y, Wang H, Song B, Xue P, Zhang W, Peth S, Horn R (2023) Characterizing uncertainty in process-based hydraulic modeling, exemplified in a semiarid Inner Mongolia steppe. Geoderma 440:116713. https://doi.org/10.1016/j.geoderma.2023.116713
Zhu B, Feng Y, Gong D, Jiang S, Zhao L, Cui N (2020) Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data. Comput Electron Agric 173:105430. https://doi.org/10.1016/j.compag.2020.105430
Acknowledgements
Thanks to Grammarly Premium, which helps correct grammatical errors and spelling mistakes in the article.
Funding
Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).
Author information
Authors and Affiliations
Contributions
Ahmed Elbeltagi had the main idea for paper, conceptualization, data collection, methodology, writing—original draft, supervision, project administration, formal analysis, funding acquisition; Okan Mert Katipoğlu implemented the modeling process, Formal analysis and prepared the plots; Veysi Kartal drafted content for model description, results; Ali Danandeh Mehr wrote introduction section and discussions; Sabri Berhail: writing—original draft, writing—review and editing; Elsayed Ahmed Elsadek and Ahmed Elbeltagi: writing—review and editing; All authors read and approved the final version for submission.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Pseudo codes for ABC-ANN model

Pseudo codes for separation of components
EMD component separation
[imf, residual, info] = emd(data)
EMD decomposes data into intrinsic mode functions (IMFs) and a residual.
EEMD component separation

The EEMD decomposition is performed using the eemd() function with various parameters such as the input data, the number of IMFs, ensemble size, noise strength, S number, number of siftings, random number generator seed, and number of threads. The resulting decomposition is stored in the variable imf.
CEEMDAN component separation

The CEEMDAN decomposition is performed using the ceemdan() function with various parameters such as the input data, the number of IMFs, ensemble size, noise strength, S number, number of siftings, random number generator seed, and number of threads.
VMD component separation
[imf, residual, info] = vmd(data)
VMD decomposes data into variational mode functions (VMFs) and a residual.
EWT component separation
[mra, cfs, wfb] = ewt(data)
EWT decomposes a signal into multiple scales with the multiresolution analysis (MRA) and wavelet filter banks (WFB).
LMD component separation
-
The code begins by initializing the LMD with the provided input variables.
-
The input signal x is transformed into a row vector and its length L_x is saved.
-
The energy of the original signal E_x is computed.
-
A copy of the original signal xx is saved.
-
The variable i is set to 0 to track the number of extracted PFs.
-
An empty array Nbits is created to store the number of iterations needed for each PF.
-
Arrays PF, A, and S are pre-allocated to store the PFs, amplitudes, and siftings of each IMF.
-
The main loop begins, where the algorithm iteratively extracts PFs until the maximum number of PFs (MAX_PFS) is reached or a stop criterion is met.
-
Within each iteration, the function lmd_pf is called to compute the PF, amplitude, and sifting of the current IMF using the EOE approach.
-
If no iterations are performed (N_Gpfi = = 0), it means that the current PF is the signal itself, and the loop is terminated.
-
The obtained PF, amplitude, and sifting are stored in the respective arrays.
-
The current PF is subtracted from the signal x.
-
The function stop_lmd is called to check if the stop criterion is met, indicating the end of the decomposition process.
-
If the stop criterion is met, the current signal x is stored as the final PF, and the loop is terminated.
-
The number of iterations needed for the current PF is appended to the Nbits array.
-
After the loop ends, any extra space in the arrays is cleared.
-
The function lmd_io is called to compute the Orthogonal Residual Trend (ORT) using the original signal xx and the extracted PFs PF.
-
The execution of the code reaches its end.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Elbeltagi, A., Katipoğlu, O.M., Kartal, V. et al. Advanced reference crop evapotranspiration prediction: a novel framework combining neural nets, bee optimization algorithm, and mode decomposition. Appl Water Sci 14, 256 (2024). https://doi.org/10.1007/s13201-024-02308-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13201-024-02308-x




















