A new approach to use of wavelet transform for baseflow separation of Karst springs (case study: Gamasiyab spring)

Today, groundwater is the essential resource the whole water sector demands. The spring baseflow discharge analysis can significantly help us better manage water resources. This study was focused on the separation baseflow discharge of the Gamasiyab karstic spring (western Iran) using the wavelet transform (WT). The recursive digital filter (RDF) method with different algorithms has been used to compare the approach performance. Forty-nine years (1968–2018) of the recorded springs’ daily discharge data were used for this aim. The isotopic content (IC) method was used to determine the best RDF algorithm and WT for baseflow separation. The IC of the monthly samples taken in 2017–2018 was measured. The results showed that the WT had the adjusted coefficient of determination (Radjusted2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_{{{\text{adjusted}}}}^{2}$$\end{document}) of 0.95 and root mean square error (RMSE) of 0.552 (m3/s) compared to the IC index method. The best algorithm in the RDF method was related to the Eckhardt algorithm with Radjusted2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_{{{\text{adjusted}}}}^{2}$$\end{document} of 0.97 and RMSE of 0.670 (m3/s), which had the best performance compared to the IC index method. Results showed that in most samples, the values of the separated baseflow by the WT are closer to the isotopic values. Also, based on the RMSE criterion, the WT has a lower RMSE than the RDF method and is known as the best method in this study. The results showed that the WT accurately separated the spring baseflow discharge and could split the baseflow of rivers.


Introduction
The karst aquifers have more suitable conditions for feeding compared to other types of aquifers (Simsek et al. 2008). However, hydrogeological studies of these aquifers are challenging due to the permeability distribution, the very heterogeneous hydraulic conductivity, and the spatial and temporal variation in the feed (Bonacci 1993;Einsiedl 2005). In this regard, karst springs can provide valuable information about aquifers' hydrogeological processes and internal structure, especially about complex groundwater flow systems where point measurement of wells is impossible.
Karst springs water is used commonly for the drinking, agricultural and environmental needs in mountainous areas. These springs have different discharge rates due to precipitation, snowmelt, and air temperature changes in different seasons. Separation of the share of baseflow in these springs, the primary source for water projects, is essential. Because most projects of environmental and water resources based on 264 Page 2 of 15 baseflow can be created and exploited, accurate estimation and share in this sector can help the country's water sector.
Various methods have been used to separate the baseflow. Baseflow separation methods are generally divided into two categories: (1) non-tracking methods and (2) tracking methods. Each of these methods involves different techniques. Non-tracking methods include graphical separation techniques and filtering separation techniques, and wavelet transform (WT). Tracking methods include isotopic and hydrochemical methods. Tracking methods are the most accurate ones. Using isotopic methods is costly and timeconsuming. Despite being fast, the graphic methods often do not provide acceptable results. Filtering methods are usually reasonably accurate and work quickly. So far, various filters have been proposed to separate the baseflow, including hydrograph separation (Hysep), Champman, Lyne and Hollick, Eckhardt, Furey and Gupta, exponentially weighted moving average (EWMA), one, two, and three parameters or the IHACRES.
WT can be considered another filtering method that the study by Salerno and Tartari (2009) used. They presented an approach with surface hydrological modeling and wavelet analysis to understand the baseflow components from the river discharge in the karst environment. In this study, the potential of wavelet analysis to help define the nature and behavior of karst share in the river flow was investigated to improve the future performance of surface hydrological modeling.
This section refers to some of the studies that have been done in the field of baseflow separation: Aksoy et al. (2009) used the baseflow separation method, the method of the United Kingdom Institute of hydrology (UKIH), together with the recursive digital filter (RDF) separation method and by merging the two approaches, a technique called filtered smoothed minima baseflow separation method (FUKIH) was presented. Using this method in several basins in the western Black Sea, they concluded that the RDF and FUKIH methods provide a more realistic baseflow structure due to the lack of linear interpolation. Falamarzi et al. (2014) used ADUKIH and RDF methods to separate the river baseflow in Iran, Abol Abbas basin. Their results showed that RDF has outperformed than ADUKIH method. Rudra et al. (2015) examined baseflow indices to identify a baseflow and rapid flow in 150 basins in Ontario (Canada). By using the PART, RDF, UKIH, local minimum, fixed distances, and moving distances methods, they stated that BFI levels above 80% occur in group A and B soils and occur less than 80% in groups C and D. Minea (2017) examined the baseflow for three basins in the northeastern region of Romania. He used local least, RDF, one-parameter algorithms, Chapman, Eckhardt, Talaksen, and WHAT models. The results showed that Eckhardt and Chapman's algorithms were the more appropriate approach. Mali and Mohammadi (2014) evaluated the baseflow estimation methods in Pirghar and Dimeh karstic springs in the Chaharmahal and Bakhtiari province of Iran. They considered the changes in the baseflow rate based on the spring hydrochemical as a criterion for evaluation and the basis for comparison to select the most appropriate RDF method and conventional methods of separating the baseflow of karst springs. The most appropriate method for flow separation among the numerical algorithm digital methods was reported to be the Eckhardt filter method. Kazemi and Ghermez Cheshmeh (2016) calculated the baseflow based on the flow continuity curve index for rivers in Iran, Caspian. They used minimum local methods, fixed intervals, sliding intervals, RDF, one-parameter, two-parameter, and Chapman methods. Their results showed that the RDF was a suitable method. Mehri et al. (2017) compared the river baseflow separation methods and seasonal changes in several watersheds in Ardabil province. They used the local minimum, fixed, sliding, and RDF methods, including one-, two-, and three-parameter algorithms, Lyne and Hollick, Chapman, and EWMA. Their results showed that the two-parameter algorithm performed better. Hessari et al. (2019) compared the effect of different methods of filtering daily flows to separate the baseflow in the rivers west of Urmia Lake. They found Lyne and Hollick and Eckhardt algorithms suitable with a 0.9 filter.
Using digital filtering methods, Cheng et al. (2021) used different filtering parameters to separate the baseflow of the Heihe River Basin (HRB) in Northwest China. Results showed that baseflow at the midstream Zhengyxia and downstream Yangyangchi stations had a significant positive and negative correlation with population and effective irrigation area, respectively. Although the baseflow separation has been investigated for rivers, this study examined the baseflow divergence for the Gamasiyab karst spring hydrograph. The WT has been used in various fields, but the study aimed to use this tool to separate the baseflow. This study provided an equation for baseflow separation using the WT for the first time. This equation can be used in all springs and rivers. Also, the isotopic method was used to evaluate the accuracy using the monthly sampled data in 2017-18.

Study area
The Gamasiyab is the most crucial spring in the Garin Mountain area, located 20 km away from Nahavand city in western Iran. Its average daily discharge is about four m 3 /s from a cold water temperature of 5 °C. Figure 1 shows the location of the Gamasiyab spring in Hamedan province and Iran. The Gamasiyab spring discharge is recorded by Page 3 of 15 264 two hydrometric stations of the Sang-e-Sorakh (station 1, since 1969) and the Varayeneh canal (station 2, since 2006). Station 1 with 1786 m elevation above sea level located in 48º23′23″ E and 34º03′33″ N. Station 2 with 1852 m elevations above sea level located in 48º22′38″ E and 34º02′53″ N. The statistical parameters of the stations used are presented in Table 1 on a daily scale. The daily spring discharge varied between 37.97 and 0.3 m 3 /s. The basin of the Sange-Sorakh station is about 60 km 2 . Based on the Varayeneh meteorology station, the average annual precipitation of the area is about 538 mm.
The water from precipitation in the Garin Mountains is divided into two parts. A part of it is surface runoff that goes through valleys and waterways and joins rivers.
Another aspect of it penetrates them through discontinuities and pores of rocks. It flows through seams, cracks, and interconnected dissolution cavities into the rock mass and finally appears as springs such as Gamasiyab spring.
Based on hydrogeological characteristics, rock units in this area are divided into two categories of sedimentary and non-sedimentary rocks. The non-sedimentary rocks include a set of intrusive, pyroclastic, and metamorphic rocks, usually of low or impermeable permeability. The limestone rocks and marl-sandstone periodicity, sedimentary rocks form the study area, divided into two categories of carbonate and non-carbonate rocks. Many karst cavities and fractures have created a very high potential for  water storage in the limestones of Garin Mountain and have made rich groundwater aquifers in this area.

Data preparation
The box plot method was used to examine the outliers. The data were converted to numbers between zero and one using the standardization method to coordinate the data better. According to Solgi (2014), Eq. 1 is used for standardization: In Eq. (1), x is the desired data, x the average data, x max the maximum data, x min the minimum data, and y the standardized data.

The proposed approach
Three approaches involve (1) the IC method, (2) the RDF algorithms, and (3) the WT used for baseflow separation. Because each of the algorithms of the RDF method had different parameters, the parameters of each algorithm were initially optimized based on the IC method in the water year 2017-18. Then, the optimized parameters were used for 49 years.

Isotopic content (IC)
First, in the water year 2017-2018, spring water and rain were sampled. Then, the mass spectrometer measured oxygen's isotopic composition (IC). In this research, the IC of water has been determined using an FT-IR 100 spectrometer in the laboratory of Mesbah Energy Company of Iran.

Wavelet transform (WT)
The WT is one of the efficient mathematical transformations in signal processing. Wavelets are mathematical functions that provide a time-scale form of time series and their relationships for analyzing time series that include variables and non-constants. Wavelet analysis can display different aspects of data, breakpoints, and discontinuities that other signal analysis methods may not. The wavelet function has two essential properties of oscillation and is short term. (x) is a wavelet function (WF) if and only if its Fourier transform ( ) satisfies the following condition (Mallat 1998): (1) This condition is the admissibility condition for the ψ(x) wavelet. The above equation can be considered equivalent to the following formula: This property of a function with a mean of zero is not limited, and many functions can be called the WF based on their functions. (x) is the mother WT that the functions used in the analysis are analyzed, resized, and displaced with two mathematical operations of translation and scale during the signal.
Finally, the wavelet coefficients can be calculated at any point of the signal (b) and for any value of the scale (a) in the following Eq. (Nourani et al. 2009).
In Eq. (5), a scale and b translate the function. The value of T is obtained for different values of a and b. Whenever T has the highest positive value, it has the highest adaptation too. Also, there is no adaptation for T, equal to zero, and there is an inverse adaptation or maximum difference for negative T values.
Because the effect of time series data is differentiated before entering different models and the initial signal is decomposed into several sub-signals, it is possible to use an analysis that includes short-term and long-term effects, which optimizes the model in future evaluations and estimates. For more information on this field, please refer to (Mallat 1998;Foufoula-Georgiou and Kumar 1994).
In this study, the baseflow separation using WT is done as follows: First, the discharge time series of the Gamasiyab spring was standardized using Eq. (1). These standardized series were then decomposed at different decomposition levels using the previous sections' wavelet functions. From the created sub-series, the details of the last level (details) and the general sub-series (approximant) were selected to separate the baseflow. The baseflow is the sum of these two sub-series. The baseflow is calculated according to Eq.
In Eq. (6), j is the decomposition level, Q b (ji) are the baseflow at the time i and the level j, d ji is the detail subseries at the level j and at the time i and, a ji is the total subseries at the decomposition level at the time i. For example, if the decomposition level is 5, the baseflow can be calculated using the general and partial subseries of level 5. Given that the minimum baseflow cannot be less than the minimum flow, and the maximum baseflow cannot be more than the full discharge, conditions according to Eq. (7) and (8) are applied to prevent the violation of these conditions, Q min and Q max are minimum and maximum discharges, respectively.
One of the critical and essential points about selecting the WFs is the nature of the phenomenon and its time series. Therefore, models of WFs that can be geometrically well matched to the time series curve can perform the mapping operation better, and the results will be better. According to the test of different WFs and the point mentioned above, 5 WFs named Db2, Db4, Coif1, Sym3, and Haar have been used. Depending on the time scale of the research, which is daily, the degree of decomposition is 5-9.

Recursive digital filter (RDF)
In the RDF method, the total flow is decomposed into two components: fast flow, baseflow, and signal processing techniques applied to eliminate the high-frequency fast flow signal and the low-frequency baseflow signal in the flow time series. These filters can be used on long-term continuous flows. The algorithms examined in this study are as follows: RDF method-one-parameter algorithm This algorithm is expressed according to Eq. (9). As the name of this algorithm indicates, it has a parameter k. This parameter, related to the watershed, is called the return constant. In this relation q b(i) is the baseflow filtered at the time i (m 3 /s), the baseflow filtered at the time i−1 (m 3 /s), q (i) the total flow at the time i (m 3 /s) (Chapman and Maxwell 1996).

Two-parameter algorithm
This algorithm includes the separation variability parameter (C) equivalent to 1−k, and k is the return constant. This algorithm is expressed according to Eq. (10) (Boughton 1993): Three-parameter IHACRES algorithm This algorithm is an extended form of a two-parameter algorithm, defined to separate the algorithmic flow that divides the adequate precipita- tion (u) into fast and slow components. In this algorithm, the two parameters q b and q f , which represent the direct runoff flow and the baseflow, respectively, are as follows.
In these relations, s and s are the filter parameters. The suffixes q and s refer to quick and slow flow, respectively. It should be noted that the value of s is negative. By removing u from these equations, the direct runoff expression is the difference between the baseflow and the total flow. The baseflow is obtained from Eq. (13) Eckhardt algorithm This algorithm has two parameters: (1) landing curve constant α and (2) maximum value of the baseflow index BFI max , which is not measurable but has been optimized by adapting the results of other methods. In Eq. (15), which is used to separate the baseflow, the condition q b(i) ≤ q (i) must be met. In this filter, the hydrological characteristics of the flow and basin are considered to some extent. To calibrate the BFI max parameter, the use of tracking data has been suggested for calibration (Eckhardt 2008).
In this method, when the constant return parameter is 0.9-0.95, the value of the BFI max parameter is 0.8 for a permanent river with a porous aquifer, 0.5 for a non-permanent river with a porous aquifer, and 0.25 for permanent rivers with a rocky aquifer.

Lyne and Hollick algorithm
This algorithm is expressed as Eq. (16). In this relation, q f (i) is the fast-filtered flow for the i-th sampling moment, q f (i−1) is the filtered fast flow for the moment before sampling i, and α is the filter constant (Lyne and Hollick 1979). Chapman algorithm This algorithm, expressed as Eq. (17), is an extended form of the Lyne and Hollick algorithm. In this equation, q f (i) is the fast-filtered flow for the i-th moment of sampling, q f (i−1) is the fast-filtered flow for the moment before sampling i, and α is the filter constant (Chapman 1999).

Furey and Gupta algorithm
In this algorithm, expressed as Eq. (18), q (i−d−1) is the total flow for the moment before sampling i, taking into account the delay (m 3 /s), q b(i−d−1) is the filtered baseflow for the moment before sampling i taking into account the delay (m 3 /s), c 1 is the physical parameter of the filter related to the surface flow, and c 3 is the physical parameter of the filter related to the underground flow, d is the delay time, and 1−γ is the basin landing constant (Furey and Gupta 2001).

EWMA algorithm In this algorithm, for each period t, the baseflow q b(t) is obtained from a set of time series with
Eq. (19). Eq. α is the constant filter parameter.

Evaluation criteria
Three criteria, coefficient of determination ( R 2 ), root mean square error (RMSE), and adjusted coefficient of determination ( R 2 adjusted ), have been used to evaluate different algorithms in this study. These criteria are expressed as the following relations: In the above equations, n is the number of data, i the counter variable, Q iobs the observational data, Q iobs the average observational data, Q ipre the computational data and Q ipre the average computational data. The coefficient R 2 shows the degree of conformity of the models' data with the actual data. RMSE expresses the root mean square errors of computational and observational data. A lower rate indicates better training and simulation of data.
The criterion of R 2 adjusted , in addition to the coefficient R 2 , is also a function of the number of samples and the number of parameters used, so it has more accuracy, and overfitting occurs less in it. The closer the R 2 adjusted to one, the better the result will be.

Results and discussion
In this section, the results of baseflow separation are initially presented using the IC method, which is the primary method in this study. Then, the separation results are presented using the WT and RDF method with their different algorithms, and at the end, a comparison is made between the used methods.

Results of IC method
The results of the baseflow separation of the Gamasiyab spring based on isotopic values δ 18 O and δ 2 H in the water year 2017-18 are presented in Fig. 2. The results show that the share of baseflow based on the isotopic value of δ 18 O and δ 2 H reaches 94.5-89.2% of the total discharge, respectively. The share of surface flow based on the isotopic value of δ 18 O and δ 2 H is 5.5-10.8% of the total discharge.

WT results
The results of using different WFs at different levels are presented in Table 2; in the Gamasiyab spring, the best performance according to the three evaluation criteria is related to the Db4 wavelet function at the decomposition level 6. In this superior combination, the R 2 is 0.95, the RMSE is 0.55, and the R 2 adjusted is 0.95. Table 3 presents the statistical parameters obtained from the baseflow separation with this WF. According to this table, the average baseflow and surface flow separated in the statistical period are 94.5% and 5.5%, respectively. Also, the maximum baseflow in the Gamasiyab spring in the statistical period equals 13.36 m 3 /s. According to Table 3, the average baseflow and surface flow separated in the water year 2017-18 are equal to 0.95-0.5%, respectively. Also, the maximum baseflow in this water year equals 10.33 m 3 /s. Figure 3 shows the daily discharge of the Gamasiyab spring in the statistical period of 1969-2018 along with its separated baseflow using the Db4 wavelet function at the decomposition level 6. Figure 4 shows the discharge of the Gamasiyab spring with its baseflow in 2017-2018.
Page 7 of 15 264 Figure 5 presents the results of baseflow separation by WT and isotopic method. According to this figure, during peak discharge, the WT performance has decreased slightly, and in other cases, it has performed well.

Results of the RDF method
Each algorithm of the RDF method has at least one filter parameter. To optimize the filter parameters in these algorithms, isotopic values and separated baseflows were used on the days of the water year 2017-2018. The optimal value of the filter parameter in each algorithm was determined using the RMSE function. Then, the filter parameter was used for the whole statistical period. BFI + 3.0 and programming in Excel were used to calculate these algorithms.

Results of RDF method-one-parameter algorithm
The results of using this algorithm to separate the baseflow of the Gamasiyab spring for the optimal values of the filter parameter are presented in Table 4. This algorithm has an optimal filter value of 0.78. Using this algorithm, the baseflow index and surface water are calculated as 50-50% of the total discharge, respectively.

Results of RDF method-two-parameter algorithm
The results of this algorithm for separating baseflow of the Gamasiyab spring for the optimal values of k and C filter parameters are presented in Table 5. The other parameter, C, is equal to 1−k. This algorithm has an optimal value of filters of 0.755-0.245. Using this algorithm, the baseflow index and surface water are calculated as 50-50% of the total discharge, respectively.

Results of RDF method-three-parameter algorithm
The results of this method for separating the baseflow of the Gamasiyab spring for optimal values of α, k, and C filter parameters are presented in Table 6. This algorithm has optimal values of filters 0.81, 0.408, and 0.667. Using this algorithm, the baseflow index and surface water are calculated as 96-4% of the total discharge, respectively.

Results of RDF method-Lyne and Hollick algorithm
The results of this algorithm for separating the baseflow of the Gamasiyab spring for the optimal values of the α filter parameter are presented in Table 7. This algorithm has an optimal filter value of 0.6. Using this algorithm, the baseflow index and surface water are calculated at 98% and 2%, respectively.

Results of RDF method-Chapman algorithm
The results of this algorithm for separating the baseflow of the Gamasiyab spring for the optimal values of the α filter parameter are presented in Table 8. This algorithm has an optimal filter value of 0.78. Using this algorithm, the baseflow index and surface water are calculated as 51% and 49%, respectively.

Results of RDF method-Eckhardt algorithm
The results of this algorithm for separating the baseflow of the Gamasiyab spring for the optimal values of the α filter parameter are presented in Table 9. The value of the BFI max parameter in this method is considered 0.80. This algorithm has an optimal filter value of 0.85. Using this algorithm, the baseflow index and surface water are calculated at 80% and 20%, respectively.

Results of RDF method-EWMA algorithm
The results of this algorithm for separating the baseflow of the Gamasiyab spring for the optimal values of the α filter parameter are presented in Table 10. This algorithm has an optimal filter value of 0.755. This algorithm calculates the baseflow index and surface water at 100-0%. In other words, this method considers all discharge as the baseflow.

Results of RDF method-Furey and Gupta algorithm
The results of this algorithm for separating the baseflow of the Gamasiyab spring for the optimal values of the α filter parameter are presented in Table 11. Considering the area of about 60 km 2 of the Gamasiyab catchment area, parameter d is equal to 4 h, parameter C is equal to 0.5, the optimal value of parameter α is equal to 0.8, and the value of the gamma parameter  is equal to 1−α, i.e., 0.2. This algorithm calculates the baseflow index and surface water at 100-0%. In other words, this method considers all discharge as the baseflow. Table 12 presents the performance of different algorithms of the RDF method based on IC-stable oxygen 18 for the Gamasiyab spring. According to Table 12, it can be seen that the performance of three-parameter algorithms, Lyne and Hollick, Eckhardt, EWMA, and Furey and Gupta, has been excellent in terms of statistical indicators, but it should be noted that some of these algorithms considered all flows as baseflows, which need further investigation. This is further explored in Fig. 6.

Comparison of the results of different algorithms of the RDF method
According to Fig. 6, it can be seen that one-parameter and two-parameter algorithms have less estimation of the isotopic method for the baseflow in the Gamasiyab spring. The Chapman algorithm also estimates the baseflow less than the IC method. According to Fig. 6, it can be seen that in the three-parameter algorithm, the maximum point estimate of baseflow was higher than the isotopic method in the Gamasiyab spring. Eckhardt's algorithm performed better in estimating the baseflow than the IC method. According to Fig. 6, it can be seen that the Lyne and Hollick, EWMA, and Furey and Gupta algorithms have considered all flows as baseflows. Examining Table 12 and Fig. 6, it can be concluded that the Eckhardt algorithm performed more logically and better than the baseflow based on the IC method for the Gamasiyab spring.

Comparison of the results of different methods
This section compares the best performance in the two WT and RDF methods. Table 13 compares used methods of baseflow separation in 2017-2018. Based on this table, performance WT is better than the RDF method. In Fig. 7, the two methods are compared with the IC index method for the water year 2017-2018. As seen in most samples, the values of the separated baseflow by the WT are closer to the isotopic values. Due to the RMSE of both methods, the WT also has a lower RMSE and is known as the best method for this study. Figure 3 shows the baseflow separation by WT for the whole statistical period.
This study provided an equation for baseflow separation using the WT for the first time. This equation can be used in all springs and rivers. Since the tracking methods are the most accurate, using them is costly and time-consuming. In this study, by examining different methods, in the end, the WT was provided for baseflow separation in springs and rivers. Since no study was found that used this approach to baseflow separation, it was not possible to compare the results of this study with other studies.

Conclusions
In this study, the WT was used as a new tool to separate the baseflow of the Gamasiyab spring. The daily discharge of 49 years (the water year 1968-1969 to 2017-2018) was used for the Gamasiyab spring. Different RDF method algorithms based on the IC of samples taken in the water year 2017-2018 were used to investigate the accuracy of WT in separating the spring baseflow. In the Gamasiyab spring, the Db4 wavelet function performed best at decomposition level 6. The results showed that the average separated baseflow using WT in the Gamasiyab spring was 3.78 m 3 /s. In other baseflow separation methods, some of the initial and final data of the statistical period were lost, and the baseflow was not provided for them, but the WT tool, while having high accuracy in separating the baseflow, provided the baseflow for each day. The results showed that the WT had a high accuracy in separating the baseflow in the Gamasiyab spring and could be used to separate the baseflow of springs and rivers.