A comparative study of shallow groundwater level simulation with three time series models in a coastal aquifer of South China
 1.8k Downloads
 2 Citations
Abstract
Accurate and reliable groundwater level forecasting models can help ensure the sustainable use of a watershed’s aquifers for urban and rural water supply. In this paper, three time series analysis methods, Holt–Winters (HW), integrated time series (ITS), and seasonal autoregressive integrated moving average (SARIMA), are explored to simulate the groundwater level in a coastal aquifer, China. The monthly groundwater table depth data collected in a long time series from 2000 to 2011 are simulated and compared with those three time series models. The error criteria are estimated using coefficient of determination (R ^{2}), Nash–Sutcliffe model efficiency coefficient (E), and rootmeansquared error. The results indicate that three models are all accurate in reproducing the historical time series of groundwater levels. The comparisons of three models show that HW model is more accurate in predicting the groundwater levels than SARIMA and ITS models. It is recommended that additional studies explore this proposed method, which can be used in turn to facilitate the development and implementation of more effective and sustainable groundwater management strategies.
Keywords
Groundwater table Time series analysis Holt–Winters ARIMA ITSIntroduction
Groundwater is often one of the major sources of water supply for domestic, agricultural, and industrial users. In some areas, it is taken as the only dependable source of supply because of its ready availability. However, groundwater supplies for agricultural, industrial, and municipal purposes have been overexploited in many parts of the world. Various consequences of unsustainable groundwater utilization and management have been of great concerns globally, especially in developing countries (Konikow and Kendy 2005). The consequences of aquifer depletion can lead to local water rationing, excessive reductions in yields, wells going dry or producing erratic groundwater quality changes, changes in flow patterns of groundwater in the inflow of poorer quality water and sea water intrusion in coastal areas, and other harmful environmental side effects such as major waterlevel declines, reduction in water in streams and lakes, increased pumping costs, land subsidence, and decreased well yields have been a great concern to the water managers, engineers, and stakeholders (Adamowski and Chan 2011; Konikow and Kendy 2005; USGS 2010). As a result, many watersheds are experiencing severe environmental, social, and financial problems (Tsanis et al. 2008). As water demand will likely increase in the short and long term, there will be increasing pressures on groundwater resources (Sethi et al. 2010). Therefore, a constant monitoring of the groundwater levels is extremely important. Meanwhile, groundwater systems possess features such as complexity, nonlinearity, being multiscale and random, all governed by natural and/or anthropogenic factors, which complicate the dynamic predictions. Therefore, many hydrological models have been developed to simulate this complex process. Models based on their involvement of physical characteristics generally fall into three main categories: black box models, conceptual models, and physicalbased models (Nourani et al. 2010). The wellforecasted water levels in advance may help the administrators to plan better the groundwater utilization. Also, for an overall development of the basin, a continuous forecast of the groundwater level is required to effectively use any simulation model for water management (Nayak et al. 2006). The common used models for simulating groundwater level include BP neural network, wavelet random coupling model, the gray time series combination model, and time series model. Among which, time series model is the most common and suitable one, however, with different features, such as the integrated time series (ITS) model, the autoregressive moving average (ARMA) model, the autoregressive integrated moving average (ARIMA) model, the seasonal autoregressive moving average (SARMA) model, and the periodic autoregressive (PAR) model (Ahn 2000; Wong et al. 2007; Yang et al. 2009). One of the widely used time series models is the ARMA model, which provide a parsimonious description of a (weakly) stationary stochastic process autoregression and the second for the moving average (McNeil et al. 2005). Autoregressive integrated moving average model (ARIMA) and seasonal autoregressive integrated moving average (SARIMA) models are extensions of ARMA class in order to include more realistic dynamics, in particular, respectively, nonstationarity in mean and seasonal behaviors (Behnia and Rezaeian 2015).
This paper demonstrates a case study on how to utilize time series analysis to predict groundwater table in a coastal island, South China. We evaluate and compare the potential of the three time series models [Holt–Winters (HW), SARIMA and ITS] in the study area. The objectives of the present study are (1) to apply and compare the advantages and disadvantages of these three models on simulating groundwater levels and (2) to provide some useful insights and for the reasonable exploitation and sustainable utilization of groundwater.
Methodology
Holt–Winters model
Exponential smoothing methods are among the most widely used forecasting techniques in industry and business, in particular the HW methods that allow us to deal with univariate time series which contain both trend and seasonal factors. Their popularity is due to their simple model formulation and good forecasting results (Gardner 1985).
HW is the label we frequently give to a set of procedures that form the core of the exponential smoothing family of forecasting methods. The basic structures were provided by Holt in 1957 and his student Winters in 1960 (Holt 1957). Its basic idea is to decompose a time series into a linear trend component, seasonal variation component, and random change component, incorporating the exponential smoothing algorithm; the longterm trend (S _{ t }), trends incremental (b _{ t }), and seasonal changes (I _{ t }) are estimated; and then, a predictive model is established to extrapolate the predicted value. Holt’s method is widely used for forecasting as reported in the literatures (Bermúdez et al. 2010; Gelper et al. 2010). It is an extended single exponential smoothing, which allows forecasting data with nonconstant trends and seasonal variations. Thus, it is also capable for detecting trend in different time periods. Kamruzzaman et al. (2011) applied Holt–Winters seasonal forecasting method to find the evidence of nonstationarity in rainfall and temperature. It was claimed that the Holt–Winters method is capable in tracking changes in the level, trend, and also seasonality; the influence of the random motion can be moderately filtered. Therefore, it is particularly suitable for time series prediction containing the trend and seasonal variation.
ARIMA model
ARIMA, also known as Box–Jenkins models (Box and Jenkins 1976), has been a very popular type of time series forecast models in hydrological field. It has the function of transforming a nonstationary time series into a stationary time series, by regressing the lag of independent variable, the present value, and lagged values of random error. ARIMA model, depending on the smoothness of the original sequence and the different part in regression model, includes the moving average (MA), autoregressive process (AR), autoregressive moving average process (ARMA), and ARIMA process.
In general, an ARIMA model is characterized by ARIMA (p, d, q), where, p, q, and d denote the order of autoregression, integration (differencing), and moving average, respectively. The corresponding seasonal multiplicative ARIMA model is represented by ARIMA (p, d, q) × (P, D, Q)^{ s } with P, D, and Q denoting the seasonal autoregression, integration (differencing), and moving average, respectively (see Box and Jenkins 1976; Tankersley and Graham 1993).
Moving average model MA (q)
ARIMA models cannot really cope with seasonal behavior; we see that, compared with ARMA models, ARIMA (p, d, q) only models time series with trends. We will incorporate now seasonal behavior and present a general definition of the seasonal ARIMA models. The idea behind the seasonal ARIMA is to look at what are the best explanatory variables to model a seasonal pattern.
 p

the order of the local or regular AR term
 d

the number of local differences
 q

the order of local or regular MA term
 P

the order of periodic or stationary AR term
 D

the number of periodic differences
 Q

the order of periodic or stationary MA term
 s

the time period of the series.
 (i)
Preparation of data including transformations and differencing.
 (ii)
Identification of the potential models by looking at the sample autocorrelations and the partial autocorrelations.
 (iii)
Estimation of the unknown parameters.
 (iv)
Checking the adequacy of fitted model by performing normal probability plot, and creating a model.
 (v)
Forecast future outcomes based on the known data.
Integrated time series model
The process of modeling involves extracting the components from the known sequence H(t) (t = 1, 2, 3,…, n). The extraction order is the trend component with periodic component, followed by the random component. After the mathematical model has been developed and overlaid linearly, model (10) can be obtained (Yang et al. 2009).
Multiple regression method can be used to determine the undetermined coefficient \( c_{0} ,c_{1} ,c_{2} , \ldots ,c_{k} \) and order k. The method is to use Excel software regression analysis templates to implement. To test the fitting result, the trend curve fitting correlation coefficient R at a certain level of significance is needed to be calculated.
In practical applications, the estimation of the first six harmonics can meet the precision requirement.
If \( s_{j}^{2} = a_{j}^{2} + b_{j}^{2} > 4s^{2} \frac{{\ln \frac{j}{\alpha }}}{n} \), the wave was not significant; otherwise, where \( s^{2} \) is the variance, the calculation formula is: \( s^{2} = \frac{1}{n  1}\sum_{t = 1}^{n} (X_{t}\overline{{X_{t} }} )^{2} \); \( \alpha \) is the significance level test (generally 5 %).
Random component is the last one to be extracted. It can be influenced by many uncertain factors, such as noise. It can be extracted with autoregression method.
Criteria of performance evaluation
The Nash–Sutcliffe model efficiency coefficient is used to assess the predictive power of hydrological models (PulidoCalvo and GutierrezEstrada 2009). An efficiency of 0 (E = 0) indicates that the model predictions are as accurate as the mean of the observed data, whereas an efficiency less than zero (E < 0) occurs when the observed mean is a better predictor than the model or, in other words, when the residual variance (described by the numerator in the expression above) is larger than the data variance (described by the denominator). Essentially, the closer the model efficiency is to 1, the more accurate the model is.
In Eqs. (15), (16), and (17): N is the number of data points used, \( \overline{y}_{i} \) is the average value of observed values over N, \( y_{i} \) is the observed monthly groundwater level, and \( \widehat{y}_{i} \) is the forecasted groundwater level from the model.
Study area and data description
Groundwater level data in the study area was obtained by monitoring the groundwater level each 5 days. The monitoring period continued from January 2000 to December 2011.
The main geological coverage of the study area is coastal plain, plateau, and hilly region. Of which alluvial plain consisting sand and gravel with thin clay accounts for more than 80 % where all the observations wells are located. The study area can be considered as an independent hydrogeological unit due to the sea surrounding on the four sides. Water yield property differs greatly because of lithology and thickness of the aquifer. Groundwater type is coarse porous water, recharged predominantly by rainfall infiltration.
Modeling
To analyze and forecast groundwater table in Dongshan County with three time series methods mentioned above, taking longterm observation well with number 3506260025 as an example, in which monthly average groundwater level was monitored during 2000–2011, the data set from 2000–2009 is used for model establishment, and those of 2010–2011 is used for predicting the dynamic change.
Holt–Winters model
Multiplicative method is selected in EViews 6.0 software to establish HW model, the software automatically select the minimum of sum squared error (SSE) and mean square error (MSE) as the best prediction model. After calculation, when α = 0.98, β = 0, γ = 0, the SSE and MSE were found to get the minimum; therefore, this combination of smoothing coefficient was considered as the optimal prediction model, fitting curve between observed values and the calculated values was shown in Fig. 10.
SARIMA model
According to the description in “ARIMA model,” SARIMA \( (p, d, q) \times (P, D, Q)^{s} \) is created using SPSS 18.0 software with the monthly average groundwater table depth monitored during 2000–2009. In identifying a seasonal model, the first step is to determine whether or not a seasonal difference is needed, in addition to or perhaps instead of a nonseasonal difference. Time series plots and autocorrelation function (ACF) and partial autocorrelation function (PACF) plots for all possible combinations of 0 or 1 nonseasonal difference and 0 or 1 seasonal difference should be considered. The detailed procedure of establishing ARIMA \( \left( {p, d, q} \right) \times \left( {P, D, Q} \right)^{s} \) is described in the following section.
Integrated time series model
ITS model was implemented with Excel 2007. Through decomposition procedures of groundwater level series in “Integrated time series model,” the predicted groundwater table can be obtained by adding the three components; Fig. 7 demonstrates that the groundwater table shows an increasing trend year by year. In natural condition, the groundwater depth should be steady or vary to a certain extent and cannot increase or decrease continuously. So the trend component could reflect the degree of exploitation by human.
Then, Fig. 10 shows a comparison of observed and calculated groundwater levels for training period by three models. It can be seen that all three models reproduced the observed time series well enough.
Results and discussions
Statistical results of accuracy evaluation for groundwater level simulation during 2010–2011
Parameter  Holt–Winters  SARIMA  ITS 

R ^{2}  0.997249  0.996683  0.993105 
E  0.167941  −0.003479  −1.09388 
RMSE  0.180867  0.198626  0.286918 
It can be seen in Fig. 11 that these three models are all suitable to predict groundwater levels. HW model was found to outperform SARIMA and ITS model, which can be observed both from the fitness between the observed and predicted values in Fig. 11 and statistical results of performance evaluation in Table 1. HW model has smaller RMSE than SARIMA and ITS models. The negative values of E for ARIMA and ITS models indicate that the observed is better than model results; however, ARIMA model performs better than ITS model since a smaller absolute E value is obtained.
Since groundwater level dynamic is a complex response to many factors, time series analysis can reflect the influence of human behavior, rainfall and solar activity. With its advantage of being easily implemented, its potential in analyzing and forecasting groundwater level dynamic is overwhelming.
Conclusions
Time series techniques have been widely used in environmental contexts. In particular, its application in hydrological forecasting has been a common practice. This paper explores the utilization of three time series models, HW, SARIMA model, and ITS model, and their potential for forecasting groundwater levels is investigated with a case study in a shallow aquifer of a coastal island, China. The monitored longterm observation monthly groundwater table depth data series from 2000 to 2011 are used in model setup and prediction. The capability to make precise predictions for each model was evaluated with statistical error criteria, coefficient of determination (R ^{2}), Nash–Sutcliffe model efficiency coefficient (E), and RMSE. The results indicate that three models are all accurate in reproducing the historical time series of groundwater levels. The comparison of three models shows that HW model is more accurate in predicting the groundwater levels than SARIMA and ITS models.
HW model is a more sophisticated method of forecasting than methods of moving average and exponential smoothing. The method can include seasonality, which is important, and the overall trend. It is an extension of Holt method of a threeparameter exponential smoothing. It means the method is characterized by three parameters that must be selected to get the forecast. α, β, and γ are parameters that must be selected before the forecasting. The choice of parameters was carried out using “smart” enumeration and minimization of errors on the known data.
Time series analysis methods, HW, ITS, and SARIMA, are explored to simulate the groundwater level in a coastal aquifer, China. The error criteria are estimated using coefficient of determination (R ^{2}), Nash–Sutcliffe model efficiency coefficient (E), and RMSE. The results showed that three models can accurately predict the water table. However, SARIMA model demonstrates more reliable capability compared with ITS and HW model.
Notes
Acknowledgments
This research was financially supported by National Natural Science Foundation of China (Grant number 41402202) and Specialized Research Fund for the Doctoral Program of Higher Education (20130061120084).
References
 Adamowski J, Chan HF (2011) A wavelet neural network conjunction model for groundwater level forecasting. J Hydrol 407:28–40CrossRefGoogle Scholar
 Ahn H (2000) Modeling of groundwater heads based on secondorder difference time series models. J Hydrol 234(1–2):82–94CrossRefGoogle Scholar
 Akaike H (1969) Fitting autoregressive models for prediction. Ann Inst Stat Math 21(1):243–247CrossRefGoogle Scholar
 Behnia N, Rezaeian F (2015) Coupling wavelet transform with time series models to estimate groundwater level. Arab J Geosci. doi: 10.1007/s1251701518290 Google Scholar
 Bermúdez JD, Segura JV, Vercheri E (2010) Bayesian forecasting with the Holt–Winters model. J Oper Res Soc 61:164–171CrossRefGoogle Scholar
 Box GEP, Jenkins GM (1976) Series analysis forecasting and control. PrenticeHall Inc., LondonGoogle Scholar
 CastellanoMéndez M, GonzálezManteiga W, FebreroBande M, ParadaSánchez JM, LozanoCalderón R (2004) Modelling of the monthly and daily behavior of the runoff of the Xallas river using Box–Jenkins and neural networks methods. J Hydrol 296(1–4):38–58CrossRefGoogle Scholar
 Gardner ES Jr (1985) Exponential smoothing, the state of the art. J Forecast 4:1–28CrossRefGoogle Scholar
 Gelper S, Fried R, Croux C (2010) Robust forecasting with exponential and Holt–Winters smoothing. J Forecast 29:285–300Google Scholar
 Holt CC (1957) Forecasting trends and seasonals by exponentially weighted averages. In: Carnegie Institute of Technology. Pittsburgh ONR memorandum no. 52Google Scholar
 Konikow LF, Kendy E (2005) Groundwater depletion, a global problem. Hydrogeol J 13(1):317–320CrossRefGoogle Scholar
 McNeil AJ, Frey R, Embrechts P (2005) Quantitative risk management: concepts, techniques, and tools. Princeton University Press, PrincetonGoogle Scholar
 Nayak P, Rao YRS, Sudheer KP (2006) Groundwater level forecasting in a shallow aquifer using artificial neural network approach. Water Resour Manag 20(1):77–90CrossRefGoogle Scholar
 Nourani V, Ejlali RG, Alami MT (2010) Spatiotemporal groundwater level forecasting in coastal aquifers by hybrid Artificial Neural NetworkGeostatisics model: a case study. Environ Eng Sci 28:217CrossRefGoogle Scholar
 PulidoCalvo I, GutierrezEstrada JC (2009) Improved irrigation water demand forecasting using a softcomputing hybrid model. Biosyst Eng 102(2):202–218CrossRefGoogle Scholar
 Sethi RR, Kumar A, Sharma SP, Verma HC (2010) Prediction of water table depth in a hard rock basin by using artificial neural network. Int J Water Resour Environ Eng 2(4):95–102Google Scholar
 Sreekanth P, Geethanjali DN, Sreedevi PD, Ahmed S, Kumar NR, Jayanthi PDK (2009) Forecasting groundwater level using artificial neural networks. Curr Sci 96(7):933–939Google Scholar
 Tsanis IK, Coulibay P, Daliakopoulos N (2008) Improving groundwater level forecasting with a feedforward neural network and linearly regressed projected precipitation. J Hydroinformatics 10(4):317–330CrossRefGoogle Scholar
 USGS (2010) http://ga.water.usgs.gov/edu/gwdepletion.html. Accessed 02 July 2010
 Wong H, Wai CI, Zhang RQ, Xia J (2007) Nonparametric time series models for hydrological forecasting. J Hydrol 332:337–347CrossRefGoogle Scholar
 Yang ZP, Lu WX, Long YQ, Li P (2009) Application and comparison of two prediction models for groundwater levels. A case study in Western Jilin Province, China. J Arid Environ 73:487–492CrossRefGoogle Scholar
 Zhao Y, Wang H (2007) Forecasting total circular volume of air railway transport with Holt–Winters model. J Civ Aviat Univ China 25(2):1–11Google Scholar
 Zheng DW (1989) Time series analysis and studies of earth rotation. Prog Astron 7(2):118–124Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.