Introduction

Groundwater is a valuable resource for domestic, irrigation and industrial uses. In China, a large part of water is supplied by groundwater, thereby increasing its importance. Therefore, it is essential to perform the dynamical prediction of groundwater table to protect and sustain the groundwater resources. In the natural scale, groundwater levels, as the dynamic behaviour of storage balance, is often affected by many factors, such as recharge driven by climatic processes and discharge to surface water. The groundwater system is inherently characterized with complexity, nonlinearity, multiscalarity and randomness, influenced by natural and/or anthropogenic factors, which complicate the dynamic predictions (Yang et al. 2015).

Past literature review indicates that the empirical time series models proposed by Box and Jenkins (1976) and Hipel and McLeod (1994) could be used for the prediction of a longer time series of water table depth. Also, some empirical approaches have been widely applied for the prediction of water table depth by Knotters and van Walsum (1997). Although conceptual and physically based models are the main tool for depicting hydrological variables and understanding the physical processes taking place in a system, they do have practical limitations. When data are not sufficient and getting accurate predictions is more important than conceiving the actual physics, empirical models remain a good alternative method and can provide useful results without a costly calibration time (Daliakopoulos et al. 2005; Zhao et al. 2014). Unfortunately, empirical models are not adequate for making predictions when the dynamical behaviour of the hydrological system changes with time as suggested by Bierkens (1988). Subsequently, some non-empirical models have been proposed for groundwater table depth modelling (Bras and Rodriguez-Iturbe 1985; Lin and Lee 1992; Brockwell and Davis 2010; Doglioni et al. 2010). Time series models and artificial neural network (ANN) models are such ‘black box’ models which are capable of modeling a dynamic system. In recent years, artificial neural networks have been proposed as a promising alternative approach to time series forecasting. Many successful applications have shown that neural networks provide an attractive alternative tool for time series modelling, among them the RBFNN model is wildly used for nonlinear system identification. The RBFNN model is characterized by a simpler structure, faster convergence, less parameters and smaller extrapolation and it is more computationally efficient (Girosi and Poggio 1990; Xie et al. 2011).

The theory of the grey system was established during the 1980s for the purpose of making quantitative predictions. As far as information is concerned, the systems which lack information, such as structure message, operation mechanism and behaviour document, are referred to as grey systems, where “grey” means poor, incomplete, uncertain, etc. It has received increasing application in the field of hydrology (Xu et al. 2008). There are several models for grey theory, among them the GM (1, 1) method is relatively simple, but can get high precision of prediction (Yang et al. 2015). The GM (1, 1) model is a multidisciplinary theory dealing with those systems for which we lack information. From the point of view of the GM (1, 1) model, the dynamics of groundwater level is regarded as a typical grey system problem, where the GM (1, 1) model can better reflect the changing features of groundwater level. It especially has the unique function of analysis and modelling for short time series, less statistical data and incomplete information of the system and has been widely applied (Deng 2002).

There has been no report of the comparativeness between the time series model GM (1, 1) and the RBFNN model in the prediction of groundwater level depth. In this study, we evaluated the potential of the popular time series models (1, 1) method and the seasonal decomposition method; multiplicative and additive methods have been applied to simulate groundwater water tables in a coastal aquifer at Fujian Province, South China, and the simulated results are compared by evaluating the root mean square error (RMSE) and regression coefficient (R 2).

Methodology

RBFNN model

Neural networks have gone through two major development periods: the early 1960s and the mid-1980s. Up to now, there are many types of artificial neural networks (ANNs) that have been used for time series forecasting. They were a key development in the field of machine learning. Artificial neural networks were inspired by biological findings relating to the behaviour of the brain as a network of units called neurons (Rumelhart et al. 1986).

Architecture of radial basis function neural network

Basically, radial basis function neural network is composed of a large number of simple and highly interconnected artificial neurons and can be organized into several layers, i.e. input layer (X), hidden layer (H) and output layer (Y) (Gevindaraju and Rao 2000; Haykin 1999). Figure 1 shows the neural network’s topology structure.

Fig. 1
figure 1

Structure of the radial basis function neural network model

RBFNN learning

RBF neural network learning algorithm aims to solve the three parameters: \(c_{i}\) (the centre of the ith unit in the hidden layer), \(\sigma\) (the width parameter) and \(\omega_{ij}\) (the connecting weight between ith hidden unit and the jth output unit) (Huang et al. 2003).

Input layer An input pattern enters the input layer and is subjected to direct transfer function and output from the input layer is the same as the input pattern. The number of nodes in the input layer is equal to the dimension of the input vector L.

Output from the input layer with element Ii (i = 1 to L) is Ii.

Hidden layer The hidden layer transforms the data from the input space to the hidden space using a nonlinear function. There are many activation functions, the most commonly used is the Gaussian function (Schwenker et al. 2001) and its mathematical model of the algorithm can be defined as follows:

$$R\left( {x_{p} - c_{i} } \right) = \exp \left( { - \frac{1}{{2\sigma^{2} }}\, ||x_{p} - c_{i} ||^{2}} \right),$$
(1)

where \(||x_{p} - c_{i}||\) denotes the Euclidean norm; \(c_{i}\) is the centre of the ith unit in the hidden layer; \(\sigma\) is the width parameter. \(R\left( {x_{p} - c_{i} } \right)\) is the response of the ith hidden unit resulting from all input data and h is the number of output units (Wang et al. 2013).

The output layer is linear and serves as a summation unit. The activity of the jth unit in the output layer \(y_{j}\) can be calculated according to:

$$y_{j} = \mathop \sum \limits_{i = 1}^{h} \omega_{ij} \exp \left( { - \frac{1}{{2\sigma^{2} }}\, ||x_{p} - c_{i}||^{2} } \right),$$
(2)

where \(\omega_{ij}\) is the connecting weight between the ith hidden unit and the jth output unit;

In brief, the RBF neural network model learning is constructed following three steps:

Step 1. Initializing the centre using a clustering method.

Step 2. The \(\sigma\) is the centre width, which can be obtained from

$$\sigma_{i} = \frac{{c_{\hbox{max} } }}{{\sqrt {2h} }}\quad i = 1,2, \ldots ,h,$$
(3)

where \(c_{\hbox{max} }\) is the maximum distance between the centres of the hidden units.

Step 3. The connecting weight between the hidden unit and the output unit can be calculated by the least squares estimation as follows:

$$\omega = \exp \left( {\frac{h}{{c_{\hbox{max} }^{2} }}\, ||x_{p} - c_{i} ||^{2}} \right) \quad i = 1,2, \ldots ,h;\quad p = 1,2,3, \ldots ,P .$$
(4)

Evaluation criteria

To evaluate the effectiveness of each network in its ability to make precise predictions, the root mean square error (RMSE) criterion is used in this paper. It is calculated by:

$${\text{RMSE}} = \frac{1}{n}\mathop \sum \limits_{i}^{n} (y_{i} - \bar{y}_{i} )^{2} ,$$
(5)

where \(y_{i}\) is the observed data, \(\bar{y}_{i}\) the estimated data and n the number of observations. The lower the values of RMSE, the more precise is the prediction.

GM (1, 1) model

As we all know, there are three kinds of information, in which the white information is already known well, the grey information is known partly and the black information is not known at all (Deng 1982, 1989). The GM (1, 1) model is a multidisciplinary theory dealing with those systems that lack information. GM (1, 1) means a single differential equation model with a single variation. The dynamics of groundwater level is controlled and related by many factors, which is a very complicated and not known well by people. From the point of view, the grey system theory provides us one of methods to study the system (Xu et al. 2008). The modelling process of the grey system theory can be summarized as follows:

  1. 1.

    Suppose there is a series of discrete nonnegative data as

    $$X^{\left( 0 \right)} \left( m \right) = \left\{ {x^{\left( 0 \right)} \left( 1 \right),x^{\left( 0 \right)} \left( 2 \right), \ldots ,x^{\left( 0 \right)} \left( n \right)} \right\}.$$
    (6)
  2. 2.

    Accumulate the discrete data above once to get a new serial, that is

    $$X^{\left( 1 \right)} \left( m \right) = \left\{ {x^{\left( 1 \right)} \left( 1 \right),x^{\left( 1 \right)} \left( 2 \right), \ldots ,x^{\left( 1 \right)} \left( n \right)} \right\},$$
    (7)

    where \(X^{\left( 1 \right)} \left( m \right) = \mathop \sum \limits_{i = 1}^{m} x^{\left( 0 \right)} \left( i \right), m = 1,2, \ldots ,n.\)

  3. 3.

    According to the GM (1, 1) model, the differential equation of the new sequence can be described as follows:

    $$\frac{{{\text{d}}x^{\left( 1 \right)} \left( t \right)}}{{{\text{d}}\left( t \right)}} + ax^{\left( 1 \right)} \left( t \right) = ut \in [0,\infty ).$$
    (8)
  4. 4.

    Suppose \(\hat{a} = \left( {a,u} \right)^{T}\), \(\hat{a}\) can be calculated by the least squares estimation as

    $$\hat{a} = \left( {a,b} \right)^{T} = \left( {B^{T} B} \right)^{ - 1} B^{T} Y,$$
    (9)
    $${\text{in which}},\quad B = \left| {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} { - \frac{1}{2}(x^{\left( 1 \right)} (1) + x^{\left( 1 \right)} (2)} & 1 \\ \end{array} } \\ {\begin{array}{*{20}c} { - \frac{1}{2}(x^{\left( 1 \right)} (2) + x^{\left( 1 \right)} (3)} & 1 \\ \end{array} } \\ \end{array} } \\ {\begin{array}{*{20}c} \ldots \\ {\begin{array}{*{20}c} { - \frac{1}{2}(x^{\left( 1 \right)} \left( {n - 1} \right) + x^{\left( 1 \right)} \left( n \right)} & 1 \\ \end{array} } \\ \end{array} } \\ \end{array} } \right|,\quad Y = \left| {\begin{array}{*{20}c} {\begin{array}{*{20}c} {x^{\left( 0 \right)} (2)} \\ {x^{\left( 0 \right)} (3)} \\ \end{array} } \\ {\begin{array}{*{20}c} \ldots \\ {x^{\left( 0 \right)} \left( n \right)} \\ \end{array} } \\ \end{array} } \right|.$$
    (10)
  5. 5.

    The approximate time response function for \(\hat{x}^{\left( 1 \right)}\) is as follows:

    $$\hat{x}^{\left( 1 \right)} \left( {m + 1} \right) = \left[ {x^{\left( 0 \right)} \left( 1 \right) - \frac{b}{a}} \right]e^{ - am} + \frac{b}{a}.$$
    (11)
  6. 6.

    \(\hat{x}^{\left( 0 \right)}\) can be restored as

    $$\hat{x}^{\left( 0 \right)} \left( {1 + t} \right) = \hat{x}^{\left( 1 \right)} \left( {1 + t} \right) - \hat{x}^{\left( 1 \right)} \left( t \right).$$
    (12)

Thus, the grey forecasting model of \(\hat{x}^{\left( 0 \right)}\) is as follows:

$$\hat{x}^{\left( 0 \right)} \left( m \right) = (1 - e^{a} )\left[ {x^{\left( 0 \right)} \left( 1 \right) - \frac{b}{a}} \right]e^{ - a(m + 1)} .$$
(13)
  1. 7.

    Before forecasting the groundwater level, the after-test residue method should be used to test the accuracy of the method (Chen et al. 1994).

Absolute error of samples:

$$\varepsilon^{\left( 0 \right)} \left( k \right) = x^{\left( 0 \right)} \left( k \right) - \hat{x}^{\left( 0 \right)} (k).$$
(14)

The mean of \(\varepsilon^{\left( 0 \right)} \left( k \right)\) and \(x^{\left( 0 \right)} \left( k \right)\):

$$\bar{\varepsilon } = \frac{1}{n}\mathop \sum \limits_{k = 1}^{n} \varepsilon^{\left( 0 \right)} (k),$$
(15)
$$\bar{x} = \frac{1}{n}x^{(0)} ({\text{k}}).$$
(16)

The variance of \(\varepsilon^{\left( 0 \right)} \left( k \right)\) and \(x^{\left( 0 \right)} \left( k \right)\):

$$S_{1}^{2} = \frac{1}{n}\mathop \sum \limits_{k = 1}^{n} (\varepsilon^{(0)} \left( k \right) - \bar{\varepsilon })^{2} ,$$
(17)
$$S_{2}^{2} = \frac{1}{n}\mathop \sum \limits_{k = 1}^{n} (x^{\left( 0 \right)} (k) - \bar{x})^{2} .$$
(18)

The accuracy of the model can be examined by the micro error probability:

$$P = P\left\{ {\left| {\varepsilon^{(0)} \left( k \right) - \bar{\varepsilon }} \right| < 0.6745S_{2} } \right\}.$$
(19)

The posterior error of the model is:

$$C = \frac{{S_{1} }}{{S_{2} }}.$$
(20)

The precision of the model = max {the grade of P, the grade of C}.

The value ranges of P and C divide the degree of accuracy for the GM (1, 1) model shown in Table 1.

Table 1 The predicted grade for the GM (1, 1) model

Application

Study area

Longyan City is located at the western part of the Fujian Province in the southeast of China, between 115°51′E–117°45′E longitude and 24°23′N–26°02′N latitude, consisting of Changting County, Shanghang County, Yongding District, Liancheng County, Wuping County, Zhangping City and Xinluo District, and covers an area of about 19,027 km2. Figure 2 shows the outlined location map of the study area. It is characterized by the subtropical marine monsoon climate. The annual average rainfall is about 1457.87 mm, with an average evaporation of about 1530.33 mm. The rainfall is concentrated in April to September, accounting for 74.5–80 % of the annual precipitation.

Fig. 2
figure 2

The outlined location map of the study area

Modelling

The RBFNN model

Preparations for neural network

Considering the dynamic change of groundwater, its influence factors and the actual situation in the study area, we take well #1138 as an example to perform groundwater level simulation. As the groundwater aquifer is unconfined, the groundwater level is influenced by many factors, mainly including river, runoff, precipitation quantity, evaporation quantity, groundwater manual mining quantity and so on. Given the limitations of the monitored data, the number of input and output layer neurons is 2 and 1, respectively. The monitored items include X1 (precipitation quantity), X2 (evaporation quantity) and Y (groundwater level). The number of hidden layer is adjusted in the RBFN network model learning.

To avoid the errors between different units in the sample data, the original data should be standardized as follows:

$$x_{j}^{'} = \frac{{x_{j} }}{{x_{j\hbox{max} } + x_{j\hbox{min} } }},$$
(21)

where \(x_{j}^{'}\) is the standardized value of the sample; \(x_{j}\) the original value of the sample; \(x_{j\hbox{max} }\) the maximal value of \(x_{j}\); \(x_{j\hbox{min} }\) the minimal value of \(x_{j}\). Then, the range of each input data is 0–1 using the above equation (Zhang et al. 2012). After running the model, the final prediction results can be calculated with Eq. (22):

$$x_{j} = \frac{{\hat{x}^{'}_{j} }}{{x_{j\hbox{max} } + x_{j\hbox{min} } }},$$
(22)

where \(\hat{x}^{'}_{j}\) is the simulated value of \(x_{j}^{'}\).

The monthly average groundwater tables are set as input samples, a total of 108 samples from January 2003 to December 2011. 28 samples from January 2003 to August 2009 are set as training samples and the others as the testing samples.

In this case, the MATLAB platform is employed to construct the training set and checking set, pretreatment of original data and result evaluation of the neural network.

Its function format can be defined as follows:

$${\text{Net}} = {\text{newrb }}\left( {p,t,{\text{e}}.{\text{g}}.{\text{ spread}},{\text{ MN}},{\text{ DF}}} \right),$$

where p and t are the input vector and target respectively; e.g. = 0.0001 (mean squared error goal); spread = 3.5 (the evolution of radial basis function); MN = 80 (the neuron maximal number); DF = 1 (the increased number of neurons between two shows).

RBFNN training and testing

28 Samples from September 2009 to December 2011 were used to perform RBFNN training, the order of the serial number is No 1 to No 28. By comparing the calculated value and actual value of groundwater level, we can judge the advantages and disadvantages of the network. During the model training period, the RBFNN models are used to compute the monthly groundwater level for well #1138 observation wells. Figure 2 shows the median absolute percentage error (MdAPE).

It can be seen that the maximum median absolute percentage error of the network for 28 training samples is 0.253 %. The root mean square error (RMSE) between the RBFNN model computed values and observed data is 0.307. The result indicates that the RBFNN model has a low value in the training sets. Figure 3 shows the training stage and that the results computed by the RBFNN model reasonably match the observed groundwater levels. Therefore, the model can be used to predict the monthly groundwater level.

Fig. 3
figure 3

The median absolute percentage of the test samples by the RBFNN model

GM (1, 1) model

The GM (1, 1) model is a classical mode in the grey forecasting models. Following the modelling steps described in “GM (1, 1) model”, the same well #1138 used in the RBFNN model is taken as an example to perform the model test. Taking the data of January 2003–2011 as original, we obtain the following results.

  1. 1.

    The observed data are converted into a new data series by a preliminary transformation called AGO (accumulated generating operation):

    $$X^{\left( 0 \right)} \left( m \right) = \left\{ {343.847,343.335,344.971,344.104,343.933,343.708,343.003343.971,343.435} \right\},$$
    $$X^{\left( 1 \right)} \left( m \right) = \left\{ {x^{\left( 1 \right)} \left( 1 \right),x^{\left( 1 \right)} \left( 2 \right), \ldots ,x^{\left( 1 \right)} \left( 9 \right)} \right\} = \{ 343.383,686.112,1028.828,1371.743,1714.868,2057.490,2400.027,2742.672,3085.378\} .$$
  2. 2.

    a and b are calculated using least squares estimation:

    $$\hat{a} = 0.0001,\quad \hat{b} = 343.851.$$
  3. 3.

    The groundwater level prediction model of January is:

    $$\hat{x}^{(0)} \left( t \right) = 342.864e^{ - 0.0001(t + 1)} .$$
    (23)

Therefore, using the predictor formula (23), we can get the predicted groundwater level of January 2003–December 2011. Figure 4 shows the median absolute percentage error (MdAPE) of the groundwater level.

Fig. 4
figure 4

The observed and forecast values by the RBFNN model

Figure 4 illustrates that the maximal MdAPE is less than 0.5 % of the analysis of the predictable results. The difference check of the prediction model: \(0.35 < C = \frac{{S_{1} }}{{S_{2} }} = 0.491 \le 0.5\) P = 1 > 0.95 and the model precision is the I-grade model. It can be seen in Fig. 5 that the model follows the same tendency of the observed groundwater level. So this model is reliable and accurate and can be used to predict the groundwater level.

Fig. 5
figure 5

The median absolute percentage of the test samples by the GM (1, 1) model

Results and discussions

To assess the models’ performance, 120 sets of monthly average groundwater levels monitored from September 2009 to December 2011 were selected to make a forecast with the two models. The comparisons of the observed groundwater level with those forecasted using the BRFNN model and GM (1, 1) model are given in Fig. 6. It can be seen that the groundwater level forecasted using the BRFNN model has a better fit to the observed values. However, to evaluate quantitatively the accuracy of each model, the root mean square error (RMSE), mean absolute error (MAE) and the correlation coefficient (R 2) are obtained. They are defined as

$${\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{t = 1}^{n} (x_{t} - \hat{x}_{t} )^{2} }}{n}} ,$$
(24)
$${\text{MAE}} = \mathop \sum \limits_{t = 1}^{n} \frac{{\left| {x_{t} - \hat{x}_{t} } \right|}}{n},$$
(25)
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{t = 1}^{n} \left( {x_{t} - \hat{x}_{t} } \right)^{2} }}{{\mathop \sum \nolimits_{t = 1}^{n} x_{t}^{2} - \frac{{\mathop \sum \nolimits_{t = 1}^{n} \hat{x}_{t}^{2} }}{n}}},$$
(26)

where \(\hat{x}_{t}\) is the estimated value at time t, \(x_{t}\) the observed value at time t and n the number of time steps.

Fig. 6
figure 6

The observed and forecasted groundwater level by the GM (1, 1) model

It is known that RMSE describes the average magnitude of the errors between the observed values and the calculated results. MAE is the average of the absolute errors and can be used to measure how close the simulated values are to the observed values. The lower the values of the RMSE and MAE, the more precise is the prediction. R 2 measures the degree of correlation among the observed and simulated values. The best fit between the observed and estimated values would reach R 2 = 1. Table 2 summarizes the accuracy degree of the forecast models. It can be seen that the two models developed in this paper have a good fitting precision and can be used to predict the monthly groundwater level. However, the RMSE values of the GM (1, 1) and the RBFNN models are 0.30715 and 0.41941, and the MAE values are 0.24233 and 0.30560, respectively. These results indicate that the RBFNN model has a better fit than GM (1, 1) for this case study (Fig. 7).

Table 2 Model prediction accuracy results
Fig. 7
figure 7

Groundwater level forecast results

Conclusions

In this paper, the radial basis function neural network (RBFNN) and GM (1, 1) models are employed to predict the monthly groundwater level fluctuations and to investigate the suitability of these two models. The effectiveness and their capability of predicting groundwater levels are assessed with RMSE, MAE and R 2. The results indicated that both models are accurate in reproducing the groundwater levels. However, the RMSE, MAE and R2 values indicate that the RBFNN model is more competent in forecasting groundwater level as compared to the GM (1, 1) model. The RBFNN model based on the history monitoring data of groundwater level predicts the future of the groundwater system according to the past rule and is applicable for the areas with long-term monitoring data. The RBFNN model has been wildly used for nonlinear systems identification because of their simple topological structure and their ability to reveal how learning proceeds in an explicit manner. The GM (1, 1) model is a multidisciplinary theory dealing with those systems that lack information, which uses a black–grey–white colour spectrum to describe a complex system whose characteristics are only partially known or known with uncertainty. However, in the GM (1, 1) model, elements a and b are fixed once determined and, regardless of the numbers of values, the elements will not change with time, and this feature limiting GM (1, 1) is only suitable for short-term forecasts. Due to many factors will enter the system with the development of the system with time and its accuracy of the prediction model will become increasingly weak with the time away from the origin. Despite the higher reliability of the RBFNN model, overfitting is a problem which needs to be studied further.