Skip to main content
Log in

Deep learning-based uncertainty quantification of groundwater level predictions

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Due to the underlying uncertainty in the groundwater level (GWL) modeling, point prediction of the GWLs does not provide sufficient information for decision making and management purposes. Thus, estimating prediction intervals (PIs) for groundwater modeling can be an important step in sustainable water resources management. In this paper, PIs were estimated for GWL of selected piezometers of the Ardebil plain located in northwest of Iran and the Qorveh–Dehgolan plain located in west of Iran, using bootstrap methods based on artificial neural networks (ANNs). For this purpose, the classic feedforward neural network (FFNN) and deep learning (DL)-based long short-term memory (LSTM) were used as ANN bases and the classic bootstrap and moving blocks bootstrap (MBB) as the bootstrap variations. Monthly GWL data of some piezometers as well as hydrologic data of the related stations from both plains were used for the training and validation of the models. The results showed that the LSTM outperforms the seasonal auto regressive integrated moving average model with exogeneous data (SARIMAX), which is a linear model, and classic FFNN in point prediction task. Moreover, in terms of PIs model performance, the LSTM-based MBB (MBLSTM) achieved an average of 30% lower coverage width criterion (CWC) than the FFNN-based MBB (MBFN) and average of 40% lower CWC than the FFNN-based classic bootstrap (BFN). In addition, PIs estimated for piezometers situated in areas with high transmissivity resulted in 55% lower CWC than PIs estimated for piezometers, which are located in areas with lower transmissivity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and material

The data will be available upon request.

Code availability

The code will be available upon request.

References

Download references

Funding

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vahid Nourani.

Ethics declarations

Conflicts of interest

There is not any conflict of the interest.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent to publish

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Details point prediction models

1.1 Feedforward neural network

The target data for FFNN models was the GWL at time step (t) and inputs were selected from the GWL, average monthly temperature and precipitation data of previous time steps (e.g., t-1, t-2, …). The functional form of FFNN model can be presented as:

$$ GWL_{t} = f_{FFNN} \left( {GWL_{t - 1} ,GWL_{t - 2} , \ldots , GWL_{t - i} ,P_{t - 1} ,P_{t - 2} , \ldots ,P_{t - j} ,T_{t - 1} ,T_{t - 2} , \ldots T_{t - k} } \right) $$
(17)

where \(GWL_{t}\), Pt and Tt are representing GWL, average monthly precipitation and average monthly temperature at time step t, and i, j and k are dominant lag times for each feature which are determined using MI measure between potential inputs and targets which is calculated via Eqs. 1820.

$$ H\left( X \right) = \mathop \sum \limits_{x} - p\left( x \right){\text{log}}[\left( {p\left( x \right)} \right] $$
(18)
$$ H\left( {X,Y} \right) = \mathop \sum \limits_{x,y} - p\left( {x,y} \right){\text{log}}[\left( {p\left( {x,y} \right)} \right] $$
(19)
$$ MI\left( {X,Y} \right) = H\left( X \right) + H\left( Y \right) - H\left( {X,Y} \right) $$
(20)

where X and Y are discrete random variables, p is the probability function and H represents the entropy. However, due to LSTM’s capability to find the optimal dependencies dynamically at each prediction (Gers et al. 2002), optimal lag time selection for input data via MI is not used for LSTM.

1.2 Long short-term memory

The LSTM cell is consisting of four gates, each with an activation function, which are considered as neurons and are trainable. The mentioned gates are input, forget, modulation and output gates (see Fig. 

Fig. 9
figure 9

LSTM cell architecture (Nourani & Behfar 2021)

9). The process in a LSTM cell is described via Eqs. 2125 (Hochreiter and Schmidhuber 1997) as:

$$ f_{i} = \sigma_{f} \left( {W_{data}^{f} x_{t} + W_{state}^{f} h_{t - 1} + b^{f} } \right)\quad \left( {forget\;gate} \right) $$
(21)
$$ i_{i} = \sigma_{i} \left( {W_{data}^{i} x_{t} + W_{state}^{i} h_{t - 1} + b^{i} } \right)\quad \left( {input\;gate} \right) $$
(22)
$$ g_{i} = \sigma_{g} \left( {W_{data}^{g} x_{t} + W_{state}^{g} h_{t - 1} + b^{g} } \right)\quad \left( {modulation\;gate} \right) $$
(23)
$$ o_{i} = \sigma_{o} \left( {W_{data}^{o} x_{t} + W_{state}^{o} h_{t - 1} + b^{o} } \right)\quad \left( {output\;gate} \right) $$
(24)

where W and b are weights and biases of each gate and h and x are hidden state and input vector, respectively. These gates make it possible for LSTM to decide to whether forget the information (via weights) or pass it to the next cell as:

$$ c_{t} = f_{t} \odot c_{t - 1} + i_{t} \odot g_{t} $$
(25)

where ft, it and gt are resultant vectors from forget, input and modulations gates respectively (Eq. 2123). The cell state ct gets updated with Eq. 25. ⊙ indicates the element-wise multiplication and:

$$ h_{t} = o_{t} \odot \sigma_{h} \left( {c_{t} } \right) $$
(26)

where σt is the non-linear activation function, ot is calculated in output gate (Eq. 24) and the hidden state ht is divided from output gate and passes to next cell (Rußwurm and Körner 2017).

For the point predicting purpose, one sequence input layer is used, which takes columns of input data. It is followed by a LSTM layer which aims to include long-term dependencies, by finding optimal dependency between time steps (dynamically inclusion of lag times). Then a fully connected layer is used to improve the regression and set the number of outputs. Finally, the regression layer is used which calculates the error between the targets and generated values, then back-propagates the effect of calculated error.

1.3 SARIMAX

The general formula of the SARIMAX(p, d, q)(P, D, Q)L model is defined as (Fathi et al. 2019):

$$ \emptyset _{p} \left( B \right)\Phi _{p} \left( {B^{S} } \right)\left( {1 - B} \right)^{d} \left( {1 - B^{S} } \right)^{D} Y_{t} = c + \sum\limits_{{i = 0}}^{n} {X_{{t,i}} } \beta _{i} + \theta _{q} \left( B \right)\Theta _{Q} \left( {B^{S} } \right)\varepsilon _{t} $$
(27)

where Yt is the value of target time series (GWL) at time t and ɛt refers to the residual at time t; p, q and d are orders of non-seasonal auto regressive, moving averages and differencing, respectively, while P, Q and D are seasonal orders of the same parameters. Also, \(\emptyset\) and Φ are autoregressive coefficients, \(\theta\) and \({\Theta }\) are moving averages coefficients of non-seasonal and seasonal parts, respectively. B refers to the back-shifting operator (BYt = Yt−1). \(\beta_{i}\) is the ith exogenous time series parameter, while Xt,i is the value of the ith exogeneous time series (precipitation and temperature). Finally, s refers to the seasonal period length.

Appendix 2: The concept of prediction intervals

The targets in modeling are expressed by (Khosravi et al. 2011a):

$$ t_{i} = y_{i} + e_{i} $$
(28)

where ti is the ith target, ei is the error which moves the target away from the value that has been estimated by \(y_{i}\). By considering that the error is independent and identically distributed (iid) and \(\hat{y}_{i}\) is the output of model for estimating the mean of regression values we have:

$$ t_{i} - \hat{y}_{i} = \left[ {y_{i} - \hat{y}_{i} } \right] + e_{i} $$
(29)

While CIs are only dealing with the variance of \(\left[ {y_{i} - \hat{y}_{i} } \right]\), PIs are handling the difference between observed values ti and the estimated values \(\hat{y}_{i}\). Thus, PIs cover more uncertainty sources and are always wider than CIs and therefore:

$$ PI = CI + e_{i} $$
(30)

Appendix 3: Tables and figures

See Tables 5, 6, 7, 8.

Table 5 Information of selected piezometric wells and time series of GW depths in the study areas
Table 6 Selected lag times for predictors of modeling via FFNN, determined by MI for both plains
Table 7 The optimal hyper-parameters of neural networks for point predictions of GWLs in the Ardabil and Qorveh–Dehgolan plains
Table 8 The optimal hyper-parameters of SARIMAX for point predictions of GWLs in the Ardabil and Qorveh–Dehgolan plains

See Figs. 10, 11, 12, 13, 14.

Fig. 10
figure 10

Comparison of observed GWLs and point predictions of validation data for piezometers of a P3, b P4, c P5 and average monthly precipitation in the Ardabil synoptic station

Fig. 11
figure 11

Comparison of observed values and PIs estimated with an 80% nominal confidence level by ANN-based bootstrap methods at validation step for a P3, b P4, and c P5 and average monthly precipitation in the Ardabil synoptic station

Fig. 12
figure 12

Comparison of observed values, point predictions and PIs estimated with an 80% nominal confidence level by FFNN, BFN and MBFN at validation step for a Q1, b Q2, and c Q3 and average monthly precipitation in the Qorveh synoptic station

Fig. 13
figure 13

Comparison of observed values, point predictions and PIs estimated with an 80% nominal confidence level with LSTM and MBLSTM at validation step for a Q1 b Q2, c Q3 and average monthly precipitation in the Qorveh synoptic station

Fig. 14
figure 14

Box plots of observed, point prediction and PI values of GWLs in a P3, b P4, c P5, d Q1, e Q2 and f Q3

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nourani, V., Khodkar, K., Paknezhad, N.J. et al. Deep learning-based uncertainty quantification of groundwater level predictions. Stoch Environ Res Risk Assess 36, 3081–3107 (2022). https://doi.org/10.1007/s00477-022-02181-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-022-02181-7

Keywords

Navigation