1 Introduction

Clay swelling, a common phenomenon observed in soils and sedimentary rocks, can generate damaging stresses during wetting or drying cycles. Identification of factors affecting the swelling is important for understanding and ultimately preventing the damage (Wangler et al. 2008). In clay–sulfate rocks, the swelling has caused unforeseen problems in various geotechnical projects leading to lengthy operational disruptions and costly remediation measures (Anagnostou et al. 2010). Although rock swelling has been known for more than a century since the construction of the Schanz railway tunnel in Baden-Württemberg in Germany, the underlying processes are not, yet, fully understood (Schädlich et al. 2013; Wanninger 2020). In general, the swelling is triggered by an inflow of water into clay–sulfate rocks which leads to volume increase by expansive hydration of the mineral anhydrite (Butscher et al. 2016). Once the swelling is triggered, the ground heave may continue to develop for many years and it is practically impossible to stop the swelling by currently known engineering methods (Fleuchaus et al. 2017; Jarzyna 2022; Schweizer et al. 2019).

The threats imposed by swelling of clay–sulfate rocks are mostly encountered during the construction of railway and road tunnels and viaducts. Major problems occurred in tunnels, e.g., the Wagenburg, Engelberg, and Belchen tunnels, constructed in the Triassic Grabfeld Formation (formerly Gipskeuper = “gypsum Keuper”), which is commonly found around the Stuttgart metropolitan area in south-western Germany and Jura Mountains in north-western Switzerland (Anagnostou et al. 2010, 2015; Butscher et al. 2011; Sass et al. 2010). For example, in the Wagenburg tunnel, which connects the center with the eastern part of Stuttgart, heaving of the unsupported tunnel floor of 1.1 m was measured over a span of 25 years (Berdugo et al. 2009; Wanninger 2020). In Spain, tunnels constructed to provide a high-speed railway from Madrid to Barcelona were affected by extreme sulfate-related heave (Alonso et al. 2013; Ramon et al. 2017).

The swelling behavior is mainly attributed to (1) osmotic water uptake of clay minerals driven by a concentration gradient between the clay matrix and the free pore water (Madsen et al. 1991), (2) hydration of clay minerals, i.e., crystalline swelling (Madsen et al. 1989), and (3) the chemical transformation of anhydrite into gypsum, i.e, gypsification of anhydrite, which is the key process controlling the swelling phenomenon (Butscher et al. 2011; Jarzyna 2022). The latter is accompanied by a volume increase of up to 61% (Butscher et al. 2016). However, should the expansion be prevented, e.g., by an inverted arch in tunneling, the rock may exert high swelling pressures on the lining and damage the structure through heaving of tunnel sections (Wanninger 2020).

A swelling law stating the stress–strain relation as well as the time dependency of the swelling process allows predicting the mechanical behavior of clay–sulfate rocks. For pure clay rocks, a distinct linear relationship between the swelling strains and the logarithm of swelling stresses has been established, widely known as “Grob’s law” (Grob 1972; Madsen et al. 1989). However, laboratory experiments illustrated that Grob’s law is inadequate for describing the swelling behavior of clay–sulfate rocks (Pimentel 2007). The relation of swelling strains to stresses for clay–sulfate rock is unknown so far due to the limited number of swelling experiments and the fact that none of the experiments reached equilibrium conditions. Systematic experimental studies are scarce, because gypsification of anhydrite can take several years to complete, and experiments are usually terminated prematurely (Butscher et al. 2016; Schweizer et al. 2019; Wanninger 2020).

Numerical models considering coupled thermal, hydraulic, mechanical and chemical processes are essential to understand the swelling behavior of clay–sulfate rocks comprehensively. Several finite element models have already been developed to address the swelling in various settings in Austria, Germany, Spain, and Switzerland (e.g., Anagnostou 1993; Ramon et al. 2017; Schädlich et al. 2013; Wittke 2014). Although these models successfully simulated the swelling deformations, their generalization capability remains limited because of the existing gaps in the overall understanding of the swelling process (Butscher et al. 2016, 2018). A lack of knowledge still exists with respect to coupling between hydraulic, mechanical, and chemical processes due to a lack of long-term experiments that would deliver a database for the development of a coupled model (Wanninger 2020).

Machine learning (ML) has been successfully applied for a wide range of problems in the field of geotechnical engineering. ML models are non-parametric statistical tools that do not require any pre-assumptions between inputs and outputs and are capable of handling complex nonlinear input–output relationships. Recently, Zhang et al. (2021) adopted a long short-term memory learning method to reproduce stress–strain behavior in clay soils. Taherdangkoo et al. (2022) developed a neural network model to determine the solubility of anhydrite and gypsum minerals in aqueous solutions with implications for swelling of clay–sulfate rocks. In this context, a predictive ML model based on supervised learning can be applied to rank the relative importance of parameters influencing the hydro-mechanical swelling of clay–sulfate rocks.

We employed a hydro-mechanical model, which depends on the water content of the rock, to study the swelling behavior of clay–sulfate rocks. The geodetic heave measurements at the Staufen study site were used to calibrate the model and evaluate the performance of the modeling approach. We then conducted a sensitivity analysis of the material properties of the swelling layer, namely the maximum swelling pressure, Young’s modulus, Poisson’s ratio, overburden thickness, and the initial volumetric water content. We compiled a synthetic dataset using the heave modeling data to build a ML model, specifically a least-squares boosting ensemble. Finally, we employed the ML model to rank the relative importance of parameters influencing the HM swelling of clay–sulfate rocks.

2 Study Site

2.1 Overview

The study site is located in the city of Staufen in south-west Germany, where geothermal drilling in clay–sulfate rocks of the Gipskeuper Formation caused severe damage to infrastructure with costs so far exceeding 100 million euros (Sass et al. 2010; Fleuchaus et al. 2017). In 2007, seven wells with a depth of up to 140 m were drilled to install borehole heat exchangers (BHE). The wellbore integrity failure of at least one BHE created a hydraulic connection between a separated artesian aquifer and the overlying Gipskeuper Formation, which in turn led to an upward flow of water. The fractures and discontinuities within the swelling layer further facilitated the lateral flow of water triggering the chemical transformation of anhydrite embedded within the clay layer into gypsum. This process called “gypsification of anhydrite”, led to a net increase in rock volume (Sass et al. 2010; Butscher et al. 2016; Fleuchaus et al. 2017). Since the operation, a ground heave with uplift rates of up to 11 mm month\(^{-1}\) (LGRB 2010), has been observed resulting in a total ground heave of up to 0.6 m, measured between 2007 and 2018 (Fleuchaus et al. 2017).

Countermeasures to mitigate the swelling process were initiated in 2009. These measures included re-grouting of the defective BHE and installation of pumping wells into the artesian aquifer below the Gipskeuper Formation to reduce the hydraulic potential in the aquifer, which led to the lowering of water inflow into the clay–sulfate rocks (LGRB 2010; Grimmet et al. 2014; Ruch et al. 2013; Sass et al. 2010). Although the countermeasure effectively decreased the heave rates at the ground surface, it failed to completely stop further swelling heave. A possible explanation is that the swelling continues until all the water intruding into the swelling layer is consumed by the gypsification of anhydrite (Butscher et al. 2016; Jarzyna 2022).

2.2 Geodetic Data

A geodetic monitoring network including 106 observation locations was set up to observe surface deformation in the affected area (LGRB 2010, 2012). The monitoring campaign started on 14 January 2008 with 14 observation locations, and thereafter, the rest of 106 observation locations were successively installed until 18 May 2010. During the monitoring campaign (from January 2008 to September 2011), the sampling was repeated 49 times at irregular intervals of 11–63 days.

The frequency distribution of the measured heave values in both time and space obtained from the geodetic monitoring network, which corresponds to a total of 3431 observation points (Schweizer et al. 2018), is depicted in Fig. 1. All heave and subsidence values were referenced to the absolute (and in a few cases projected) elevation at the time before the BHE drilling. The analysis showed that only a very small ground heave close to zero was measured at around 1400 observation points. The maximum measured heave was 0.4106 m, and the average value was 0.0785 m. The heave at 85 points was equal to or higher than 0.3 m. Small subsidence at some observation points (a total of 66 points), which surround the heaved area, was also measured. The maximum measured subsidence was 0.0038 m.

The heave measurements (3431 observation points) were used to calibrate the numerical model and evaluate the accuracy of the presented modeling approach. However, in the compiled dataset (Schweizer et al. 2018), the number of measured heave data was limited at the start of the campaign, and more data were acquired after installing all the observation points. The analysis showed that the dataset has high variability and ambiguity for various distances from the center, indicated by the large spread of heave values (see Sect. 4.1). The heave body is distinctly anisotropic, especially in proximity to the center, which further makes the model’s performance evaluation difficult.

Fig. 1
figure 1

Histogram plot of geodetic heave measurements at the Staufen study site obtained between 14 January 2008 and 12 September 2011

3 Methodology

3.1 Coupled Hydro-Mechanical Model

3.1.1 Motivation

Several numerical models have been developed to study the swelling behavior of clay–sulfate rocks. Grob’s law was primarily implemented in most of these models to simulate mechanical deformations. For instance, Anagnostou (1992, 1993) developed a coupled HM model to study the effect of seepage flow on deformation patterns around tunnels excavated in swelling rocks. Wittke (2003) developed a coupled HM model, the swelling law coupled with the seepage flow, employing a water uptake coefficient to take into account the water transport and gypsification of anhydrite. The HM model was calibrated in a test gallery at the Freudenstein tunnel in Germany. Schädlich et al. (2013) developed a mechanical model to study the swelling deformations along the first tube of the Pfändertunnel in Austria. Ramon et al. (2013) implemented gypsum crystal growth in a standard coupled HM formulation for saturated porous media to simulate the heave of the central pillars of a railway bridge. Ramon et al. (2017) developed a hydro-chemo-mechanical model to simulate the swelling behavior observed in the Lilla tunnel in Spain.

The swelling heave observed at Freudenstein tunnel, Lilla tunnel, Lochviller, Pont de Candí, and Staufen was triggered by a change in hydrologic conditions (Alonso et al. 2013; Butscher et al. 2016; Ramon et al. 2017; Sass et al. 2010; Wittke 2014), leading to ingress of water into the swelling zone. The water uptake of clay–sulfate rocks is controlled by the water consumption of the clay and the dissolution of anhydrite in groundwater, which depends on the ionic compositions of the groundwater, and the subsequent crystallization of gypsum. The gypsification of anhydrite, which delays the water uptake of clay–sulfate rocks, is governed by transport processes such as convection of ions within the groundwater (Butscher et al. 2016; Wittke 2014). Therefore, we employed an HM model that takes into account the water uptake of unleached clay–sulfate rocks to describe the swelling processes. Unlike the previous HM modeling studies that implemented the Grob’s law (e.g., Schädlich et al. 2013; Wittke 2014), Richards’ equation coupled to a deformation process with linear kinematics was used to simulate the hydro-mechanical behavior of clay–sulfate rocks.

3.1.2 Summary of Governing Equations

In summary, the governing equations are as follows:

3.1.2.1 The hydraulic process

Richards (1931) equation was used to describe flow in porous media, which can be written as (Kafle et al. 2022):

$$\begin{aligned}&\rho _\text {w}\left[ \phi +\frac{p_\text {w}S_\text {w}(\alpha _\text {B}-\phi )}{K_\text {s}}\right] {\dot{S}}_\text {w}+\rho _\text {w}S_\text {w}\left[ \frac{\phi }{K_\text {w}}+S_\text {w}\frac{\alpha _\text {B}-\phi }{K_\text {s}}\right] {\dot{p}}_\text {w}\nonumber \\&+\nabla \cdot {\mathbf {q}}+\alpha _\text {B}\rho _\text {w}S_\text {w}\nabla \cdot \mathbf {{\dot{u}}}=0 \end{aligned}$$
(1)

with

$$\begin{aligned} {\mathbf {q}} = -\rho _\text {w}\frac{{\mathbf {K}}}{\mu _\text {w}}\left( \nabla p_\text {w}-\rho _\text {w}{\mathbf {g}}\right) , \end{aligned}$$
(2)

where \(\phi\) is the porosity, \(\rho _\text {w}\) the water density, \(\mu _\text {w}\) the water viscosity, and \(S_\text {w}\) the water saturation. \(p_\text {w}\) is the water pressure, \({\mathbf {g}}\) the gravity acceleration vector, \({\mathbf {K}}\) the intrinsic permeability, and \({\mathbf {u}}\) the solid skeleton velocity. The Biot–Willis coefficient is \(\alpha _\text {B}\), and the bulk moduli of the solid and water phase are \(K_\text {s}\) and \(K_\text {w}\), respectively.

Richards’ equation is solved for fluid pressure as the primary variable. The constitutive relation between the capillary pressure and the degree of saturation, i.e., the water retention curve, is required to close the system of equations. The accuracy of the coupled solution depends on the water retention curve. Thus, appropriate selection of the constitutive relation for the material and expected stress conditions is crucial (Fredlund 2006). The relation between capillary pressure \(p_\text {cap}\) and saturation was described by van Genuchten (1980) as follows:

$$\begin{aligned} p_\text {cap} = p_\text {B}({S_\text {eff}}^{-\frac{1}{m}}-1)^{1-m}, \end{aligned}$$
(3)

where \(p_\text {B}\) expressed by \(\rho _\text {w}g/\alpha\) is a parameter related to air entry pressure, and \(\alpha\) is the air entry value. m is a pore size distribution parameter. The effective saturation \(S_\text {eff}\) is described as

$$\begin{aligned} S_\text {eff} = \frac{S_\text {w}-S_\text {r}}{S_\text {max}-S_\text {r}}, \end{aligned}$$
(4)

where \(S_\text {max}\) is the maximum water saturation, and \(S_r\) is the residual water saturation of the rock medium. The volumetric water content \(\theta\) was defined as \(S_\text {w} \phi\). The in situ state is characterized by \(\theta _\text {i} = S_\text {wi}\phi\) and \(S_\text {wi}\) is the initial water saturation. The volumetric water content at saturation \(\theta _\text {s} = S_\text {max}\phi\) corresponds to a state in which the swelling process, i.e., hydration of the clay rock and the gypsification of anhydrite fraction embedded within the clay, is completed. Note that the complete transformation of anhydrite into gypsum may only occur where the clay and anhydrite fractions are adequetly distributed with the swelling rock.

3.1.2.2 The mechanical process

The mechanical model solves the equilibrium conditions

$$\begin{aligned} \nabla \cdot \varvec{\sigma }+\left[ \rho _\text {s}(1-\phi )+S_\text {w}\rho _\text {w}\phi \right] {\mathbf {g}}={\varvec{0}}, \end{aligned}$$
(5)

where \(\rho _\text {s}\) is the intrinsic solid density and \(\varvec{\sigma }\) is the total stress tensor, which relates to the effective stress tensor \(\varvec{\sigma }'\) by the extended Bishop’s model:

$$\begin{aligned} \varvec{\sigma }= -\chi (S_\text {w}) p_\text {w} {\mathbf {I}} + \varvec{\sigma }' - \sigma ^\text {sw} {\mathbf {I}}, \end{aligned}$$
(6)

where \(\chi (S_\text {w})\) is Bishop’s function, often set equal to the water saturation. \({\textbf {I}}\) is the unit tensor, and \(\sigma ^\text {sw}\) is the swelling pressure. We employed a swelling model often employed for swelling clay rocks (Graupner et al. 2018; Schäfers et al. 2020), which relates the swelling stress to the maximum swelling pressure \(\sigma ^\text {sw}_\text {max}\) by saturation changes as follows:

$$\begin{aligned} \sigma ^\text {sw} = \sigma ^\text {sw}_\text {max}(S_\text {w}-S_\text {wi}). \end{aligned}$$
(7)

For simplicity, this relationship is kept linear in this study, but non-linear forms are possible. All processes described above were realized using OpenGeoSys (OGS) (Kolditz et al. 2012; Bilke et al. 2019), which is a finite element method-based simulator for coupled problems in porous media.

3.1.3 Model Setup

The axisymmetric 2D domain has a length of 240 m and a vertical extent of 104.5 m, representing the thickness of the swelling layer (42.5 m) and the overburden layer (62 m). The latter is obtained by lumping all layers above the swelling layer into one. The sedimentary layers are assumed to be homogeneous in our model. The simulation consists of three phases: (1) the steady-state initial condition, (2) the geothermal drilling operations on 3 September 2007 leading to the inflow of water into the swelling layer, and (3) the mitigation measures that started on 4 November 2009, after which the water flow to the swelling layer was stopped. The total simulation time (phases 2 and 3) was set to 1500 days, the second phase lasted 790 days, and the third phase lasted 710 days (LGRB 2010, 2012). During the leakage period, water flows into the swelling layer \(({-104.5}\,\hbox {m}\le y \le {-62}\,\hbox {m})\) from the left side of the domain with an inflow rate of \(1.3 . 10^{-1}\) \({\hbox {kg s}^{-1}}\) (Schweizer et al. 2018). During the mitigation phase, the leakage (water inflow) is stopped, while the mechanical behavior (heave) is observed further.

The domain is initially at the in situ state (\(\theta _\text {i} = S_\text {wi}\phi\)) with an initial pore pressure corresponding to \(p_\text {w} = - p_\text {cap}(S_\text {w}(t=0))\). The gravity was neglected, and thus the initial pressure was constant throughout the domain. The model was used to calculate water pressure and stress changes with respect to the initial in situ state. We acknowledge that with deactivating gravity, depth-dependent hydraulic properties remain constant throughout the model domain. However, it should be noted that we did not aim to model the classical case of two fluids with different densities (gas and water), but rather using the initial volumetric water content as a proxy for the inflowing water that triggers the swelling. With this approach, we are able to account for the circumstance that swelling does not occur prior to the ingress of water into the swelling layer (Schweizer et al. 2018).

3.1.4 Boundary Conditions and Material Properties

The pressure at the top of the domain was held constant at its initial value to allow for free drainage. The left side \(({-104.5}\,\mathrm{m}\le y \le {-62}\,\mathrm{m})\) was set to a fixed inflow rate (\(1.3 . 10^{-1}\) \({\hbox {kg s}^{-1}}\)) during the first 790 days, and then it was set to a no-flow boundary to account for the mitigation period. The inflowing water flow rate was defined on the basis of Schweizer et al. (2018) findings. The ambient and inflowing water was assumed as pure water, having a density of 1000 \({\hbox {kg m}^{-3}}\). These assumptions differ from the compositions of water samples taken from groundwater wells (LGRB 2010). For instance, the ambient groundwater within the unleached gypsum Keuper has high calcium sulfate concentrations, saturated with respect to gypsum, and slightly undersaturated with respect to anhydrite. These significant differences in water compositions have negligible influence on the modeling, as the chemical reactions have been ignored in the present study. The left side \(({-62}\,{m}\le y\le {0}\,{m})\) was set to a no-flow boundary. The right side \(({-104.5}\,{m}\le y\le {0}\,{m})\) was set to a pressure boundary equal to the initial pressure. Regarding the mechanical boundary conditions, the lateral and bottom sides were fixed in their normal direction (roller). The top boundary was set to be traction free.

The behavior of the medium was considered to be isotropic. The values of two elastic constants, Young’s modulus E and Poisson’s ratio \(\nu\), as well as hydrogeological parameters were taken from various sources (Benz et al. 2012; LGRB 2010, 2012; Schweizer et al. 2018). Mechanical and hydrogeological parameters of the overburden layer were calculated by the weighted arithmetic mean of the geological layers, which can be used to determine the average properties of layered, parallel beds. The material parameters used for the numerical simulation are listed in Table 1.

Table 1 Material parameter values. The values of intrinsic permeability are adjusted after the model's calibration

3.1.5 Sensitivity Analysis

We used direct point measurements of heave data from the city of Staufen (Schweizer et al. 2018; LGRB 2010, 2012) to calibrate the HM model. No field information about the initial volumetric water content within the domain was available, and therefore this parameter was selected as a calibrated parameter. Following Schweizer et al. (2018), the intrinsic permeability was also chosen as a calibrated parameter. The result of the simulation with the calibrated HM model was considered as the base case scenario. A sensitivity analysis was designed to examine the influence of the model parameters on the swelling behavior of our model (Table 2). The results give insight into the importance of a wide range of possible field conditions on the swelling behavior of clay–sulfate rocks. The influence of Young’s modulus, Poisson’s ratio, overburden thickness, maximum swelling pressure, porosity, and the initial volumetric water content on the observed uplift on the ground surface was evaluated. Note that the material properties of the overburden layer have remained unchanged. Ranges of parameter values considered in the sensitivity analysis listed in (Table 2) are not limited to the Staufen site, and were obtained from previous studies (Alonso et al. 2013; Butscher et al. 2011, 2018; Schädlich et al. 2013; Wittke 2003). We applied the one-at-a-time (OAT) method (Morris 1991) to analyze the influence of changing the values of each chosen parameter separately. The OAT method provides a local sensitivity measure, while being computationally cheap compared to multiple parameter variations at a time. It provides relevant information about parameter sensitivities as it derives the independent effects of each parameter on model results. For a discussion of this and other methods in the context of coupled models, see cf. Chaudhry et al. (2021) and references therein.

Table 2 Parameter values of the swelling layer used for the sensitivity analysis

3.2 Machine Learning Model

3.2.1 Least-Squares Boosting Ensemble

We employed the least-squares gradient boosting (LSBoost), an ensemble of two powerful algorithms, boosting and trees (Friedman 2001; Hastie et al. 2009; Taherdangkoo et al. 2021), to construct a predictive model able to determine the maximum heave observed at the study site Staufen. The LSBoost incorporates important advantages of tree algorithms, handling predictor variables of various types, and accommodating missing values based on surrogate splitting. Previous research has shown that the LSBoost is capable of handling complex regression issues such as non-linearity, high dimensional, and small sample size (Breiman 1996; Elith et al. 2008). The algorithm assigns a different weight to each weak learner to minimize the mean-square error between the target value y and the predicted value \({\tilde{y}}\) as (Hastie et al. 2009):

$$\begin{aligned} {\tilde{y}} = {\bar{y}}(x) + w \sum _{n=1}^{N} \rho _{n}L_{n}(x), \end{aligned}$$
(8)

where x is the predictor variable, \({\bar{y}}\) is the mean of y, w with \((0 \le w \le 1)\) is the learning rate, N is the total number of learners, and \(\rho _{n}\) is the weight for the \(n^\text {th}\) learner. The LSBoost can directly quantify the importance or relative influence of each feature/predictor from a trained model because the algorithm chooses a single feature at each level to approximate the alignment accuracy residual (Hastie et al. 2009; Moting 2019). A high value of any feature indicates the high effect of that feature in estimating the predicted value. The LSBoost algorithm was implemented in MATLAB 2021b.

3.2.2 Bayesian Optimization

We employed Bayesian optimization algorithm to tune hyper-parameters, i.e., user-defined parameters, of the LSBoost. This algorithm searches to find the global minimum of an objective function f(z) as (Gelbart et al. 2014; Snoek et al. 2012):

$$\begin{aligned} z^{*} = \arg \min f(z), z \in A, \end{aligned}$$
(9)

where A is the search space of z, i.e., set of hyper-parameters. The expected improvement acquisition function u(z) was used in the implementation of the optimization algorithm. This function evaluates f at the point z where the highest improvement upon \(f^{'}\), the current minimum observed f, is expected. This corresponds to (Bull 2011):

$$\begin{aligned} u(z) = \max (0,f^{'}-f(z)). \end{aligned}$$
(10)

3.2.3 Dataset

We used the obtained heave data from the sensitivity analysis to compile a synthetic dataset to build the machine learning model. The LSBoost considers Young’s modulus E, Poisson’s ratio \(\nu\), overburden thickness \(H_{o}\), maximum swelling pressure \(\sigma _\text {max}^\text {sw}\), porosity \(\phi\), and initial volumetric water content \(\theta_\text {i}\) of the swelling layer as input parameters, while the corresponding maximum modeled heave observed at the ground surface is considered as the output parameter. We used k-fold cross-validation to randomly divide the dataset into k groups of roughly equal size. Then, k-1 groups of the dataset were used for the model development and the remaining group was used to validate its performance. Therefore, each data sample had a chance to contribute to both the training and validation phases (Refaeilzadeh et al. 2009). In this study, fivefold cross-validation was used.

4 Results and Discussion

4.1 Hydro-Mechanical Modeling

We ran several model simulations to calibrate the initial volumetric water content within the model domain as well as the intrinsic permeabilities of both swelling and overburden layers. These parameters were calibrated against the heave measurements presented in Fig. 1. Accordingly, the initial volumetric water content was set to 0.0924, and the intrinsic permeabilities of the swelling and overburden layers were set to \({2 . 10^{-13}}\,\hbox {m}^2\) and \({8 . 10^{-13}}\,\hbox {m}^2\), respectively.

The simulated heave after 1500 days (Fig. 2) shows that the maximum swelling occurs in the vicinity of the inflow boundary, representing the defective BHE drilling. Previous studies (e.g., Serafeimidis et al. 2013, 2014) demonstrated that hydration times of low porosity thick anhydrite layers may exceed centuries in duration. Therefore, there is still a large potential for future uplift of the ground surface by the further inflow of water within the swelling layer. Although the stop of water inflow with the start of the mitigation measures decreases the swelling rate, it takes a long time until the swelling is completed, and thus the ground surface continues to heave. The ground heave at distances of more than 180 m away from the center is negligible during the simulation time.

The modeling indicates that the total vertical displacement at the ground surface is smaller than the displacement above the swelling layer (Fig. 2), showing the elastic behavior of the overburden layer. The maximum vertical displacement at the top of the swelling layer is around 0.42 m, while it is 0.39 m m at the ground surface. Thicker overburden layers could further decrease the overall heave at the ground surface. The vertical displacement within the swelling layer is lower at greater depths due to the increase of vertical stress. The influence of swelling on the porosity of the swelling layer is evident, and the changes are more significant at greater depths Fig. 3.

Fig. 2
figure 2

Vertical displacement (heave) of the model domain after 1500 days simulation time

Fig. 3
figure 3

Porosity within the model domain after 1500 days simulation time

4.2 Comparative Analysis

Figure 4 compares simulated heave at the ground surface with measurements at two points in time, at the end of the leakage and simulation periods. Note that measurements for locations close to the center of the heave cone \((0 \le \text {distance} \le {12.5}\,\hbox {m})\) were unavailable (LGRB 2010; Schweizer et al. 2018). As mentioned earlier, the large spread of measured uplift values in the proximity of the center indicates the anisotropic shape of the heave body (ellipsoidal shape), which cannot be reproduced by the isotropic model. This causes overestimation of the measured uplift values depending on the measurement’s location. The HM model captures the overall shape of the heave body with the accuracy needed for the development of meaningful countermeasures.

Generally, the two simulation phases are characterized by a change in water inflow conditions, which directly affects the heave development. The water content of the rock has a strong effect on the swelling, also evident in Eq. 7. The swelling proceeds faster during the first simulation phase due to the inflow of water into the swelling layer, and then with the change in water inflow conditions, i.e., start of mitigating measures, the swelling rate and accordingly the uplift rate at the ground decreases. The heave at the end of the first (790 days) and second (1500 days) simulation phases are 0.32 m and 0.385 m, respectively, while the field observations are 0.232 m and 0.365 m (Fig. 5). The difference between the final simulated and observed heave values is minor. The findings are consistent with field observations showing that, once the swelling is triggered by water inflow, it may continue to develop for many years, but the swelling rate would significantly decline over years (Alonso et al. 2013; Anagnostou et al. 2015; Fleuchaus et al. 2017; LGRB 2010).

Fig. 4
figure 4

Comparison of geodetic heave measurements with the simulated heave for two points in time: after mitigation measures (red) and at the end of the simulation (blue)

Comparing the development of the heave with time at 12.5 m distance from the center (Fig. 5) shows a systematic overestimation of heave by the model. The deviation is higher at early simulation times due to the strong dependency of swelling on changes in the volumetric water content. The deviation between the modeling and measured values decreases towards the end of the simulation with the heave–time curve approaching a plateau. In general, the shape of the heave-time curve is in line with previous HM modeling studies, e.g., Wahlen et al. (2009).

Fig. 5
figure 5

Comparing modeled and measured heave at the distance of 12.5 m from the center. This measuring point was chosen because of (1) being close to the center, and (2) having the most heave measurements data in time required for the comparison analysis

The analysis of modeling residuals, e.g., the difference between the measured and modeled values, for 893 points in space and time displayed in Fig. 6 shows that for most locations the prediction residuals are less than 0.13 m. The value of residuals varies according to the measuring locations from the center, which can be attributed to the anisotropic shape of the heave body. The histogram plot illustrates that the modeling residuals are zero at around 200 measuring points (Fig. 7). However, some locations are associated with relatively high prediction errors.

The prediction error can be attributed to (1) the anisotropy of the heave body (ellipsoidal shape). The anisotropic distribution of measured heave data (Fig. 4) cannot be reproduced with a axisymmetric isotropic model. (2) The swelling behavior of clay–sulfate rocks is controlled by coupled hydraulic, chemical and mechanical processes that hardly can be reflected by a general constitutive law such as the one chosen here. The modeling suggests that a more complex constitutive law should be employed to describe the relationship between pressure (stress) and heave (strain) resulting from the swelling process. However, the relation of swelling strains to stresses is unknown so far, especially, the amount of stresses and strains caused by gypsification of anhydrite (Wanninger 2020). (3) The chemical transformation of anhydrite into gypsum is the key process controlling the swelling phenomena. The coupling of hydraulic, mechanical, and chemical (HMC) processes has to be considered to obtain more accurate results. Some authors have addressed this problem in the past, e.g., Ramon et al. (2017), but the accurate coupling of hydro-mechanical processes to the chemical processes is still unknown (Butscher et al. 2016; Wanninger 2020).

Fig. 6
figure 6

Analysis of model prediction residuals in calculating heave values for 893 measuring points at various distances from the center. The residuals at each location (from 15 to 195 m away from the center with 10 m interval) were calculated for all heave values in time. The median is indicated by the central red marks, while the bottom and top edges of the box indicate the 25th and 75th percentiles. The outliers are indicated with the \('+'\) symbol

Fig. 7
figure 7

Frequency of modeling prediction residuals in calculating heave values for 893 measuring points in both time and space

4.3 Sensitivity Analysis

The sensitivity analysis was designed to examine field uncertainties and draw a meaningful conclusion on the hydro-mechanical behavior of clay–sulfate rocks. A total of 144 simulations were performed by altering the parameter values listed in Table 2. Each parameter value was varied while the others were kept equal to the base case. Figure 8 depicts the maximum observed heave (at zero distance from the center) at the ground surface after 1500 days of simulation time.

The stiffness of the rock mass is usually an experience-based estimate as rock heterogeneity may influence the testing results. The sensitivity of the heave to the rock stiffness is studied through Young’s modulus, which was varied between 300 and 3000 MPa. In the numerical model, a low Young’s modulus increases the magnitude of the observed heave at the ground surface. The results indicate that the swelling is highly sensitive to rock stiffness should E be lower than 1500 MPa. Larger values of the rock stiffness \((E > {1500}\,\hbox {MPa})\) have a relatively small influence on swelling deformations. This transition range depends on the maximum swelling pressure.

The reported \(\nu\) values for clay–sulfate rocks fall in the range of 0.2 to 0.35 (e.g., Ramon et al. 2017; Schädlich et al. 2013; Wittke 2003), however, a wider range was considered to provide a deeper understanding on the influence of this parameter. Poisson’s ratio describes lateral deformation, while Young’s modulus is related to deformation in the direction of the applied load. Although exhibiting a different behavior in comparison with Young’s modulus, the swelling also decreases with the increase of \(\nu\) and with it the rock’s bulk modulus. Should \(\nu\) approach the limiting value of 0.5 (perfect incompressibility), the vertical swelling is negligible, and therefore, the observed heave at the ground surface becomes approximately zero.

The maximum swelling pressure of clay–sulfate rocks varies largely in laboratory swelling tests (e.g., Steiner 1993), and therefore, it was varied between 3.2 and 13 MPa. The swelling and subsequently the magnitude of the observed heave are notably sensitive to \(\sigma _\text {max}^\text {sw}\), which is primarily due to higher swelling strains directly imposed underneath the overburden layer. The results show a linear increase of the heave in the numerical analysis with the increase of \(\sigma _\text {max}^\text {sw}\) value, also evident in Eq. 7.

The swelling decreased monotonically with increasing the depth of the swelling layer. This is caused by increasing the bending stiffness of the overburden layer. The maximum observed heave for the case of free swelling, i.e., without considering an overburden layer, equals 0.47 m, which is approximately 4 times higher in comparison to the case having an overburden thickness of 200 m. This implies that the swelling heave is much more pronounced when the swelling layer is situated at shallow depths.

The sensitivity of swelling to the initial volumetric water content was evaluated by varying its value between 0.0077 to 0.053. The relationship between swelling stress and initial volumetric water content is linear. A threshold value of \(\theta_{i}=0.018\), which is related to the water retention curve, exists in the numerical model, below which the swelling remains almost constant. The modeling shows that, above this threshold, swelling decreases with a rise in the initial volumetric water content because the difference between final and initial volumetric water content becomes smaller, subsequently leading to lower swelling stress.

Fig. 8
figure 8

Sensitivity analysis of various parameters on heave values calculated from HM modeling. The Staufen study site was used as the base case (red color). The maximum heave at the ground surface is plotted, varying Young’s modulus, Poisson’s ratio, maximum swelling pressure, overburden thickness, porosity, and initial volumetric water content

4.4 Relative Importance of Parameters

The data obtained from the sensitivity analysis were used to build a synthetic dataset. The scatter plot (Fig. 9) shows that the heave determined by the ML model agrees well with the results of the HM model (data points point close to the 1:1 reference line), illustrating the good performance of the LSBoost model in predicting the maximum heave values obtained from the HM modeling. The model efficiently identifies the existing patterns in the input data to predict the unseen values accurately. The coefficient of determination (\(R^2\)) and root mean square error (RMSE) values equal to 0.98 and 0.239, respectively. The optimized hyper-parameter values for the number of learners, learning rate, and the minimum leaf size are 460, 0.087, and 1, respectively.

The relative impact of parameters (Fig. 10) on the heave calculations indicates that the maximum swelling pressure and Young’s modulus of the swelling layer were the most important parameters controlling the swelling; the former had a 2.35 times higher influence than the latter. This was followed by the Poisson’s ratio, the initial volumetric water content and the overburden thickness. The ranking demonstrated that the influence of the two latter parameters on the swelling is minimal compared to the other parameters. Schädlich et al. (2013) developed a mechanical model based on a semi-logarithmic relation between swelling stress and strain (Grob 1972), and conducted a sensitivity analysis on a few parameters including the maximum swelling pressure and Young’s modulus. Our findings regarding the influence of the aforementioned parameters are consistent with Schädlich et al. (2013). While Schädlich et al. (2013) considered a semi-logarithmic swelling law, the heave was shown to increase almost linearly with the increase of the maximum swelling pressure, which is consistent with our results.

Fig. 9
figure 9

Regression plot of the LSBoost model predicted heave values versus HM modeling values

Fig. 10
figure 10

Relative importance of various parameters on maximum heave values obtained from HM modeling

We calculated the partial dependence (Hastie et al. 2009) between the predictor variables and the heave calculated using the LSBoost model. Figure 11 shows the two-variable partial dependence of heave value on joint values of Young’s modulus and maximum swelling pressure. Heave value has a very strong partial dependence on maximum swelling pressure that is consistent with its importance ranking. The strong partial dependence of the heave on Young’s modulus is evident. These outcomes confirm the relative importance analysis depicted in Fig. 10.

Fig. 11
figure 11

Partial dependence of heave value on Young’s modulus and maximum swelling pressure

5 Summary and Conclusions

We developed a coupled hydro-mechanical model to reproduce the observed heave at the city of Staufen, south-west Germany. Unlike previous HM modeling studies that implemented a semi-logarithmic constitutive law, Richards’ equation coupled to a water content-dependent deformation process with linear kinematics was used to describe the behavior of clay–sulfate rocks. Direct heave measurements (3431 sampling points) obtained from the study site were used to calibrate the model and evaluate the performance of the modeling approach. We then conducted a sensitivity analysis on material properties of the swelling layer, namely Young’s modulus, Poisson’s ratio, overburden thickness, maximum swelling pressure, porosity, and the initial volumetric water content, to gain a better insight into the importance of parameters controlling hydro-mechanical swelling of clay–sulfate rocks. Finally, we employed a least-squares boosting ensemble (LSBoost) model tuned with a Bayesian optimization algorithm to rank the importance of the aforementioned key parameters on heave calculations. The following conclusions can be drawn from our modeling:

  1. 1.

    The fully coupled hydro-mechanical model is capable of reproducing the swelling behavior of clay–sulfate rocks with an accuracy sufficient for practical applications.

  2. 2.

    Comparing the modeled and measured heave data from the study site showed that the modeling approach has some deficiencies, which led to an overestimation of the heave at various distances from the center. The heave development with time cannot be exactly reproduced with the linear constitutive model, and thus a more complex relation between swelling stress and strain is needed to improve the modeling accuracy.

  3. 3.

    In general, the HM model is not limited to describe the swelling of clay–sulfate rocks and can also be used for other types of clay rocks as well as clays, in which the swelling is controlled by increase of the water content.

  4. 4.

    The LSBoost model tuned with Bayesian optimization performs very well in determining the maximum heave values obtained from hydro-mechanical calculations, indicated by high \(R^2\) and low RMSE values. Having a high predictive accuracy, the LSBoost model can serve as an additional tool for calculating the potential heave of the ground surface due to rock swelling in geotechnical projects.

  5. 5.

    The calculated heave is highly sensitive to the maximum swelling pressure, and with much less degree is influenced by the value of Young’s modulus. The Poisson’s ratio, overburden thickness, and initial volumetric water content of the rock influence the swelling with a much lower impact.