Abstract
Estimating soil properties from the mechanical reaction to a displacement is a common strategy, used not only in in situ soil characterization (e.g., pressuremeter and dilatometer tests) but also by biological organisms (e.g., roots, earthworms, razor clams), which sense stresses to explore the subsurface. Still, the absence of analytical solutions to predict the stress and deformation fields around cavities subject to geostatic stress, has prevented the development of characterization methods that resemble the strategies adopted by nature. We use the finite element method (FEM) to model the displacementcontrolled expansion of cavities under a wide range of stress conditions and soil properties. The radial stress distribution at the cavity wall during expansion is extracted. Then, methods are proposed to prepare, transform and use such stress distributions to backcalculate the far field stresses and the mechanical parameters of the material around the cavity (MohrCoulomb friction angle \(\phi \), Young’s modulus E). Results show that: (i) The initial stress distribution around the cavity can be fitted to a sum of cosines to estimate the far field stresses; (ii) By encoding the stress distribution as intensity images, in addition to certain scalar parameters, convolutional neural networks can consistently and accurately backcalculate the friction angle and Young’s modulus of the soil.
The study of the distribution of stresses around expanding (or contracting) cavities has many applications in geomechanics, from tunneling and underground exploration to in situ soil characterization and resource extraction. Even though the problem scales are quite different, a commonality between these applications is that they can be modelled considering a cylindrical cavity that is either pressurized (e.g., the pressuremeter test) or deformed (e.g., cone penetration and tunneling). Moreover, cavity expansion mechanisms are not only used for characterization and design by engineers, but also for burrowing purposes by natural organisms. For instance, roots [1], earthworms [3, 11], razor clams [45] and sandfish [22] all use expanding cavities for exploration/navigation toward paths of least resistance or maximum nutrient yield, anchoring during excavation and assessing the mechanical stability of tunnel networks. These similarities between engineered and biological systems have sparked a wave of scientific interest in bioinspiration applied to geotechnics [23]. The recent developments in robotics, in addition to the rise of Machine Learning (ML) applications in underground exploration and soil characterization, have opened the possibility to explore the development of autonomous devices that can burrow and sense underground.
This study focuses on the backcalculation of soil parameters from the radial stress at the wall of circular cavity subject to displacementcontrolled expansion, under geostatic stress. To the authors’ best knowledge, there is no closedform solution available at shallow depth (at which, the state of stress may not be considered biaxial). We simulate cavity expansion for various soil properties and depths with a 2D planestrain Finite Element model. The simulation input parameters, together with the calculated radial stress at the cavity wall during the expansion are used as inputs to ML algorithms to assess the sensitivity of the FEM model, backcalculate the farfield stress and estimate soil constitutive parameters. The main contributions made in this work are the following:

A program was written to automatically launch Finite Element Method (FEM) simulations of displacementcontrolled cavity expansion under a wide range of stress conditions and soil properties, and extract the radial stress distribution at the cavity wall during expansion. The results of the FEM simulations constitute a database that can be used for input in a MLbased algorithm.

Methods are proposed to prepare, transform and use such stress distributions to backcalculate the far field stresses and the mechanical parameters of the material around the cavity (MohrCoulomb friction angle \(\phi \), Young’s modulus E).

The performance of eight ML techniques in estimating the soil’s friction angle is assessed, and then, the same ML models are tested to predict the soil’s Young’s modulus. It is shown that similar convolutional neural networks can accurately back calculate the friction angle and the Young’s modulus of the material.

The combination of FEM and ML is expected to advance the technology of autonomous devices that can burrow and sense underground, considering that such devices could selfanchor and/or sense through the displacement controlled expansion of an embedded cavity.
The paper is structured as follows. We first present literature reviews of cavity expansion (Sect. 1.1) and machine learning (Sect. 1.2) applications in geotechnics. Section 2 describes the general workflow and explains the numerical modeling approach. Section 3 provides an overview of the ML algorithms used in this study. Section 4 summarizes our results for the farfield stress, the MohrCoulomb friction angle and the Young’s modulus, and discusses the performance of the ML models tested. Section 5 concludes the manuscript summarizing the findings, limitations and proposing possible directions of future work.
Related work
Cavity expansion in geotechnics
Cavity expansion is an active field of research in geomechanics given its applications to in situ testing [18, 20], resource withdrawal and tunneling [2, 46]. For applications that deal with vertical cavities at relatively large depths, the state of stress can be reduced to plane strain (no change in strain along the body of the cavity) and inplane isotropic stress conditions. Under such conditions, the expansion problem can be reduced to a onedimensional (radial) problem since both the distribution of stresses and the deformation of the cavity are radially isotropic [48]. However, for horizontal (or inclined) cavities, or vertical cavities under particular tectonic or sedimentation conditions, the inplane state of stress may not be isotropic (for instance, the state of stress may be biaxial or geostatic). Therefore, the displacementcontrolled expansion of a circular cavity yields a nonisotropic stress distribution, and similarly, the pressurecontrolled expansion of a circular cavity generates a nonisotropic displacement of the cavity wall— increasing significantly the complexity of the analytical solutions.
The analytical solutions that give the closest stress estimates in such nonisotropic plane stress conditions are those developed for tunneling applications and in situ testing [48], for instance to interpret the pressuremeter test (PMT) [47], the cone penetration test (CPT) [35, 36] and the dilatometer test (DMT) [53]. Analytical solutions were also established to predict rock or soil behavior around vertical extraction wells relevant to hydraulic fracturing and geothermal foundations, and around horizontal shafts for deep tunneling, microtunneling and horizontal directional drilling (HDD) [12].
Still, the solutions mentioned above are developed for infinite domains, e.g., the size of the cavity is relatively small (or deep) compared to the dimensions of the domain it is embedded in. Therefore, there is no effect of free surfaces, and the vertical and horizontal far field stresses can be considered constant along the cavity, which results in isotropic or biaxial stress conditions. When analyzing the problem of shallow cavities, these assumptions do not necessarily hold true, since the gradient of stresses around the cavity becomes significant and the geostatic stress field can no longer be modelled as a biaxial state. Furthermore, experimental studies on shallow cavities such as [16, 31] and limit analysis formulations such as [16], have shown that the failure mechanism of shallow horizontal cavities is dominated by the development of shear planes that follow a catenary/parabolic shape, resembling a passive trapdoor mechanism, see [7, 40, 42].
These considerations on the mechanical response of the cavity increase significantly the complexity of the mathematical problem, for which, to the authors’ best knowledge, there is no available closedfrom analytical solution. For this reason, we use the FEM to capture the response of cavities during expansion. We then test ML algorithms in order to estimate the farfield stress and soil parameters from the response of the model. Section 1.2 presents an overview of other applications of ML in geotechnics.
Machine learning applications in geotechnics
The use of ML in geotechnics has seen exponential growth over the last decade [50]. ML has indeed proven to be a useful tool to provide estimations of soil properties and/or limit loads for geotechnical applications where no closedform analytical solutions exists. ML algorithms do not replace analytical solutions; they are comparable to empirical equations and correlations. For a given application, there is usually no consensus on which ML algorithm or data preparation methodology works best, and therefore some authors have written reviews that compare the advantages/limitations of several approaches. For instance, Lary et al. [17] reviewed applications of ML in geosciences and remote sensing, Zhang et al. [51] summarized applications of ML in the constitutive modeling of soils, and Wang and Sun [50] reviewed applications targeted toward modeling of soil properties. Although the range of applications is wide, most studies can be grouped into three categories: i) Estimation of mechanical properties of a system for specific loading conditions and soil type, ii) Estimation of a set of design parameters (e.g., limit load, factor of safety) from the response of a soil to stimuli and iii) Generation and/or calibration of constitutive models. The third category, which goes beyond the scope of this paper, is concerned with the creation, validation and tuning of constitutive models, as shown in the reviews presented in [43, 44, 51].
Estimation of mechanical properties typically resorts to either numerical models or field/lab data as the training set. Material mechanical properties are set as the output so as to create models that can later predict soil parameters. Applications using lab/field data include the prediction of shear strength in cohesive soils using neural networks (NN) from experimental data [33], the estimation of Camclay parameters [26], and the estimation of the over consolidation ratio (OCR) from piezocone penetration tests [15]. Studies in which numerical models are used to generate training data include the combination of NN and gradientdescent to identify constitutive model parameters from selfboring pressuremeter tests [27], and the use of the output of FEM simulations of piezocone penetration as the training data of a NN in order to estimate soil parameters [28].
Estimation of design parameters usually aims to calculate a factor of safety or a limit load, using either field data or numerical results as training sets. Some of the studies that use field and lab data include the work of Sulewska [39], which discusses six different applications of NN to predict displacements (settlement, consolidation) and the limit load (bearing capacity) in different applications. Khatibi et al. [13] estimated the shear wave velocity in incomplete data sets achieving good agreement between predicted and measured downhole pressures, which helped assessing wellbore stability. Zhang et al. [49] used data from a tunneling project in Changsha city (China) to predict surface settlement. Samui and Sitharam [37] trained a model that can predict the liquefaction susceptibility of a site. Lu et al. [19] estimated the pullback force during horizontal directional drilling using a genetic algorithm combined with support vector machines (SVM). Kardani et al. [10] estimated the bearing capacity of piles, using field data as the training set. When numerical models are used to generate training data, the numerical model maps the inputs, i.e., the geometry and loading conditions of the problem, to the desired output, typically a factor of safety or a limit load. Numerical models may be computationally expensive and/or rely on advanced/licensed software, making them inconvenient for fast estimations. ML models trained on numerical simulations are fast, reliable alternative methods that allow quick estimations. For instance, Makasis et al. [21] generated a model to design thermal piles. He et al. [8] focused on a reliability analysis of spatially variable slopes, training a ML model on relatively few numerical simulations, and showing the ability of the model to produce results comparable to those of the complex numerical model.
Numerical model
The finite element models used for the present study are built using Abaqus software. The 2D, plane strain model approximates the cross section of a cylindrical cavity, which is assumed to be long in relation to its diameter so that plane strain conditions hold. Due to the symmetry of the problem with respect to the vertical axis, and in order to reduce computational cost, only half of the domain is modelled (see Fig. 1). The top boundary of the domain represents the free surface, symmetric boundary conditions are set on the left wall of the domain (cavity’s side), and fixed boundary conditions are set on the right and bottom boundaries, see Fig. 1a. The diameter of the cavity (\(D_c\)) is fixed for all the simulations at 1.0 m—the choice of this diameter corresponds to an intermediate value between soil probing instruments, usually in the range of tens of centimeters, and the range of infrastructure construction devices such as tunneling machines. The depth of the cavity (H) is measured from the surface to the center of the cavity and is varied between 5 and 50 times the cavity diameter (5 to 50 m). The height of the model (\(H_m\)) was set to \(H_m = H + 25 D_c\), while its width (\(W_m\)) was set to \(W_m = 1.5 H_m\)—these dimensions were chosen according to [54] to avoid boundary effects. Each model was created and meshed automatically using a Python routine that exploits the scripting capabilities of Abaqus. The model was split into two sections prior to meshing. A first region concentric to the cavity with a diameter of \(6D_c\) was assigned 4node bilinear elements arranged in a structured mesh with 36 elements around the cavity (see Fig. 1c. A second region covering the rest of the domain was assigned 3node linear elements arranged in a nonstructured mesh. The number of elements in the mesh ranged between 1, 415 and 4, 340 depending on the depth of the cavity (see Fig. 1b. The geometry of the models and an example of mesh are shown in Fig. 1.
Model input parameters
We used the MohrCoulomb (MC) soil constitutive model because of its simplicity and flexibility. After creating the geometry of the model, soil properties were assigned to the elements of the mesh, including: soil density (d), Young’s modulus (E), Poisson’s ratio (\(\nu \)), MohrCoulomb (MC) friction angle (\(\phi \)) and MC dilation angle (\(\psi \)). In the following sections, we refer to the soil density (d) and unit weight (\(\gamma \)) of the material interchangeably, keeping in mind that these two parameters are linearly related by \(\gamma = g \cdot d\), where g is the gravity acceleration. All the elements of the mesh were assigned the same soil properties, the values of which were selected randomly from a uniform distribution spanning ranges suggested in [14, 40] for frictional soils (silty sands to gravels)—see Table 1. The distributions of the different variables were assumed independent, which allowed us to ignore correlations between soil parameters.
A fixed value of MC cohesion (c) of 5 kPa was used in every simulation in order to improve the convergence of the models. Although there is no risk of achieving a state of stress at a corner of the MC yield surface in plane strain, the simulations were run with the DruckerPrager (DP) model, using a matching smooth yield surface (e.g., [4, 9, 25, 38, 52]). The Python scripts developed to sample constitutive parameters for the present study can thus be used as they are for future 3D simulations.
Simulation procedure
The computations were performed in two steps. First, the displacement of the cavity wall was fixed to zero and a geostatic stress step was used to create a linear stress gradient controlled by the unit weight of the material and increasing in the direction of gravity (see Fig. 1a, i.e., from the free surface (zero vertical stress) to the bottom of the simulation domain. The resulting horizontal stress was consistent with the conditions of the soil at rest. We verified that the lateral earth pressure coefficient was \(K_o = 1  sin(\phi )\) from the simulation results after the geostatic step, which Abaqus performs without inducing deformations to the model. The density and MC friction angle of the material are two of the variables assigned randomly in the models, and therefore, our set of simulations covers a wide range of preexpansion stress conditions at comparable cavity depths. The expansion of the cavity was simulated in a second step, displacing the nodes around the cavity radially away from the cavity center at a constant rate, keeping the cavity circular. The maximum radial expansion was set to a large value (\(10 D_c\)) in order to push the simulation to the maximum expansion of the cavity before it fails. A total of 1,500 simulations were performed using a machine with a Dual Intel Xeon Gold 6148 (2.4 GHz, 3.7 GHz turbo) processors, and each simulation was run in serial instances of Abaqus Standard (implicit). In average, each simulation took about 5 minutes to complete for a total of about 125 hours of computing time.
Since the objective of the study is to test whether the data captured at the wall of the cavity during expansion can be used to infer soil behavior, we used the radial stress at each orientation around the cavity and at every value of radial expansion/deformation (\(\sigma _{r}(r=D_c/2,\theta ,\epsilon _r)\)) as a data set that can be given as input to the ML algorithm to backcalculate soil properties. Examples of \(\sigma _{r}(r=D_c/2,\theta ,\epsilon _r)\) data sets obtained by FEM simulation are shown in Fig. 2.
Machine learning algorithms
In our study, we focus on predicting the MohrCoulomb (MC) friction angle (\(\phi \)) and the Young’s modulus (E). We assume that the far field stresses (\(\sigma _v, \sigma _h, K_o\)), cavity depth (H), and soil density (d) are known, as they can be inferred from the initial stress distribution (see Sect 4.1). Therefore, we used them together with the radial stress distribution during expansion (\(\sigma _r(r=D_c/2,\theta ,\epsilon _r)\)) as predictive variables of the tested models.
ML models with higher complexity tend to be more accurate, but less interpretable [34]. Besides accuracy, interpretability is also important in our study, as we are interested in inferring the relationship between cavity expansion and soil properties. Therefore, in order to find the simplest model that can capture the mechanism, we trained and evaluated 8 different machine learning approaches with increasing complexity (Table 2). In the first model, we used a mean score predictor as our naïve baseline model. This model simply uses the average of \(\phi \) in the training set (with n samples) to predict every sample j in the test set, as follows:
This mean score naïve baseline model assesses how well the distribution of \(\phi \) in the training set alone can predict unseen test samples. In addition, we test two linear regression models with different input features. The first only uses H and d, while the second uses H, d as well as \(\sigma _r(r=D_c/2,\theta ,\epsilon _r)\). We also explored various types of neural networks. Compared to linear regression models, a twolayer fully connected neural network can learn nonlinear relationships from the input features. Lastly, we transformed the surfaces described by \(\sigma _r(r=D_c/2,\theta ,\epsilon _r)\) (see Fig. 2 for examples) as grayscale images, represented as a matrix with pixel values in the range [0, 1], and tested three sixlayer Convolutional Neural Networks (CNNs) with different featurizations. Our results show that CNNs trained with H, d and \(\sigma _r(r=D_c/2,\theta ,\epsilon _r)\) represented as images have the highest accuracy.
Data processing
We generated all our data through FEM simulations (See Sect. 2.2). For each simulation, H, d, \(\phi \) and E are scalar values; \(\sigma _r(r=D_c,\theta ,\epsilon _r)\) is represented as a matrix \(\in \) \({\mathbb {R}}^{n \times 37}\), where n is the number of radial expansion steps in the FEM simulation (discretized values of \(\epsilon _r\)), and the 37 columns refer to different orientations around the cavity with a spacing of \(5^{\circ }\), each one corresponding to a node around the cavity. We note \({\sigma _r}^{i,j}\) the stress (numerical scalar) at node j at step i. We randomly split 1,363 simulation results into a training set with 817 simulations, a validation set with 273 simulations, and a test set with 273 simulations (6:2:2 ratio). We used the validation set to choose the optimal hyperparameters for some models (e.g., the regularization power in the linear regression model); we also monitored the performance of CNNs on the validation set during training and interrupted the training process when the performance stopped to improve in 20 consecutive epochs—common workflow to avoid overfitting [5]. Lastly, we evaluated ML algorithm performance by comparing the predictions of the ML models with the actual values calculated by FEM in the test set. We used the same dataset split to train and evaluate all 8 ML models.
The distribution of H, d, and \(\phi \) are shown in Fig. 3. For all models, we standardized H and d into a range of [0, 1] by linearly transforming them based on their minimums and maximums in the training set. For the stress field matrices \(\sigma _r\), we explored different featurization methods. For CNNs, we treated \(\sigma _r\) as 2D grayscale images where the stress field value at orientation j and expansion step i can be thought as a pixel value. CNNs are designed for image analysis—with convolution transformations, CNNs can learn the spatial relationships across pixels [29], which encode cavity locations and simulation steps in our framework. Our hypothesis is that the spatial relationship between cavity location and simulation step is useful for predicting soil properties. Since linear regression models and fullyconnected neural networks cannot use 2D matrices as input features, we vectorized these 2D matrices of \(\sigma _r\) into 1D vectors by stacking matrix rows together (rowmajor flattening).
Linear models
We used a trivial meanscore predictor as a baseline model. The model computes the average \(\phi \) in the training set, and then directly uses it as prediction of all samples in the validation set and test set. Linear regression is a popular and interpretable approach to study the relationship between soil properties and cavity behavior [50]. We applied Lasso regularization [41] to all of our linear regressions, as it can shrink the coefficients of less predictive features to zero and improve model generalizability. The first regression model only uses two numerical values (cavity depth H and soil density d) as input features, while the second model adds vectorized stress fields \(\sigma _r\) as an extra feature (Table 2).
The regularization power \(\lambda \) is a hyperparameter, where larger \(\lambda \) imposes stronger regularization. We tune \(\lambda \) on the validation set through gridsearch using the Python package scikitlearn [32]; we choose the \(\lambda \) that gives the smallest Mean Absolute Error (MAE) in our final model. The MAE is defined according to:
where n is the size of the dataset, \({\hat{\phi }}\) is the predicted value of a parameter (e.g., friction angle) and \(\phi \) is the true parameter value, taken as input in the FEM. Models with smaller MAE are more predictive.
Fullyconnected neural network
We developed two simple fullyconnected neural networks (NN), with two hidden layers each, using the Python package PyTorch [30]. Multilayer NNs can approximate any continuous functions [6]. Therefore, compared to linear models (Sect. 3.2), NNs are more powerful ML models that can learn nonlinear relationships between the input features and MC friction angle \(\phi \). Similar to our experimental design for linear models, one NN model only uses two input features: cavity depth H and soil density d, and the second NN uses H, d, and vectorized \(\sigma _r\) as input (Table 2). The hyperparameters: hidden neuron numbers, learning rate, and batch size, are tuned on the validation set through gridsearch. We chose the combination that yielded the smallest MAE score as our final model.
Convolutional neural network
We developed three CNN models that share a similar model architecture (Fig. 4) but have different featurizations with the Python package PyTorch [30]. In our CNN model, we have six convolutional layers where each layer has six \(3\times 3\) kernels. After every 2 consecutive convolutional layers, we insert one maxpooling layer to introduce regularization in the model. Finally, we have two fullyconnected layers to nonlinearly transform the image representations into a numerical value. One CNN takes the stress field images with dimension \(256\times 256\) generated by resizing and standardizing \(\sigma _r\) matrices. The second CNN takes these \(256\times 256\) images as input, and then standardized H and d are concatenated in the first fullyconnected layer (Fig. 4). The third CNN has the same architecture as the second CNN, but for each row (cavity orientation), we linearly interpolated the values at the 37 cavity nodes to have values at every degree. For each column (radial expansion), we linearly interpolated the values in the range 0 to \(1\%\) to 100 data points. In that way, we increase the size of our images while keeping a consistent mapping, i.e., each row and column numbers refer to the same orientation and radial expansion across simulations, unlike the first and second CNN where we resized the images without interpolation.
Results and discussion
Our results are organized as follows. In Sect. 4.1, the preexpansion radial stress distribution at the cavity wall is used to estimate the farfield stresses around the cavity and unit weight of the soil. In Sect. 4.2, the radial stress distribution during expansion (\(\sigma _r\)) is used to backcalculate the MC friction angle (\(\phi \)), while Sect. 4.3 uses a similar framework to backcalculate the Young’s modulus (E) of the material.
Estimation of farfield stresses
The preexpansion radial stress distribution at the cavity wall (\({\sigma _{r}}(r=D_c/2,\theta ,\varepsilon _r = 0)\)) is used to calculate the far field stresses in each simulation. An initial inspection of the initial stress distributions showed that the radial stress could be parametrized as a sum of sinusoidal/cosine signals. In fact, this observation was previously found and exploited by [24] for cavity expansion problems in elasticity. Figure 5 shows an example of the preexpansion radial stress distribution for two simulations with similar farfield vertical stress \(\sigma _v\) but different farfield horizontal stress \(\sigma _h\).
Due to the presence of geostatic stress conditions, the initial radial stress distribution has a period of \(2\pi \) (one revolution around the circular cavity). We used the following Fourier series to fit the initial stress distribution around the cavity:
where \(\theta \) is the orientation angle measured from the bottom of the cavity. By choosing this convention we have \(\sigma _{r}(\theta ) = \sigma _{r}(\theta )\). The order of the fit (\(j=4\)) is the smallest value that could accurately capture the absence of biaxial symmetry in the radial stress distribution. Every fit of the initial stress distribution has coefficients of determination \(R^2 > 0.94\). Next, from Eq. 3, and knowing the fit coefficients \(a_j\) for every model, estimations for the radial stress at the invert (\(\theta = 0\)), waist (\(\theta = \pi /2\)) and crown (\(\theta = \pi \)) of the cavity were found from the following relations:
in which the subscripts B, W and C correspond to the invert (bottom), waist (side) and crown (top) of the cavity respectively.
The vertical (\(\sigma _v\)) farfield stress at the depth of the cavity center (H) can be estimated as the average between the radial stress at the crown and invert of the cavity. Similarly, the horizontal (\(\sigma _h\)) farfield stress at the depth H can be estimated as radial stress at the waist of the cavity. And from there, assuming that the depth of the cavity is known and that the stress state follows geostatic conditions, the unit weight of the material and the lateral earth pressure coefficient can be estimated too, according to the following equations:
The estimation of the farfield stresses (respectively, soil unit weight and lateral earth pressure coefficient) is assessed against the actual values set as input in the FEM simulations in Fig. 6 (respectively, Fig. 7).
Lastly, the gradient of stresses present under geostatic stress conditions causes a lack of symmetry of the radial response between the top and bottom of the cavity. In order to compare and quantify the symmetry (or lack thereof) we define the coefficient of symmetry \(S^2\) as follows:
where \(\sigma _r(\theta )\) is the radial stress at an orientation \(\theta \) around the cavity wall, and \(\bar{\sigma _r}\) is the mean radial stress across all the orientations between \([0, \pi ]\).
\(S^2\) is similar to the coefficient of determination \(R^2\) and measures the difference between the radial stress at the top (waist to crown) and the bottom (waist to invert) of the cavity, and compares it to the difference between the entire radial stress distribution at the cavity wall and its average value. Similar to \(R^2\), the coefficient \(S^2\) has a maximum value of 1 when there is perfect symmetry between both sides of the distribution. Figure 8 shows the distribution of the values of \(S^2\) as a function of the depth of the cavity H and the lateral stress coefficient \(K_o\) (variables which had the highest influence on the symmetry index).
Results from Fig. 8 show that the depth of the cavity has a significant effect on the symmetry of the response. This was expected, since the stress gradient across the height of the cavity is significant for shallow cavities, and reduces in importance as the depth of the cavity increases. In fact, models in which the depth was more than 15 times the cavity diameter (15m) had \(S^2 >= 0.95\), suggesting that at such depths, the assumption of biaxial stress conditions may be acceptable. However, as the depth of the cavities decreases (i.e., the cavities are more shallow) there is a large variability of the symmetry of the response, which is controlled by the value of the lateral earth pressure coefficient. Low values of \(K_o\) display a higher symmetry of the preexpansion stress distribution due to a larger difference between the magnitude of the vertical and horizontal farfield stress.
Estimation of MC friction angle \(\phi \)
In order to backcalculate the MC friction angle \(\phi \) of the material, we used the eight ML models described in Table 2. As explained above, the models use known parameters: cavity depth H, soil density d, and the radial stress distribution at the cavity wall \(\sigma _r(r=D_c/2,\theta ,\epsilon _r)\)) in order to predict the MC friction angle \(\phi \). As the dependent variable \(\phi \) is a continuous value, we use the MAE to measure the performance of the different models (see Eq. 2).
The MAEs during training, validation (tuning of hyperparameters) and testing are reported in Table 3. Results show that increasing the complexity of the models results in better performance. For instance, with the same input parameters, fullyconnected neural networks have better performance than linear regression models, and CNNs outperforms fullyconnected neural networks across different evaluation subsets. We see the largest performance improvement when including CNN models, which suggests that the nonlinear spatial relationships across cavity locations and simulation steps are critical to accurately predict the MC friction angle \(\phi \). The resulting test MAE of the CNN model is \(0.49^\circ \). This means that in average, the error in the friction angle estimation \({\hat{\phi }}\) is within 0.5 degrees from the true value \(\phi \).
Interestingly, using linear interpolation to modify the input stress field images does not improve the CNN performance. The interpolation modifies the grey scale images obtained from \(\sigma _r\), in such a way that every row number (orientation around the cavity) and every column number correspond to the same values across simulations. This fact suggests that the performance of the model is controlled by the ’shape’ of the radial stress distribution, not the magnitude of the expansion.
Estimation of young’s modulus E
In this section, we apply some of the models used in Sect. 4.2 to infer the value of the Young’s modulus (E) of the soil mass without further training, hence testing the generalizability of the framework.
We introduce two scalar parameters, \(M_w\) and \(M_c\), which correspond to the slope of the radial stress/strain curve between the initial radial stress and the stress at a radial deformation \(\varepsilon _r=0.1\%\), measured according to the following equation:
where the subscript j corresponds to the orientations at the waist (\(M_w\)) and crown (\(M_c\)). We applied the data processing pipeline (Sect. 3.1), model architectures of linear models (Sect. 3.2) and CNN (Sect. 3.4) to predict E, this time with the addition of \(M_w\) and \(M_c\), calculated for each one of the simulations, and fed to our CNN models together with H and d before the fullyconnected layers (Fig. 4). \(M_w\) and \(M_c\) are intuitive measures of the stiffness of the material around the cavity and their inclusion significantly improved the performance of the prediction models. The Mean Absolute Percentage Error (MAPE) is calculated as:
The ML experiment results are summarized in Table 4, where we see a similar pattern of model performance as in the estimation of MC friction angle \(\phi \). It is worth noting that in this experiment, we use a slightly different metric to measure the performance of the predictor. The MAPE (Eq. 8) is a relative measurement of error between the predicted Young’s modulus \({\hat{E}}\) compared to its true value E. Therefore, the performance scores in Table 4 have a different scale than the scores in Table 3. The performance of the models resonates with our findings in Table 3: (1) The CNN model yields the best prediction performance; (2) The inclusion of the stress field distribution \(\sigma _r\) improves the accuracy of the predictions of E; (3) The MAPE of the estimation is under \(2\%\).
Influence of noise in ML models
The ML models presented in Sects. 4.2 and 4.3 are tested and trained with ‘clean’ data obtained from FEM simulations. However, to test the applicability of this method to data acquired from field measurement we test the influence of noise in the accuracy of the CNN models (which yield the highest accuracy).
Gaussian noise, i.e., random values sampled from a normal distribution N(0, std) with zero mean and standard deviation std, are added to each pixel of the stress field images \(\sigma _r\). Ten different levels of noise are tested by increasing std from 0.1 to 1 (\(std = 0.1, 0.2, ..., 1.0\)). Resulting pixel values outside the normalized range [0, 1] are assigned values of 0 and 1 respectively.
Resulting datasets \((\sigma _r)_{std} = \sigma _r + N(0,std)\) are used to train, validate and test the CNN models that predict \(\phi \) and E. The accuracy of each model, in terms of the MAPE of the estimation of E and the MAE of the estimation of \(\phi \), for each value of std, are summarized in Fig. 9.
The estimation of \(\phi \) is more sensitive to noise, increasing monotonically from a MAE of \(0.48^{\circ }\) (with no noise added), to \(2.7^{\circ }\) with \(std = 1.0\). The MAPE of the estimation of E increases from \(1.84\%\) (with no noise) to about \(2.3\%\) for \(std \ge 0.2\), after which the estimation appears to plateau.
The perceived lower sensitivity to noise of the estimation of E is partially attributed to the fact that the MAPE is a percentage metric, as opposed to the MAE used for \(\phi \). In addition, the high performance of the model suggests that the extra features \(M_w\) and \(M_c\) improve the robustness of the model, offsetting the influence of std.
Conclusions
This manuscript presents a novel framework that couples FEM and ML to backcalculate the farfield stresses and soil properties from the radial stress field (\(\sigma _r\)) at the wall of a circular cavity during displacementcontrolled expansion under nonnegligible geostatic stress gradients. We exploited the resemblance of the initial radial stress distribution around the cavity and Fourierlike functions to propose a simple, yet accurate way to backcalculate farfield stresses around the cavity. Moreover, such simple representation of the initial stress distribution (controlled by 4 scalar parameters), can also be used to assess the existence of geostatic or biaxial stress conditions, based on the symmetry (or lack thereof) of the stress distributions between the top and bottom parts of the cavity. We then evaluated eight different ML models of increasing complexity in order to find the simplest, most interpretable method that can accurately backcalculate the MC friction angle of the material around the cavity (\(\phi \)). Results show that using imagelike data that encodes the radial stress distribution as a function of radial deformation and orientation angle significantly improves the accuracy of the models. Interestingly, we observed that the predictive power of the encoded images to predict \(\phi \) is controlled by the ’shape’ of \(\sigma _r\) rather than by its magnitude. Lastly, we tested the flexibility of the prediction framework used to backcalculate \(\phi \), by reusing it to predict the Young’s modulus of the material (E). Having a consistent and flexible framework that can consistently backcalculate different soil properties significantly reduces the time that must be invested in data preparation and model selection. The bestperforming model has a mean absolute percentage error (MAPE) under \(2\%\), using a convolutional neural network that uses \(\sigma _r\) and intuitive cavity stiffness parameters (\(M_w\), \(M_c\)). Although there is no practical way to interpret the mechanisms occurring inside the neural network, we hypothesise that the encoded images from \(\sigma _r\) and the parameters \(M_w\), \(M_c\) provide ’shape’ and magnitude information respectively, which is then combined to estimate the Young’s modulus of the material. The CNN models proposed in this study were found to have little sensitivity to Gaussian noise: the MAE of \(\phi \) (respectively, the MAPE of E) increases from 0.5\(^o\) (respectively, from 1.84%) without noise to 2.75\(^o\) (respectively, to 2.3%) with noise of unit standard deviation. This relative insensitivity to noise is indicative of the robustness of the proposed CNNs.
The merits of the current study are not limited to the particular problem of cavity expansion nor to the use of machine learning algorithms in geotechnical problems. The current study can indeed be used as a guide to generate, prepare and transform mechanical data (i.e., radial stress distribution \(\sigma _r\)) for use as training data in ML algorithms. To that end, we have used different fitting techniques and prediction models, all informed by a mechanical understanding of the problem at stake, and intuitive choices that may improve the performance of prediction algorithms. Future work spanning from this study includes testing the effect of less ideal conditions encountered in physical experiments and validating the methods presented in this study against actual experimental and/or field data. Another possible extension of this work is the exploration of other applications where no analytical solutions exist yet. Still, as with any other study that applies datasciencebased methods to mechanical problems, careful guiding, interpretation and validation of the results is fundamental.
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Anselmucci F, Andò E, Viggiani G, Lenoir N, Peyroux R, Arson C, Sibille L (2021) Use of xray tomography to investigate soil deformation around growing roots. Use of xray tomography to investigate soil deformation 588around growing roots.Geotechn Lett 11(1):96–102. https://doi.org/10.1680/JGELE.20.00114
Atkinson JH, Potts DM (1977) Stability of a shallow circular tunnel in cohesionless soil. Géotechnique 27(2):203–215. https://doi.org/10.1680/geot.1977.27.2.203
Borela R, Frost JD, Viggiani G, Anselmucci F (2021) Earthworminspired robotic locomotion in sand: An experimental study with Xray tomography. Geotech Lett 11(1):66–73. https://doi.org/10.1680/jgele.20.00085
Borja RI (2013) Plasticity–Modeling and computation, vol 2. Springer, Berlin
Caruana R, Lawrence S, Giles L (2001) Overfitting in neural nets: backpropagation, conjugate gradient and early stopping. Advances in neural information processing systems, 402–408
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Contr, Sign Syst 2(4):303–314
Evans CH (1984) An examination of arching in granular soils. PhD thesis, Massachusetts Institute of Technology. http://hdl.handle.net/1721.1/45181
He X, Xu H, Sabetamal H, Sheng D (2020) Machine learning aided stochastic reliability analysis of spatially variable slopes. Compu Geotech 126:103711. https://doi.org/10.1016/J.COMPGEO.2020.103711
Jiang H, Xie Y (2011) A note on the MohrCoulomb and DruckerPrager strength criteria. Mech Resear Commun 38(4):309–314. https://doi.org/10.1016/J.MECHRESCOM.2011.04.001
Kardani N, Zhou A, Nazem M, Shen SL (2019) Estimation of bearing capacity of piles in cohesionless soil using optimised machine learning approaches. Geotech Geol Eng 38(2):2271–2291. https://doi.org/10.1007/S10706019010858
Keudel M, Schrader S (1999) Axial and radial pressure exerted by earthworms of different ecological groups. Biol Fert Soils 29(3):262–269. https://doi.org/10.1007/s003740050551
Keulen B (2001) Maximum allowable pressures during horizontal directional drillings focused on sands. PhD thesis, TU Delft. https://repository.tudelft.nl/islandora/object/uuid:ad91dad8b958481b82d8c7395d1a3874
Khatibi S, Aghajanpour A (2020) Machine learning: a useful tool in geomechanical studies a case study from an offshore gas field. Energies 13(14):3528. https://doi.org/10.3390/EN13143528
Kulhawy FH, Mayne PW (1990) Manual on estimating soil properties for foundation design. Technical report, Electric Power Research Inst., Palo Alto, CA (USA); Cornell Univ., Ithaca. https://www.osti.gov/biblio/6653074
Kurup PU, Dudani NK (2002) Neural networks for profiling stress history of clays from PCPT data. J Geotech Geoenviron Eng 128(7):569–579. https://doi.org/10.1061/(ASCE)10900241(2002)128:7(569)
Lan H, Moore ID (2020) Experimental investigation examining influence of burial depth on stability of horizontal boreholes in sand. J Geotech Geoenviron Eng 146(5):04020013. https://doi.org/10.1061/(ASCE)GT.19435606.0002222
Lary DJ, Alavi AH, Gandomi AH, Walker AL (2016) Machine learning in geosciences and remote sensing. Geosci Front 7(1):3–10. https://doi.org/10.1016/J.GSF.2015.07.003
Li L, Li J, Sun D (2016) Anisotropically elastoplastic solution to undrained cylindrical cavity expansion in K0consolidated clay. Comp Geotech 73:83–90. https://doi.org/10.1016/j.compgeo.2015.11.022
Lu H, Iseley T, Matthews J, Liao W (2021) Hybrid machine learning for pullback force forecasting during horizontal directional drilling. Automat Constr 129:103810. https://doi.org/10.1016/J.AUTCON.2021.103810
Mair RJ, Muir Wood D, Wood DM (1987) Pressuremeter testing : methods and interpretation. 1st edn. Elsevier, Armsterdam. pp 169, https://doi.org/10.1139/t88074
Makasis N, Narsilio GA, Bidarmaghz A (2018) A machine learning approach to energy pile design. Comp Geotech 97:189–203. https://doi.org/10.1016/J.COMPGEO.2018.01.011
Maladen R, Ding Y, Li C, Goldman DI (2009) Undulatory swimming in sand: subsurface locomotion of the sandfish lizard. Science 325(5938):314–318. https://doi.org/10.1126/science.1172490
Martinez A, DeJong J, Akin I, Aleali A, Arson C, Atkinson J, Bandini P, Baser T, Borela R, Boulanger R, Burrall M, Chen Y, Collins C, Cortes D, Dai S, DeJong T, Dottore ED, Dorgan K, Fragaszy R, Frost JD, Full R, Ghayoomi M, Goldman DI, Gravish N, Guzman IL, Hambleton J, Hawkes E, Helms M, Hu D, Huang L, Huang S, Hunt C, Irschick D, Lin HT, Lingwall B, Marr A, Mazzolai B, McInroe B, Murthy T, O’Hara K, Porter M, Sadek S, Sanchez M, Santamarina C, Shao L, Sharp J, Stuart H, Stutz HH, Summers A, Tao J, Tolley M, Treers L, Turnbull K, Valdes R, Lv Paassen, Viggiani G, Wilson D, Wu W, Yu X, Zheng J (2021) Bioinspired geotechnical engineering: principles, current work, opportunities and challenges. Geotechn. https://doi.org/10.1680/JGEOT.20.P.170
Muskhelishvili N (1997) Some basic problems of the mathematical theory of elasticity. Springer, Berlin, p 746
Nayak GC, Zienkiewicz OC (1972) Convenient form of stress invariants for plasticity. J Struct Divis 98(4):949–954
Obrzud RF, Vulliet L, Truty A (2009) Optimization framework for calibration of constitutive models enhanced by neural networks. Int J Numer Anal Meth Geomech 33(1):71–94. https://doi.org/10.1002/NAG.707
Obrzud RF, Vulliet L, Truty A (2009) A combined neural network/gradientbased approach for the identification of constitutive model parameters using selfboring pressuremeter tests. Int J Numer Anal Meth Geomech 33(6):817–849. https://doi.org/10.1002/NAG.750
Obrzud RF, Truty A, Vulliet L (2012) Numerical modeling and neural networks to identify model parameters from piezocone tests: II multiparameter identification from piezocone data. Int J Numer Anal Meth Geomech 36(6):743–779. https://doi.org/10.1002/NAG.1028
Olah C, Satyanarayan A, Johnson I, Carter S, Schubert L, Ye K, Mordvintsev A (2018) The building blocks of interpretability. Distill 3(3):e10
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L (2019) PyTorch: an imperative style, highperformance deep learning library. Adv Neur Infor Process Sys 32:8026–8037
Patinoramirez F, Anselmucci F, Viggiani G, Caicedo B, Arson C (2021) Deformation and failure mechanisms of granular soil around pressurised shallow cavities. Geotechnique. https://doi.org/10.1680/JGEOT.21.00136
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikitlearn: machine learning in python. J Mach Learn Resear 12:2825–2830
Pham BT, Son LH, Hoang TA, Nguyen DM, Tien Bui D (2018) Prediction of shear strength of soft soil using machine learning methods. CATENA 166:181–191. https://doi.org/10.1016/J.CATENA.2018.04.004
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215
Russell AR, Khalili N (2006) Cavity expansion theory and the cone penetration test in unsaturated sands. In: Unsaturated Soils 2006, pp. 2546–2557. American Society of Civil Engineers, Reston, VA. https://doi.org/10.1061/40802(189)217
Salgado R, Mitchell JK, Jamiolkowski M (1997) Cavity expansion and penetration resistance in sand. J Geotech Geoenviron Eng 123(4):344–354. https://doi.org/10.1061/(ASCE)10900241(1997)123:4(344)
Samui P, Sitharam TG (2011) Machine learning modelling for predicting soil liquefaction susceptibility. Nat Haz Earth Sys Sci 11(1):1–9. https://doi.org/10.5194/NHESS1112011
Sloan S, Booker J (1986) Removal of singularities in tresca and mohrcoulomb yield functions. Commun Appl Numer Meth 2(2):173–179
Sulewska M (2017) Applying artificial neural networks for analysis of geotechnical problems. Comp Assist Meth Eng Sci 18(4):231–241
Terzaghi K, Peck RB, Mesri G (1996) Soil mechanics in engineering practice, 3rd edn., p. 534. Wiley, Hoboken . https://doi.org/10.1097/0001069419491100000029. https://insights.ovid.com/crossref?an=0001069419491100000029
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Society Series B (Methodol) 58(1):267–288
Tien Hj (1996) A literature study of the arching effect. PhD thesis, Massachusetts Institute of Technology
Wang K, Sun WC (2018) A multiscale multipermeability poroplasticity model linked by recursive homogenizations and deep learning. Comp Meth Appl Mech Eng 334:337–380. https://doi.org/10.1016/J.CMA.2018.01.036
Wang K, Sun WC (2019) Metamodeling game for deriving theoryconsistent, microstructurebased tractionseparation laws via deep reinforcement learning. Comp Meth Appl Mech Eng 346:216–241. https://doi.org/10.1016/J.CMA.2018.11.026
Winter AG, Deits RLH, Dorsch DS, Slocum AH, Hosoi AE (2014) Razor clam to RoboClam: burrowing drag reduction mechanisms and their robotic adaptation. Bioinspir Biomimet 9(3):036009. https://doi.org/10.1088/17483182/9/3/036009
Wong KS, Ng CWW, Chen YM, Bian XC (2012) Centrifuge and numerical investigation of passive failure of tunnel face in sand. Tunnell Undergr Space Technol 28(1):297–303. https://doi.org/10.1016/j.tust.2011.12.004
Wood DM (1990) Straindependent mduli and pressuremeter tests. Géotechnique 40(3):509–512. https://doi.org/10.1680/geot.1990.40.3.509
Yu HS (2000) Cavity expansion methods in geomechanics. Springer, Dordrecht. p. 385, https://doi.org/10.1007/9789401595964
Zhang P, Wu HN, Chen RP, Chan THT (2020) Hybrid metaheuristic and machine learning algorithms for tunnelinginduced settlement prediction: a comparative study. Tunnell Undergr Space Technol 99:103383. https://doi.org/10.1016/J.TUST.2020.103383
Zhang P, Yin ZY, Jin YF (2021) Machine learningbased modelling of soil properties for geotechnical design: review, tool development and comparison. Arch Comput Meth Eng 2021(1):1–17. https://doi.org/10.1007/S11831021096155
Zhang P, Yin ZY, Jin YF (2021) Stateoftheart review of machine learning applications in constitutive modeling of soils. Arch Comput Meth Eng 28(5):3661–3686. https://doi.org/10.1007/S1183102009524Z
Zhao Y, Borja RI (2022) A doubleyieldsurface plasticity theory for transversely isotropic rocks. Acta Geotechnica. https://doi.org/10.1007/s11440022016056
Zhou H, Kong GQ, Liu HL (2016) Pressurecontrolled elliptical cavity expansion under anisotropic initial stress: elastic solution and its application. Sci China Technol Sci 59(7):1100–1119. https://doi.org/10.1007/s1143101660234
Zhou H, Liu H, Yin F, Chu J (2018) Upper and lower bound solutions for pressurecontrolled cylindrical and spherical cavity expansion in semiinfinite soil. Comp Geotech 103(May):93–102. https://doi.org/10.1016/j.compgeo.2018.07.011
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grant No. 1935548. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Funding was provided by UKRI NERC grant NE/T010983/1.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
PatinoRamirez, F., Wang, Z.J., Chau, D.H. et al. Backcalculation of soil parameters from displacementcontrolled cavity expansion under geostatic stress by FEM and machine learning. Acta Geotech. (2022). https://doi.org/10.1007/s1144002201698z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s1144002201698z
Keywords
 Cavity expansion
 Finite element method
 Machine learning
 Nonbiaxial stress conditions
 Parameter backcalculation