Back-calculation of soil parameters from displacement-controlled cavity expansion under geostatic stress by FEM and machine learning

Estimating soil properties from the mechanical reaction to a displacement is a common strategy, used not only in in situ soil characterization (e.g., pressuremeter and dilatometer tests) but also by biological organisms (e.g., roots, earthworms, razor clams), which sense stresses to explore the subsurface. Still, the absence of analytical solutions to predict the stress and deformation fields around cavities subject to geostatic stress, has prevented the development of characterization methods that resemble the strategies adopted by nature. We use the finite element method (FEM) to model the displacement-controlled expansion of cavities under a wide range of stress conditions and soil properties. The radial stress distribution at the cavity wall during expansion is extracted. Then, methods are proposed to prepare, transform and use such stress distributions to back-calculate the far field stresses and the mechanical parameters of the material around the cavity (Mohr-Coulomb friction angle ϕ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi $$\end{document}, Young’s modulus E). Results show that: (i) The initial stress distribution around the cavity can be fitted to a sum of cosines to estimate the far field stresses; (ii) By encoding the stress distribution as intensity images, in addition to certain scalar parameters, convolutional neural networks can consistently and accurately back-calculate the friction angle and Young’s modulus of the soil.

The study of the distribution of stresses around expanding (or contracting) cavities has many applications in geomechanics, from tunneling and underground exploration to in situ soil characterization and resource extraction. Even though the problem scales are quite different, a commonality between these applications is that they can be modelled considering a cylindrical cavity that is either pressurized (e.g., the pressuremeter test) or deformed (e.g., cone penetration and tunneling). Moreover, cavity expansion mechanisms are not only used for characterization and design by engineers, but also for burrowing purposes by natural organisms. For instance, roots [1], earthworms [3,11], razor clams [45] and sandfish [22] all use expanding cavities for exploration/navigation toward paths of least resistance or maximum nutrient yield, anchoring during excavation and assessing the mechanical stability of tunnel networks. These similarities between engineered and biological systems have sparked a wave of scientific interest in bio-inspiration applied to geotechnics [23]. The recent developments in robotics, in addition to the rise of Machine Learning (ML) applications in underground exploration and soil characterization, have opened the possibility to explore the development of autonomous devices that can burrow and sense underground.
This study focuses on the back-calculation of soil parameters from the radial stress at the wall of circular cavity subject to displacement-controlled expansion, under geostatic stress. To the authors' best knowledge, there is no closed-form solution available at shallow depth (at which, the state of stress may not be considered biaxial). We simulate cavity expansion for various soil properties and depths with a 2D plane-strain Finite Element model. The simulation input parameters, together with the calculated radial stress at the cavity wall during the expansion are used as inputs to ML algorithms to assess the sensitivity of the FEM model, back-calculate the far-field stress and estimate soil constitutive parameters. The main contributions made in this work are the following: • A program was written to automatically launch Finite Element Method (FEM) simulations of displacementcontrolled cavity expansion under a wide range of stress conditions and soil properties, and extract the radial stress distribution at the cavity wall during expansion.
The results of the FEM simulations constitute a database that can be used for input in a ML-based algorithm. • Methods are proposed to prepare, transform and use such stress distributions to back-calculate the far field stresses and the mechanical parameters of the material around the cavity (Mohr-Coulomb friction angle /, Young's modulus E). • The performance of eight ML techniques in estimating the soil's friction angle is assessed, and then, the same ML models are tested to predict the soil's Young's modulus. It is shown that similar convolutional neural networks can accurately back calculate the friction angle and the Young's modulus of the material. • The combination of FEM and ML is expected to advance the technology of autonomous devices that can burrow and sense underground, considering that such devices could self-anchor and/or sense through the displacement controlled expansion of an embedded cavity.
The paper is structured as follows. We first present literature reviews of cavity expansion (Sect. 1.1) and machine learning (Sect. 1.2) applications in geotechnics. Section 2 describes the general workflow and explains the numerical modeling approach. Section 3 provides an overview of the ML algorithms used in this study. Section 4 summarizes our results for the far-field stress, the Mohr-Coulomb friction angle and the Young's modulus, and discusses the performance of the ML models tested. Section 5 concludes the manuscript summarizing the findings, limitations and proposing possible directions of future work.
1 Related work

Cavity expansion in geotechnics
Cavity expansion is an active field of research in geomechanics given its applications to in situ testing [18,20], resource withdrawal and tunneling [2,46]. For applications that deal with vertical cavities at relatively large depths, the state of stress can be reduced to plane strain (no change in strain along the body of the cavity) and in-plane isotropic stress conditions. Under such conditions, the expansion problem can be reduced to a one-dimensional (radial) problem since both the distribution of stresses and the deformation of the cavity are radially isotropic [48]. However, for horizontal (or inclined) cavities, or vertical cavities under particular tectonic or sedimentation conditions, the in-plane state of stress may not be isotropic (for instance, the state of stress may be biaxial or geostatic). Therefore, the displacement-controlled expansion of a circular cavity yields a non-isotropic stress distribution, and similarly, the pressure-controlled expansion of a circular cavity generates a non-isotropic displacement of the cavity wall-increasing significantly the complexity of the analytical solutions. The analytical solutions that give the closest stress estimates in such non-isotropic plane stress conditions are those developed for tunneling applications and in situ testing [48], for instance to interpret the pressuremeter test (PMT) [47], the cone penetration test (CPT) [35,36] and the dilatometer test (DMT) [53]. Analytical solutions were also established to predict rock or soil behavior around vertical extraction wells relevant to hydraulic fracturing and geothermal foundations, and around horizontal shafts for deep tunneling, micro-tunneling and horizontal directional drilling (HDD) [12].
Still, the solutions mentioned above are developed for infinite domains, e.g., the size of the cavity is relatively small (or deep) compared to the dimensions of the domain it is embedded in. Therefore, there is no effect of free surfaces, and the vertical and horizontal far field stresses can be considered constant along the cavity, which results in isotropic or biaxial stress conditions. When analyzing the problem of shallow cavities, these assumptions do not necessarily hold true, since the gradient of stresses around the cavity becomes significant and the geostatic stress field can no longer be modelled as a biaxial state. Furthermore, experimental studies on shallow cavities such as [16,31] and limit analysis formulations such as [16], have shown that the failure mechanism of shallow horizontal cavities is dominated by the development of shear planes that follow a catenary/parabolic shape, resembling a passive trapdoor mechanism, see [7,40,42].
These considerations on the mechanical response of the cavity increase significantly the complexity of the mathematical problem, for which, to the authors' best knowledge, there is no available closed-from analytical solution. For this reason, we use the FEM to capture the response of cavities during expansion. We then test ML algorithms in order to estimate the far-field stress and soil parameters from the response of the model. Section 1.2 presents an overview of other applications of ML in geotechnics.

Machine learning applications in geotechnics
The use of ML in geotechnics has seen exponential growth over the last decade [50]. ML has indeed proven to be a useful tool to provide estimations of soil properties and/or limit loads for geotechnical applications where no closedform analytical solutions exists. ML algorithms do not replace analytical solutions; they are comparable to empirical equations and correlations. For a given application, there is usually no consensus on which ML algorithm or data preparation methodology works best, and therefore some authors have written reviews that compare the advantages/limitations of several approaches. For instance, Lary et al. [17] reviewed applications of ML in geo-sciences and remote sensing, Zhang et al. [51] summarized applications of ML in the constitutive modeling of soils, and Wang and Sun [50] reviewed applications targeted toward modeling of soil properties. Although the range of applications is wide, most studies can be grouped into three categories: i) Estimation of mechanical properties of a system for specific loading conditions and soil type, ii) Estimation of a set of design parameters (e.g., limit load, factor of safety) from the response of a soil to stimuli and iii) Generation and/or calibration of constitutive models. The third category, which goes beyond the scope of this paper, is concerned with the creation, validation and tuning of constitutive models, as shown in the reviews presented in [43,44,51].
Estimation of mechanical properties typically resorts to either numerical models or field/lab data as the training set. Material mechanical properties are set as the output so as to create models that can later predict soil parameters. Applications using lab/field data include the prediction of shear strength in cohesive soils using neural networks (NN) from experimental data [33], the estimation of Cam-clay parameters [26], and the estimation of the over consolidation ratio (OCR) from piezocone penetration tests [15]. Studies in which numerical models are used to generate training data include the combination of NN and gradientdescent to identify constitutive model parameters from self-boring pressuremeter tests [27], and the use of the output of FEM simulations of piezocone penetration as the training data of a NN in order to estimate soil parameters [28].
Estimation of design parameters usually aims to calculate a factor of safety or a limit load, using either field data or numerical results as training sets. Some of the studies that use field and lab data include the work of Sulewska [39], which discusses six different applications of NN to predict displacements (settlement, consolidation) and the limit load (bearing capacity) in different applications. Khatibi et al. [13] estimated the shear wave velocity in incomplete data sets achieving good agreement between predicted and measured downhole pressures, which helped assessing wellbore stability. Zhang et al. [49] used data from a tunneling project in Changsha city (China) to predict surface settlement. Samui and Sitharam [37] trained a model that can predict the liquefaction susceptibility of a site. Lu et al. [19] estimated the pullback force during horizontal directional drilling using a genetic algorithm combined with support vector machines (SVM). Kardani et al. [10] estimated the bearing capacity of piles, using field data as the training set. When numerical models are used to generate training data, the numerical model maps the inputs, i.e., the geometry and loading conditions of the problem, to the desired output, typically a factor of safety or a limit load. Numerical models may be computationally expensive and/or rely on advanced/licensed software, making them inconvenient for fast estimations. ML models trained on numerical simulations are fast, reliable alternative methods that allow quick estimations. For instance, Makasis et al. [21] generated a model to design thermal piles. He et al. [8] focused on a reliability analysis of spatially variable slopes, training a ML model on relatively few numerical simulations, and showing the ability of the model to produce results comparable to those of the complex numerical model.

Numerical model
The finite element models used for the present study are built using Abaqus software. The 2D, plane strain model approximates the cross section of a cylindrical cavity, which is assumed to be long in relation to its diameter so that plane strain conditions hold. Due to the symmetry of the problem with respect to the vertical axis, and in order to reduce computational cost, only half of the domain is modelled (see probing instruments, usually in the range of tens of centimeters, and the range of infrastructure construction devices such as tunneling machines. The depth of the cavity (H) is measured from the surface to the center of the cavity and is varied between 5 and 50 times the cavity diameter (5 to 50 m). The height of the model (H m ) was set to H m ¼ H þ 25D c , while its width (W m ) was set to W m ¼ 1:5H m -these dimensions were chosen according to [54] to avoid boundary effects. Each model was created and meshed automatically using a Python routine that exploits the scripting capabilities of Abaqus. The model was split into two sections prior to meshing. A first region concentric to the cavity with a diameter of 6D c was assigned 4-node bi-linear elements arranged in a structured mesh with 36 elements around the cavity (see Fig. 1c. A second region covering the rest of the domain was assigned 3-node linear elements arranged in a non-structured mesh. The number of elements in the mesh ranged between 1, 415 and 4, 340 depending on the depth of the cavity (see

Model input parameters
We used the Mohr-Coulomb (MC) soil constitutive model because of its simplicity and flexibility. After creating the geometry of the model, soil properties were assigned to the elements of the mesh, including: soil density (d), Young's modulus (E), Poisson's ratio (m), Mohr-Coulomb (MC) friction angle (/) and MC dilation angle (w). In the following sections, we refer to the soil density (d) and unit weight (c) of the material interchangeably, keeping in mind that these two parameters are linearly related by where g is the gravity acceleration. All the elements of the mesh were assigned the same soil properties, the values of which were selected randomly from a uniform distribution spanning ranges suggested in [14,40] for frictional soils (silty sands to gravels)-see Table 1. The distributions of the different variables were assumed independent, which allowed us to ignore correlations between soil parameters.
A fixed value of MC cohesion (c) of 5 kPa was used in every simulation in order to improve the convergence of the models. Although there is no risk of achieving a state of stress at a corner of the MC yield surface in plane strain, the simulations were run with the Drucker-Prager (DP) model, using a matching smooth yield surface (e.g., [4,9,25,38,52]). The Python scripts developed to sample  constitutive parameters for the present study can thus be used as they are for future 3D simulations.

Simulation procedure
The computations were performed in two steps. First, the displacement of the cavity wall was fixed to zero and a geostatic stress step was used to create a linear stress gradient controlled by the unit weight of the material and increasing in the direction of gravity (see Fig. 1a, i.e., from the free surface (zero vertical stress) to the bottom of the simulation domain. The resulting horizontal stress was consistent with the conditions of the soil at rest. We verified that the lateral earth pressure coefficient was K o ¼ 1 À sinð/Þ from the simulation results after the geostatic step, which Abaqus performs without inducing deformations to the model. The density and MC friction angle of the material are two of the variables assigned randomly in the models, and therefore, our set of simulations covers a wide range of pre-expansion stress conditions at comparable cavity depths. The expansion of the cavity was simulated in a second step, displacing the nodes around the cavity radially away from the cavity center at a constant rate, keeping the cavity circular. The maximum radial expansion was set to a large value (10D c ) in order to push the simulation to the maximum expansion of the cavity before it fails. A total of 1,500 simulations were performed using a machine with a Dual Intel Xeon Gold 6148 (2.4 GHz, 3.7 GHz turbo) processors, and each simulation was run in serial instances of Abaqus Standard (implicit). In average, each simulation took about 5 minutes to complete for a total of about 125 hours of computing time.
Since the objective of the study is to test whether the data captured at the wall of the cavity during expansion can be used to infer soil behavior, we used the radial stress at each orientation around the cavity and at every value of radial expansion/deformation (r r ðr ¼ D c =2; h; r Þ) as a data set that can be given as input to the ML algorithm to back-calculate soil properties. Examples of r r ðr ¼ D c =2; h; r Þ data sets obtained by FEM simulation are shown in Fig. 2.

Machine learning algorithms
In our study, we focus on predicting the Mohr-Coulomb (MC) friction angle (/) and the Young's modulus (E). We assume that the far field stresses (r v ; r h ; K o ), cavity depth (H), and soil density (d) are known, as they can be inferred from the initial stress distribution (see Sect 4.1). Therefore, we used them together with the radial stress distribution during expansion (r r ðr ¼ D c =2; h; r Þ) as predictive variables of the tested models.
ML models with higher complexity tend to be more accurate, but less interpretable [34]. Besides accuracy, interpretability is also important in our study, as we are interested in inferring the relationship between cavity expansion and soil properties. Therefore, in order to find the simplest model that can capture the mechanism, we trained and evaluated 8 different machine learning approaches with increasing complexity (Table 2). In the first model, we used a mean score predictor as our naïve baseline model. This model simply uses the average of / in the training set (with n samples) to predict every sample j in the test set, as follows:

Data processing
We generated all our data through FEM simulations (See Sect. 2.2). For each simulation, H, d, / and E are scalar values; where n is the number of radial expansion steps in the FEM simulation (discretized values of r ), and the 37 columns refer to different orientations around the cavity with a spacing of 5 , each one corresponding to a node around the cavity. We note r r i;j the stress (numerical scalar) at node j at step i. We randomly split 1,363 simulation results into a training set with 817 simulations, a validation set with 273 simulations, and a test set with 273 simulations (6:2:2 ratio). We used the validation set to choose the optimal hyperparameters for some models (e.g., the regularization power in the linear regression model); we also monitored the performance of CNNs on the validation set during training and interrupted the training process when the performance stopped to improve in 20 consecutive epochs-common workflow to avoid overfitting [5]. Lastly, we evaluated ML algorithm performance by comparing the predictions of the ML models with the actual values calculated by FEM in the test set. We used the same dataset split to train and evaluate all 8 ML models.
The distribution of H, d, and / are shown in Fig. 3. For all models, we standardized H and d into a range of [0, 1] by linearly transforming them based on their minimums and maximums in the training set. For the stress field matrices r r , we explored different featurization methods. For CNNs, we treated r r as 2D gray-scale images where the stress field value at orientation j and expansion step i can be thought as a pixel value. CNNs are designed for image analysis-with convolution transformations, CNNs can learn the spatial relationships across pixels [29], which encode cavity locations and simulation steps in our framework. Our hypothesis is that the spatial relationship between cavity location and simulation step is useful for predicting soil properties. Since linear regression models and fully-connected neural networks cannot use 2D matrices as input features, we vectorized these 2D matrices of r r into 1D vectors by stacking matrix rows together (row-major flattening).

Linear models
We used a trivial mean-score predictor as a baseline model. The model computes the average / in the training set, and then directly uses it as prediction of all samples in the validation set and test set. Linear regression is a popular and interpretable approach to study the relationship between soil properties and cavity behavior [50]. We applied Lasso regularization [41] to all of our linear regressions, as it can shrink the coefficients of less predictive features to zero and improve model generalizability. The first regression model only uses two numerical values (cavity depth H and soil density d) as input features, while the second model adds vectorized stress fields r r as an extra feature ( Table 2).
The regularization power k is a hyper-parameter, where larger k imposes stronger regularization. We tune k on the validation set through grid-search using the Python package scikit-learn [32]; we choose the k that gives the smallest Mean Absolute Error (MAE) in our final model. The MAE is defined according to: where n is the size of the dataset,/ is the predicted value of a parameter (e.g., friction angle) and / is the true parameter value, taken as input in the FEM. Models with smaller MAE are more predictive.

Fully-connected neural network
We developed two simple fully-connected neural networks (NN), with two hidden layers each, using the Python package PyTorch [30]. Multi-layer NNs can approximate any continuous functions [6]. Therefore, compared to linear models (Sect. 3.2), NNs are more powerful ML models that can learn nonlinear relationships between the input features and MC friction angle /. Similar to our experimental design for linear models, one NN model only uses two input features: cavity depth H and soil density d, and the second NN uses H, d, and vectorized r r as input (Table 2). The hyper-parameters: hidden neuron numbers, learning rate, and batch size, are tuned on the validation set through grid-search. We chose the combination that yielded the smallest MAE score as our final model.

Convolutional neural network
We developed three CNN models that share a similar model architecture (Fig. 4) but have different featurizations with the Python package PyTorch [30]. In our CNN model, we have six convolutional layers where each layer has six 3 Â 3 kernels. After every 2 consecutive convolutional layers, we insert one max-pooling layer to introduce regularization in the model. Finally, we have two fully-connected layers to nonlinearly transform the image representations into a numerical value. One CNN takes the stress field images with dimension 256 Â 256 generated by resizing and standardizing r r matrices. The second CNN takes these 256 Â 256 images as input, and then standardized H and d are concatenated in the first fully-connected layer (Fig. 4). The third CNN has the same architecture as the second CNN, but for each row (cavity orientation), we linearly interpolated the values at the 37 cavity nodes to have values at every degree. For each column (radial expansion), we linearly interpolated the values in the range 0 to 1% to 100 data points. In that way, we increase the size of our images while keeping a consistent mapping, i.e., each row and column numbers refer to the same orientation and radial expansion across simulations, unlike the first and second CNN where we resized the images without interpolation.

Results and discussion
Our results are organized as follows. In Sect. 4.1, the preexpansion radial stress distribution at the cavity wall is used to estimate the far-field stresses around the cavity and unit weight of the soil. In Sect. 4.2, the radial stress distribution during expansion (r r ) is used to back-calculate the MC friction angle (/), while Sect. 4.3 uses a similar

Estimation of far-field stresses
The pre-expansion radial stress distribution at the cavity wall (r r ðr ¼ D c =2; h; e r ¼ 0Þ) is used to calculate the far field stresses in each simulation. An initial inspection of the initial stress distributions showed that the radial stress could be parametrized as a sum of sinusoidal/cosine signals. In fact, this observation was previously found and exploited by [24] for cavity expansion problems in elasticity. Figure 5 shows an example of the pre-expansion radial stress distribution for two simulations with similar far-field vertical stress r v but different far-field horizontal stress r h . Due to the presence of geostatic stress conditions, the initial radial stress distribution has a period of 2p (one revolution around the circular cavity). We used the following Fourier series to fit the initial stress distribution around the cavity: where h is the orientation angle measured from the bottom of the cavity. By choosing this convention we have r r ðhÞ ¼ r r ðÀhÞ. The order of the fit (j ¼ 4) is the smallest value that could accurately capture the absence of biaxial symmetry in the radial stress distribution. Every fit of the initial stress distribution has coefficients of determination R 2 [ 0:94. Next, from Eq. 3, and knowing the fit coefficients a j for every model, estimations for the radial stress at the invert (h ¼ 0), waist (h ¼ p=2) and crown (h ¼ p) of the cavity were found from the following relations: in which the subscripts B, W and C correspond to the invert (bottom), waist (side) and crown (top) of the cavity respectively. The vertical (r v ) far-field stress at the depth of the cavity center (H) can be estimated as the average between the radial stress at the crown and invert of the cavity. Similarly, the horizontal (r h ) far-field stress at the depth H can be estimated as radial stress at the waist of the cavity. And from there, assuming that the depth of the cavity is known and that the stress state follows geostatic conditions, the unit weight of the material and the lateral earth pressure coefficient can be estimated too, according to the following equations: The estimation of the far-field stresses (respectively, soil unit weight and lateral earth pressure coefficient) is assessed against the actual values set as input in the FEM simulations in Fig. 6 (respectively, Fig. 7). Lastly, the gradient of stresses present under geostatic stress conditions causes a lack of symmetry of the radial response between the top and bottom of the cavity. In order to compare and quantify the symmetry (or lack thereof) we define the coefficient of symmetry S 2 as follows: where r r ðhÞ is the radial stress at an orientation h around the cavity wall, and r r is the mean radial stress across all the orientations between ½0; p. S 2 is similar to the coefficient of determination R 2 and measures the difference between the radial stress at the top (waist to crown) and the bottom (waist to invert) of the cavity, and compares it to the difference between the entire radial stress distribution at the cavity wall and its average value. Similar to R 2 , the coefficient S 2 has a maximum value of 1 when there is perfect symmetry between both sides of the distribution. Figure 8 shows the distribution of the values of S 2 as a function of the depth of the cavity H and the lateral stress coefficient K o (variables which had the highest influence on the symmetry index).
Results from Fig. 8 show that the depth of the cavity has a significant effect on the symmetry of the response. This was expected, since the stress gradient across the height of the cavity is significant for shallow cavities, and reduces in importance as the depth of the cavity increases. In fact, models in which the depth was more than 15 times the cavity diameter (15m) had S 2 [ ¼ 0:95, suggesting that at such depths, the assumption of biaxial stress conditions may be acceptable. However, as the depth of the cavities decreases (i.e., the cavities are more shallow) there is a large variability of the symmetry of the response, which is controlled by the value of the lateral earth pressure coefficient. Low values of K o display a higher symmetry of the  pre-expansion stress distribution due to a larger difference between the magnitude of the vertical and horizontal farfield stress.

Estimation of MC friction angle /
In order to back-calculate the MC friction angle / of the material, we used the eight ML models described in Table 2. As explained above, the models use known parameters: cavity depth H, soil density d, and the radial stress distribution at the cavity wall r r ðr ¼ D c =2; h; r Þ) in order to predict the MC friction angle /. As the dependent variable / is a continuous value, we use the MAE to measure the performance of the different models (see Eq. 2). The MAEs during training, validation (tuning of hyperparameters) and testing are reported in Table 3. Results show that increasing the complexity of the models results in better performance. For instance, with the same input parameters, fully-connected neural networks have better performance than linear regression models, and CNNs outperforms fully-connected neural networks across different evaluation subsets. We see the largest performance improvement when including CNN models, which suggests that the nonlinear spatial relationships across cavity locations and simulation steps are critical to accurately predict the MC friction angle /. The resulting test MAE of the CNN model is 0:49 . This means that in average, the error in the friction angle estimation/ is within 0.5 degrees from the true value /.
Interestingly, using linear interpolation to modify the input stress field images does not improve the CNN performance. The interpolation modifies the grey scale images obtained from r r , in such a way that every row number (orientation around the cavity) and every column number correspond to the same values across simulations. This fact suggests that the performance of the model is controlled by the 'shape' of the radial stress distribution, not the magnitude of the expansion.

Estimation of young's modulus E
In this section, we apply some of the models used in Sect. 4.2 to infer the value of the Young's modulus (E) of the soil mass without further training, hence testing the generalizability of the framework. We introduce two scalar parameters, M w and M c , which correspond to the slope of the radial stress/strain curve between the initial radial stress and the stress at a radial deformation e r ¼ 0:1%, measured according to the following equation: where the subscript j corresponds to the orientations at the waist (M w ) and crown (M c ). We applied the data processing pipeline (Sect. 3.1), model architectures of linear models (Sect. 3.2) and CNN (Sect. 3.4) to predict E, this time with the addition of M w and M c , calculated for each one of the simulations, and fed to our CNN models together with H and d before the fully-connected layers (Fig. 4). M w and M c are intuitive measures of the stiffness of the material around the cavity and their inclusion significantly improved the performance of the prediction models. The Mean Absolute Percentage Error (MAPE) is calculated as: The ML experiment results are summarized in Table 4, where we see a similar pattern of model performance as in the estimation of MC friction angle /. It is worth noting that in this experiment, we use a slightly different metric to measure the performance of the predictor. The MAPE (Eq. 8) is a relative measurement of error between the predicted Young's modulusÊ compared to its true value E. Therefore, the performance scores in Table 4 have a different scale than the scores in Table 3. The performance of the models resonates with our findings in Table 3: (1) The CNN model yields the best prediction performance; (2) The inclusion of the stress field distribution r r improves the accuracy of the predictions of E; (3) The MAPE of the estimation is under 2%.

Influence of noise in ML models
The ML models presented in Sects. 4.2 and 4.3 are tested and trained with 'clean' data obtained from FEM simulations. However, to test the applicability of this method to data acquired from field measurement we test the influence of noise in the accuracy of the CNN models (which yield the highest accuracy).
Gaussian noise, i.e., random values sampled from a normal distribution N(0, std) with zero mean and standard deviation std, are added to each pixel of the stress field images r r . Ten different levels of noise are tested by increasing std from 0.1 to 1 (std ¼ 0:1; 0:2; :::; 1:0). Resulting pixel values outside the normalized range [0, 1] are assigned values of 0 and 1 respectively.
Resulting datasets ðr r Þ std ¼ r r þ Nð0; stdÞ are used to train, validate and test the CNN models that predict / and E. The accuracy of each model, in terms of the MAPE of the estimation of E and the MAE of the estimation of /, for each value of std, are summarized in Fig. 9.
The estimation of / is more sensitive to noise, increasing monotonically from a MAE of 0:48 (with no noise added), to 2:7 with std ¼ 1:0. The MAPE of the estimation of E increases from 1:84% (with no noise) to about 2:3% for std ! 0:2, after which the estimation appears to plateau.
The perceived lower sensitivity to noise of the estimation of E is partially attributed to the fact that the MAPE is a percentage metric, as opposed to the MAE used for /. In addition, the high performance of the model suggests that the extra features M w and M c improve the robustness of the model, offsetting the influence of std.

Conclusions
This manuscript presents a novel framework that couples FEM and ML to back-calculate the far-field stresses and soil properties from the radial stress field (r r ) at the wall of a circular cavity during displacement-controlled expansion under non-negligible geostatic stress gradients. We exploited the resemblance of the initial radial stress distribution around the cavity and Fourier-like functions to propose a simple, yet accurate way to back-calculate farfield stresses around the cavity. Moreover, such simple representation of the initial stress distribution (controlled by 4 scalar parameters), can also be used to assess the existence of geostatic or biaxial stress conditions, based on the symmetry (or lack thereof) of the stress distributions between the top and bottom parts of the cavity. We then evaluated eight different ML models of increasing complexity in order to find the simplest, most interpretable method that can accurately back-calculate the MC friction angle of the material around the cavity (/). Results show that using image-like data that encodes the radial stress distribution as a function of radial deformation and orientation angle significantly improves the accuracy of the models. Interestingly, we observed that the predictive power of the encoded images to predict / is controlled by the 'shape' of r r rather than by its magnitude. Lastly, we tested the flexibility of the prediction framework used to back-calculate /, by re-using it to predict the Young's modulus of the material (E). Having a consistent and flexible framework that can consistently back-calculate different soil properties significantly reduces the time that must be invested in data preparation and model selection.
The best-performing model has a mean absolute percentage error (MAPE) under 2%, using a convolutional neural network that uses r r and intuitive cavity stiffness parameters (M w , M c ). Although there is no practical way to interpret the mechanisms occurring inside the neural network, we hypothesise that the encoded images from r r and the parameters M w , M c provide 'shape' and magnitude information respectively, which is then combined to estimate the Young's modulus of the material. The CNN models proposed in this study were found to have little sensitivity to Gaussian noise: the MAE of / (respectively, the MAPE of E) increases from 0.5 o (respectively, from 1.84%) without noise to 2.75 o (respectively, to 2.3%) with noise of unit standard deviation. This relative insensitivity to noise is indicative of the robustness of the proposed CNNs.
The merits of the current study are not limited to the particular problem of cavity expansion nor to the use of machine learning algorithms in geotechnical problems. The current study can indeed be used as a guide to generate, prepare and transform mechanical data (i.e., radial stress distribution r r ) for use as training data in ML algorithms. To that end, we have used different fitting techniques and prediction models, all informed by a mechanical understanding of the problem at stake, and intuitive choices that may improve the performance of prediction algorithms. Future work spanning from this study includes testing the effect of less ideal conditions encountered in physical experiments and validating the methods presented in this study against actual experimental and/or field data. Another possible extension of this work is the exploration of other applications where no analytical solutions exist yet. Still, as with any other study that applies data-science-based methods to mechanical problems, careful guiding, interpretation and validation of the results is fundamental. material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Funding was provided by UKRI NERC grant NE/T010983/1.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Declarations
Conflict of interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.