Introduction

In geotechnical engineering, one of the most major investigation discipline is the determination of limit state of the system shallow foundation-precedent soil in terms of failure pressure load and respective displacement field. In the corresponding scientific publications the aforementioned discipline is analyzed considering the physical system as deterministic starting from the classical Terzaghi ideas [1], scientific publications in later years tackle the problem of the estimation of the response of the aforementioned physical system through simple material laws like the Coulomb friction law among others[2,3,4,5,6]. In the more recent years, more complicated constitutive modelling and more complex FEM or other computational models have been employed to increase the qualitative and quantitative estimation of the response of the footing settlement and the soil mass.[7,8,9,10,11,12]. Furthermore, there are scientific publications that investigate the problem under discussion, adopting the random nature of the input variables starting from the seminal work of [13, 14]. Following the evolution of the computer science, more complicated works that consider the stochastic nature of the physical problem analysed have been written [11, 15,16,17,18,19,20,21,22,23,24,25]. By means of a deterministic system one can define the mechanism of the soil domain once it fails. Subsequently, the failure loads can be calculated and normalized. This investigation has been incorporated to the foundation design laws by using the foundation shape variables \(S_i\) where i=q,c,\(\gamma\) resulting for the shape variables accounting for the influence of a lateral vertical load, the cohesion of the soil and the total wight of the geomaterial and the size of the settlement. Also, the friction variables \(N_i\) have been also proposed for concluding the estimation of the limit load foundation. When considering the physical system as uncertain, the quantification of the influence of the randomness of input material variables, like Young Modulus and the hydraulic conductivity can be computed with a reliability and in a rapid process following the vast scientific progress in computational mechanics, materials and computer science. The relative scientific literature have been tackling the problem of variability in space of the input through the random field creation through the spectral representation method, the Karhunen Loeve series [13,14,15, 17, 18, 20, 23, 25] and by incorporating a set of deterministic shape functions and nodal random variables [11, 16, 19, 21, 22, 24]. The formulation of random input vectors may be done with the usage of independent pseudorandom procedures or by using importance sampling methods like the Latin hypercube sampling (LHS) [26, 27]. The variability estimation in geomechanics and the determination of the limit situation of the soil mass, has led to reliability analysis and estimation for the shallow settlement construction with the aid of the probability density function (PDF) of the failure limit load pressure and displacement field parallel to the randomness of the starting point of the Meyerhoff spline [28,29,30,31].

Machine Learning (ML) methods like neural networks (NN) have been increasingly engaged in all aspects of engineering as stated in [32,33,34,35,36] in the precedent years. The profound research of [35] has led to the incorporation of the postulation of the so called Physics Informed Neural Network (PINN), enlightening its prevalences in terms of reliability and calculation time. The concept of predicting the behaviour of a collection of continuum bodies without performing the direct analysis but with a model like NN is of vast importance in sciences and engineering for implementing a safe and economic construction. The enhancement of the NN with collected evidential experimental in situ data applications or computer simulations has become an alleviated difficulty procedure and as a consequence the increase of accuracy leads to an affordable computational cost, which is one profound gain of the method. Furthermore, high integrated deep learning tools like Tensorflow and Pytorch [37, 38] provide parallel computing capabilities. Constructing a PINN in these open source programs as well as implementing it can give a very large state of efficiency and reliability, which in some probles can be more than the corresponding of the conventional finite element method (FEM). Alternative ways are the eXtended PINNs (XPINNs) which are founded by [39], Variational PINNs [40] as well as Parallel PINNs [41]. In geotechnical engineering this theory has been adopted to various problems. A brief presentation of them are constitutive modelling formulation [42, 43], soil parameter determination [44,45,46], prediction of cohesionless soil liquefaction [47], and infrastructure response like tunnels [48,49,50,51,52,53] or behaviour of a building when landslide occurs [54, 55]. Regarding shallow foundations, surrogate modelling have been formulated in caisson foundation located on cohesionless soil mass [56]. All the referenced literature has been leading to vast amount of data collection and amalgamation in aspects like the stresses, the strains, the displacements, the limit load or the failure envelope. These aspects may refer to a certain Gauss point of the total soil mass. The aforementioned scientific progress can aid the engineering design and the appropriated decisions to be taken in a faster and a more reliable way. By enriching the data collected using detailed analyses or by in situ investigation the algorithmic steadiness and efficiency of the computed estimations augment.

A collection of FNNs for the prediction of the limit state of shallow foundations on cohesive geomaterials are given, from precedent Monte Carlo analyses of the authors [28,29,30,31]. The selection of the FNN method for the formulation of the Machine Learning models was done because from several alternative methods that were applied, like the Convolutional Neural Networks and the Random Forest Minimization procedure for obtaining the model tuning variables, the FNN modelling provided the minimum \(L_2\) error after the final convergence of each model. The material values considered as stochastic random variables and random fields, are the parameter of elasticity, namely the compressibility factor \(\kappa\), the parameter of plastic evolution, namely the critical state line slope c and the material governing the Darcy flow law which is the permeability k. These FNN models have as input parameters the eccentricity in X and Y global vectors along the area of the foundations as well as the angle of the shallow foundation with respect to the horizontal direction. The controlled quantities are the axial force, the horizontal and vertical displacement of the settlement as well as the corresponding rotation at limit state. The loads are static to non porous and porous soil mass which lead to the u-p numerical formulation to be solved, in all load cases. Furthermore, the material constitutive modelling proposed by [57] results to reliable FNNs. A convergence analysis for the training epochs is portrayed, justifying the quick convergence, the reliability of the model and its adaptivity to be enriched with more data. Smaller values of the ultimate axial load \(N_{ult}\) and greater values of the failure displacements and rotations, which are the most unfavourable situations, are given from the FNNs formed through Monte Carlo simulations with a Karhunen Loeve sum, when considering the pore-soil-fluid interaction. When the water presence is negligible larger displacements are estimated when \(\kappa\) is linear along the depth of the soil mass. In this paper, initially the FNN theory is presented in a brief way. The deterministic physical problem and u-p formulation is portrayed afterwards alongside with the material constitutive modelling adopted. Subsequently, the constructed FNNs are given and a discussion is given before the final concluding remarks of the work.

Construction of a feed forward neural network

Feed-forward neural network (FNN) is defined as a congregation of unified processing parts called neurons, incorporating an input, an output and a series of intermediary concealed layers. Consider \(NN^{k_l}: \ \mathbb {R}^{dd_0} \longrightarrow \mathbb {R}^{dd_{k_l+1}}\) to be a FNN with \(k_l\) concealed layers, with each concealed layer incorporating \(nn_h\) neurons, for \(h=1,2,...,k_l\). The neurons of the layers of the input and output are respectively, \(nn_0 =dd_0\) and \(nn_{k_l+1} = dd_{k_l+1}\). In each layer apart from the input a bias vector and a weight matrix, signed as \(\varvec{We}_j\) and \(\varvec{Bi}_j\), respectively, are assigned; these are the parameters of the model. The input is signed as \(\varvec{u}_0 \in \mathbb {R}^{d_0}\) and the output vector of the \(h^{th}\) layer as \(\varvec{u}_h \in \mathbb {R}^{d_h}\), for \(h=1, 2,..., k_l+1\). A FNN scheme with one concealed layer is depicted in Fig. 1.

Fig. 1
figure 1

A neural network constructed through the feed forward procedure with one conealed layer

An intermediate layer, h, may be prescribed through recursive equation:

$$\begin{aligned} \varvec{u}_{h} = D\delta _{h}(\varvec{We}_{j}\varvec{z}_{h-1} + \varvec{Bi}_{h}), \hspace{10pt} \forall h \in \{1,2,...,k_l+1\} \end{aligned}$$
(1)

\(D\delta _h(\cdot )\) is a non-linear relation and is used layer-wise. Subsequently, the application of a FNN may be regarded as a mapping procedure for data of the input \(\varvec{u}_0 \in \mathbb {R}^{dd_0}\) to data of the output \(\varvec{u}_{k_l+1} \in \mathbb {R}^{dd_{k_l+1}}\), with the use of (1).

The selection of the model hyperparameters through the method known as supervised learning. In this method, the FNN is fed with examples, having a value at the input and at the output value called as flag value, and then the hyperparameters are altered with target to have the smallest divergence, among the target variables and its estimated outputs. The divergence is estimated through a function of loss estimation, \(E(\varvec{We}; \varvec{Bi})\), like the mean squared norm. If a continuous function is considered and a dataset \(\{\varvec{o_1}^{(i)}, \varvec{t_1}^{(i)}\}_{i=1}^{N_d}\), is defined, the loss estimation function is given hereinafter:

$$\begin{aligned} E(\varvec{We}; \varvec{Bi}) = \frac{1}{N_d} \sum _{i=1}^{N_d} {|\varvec{u}_{k_l+1}(\varvec{o_1}^{(i)}) - \varvec{t_1}^{(i)}|}^2 \end{aligned}$$
(2)

where \(\{\varvec{o_1}^{(i)}\}_{i=1}^{N_d}\) are the inputs and \(\{\varvec{t_1}^{(i)}\}_{i=1}^{N_d}\) the targets for \(N_d\) data vector length.

The activation functions are nonlinear, subsequently the optimization of (2) is a not a convex problem, and it may only be solved with numerical solution schemes that are nonlinear and iterative like quasi Newton schemes [58] and stochastic gradient methods [59]. The FNN formulation is adopted hereinafter for the derivation of the NN that estimate the limit axial force and the displacement field in footing settlement situated in cohesive geomaterials soil domain

The deterministic physical system. The u-p numerical scheme and the stress strain material law

A soil mass with cohesion which is loaded may exhibit interactivity between soil and pores in total or partly saturated volumes, named porous media. The numericall problems that assimilate their behaviour are named porous media problems. The general application mathematical system is called the Biot set of partial differential equations. In small frequency load cases, the Biot problem is reconstructed to an alleviated computational expense calculation algorithm. The u-p set of equations, comprising the set of the total mass of fluid and equilibrium of the momentum of the soil with Darcian porous behaviour, with the relations acquiring for the soil boundaries and the constitutive law of the stresses and strains is a stable calculating scheme compared to the Biot problem.The u-p set of equations is conducted hereinafter as static loads are subjected to the clayey soil mass. In most structured geomaterials the u-p scheme is appropriate in most of the physical load cases as proven in ([60]).

The discrete setup of u-p numerical scheme is given with the Galerkin method and the set of equations are: [61, 62]:

$$\begin{aligned} \textbf{M}\ddot{\textbf{x}}+\textbf{C}\dot{\textbf{x}}+\textbf{K}\textbf{x}=\textbf{f} \end{aligned}$$
(3)

The total matrix for stiffness \(\textbf{K}\), matrix for mass \(\textbf{M}\) and matrix for damping \(\textbf{C}\) are given below:

$$\begin{aligned} \textbf{M}=\begin{bmatrix} \mathbf {M_S} &{} \textbf{0} \\ \textbf{0} &{} \textbf{0} \end{bmatrix} \end{aligned}$$
(4)

\(\mathbf {M_S}\) accounts for solid skeleton mass consistent matrix and \(\rho _d\) is the total soil density and with \(\mathbf { N^u }\) is the displacement field’s shape functions so the mass matrix is calculated as:

$$\begin{aligned} \mathbf {M_S}= & {} \int _{V} \mathbf { N^u } ^T \mathbf { N^u } \rho _d dv \end{aligned}$$
(5)
$$\begin{aligned} \textbf{C}= & {} \begin{bmatrix} \mathbf {C_S} &{} \textbf{0} \\ \mathbf {Q^T_c} &{} \textbf{S} \end{bmatrix} \end{aligned}$$
(6)

\(\mathbf {C_S}\) accounts for solid skeleton damping Rayleigh matrix whereas

$$\begin{aligned} \textbf{K}=\begin{bmatrix} \mathbf {K_S} &{} \mathbf {-Q_c} \\ \textbf{0} &{} \textbf{H} \end{bmatrix} \end{aligned}$$
(7)

\(\mathbf {K_S}\) accounts for solid skeleton stiffness matrix. \(\textbf{E, B}\) account for the elasticity and deformation matrices. Consequently, \(\mathbf {K_S}\) is the well known:

$$\begin{aligned} \mathbf {K_S}=\int _{V} \textbf{ B} ^T \textbf{ E B} dv \end{aligned}$$
(8)

The total matrices have also the following components. The matrix that accounts for coupling the set of equations is \(\mathbf {Q_c}=\int _{V} \textbf{B}^T \textbf{m} \mathbf {N^p} dv\) where \(\textbf{m}\) stands for the unity matrix. The permeability matrix, where if \(\textbf{k}\) is the matrix of permeability, then \(\textbf{H}=\int _{V} (\bigtriangledown \mathbf {N^p})^T \textbf{k} \bigtriangledown \mathbf {N^p} dv\). The saturation matrix, where if where \(\mathbf {N^P}\) are the shape functions for pore pressure and Q is related directly from the soil skeleton and pore fluid bulk moduli, and \(\mathbf {S=\int _{V} N^p} \frac{1}{Q}\mathbf { N^p } dv\). In conclusion, the loading vector divided by the total mixture density \(\textbf{b}\), provides an equivalent force vector \(\mathbf {f_S}=\int _{V} (\mathbf {N^p})^T \bigtriangledown ^T ({\textbf {k}} {\textbf {b}}) dv\). This numerical set of equations is having a solution through contemporary integration procedures like Newmark method. Furthermore, the force and the independent variables array are composed:

$$\begin{aligned} \textbf{f}=\begin{bmatrix} \mathbf {f_S} \\ \textbf{0} \end{bmatrix} \textbf{x}=\begin{bmatrix} \textbf{u} \\ \textbf{p} \end{bmatrix} \end{aligned}$$
(9)

Stress–strain law. Plastic yield function and bond strength function

The material constitutive model incorporated hereinafter is a modified Cam Clay type yield function from critical state theory for structured cohesive soils. The stresses signed comprise the solid skeleton effective stresses instead of the respective stresses at the pores. The model defines two surfaces, the plastic yield function (PYF) for the elastic region and the bond strength function (BSF) which portrays the acceptable places in which PYF may be. ([57, 63,64,65,66]). BSF has size related to the shape of cohesive microslates. A stress tensor that lies within BSF circumference, the deterioration gradient of the cohesive soil is at the largest value. The functions are in elleipsoidal shape and therefore may only be intersected to a one point.

The mathematical representation of a function of this model is given by:

$$\begin{aligned} f_h(p_{hy},\mathbf {s_d},p_{L-hy},\mathbf {s_{L-d}}, a_0)=\frac{1}{c^2} (\mathbf {s-s_{L-d}}):(\mathbf {s-s_{L-d}})+(p_{hy}-p_{L-hy})^2-(\xi _a a_0)^2=0 \end{aligned}$$
(10)

For \(\mathbf {\sigma }\), stress the volumetric part \(p_{hy}\) and a distortional part \(\mathbf {s_d}\) whilst the point L comprising the center of the elliptical shape consists of the volumetric part \(p_{L-hy}\) and the distortional part \(\mathbf {s_{L-d}}\). \(a_0\) is the halfsize of the larger diameter of BSF and the reduction of the PYF in relation to BSF is through the value of \(\xi _a\). If \(\mathbf {s_{L-d}=0}\), \(p_{L-hy}=a_0\) then \(\xi _a=1\) and BSF is:

$$\begin{aligned} f_h(p_{hy},\mathbf {s_d},p_{L-hy},\mathbf {s_{L-d}},a_0) =F(p_{hy},\mathbf {s_d},a_0)=\frac{1}{c^2} \textbf{s}:\textbf{s}+(p_{hy}-a_0)^2-a_0^2=0 \end{aligned}$$
(11)

Nonetheless, when the volumetric part diverges from \(a_0\) and the distortional part of the center of the ellipse is not zero subsequently PYF is the following:

$$\begin{aligned} f_g(p_{hy},\mathbf {s_d},p_{L-hy},\mathbf {s_{L-d}},a_0) =f_p(p_{hy},\mathbf {s_d},p_{L-hy},\mathbf {s_{L-d}},a_0) \end{aligned}$$
(12)

The elasticity of the soil is considered as having isotropically poroelastic behaviour. The volumetric deformity modulus, which is directly related to the shear deformity modulus, as a consequence of a fixed Poisson ratio is as follows:

$$\begin{aligned} K_{bulk}=\frac{\nu p_{hy}}{\kappa } \end{aligned}$$
(13)

\(\nu\) is the specific volume of the soil.

The framework analysed in chapters 2-4 is adopted to the analyses portrayed in chapter 5 and was chosen in particular in order to maximize the qualitative and quantitative value of the numerical results accuracy and the reliability of the constructed FNN. The medium stress strain relation assumption with the stochastic finite element method implementation increases the reliability of the variability estimation and the statistical moments estimation and subsequenlty the NN s precision with small calculation cost in relation of the epochs needed for training.

Data accumulation from Monte Carlo stochastic analyses and formulation of the feed forward neural networks for the prediction of footing settlement limit state.

Data acquisition and numerical analyses performed

The aforementioned numerical and physical framework is used in porous medium simulations as depicted in Figs. 2 and 3 and are defined by relations (3). The output response are the perpendicular load to the area of the shallow foundation and its nodal displacements in global vectors X (horizontal) and Z (vertical) of the points A, B, C, D of Figs. 2 and 3 and the rotation of the shallow foundation. The implementation of the foundation is by only embedding the nodal forces that are providing the same energy as the linear stress distribution if a certain value of axial force and moments is assumed. The forces are the nodal values \(q_1-q_4\) that correspond to B, C, A and D. The application of the forces is in the surface ABCD, of size (1X1 \(m^2\)). The vector of eccentricities alongside and the obliquity of the footing force \((e_x=\frac{M_x}{N}, e_y=\frac{M_y}{N}, \theta _q)\) may be adopted with the given hereinafter vectors: \((0,0,90^o), (\frac{h_0}{12},0,90^0), (\frac{h_0}{6},0,90^0), (\frac{h_0}{3},0,90^0), (\frac{h_0}{6},\frac{h_0}{6},90^0), (\frac{h_0}{3},\frac{h_0}{6},90^0), (\frac{h_0}{3},\frac{h_0}{3},90^0), (0,0,0^o)\),

\((0,0,30^o), (0,0,45^o), (0,0,60^o)\), \(h_0\) is the respective length of the footing settlement for each non central force. The FEM mesh is formed with 8 node hexahedral finite elements with linear interpolation functions for pore pressures and displacements, which results to quantitatively reliable response estimation ([67, 68]). The FEM mesh size in X, Y, Z global vectors are \(l_x=5 m, l_y=5 m, l_z=4 m\). The mesh used was put in contrast to more detailed meshes and was proven that the difference in the output response is in the error magnitude of 5\(\%\), which is acceptable. The geostatic stresses are set as initial conditions: \(\sigma _v=\gamma z\), \(\sigma _x=\sigma _y=100 kPa\). These initial conditions were set in order to comply to an overconsolidated analyzed clay and to a reliable bulk modulus in the order of magnitude of 20 MPa. The total simulation time is one day for quasi static conditions to be attained and the time fragment is dt=0.001 d. The displacement response over time was portrayed where it was confirmed the soil mass exhibit a static behaviour. The boundary relations are: \(\textbf{u}_{x}(z=h)=\textbf{u}_{y}(z=h)=\textbf{u}_{z}(z=h)=\textbf{0}\) and the remaining boundary surfaces are free to respond. The input material variability constitutes of the compressibility factor \(\kappa\), the critical state line inclination c and the permeability factor k.

Fig. 2
figure 2

Illustration of the loading of the problem for non oblique forces. \(q_1\) is the maximum of \(q_1, q_2, q_3, q_4\) and \(q_2\) is the minimum of \(q_1, q_2, q_3, q_4\). Furthermore, \(l_x=5\,m, l_y=5\,m, l_z=4\,m\)

Fig. 3
figure 3

Illustration of the loading of the problem for non oblique forces. \(q_1\) is the maximum of \(q_1, q_2, q_3, q_4\) and \(q_2\) is the minimum of \(q_1, q_2, q_3, q_4\). Furthermore, \(l_x=5\,m, l_y=5\,m, l_z=4\,m\)

The compressibility factor \(\kappa\), may have spatial distribution with respect tor depth as constant (\(\kappa _{{C}}\)), or linear ( \(\kappa _{{L}}\)). In the \(\kappa _{{L}}\) case, \(\kappa _{z=0}=0.008686\) and the ratio R follows the truncated normal PDF with \(R=\frac{\kappa _{z=max}}{\kappa _{z=0}}\). The linear equation of the compressibility factor as a function of depth is a usual conjecture of this material variable. This enlightens that with the increase of depth bulk modulus is increased \(\kappa\) is decreased. Additionally, hereinafter the value for \(\kappa\) at the top is assumed deterministic because in the upper place of the soil mass it is easy to determine a value for the material input value thus it can be set as deterministic. The ratio has mean value \(\mu _{R}=0.469\) and the respective coefficient of variation (CoV) is 0.25, so \(\kappa _{z=max,mean}=0.004074\). These values are used for the solid mixture stiffness to comply to a shear velocity of 200 \(\frac{m}{s}\). Bulk and the shear moduli are analogous, as a subsequence of constant Poisson ratio. Thus, \(\kappa\) is related with the shear velocity. If \(\kappa\) is constant through the soil mass, the mean value of \(\kappa\) is \(\kappa _{\mu }=0.004074\) and the CoV is 0.25.

The critical state slope c is set as constant spatially and it may be calculated from a random variable PDF or with a non stochastic value. In a random variable occasion \(c_{{R}}\), the friction angle \(\phi _0\) complies with the truncated normal continuous random variable with the mean value is \(\mu _{\phi }=23^o\) and the standard deviation is \(\sigma _{\phi }=2^o\) and the random vector of \(\phi _0\) is collected with the LHS method. The set of \(\phi _0\) has values that apply to most of natural clays ([57]). The \(\phi _0\) random discrete set is attained with the standard normal distribution sample collection with the Latin Hypercube Sampling and then are reformed to the truncated normal PDF. Thus, c is estimated through \(c=\sqrt{\frac{2}{3}} \frac{6sin(\phi _0)}{3-sin(\phi _0)}\). If c is set as deterministic and signed as, \(c_{{D}}\), c=0.7336 for friction angle \(\mu _{\phi }=23^o\).

The permeability k, is set as fixed spatially. The absolute value is set to follow a random continuous PDF or as non stochastic. In a random variable case \(k_{{R}}\), the mean value is \(\mu _{k}=10^{-8}\) and the CoV is \(CoV_{k}=0.25\). If k is set as deterministic \(k_{{D}}\), \(k=10^{-8}\). It should be noted hereby that the chosen values of the material constitutive parameters analyzed, as well as, the material constitutive modelling calibration parameters were chosen to account for a cohesive soil of an overconsolidated clay with OCR=4. This accounts to a fairly stiff clay and subsequently this can explain the solution of shallow foundation for the support of an infrastructure.

Table 1 Simulations of non porous medium accumulated
Table 2 Simulations of porous medium accumulated

The simulations performed are of two kinds. The solid simulations, where the pore-fluid interaction is of neglecting magnitude and the porous simulations, where the fluid soil mixture interconnection is considerable. The solid simulations performed, signed with (\(\textbf{S}\)), are portrayed in Table 1, using linear (L) or constant (C) assumption for \(\kappa\) and deterministic (D) or random variable (R) relation for c. The porous simulations performed are depicted in Table 2, using constant (C), linear (L) and random field (RF) case for \(\kappa\). Deterministic (D), random continuous variable (R) and random process field (RF) relation for c. Deterministic (D), random process field (RF) and random continuous variable (R) for k. The abbreviation of the Monte Carlo analysis as well as abbreviation of the neural network is also given.

In the stochastic functions the mean values are applied: \(\kappa _{mean}=0.008686\), \(c_{mean}\)=0.7336 and \(k_{mean}=10^{-8} \frac{m^3\,s}{Mgr}\) according to ([60, 62, 69]). The standard deviations applied are: \(\sigma _{\kappa }=0.25\kappa _{mean}\), \(\sigma _{\phi }=2^o\) and \(\sigma _k=0.25 k_{mean}\). The autocorrelation relation is \(C_h=e^{\frac{-|\Delta x|}{b}}\) and is chosen in all random process simulations. Correlation length value has three alternatives, b=2 m ( \(k_{{RF}{{2}}}\)), b=4 m ( \(k_{{RF}{{4}}}\)) and b=8 m ( \(k_{{RF}{{8}}}\)). The spatial representations for \(\kappa\), \(\kappa _{{L}}\) and \(\kappa _{{C}}\), with the random variable simulations for all material variables comply to random variable analysis. A constant deterministic simulation for c is used. When a random process field (RF) is selected, the Karhunen Loeve sum is used and an arbitral representation of the process is defined with the finite sum of the method using an autocovariance function of exponential type. Random variables \(\xi\) complying to standard normal PDF were attained through the Latin Hypercube sampling and then the input random vectors are formed. The eigenfunctions and eigenvalues of \(C_h\) are given from a direct set of equations due to the integrodifferential Fredholm eigenproblem having an analytical solution for this type of \(C_h\). An illustration of the stochastic processes used hereinafter, for different correlation lengths, is described in Fig. 4. It should be noted that for convenience, the normalized functions with respect to mean value are portrayed in the realizations.

Fig. 4
figure 4

Realization of the random fields constructed from the Karhunen Loeve series, when the correlation lengths are 2 m, 4 m, 8 m. In the X axis the soil depth with respect to the upper soil surface is denoted in m. In Y axis the normalized value of a certain material parameter (\(\kappa\), c, k) with respect to the corresponding mean value is given

The forces are static total time and the timestep correspond to a static simulation of the analysis domain and eight eigenfunctions of \(C_h\) are adopted. Limit state is considered if a Gaussian Point exhibits a softening response or in other words: H0, H corresponds to the modulus of plasticity. A Monte Carlo analysis was done for 100 deterministic analyses and the input random random vectors were attained with the Latin Hypercube Sampling. The random output vector size was proven sufficient for attaining convergence for the statistical moments of the response of the output as is described and confirmed in the preceding literature of the authors. 1000 samples have been simulated from a Monte Carlo analysis and assessed for the convergence in contrast with the two first statstical moments for 100 deterministic simulations. The divergence in relative terms is no more than \(5\%\) thus 100 samples are adequate for predicting the statistical moments of the output response. The correlation of the material variables is only with themselves thus the matrix of the correlation is a diagonal matrix.

From the Monte Carlo simulations depicted the FNN procedure illustrated in 2 is adopted in order to give the FNN that estimate the limit axial force \(N_{ult}\), the vertical and horizontal displacements \(u_{y}, u_{x}\) respectively and the rotation at limit state of the footing settlement Rotation. The training aimed to minimize the error described by 2. A Monte Carlo analysis results to a distinct series of FNNs and 4 distinct FNNs for each output is formed. This has been done for porous and non porous medium. The collection of the FNNs enlightens not only the prediction of the soil domain response but also illustrates the efect of the uncertainty in space of the input material parameters and portrays the qualitative influence of each input variability to corresponding output uncertainty.

The validation of the stochastic analyses, as referenced in the respective publications [28,29,30,31] is the following. At first the deterministic finite element model is validated through comparison of the results provided from the Open Source Computational Mehanics code MSolve of National Technical University of Athens, which the link may be found in the Declarations section. The comparison was done with the respective results from ANSYS. Subsequently, since the deterministic model was verified the stochastic model was verified through the comparison of the probability density functions of the output with the analytical or semianalytical solutions for the variables discussed. Finally, since the stochastic modelling is verified the Neural Network validity was verified through the validation dataset. The dataset of training was done with all the mean values of the points discussed in the first paragraph of section 5.1. The number of the dataset input vector is 11, referring to the triad of eccentricities and obliquity angle \((e_x=\frac{M_x}{N}, e_y=\frac{M_y}{N}, \theta _q)\). The validation dataset was formed from respective analyses in the vicinity of the 11 points. Consequently, the standard validation procedure in the FNN theory was adopted. Subsequently, all deterministic, stochastic and Machine Learning models were tested and verified.

Output response and corresponding assessment

Hereinafter, the results and the respective assessment follows. In Table 3,the convergence analysis of the formulated Neural Networks (NN) is presented. The \(L_2\) error is no more than in the order of magnitude of \(10^{-5}\), enlightening the accuracy of the FNNs. In the Appendix, the 3 FNNs that estimate the unfavorable values for the axial load \(N_{ult}\), the horizontal and vertical displacement \(u_{x}\) and \(u_{y}\) as well as the rotation Rotation of the shallow foundation, in porous analyses and the 2 FNNs for medium which is without the pore pressure are presented in Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 and 32 comprising of 5 smaller figures. The 5 subfigures are the schematic illustration of the the FNN in 3 dimensions, the required for convergence epochs and the model \(L_2\) loss figure and the FNN projection to the input axis x, which is the eccentricity \(e_x\), the input axis y, which is the eccentricity \(e_y\) and the input axis z, which is the obliquity angle in relation to the horizontal direction \(\theta _q\). An FNN is formulated with dataset point that correspond to the mean value of a Monte Carlo simulation.

The assessment of the convergence of the model hyperparameters reveals that the greatest amount of Epochs are 92,200 and they provide an \(L_{2}\) error of \(1.12 \times 10^{-7}\). Furthermore, the largest error is \(1.96\times 10^{-5}\) and it is obtained when the training Epochs amoun is in the order of magnitude 50000. Taking this into consideration and combining the relative scientific publications [56] it can be concluded that the selection of the FNNs is appropriate for the problem in discussion. Subsequently, this method is not only effortless but also convenient and adaptive for improving the FNN in terms of data, like real in situ or in laboratory tests as well as more computational data from FEM analysis. It is a future work and research goal to enrich this model of high reliability with real world data and investigate the effect of the data enrichment to the prediction of the shallow foundation limit state under cohesive soil domain.

The assimilation of the dataset vector and the formulation of the FNNs lead to important deductions for computational geomechanics discipline. Hereinafter, the most unfavorable estimated values of an FNN will be provided alongside with the triad of the input vector that result the aforementioned unfavorable situations. Furthermore, the remainining response values estimated will be portrayed and assessed. Alongside with the aforementioned analysis, a comparative assessment will be provided in order to obtain the most detrimental situations for the shallow foundation subjecteed to static load and sited to a stochastic soil domain comprised by cohesive geosubstance. These results will be accurate quantitatively as a result to a high accuracy of the material stress strain law and the increased reliability of the neural network algorithm construction.

In Neural Network with ordinal number 10, that abbreviates to the Monte Carlo stochastic analysis of \(\textbf{P}\)- \(\kappa _{{RF}}\)- \(c_{{RF}}\)- \(k_{{RF}{{4}}}\) is proven to result to most detrimental situation, that is the minimum value, for \(N_{ult}\). For this output variable, the critical value is 642.80 kN whilst the rest of the output values \(u_{y}\), \(u_{x}\), Rotation are 0.0030 m, 0.0528 m, \(1.41\times 10^{-5}\) rad, respectively. This situation is obtained when the input triad \((e_x,e_y,\theta )\) are (0,0,0). Furthermore, the range of \(N_{ult}\) is fairly large which is approximated as \(\approxeq\) 3500-640=2860 kN leading to the conclusion that the FNN can estimate a variety of possible situations. Consequently, the situation of horizontal central load is the most critical for the foundation failure load and the rest output values can be described as moderate values for vertical displacement and foundation rotation and large values for horizontal displacement. This analysis is depicted in Figs.9, 10, 11 and 12

In Neural Network with ordinal number 11, that is to the Monte Carlo computational reproduction of \(\textbf{P}\)- \(\kappa _{{RF}}\)- \(c_{{RF}}\)- \(k_{{RF}{{8}}}\) is found to give the most detrimental situation, that is the maximum values for \(u_x\). For the horizontal displacement, the critical value is 0.0898 m while the rest of the output values \(N_{ult}\), \(u_{y}\), Rotation respectively are 736.817 kN, 0.0124 m, \(4\times 10^{-4}\) rad. This is obtained when the input vector is \((e_x,e_y,\theta )\)=(0,0,\(30^o\)). Furthermore, the range of \(u_x\) is fairly large as is estimated as 0.0898 m, resulting to the deduction that the variety of estimations of the FNN is substantial. As a result, the situation of moderate oblique central load is the most unfavorable for the foundation failure load and the rest output values may be characterized as modest values for vertical displacement, foundation rotation and ultimate load which means a complex way of failure occurs. This analysis is portrayed in Figs. 13, 14, 15 and 16

In Neural Network with ordinal number 9, that abbreviate to the stochastic computational reproduction of \(\textbf{P}\)- \(\kappa _{{RF}}\)- \(c_{{RF}}\)- \(k_{{RF}{{2}}}\) is found to have the most unfavorable values, that is the largest values, for \(u_{y}\). For the aforementioned variable the critical value is 0.0641 m while the rest of the output values \(N_{ult}\), \(u_{x}\), Rotation respectively are 3605.381 kN, 0.0622 m, \(2\times 10^{-4}\) rad. This situation is acquired when the set of input values \((e_x,e_y,\theta )\) are (0,0,\(90^o\)). The range of \(u_y\) is considered as moderate since it is estimated as 0.0641 m. Subsequently, the situation of vertical central load is the most critical for the foundation failure load and the rest output variables can be portrayed as having moderate values for foundation rotation and large values for horizontal displacement and failure footing settlement load. This analysis is presented in Figs. 5, 6, 7 and 8

Neural Network with ordinal number 9, that is to the Monte Carlo computational reproduction \(\textbf{P}\)- \(\kappa _{{RF}}\)- \(c_{{RF}}\)- \(k_{{RF}{{2}}}\) happens also to provide the most detrimental situation, that is the largest values, for Rotation. For this output variable the critical value is 0.0243 rad whilst the rest of the output values \(N_{ult}\), \(u_{y}\), \(u_{x}\) respectively are 1837.24 kN, 0.0005 m, 0.0531 m. This situation occurs when the set of input values \((e_x,e_y,\theta )\) are (\(\frac{h}{3}\),0,0). The range of Rotation is estimated as 0.0243 rad which is characterized as moderate. Subsequently, it represents a substantial amount of real world situations. Consequently, the situation of horizontal large eccentricity load is the most detrimental situation for the foundation rotation and the remaining output values can be depicted as moderate values for vertical displacement and large values for horizontal displacement and foundation failure load. This analysis is depicted in Figs. 5, 6, 7 and 8

In Neural Network with ordinal number d1, that abbreviates to the Monte Carlo computational reproduction \(\textbf{S}\)- \(\kappa _{{L}}\)- \(c_{{R}}\) is justified to have the most detrimental occasion, that is the smallest values, for \(N_{ult}\). For this output variable the critical value is 643.56 kN while the rest of the output values \(u_{y}\), \(u_{x}\), Rotation respectively are 0.0029 m, 0.0408 m, \(1.8\times 10^{-7}\) rad. This occurs when the set of input values \((e_x,e_y,\theta )\) are (0,0,0). The range of \(N_{ult}\) is estimated as \(\approxeq\) 3900-640=3260 kN which is considered as fairly large. Consequeqntly, the variety of real world situations that are represented from the FNN can be characcterized as significant. Subsequently, when horizontal central load is subjected into the shallow foundation then the most unfavorable situation occurs for the aforementioned NN for predicting the foundation failure load and the remaining output values can be described as moderate values for vertical displacement and foundation rotation and large values for horizontal displacement. This analysis is portrayed in Figs. 17, 18, 19 and 20

Neural Network with ordinal number d1, that ties in with the Monte Carlo computational reproduction \(\textbf{S}\)- \(\kappa _{{L}}\)- \(c_{{R}}\) happens also to reuslut to the most unfavorable situation, that is the largest values, for \(u_{y}\). For this output variable the critical value is 0.0488 m while the rest of the output values \(u_{y}\), \(u_{x}\), Rotation respectively are 3851.63 kN, 0.0472 m, \(2\times 10^{-4}\) rad. This situation is done when the set of input values \((e_x,e_y,\theta )\) are (0,0,\(90^o\)).The range of \(u_y\) is estimated as 0.0488 m which is considered as moderate to large. As a result, the situation of vertical central load is the most probable for the NN to predict the most critical situation for the foundation failure load and the 3 other output variables can be depicted as having moderate values for foundation rotation and large values for horizontal displacement and failure footing settlement load. This analysis is presented in Figs.17, 18, 19 and 20

In Neural Network with ordinal number d2, that is to the Monte Carlo simulation \(\textbf{S}\)- \(\kappa _{{L}}\)- \(c_{{D}}\) is justified to have the most unfavorable values, that is the largest values, for \(u_{x}\). For this output variable the critical value is 0.0655 m while the rest of the output values \(N_{ult}\), \(u_{y}\), Rotation respectively are 759.06 kN, 0.0108 m, \(3\times 10^{-5}\) rad. This situation is done when the set of input values \((e_x,e_y,\theta )\) are (0,0,\(30^o\)). The range of \(u_x\) is estimated as 0.0655 m which is concluded as moderate to large. As a result, the situation of moderate oblique central load is the most unfavorable for the foundation failure load and the rest output values can be described as moderate values for vertical displacement, foundation rotation and ultimate load which means a complex way of failure occurs. This analysis is depicted in Figs. 21, 22, 23 and 24

Neural Network with ordinal number d2, that corresponds to the Monte Carlo simulation \(\textbf{S}\)- \(\kappa _{{L}}\)- \(c_{{D}}\) happens also to have the most unfavorable values, that is the greatest values, for Rotation. For this output variable the critical value is 0.0222 rad while the rest of the output values \(N_{ult}\), \(u_{y}\), \(u_{x}\) respectively are 2174.30 kN, 0.0466 m, \(1.41\times 10^{-4}\) rad. This situation is done when the set of input values \((e_x,e_y,\theta )\) are (\(\frac{h}{3}\),0,\(90^o\)). The range for Rotation, is estimated as 0.0222 rad which is considered as moderate to large. This leads to the conclusion that the variety of real world situations portrayed from the FNNs in all displacements and rotations is substantial. As a consequence, the situation of vertical load of high eccentricity is the most unfavorable situation for the footing settlement failure load and the rest output values can be depicted as moderate values for foundation rotation and large values for failure load and vertical displacement. This analysis is presented in Figs. 21, 22, 23 and 24

The aforementioned findings lead to two important deductions. In porous analyses the most critical situation regarding \(N_{ult}\), \(u_{x}\), \(u_{y}\), Rotation is obtained when the Karhunen Loeve random field representation is assumed for the input randomness. If the depth of the soil domain coincides with the correlation length (b=4 m) the ultimate load is the smallest and thus the most unfavorable situation for this variable occurs. A decrease for the correlation length maximizes the value for the vertical displacement and the rotation at failure whilst an increase of the correlation length gives greater horizontal displacement estimations. Subsequently, the diminishing of the correlation length which maximazies the uncertainty, the limit displacements variability is increased. If the water pore pressure is neglected, the worst case scenario regarding \(N_{ult}\), \(u_{x}\), \(u_{y}\), Rotation occur when linear distribution over depth for \(\kappa\) is considered. Specifically, for the limit load and the vertical failure displacement more unfavorable situations are predicted by the neural networks when random variable case is considered for the critical state line inclination whilst for the horizontal failure displacement and rotation the critical situation is obtained when c is deterministic. As a result, the effect of c is more profound to the limit load, as expected, and to the corresponding vertical displacement field and the effect of \(\kappa\) is more evident in the estimation of the displacement field at failure. The aforementioned results indicate quantitatively and qualitatively the effect of each material variable uncertainty to the response of the soil domain and extends similar results that are obtained by previous literature in terms of estimating the response without new analyses required. The range of all FNNs estimations can be considered as substantial and with an acceptable variety of real world situations in Geotechnical Engineering. It should be also highlighted here that the optimum estimations of the FNNs for a model that is the most detrimental for two output variables simultaneously do not have the exact same values, as expected. As a concluding remark, the FNNs have the advantage that can be easily improved by collecting more computational or experimental data. Subsequently this work enlightens the value of the scientific fields of the variability estimation in engineering and in computational geoengineering.

Concluding remarks

A series of feed forward neural networks predicting the limit state of shallow foundation under structured geomaterials is given. The aforementioned data are taken from precedent Monte Carlo analyses that estimate the limit load of the footing settlement under static loading conditions. Furthermore, estimations about the displacement field at the time the soil exhibits its bearing capacity are provided. Specifically, the horizontal and vertical displacements of the footing settlement neural networks are given alongside with a prediction for the corresponding rotation of the foundation. The material constitutive model of stress–strain relation is a high fidelity model for cohesive soils which provide a reliable stress strain law in terms of quantitative results and thus the load–displacement field response in static and dynamic loading. Therefore, the estimation of the neural network are considered to be reliable and attested.

In terms of convergence, these FNNs are constructed with a small amount of epochs and the \(L_{2}\) error is relative low for the accuracy needed for an application in geomechanics. No more than 1,00,000 epochs are needed for a loss at most \(10^{-5}\). The prediction of the displacement filed demands importantly smaller number of epochs compared to the corresponding for the failure output load. Furthermore, alleviated error is estimated for the neural networks of the horizontal displacement at failure and increased error is estimated for the respecting of the rotation of the shallow foundation. That indicates the advantage of the method used for obtaining the neural networks and also highlits an advantage of the method that it can be easily enriched with more experimental or computational data. The enrichment of the neural network with experimental data and the comparison with the one presented in this article is a future work to be done and the conclusions of this research will be of a significant importance.

In terms of the results presented, in analyses that take into account the soil-pore fluid interaction the most unfavorable situation for the output values of failure load and failure displacement field is when the uncertainty of the unput material paramters are represented with the Karhunen Loeve sum. In this case the most detrimental limit load is in the order of magnitude of 642 kN while the displacements are in the order of magnitude of 0,05 m and the rotations are negligible. The most unfavorable displacements are in the order of magnitude of 0,09 m with corresponding axial load in the vicinity of 735 kN and negligible rotations. When the uncertainty increases, with the reduction of the correlation length, the uncertainty of the limit displacement field increases. For analyses that neglect the water pressure, the values for output values of failure load and failure displacement field are generally most unfavorable when linear relation with respect to depth for the compressibility factor is set. In this case the most detrimental limit load is in the order of magnitude of 643 kN while the displacements are in the order of magnitude of 0,04 m and the rotations are negligible. The most unfavorable displacements are in the order of magnitude of 0,07 m with corresponding axial load in the vicinity of 3800 kN and negligible rotations. All these may be attributed to the fact that the mean value of a Monte Carlo simultaion was adopted at a data point of the neural networks constructed. It comes to an accordance with the precedent scientific publications of the authors. If c is deterministic, the horizontal displacement and the rotation of the foundation are having their largest values. If c is set to comply to random continuous variable, the vertical displacement and the failure load of the footing settlement are having their most unfavorable values.