Globally supported surrogate model based on support vector regression for nonlinear structural engineering applications

This work presents a global surrogate modelling of mechanical systems with elasto-plastic material behaviour based on support vector regression (SVR). In general, the main challenge in surrogate modelling is to construct an approximation model with the ability to capture the non-smooth behaviour of the system under interest. This paper investigates the ability of the SVR to deal with discontinuous and high non-smooth outputs. Two different kernel functions, namely the Gaussian and Matèrn 5/2 kernel functions, are examined and compared through one-dimensional, purely phenomenological elasto-plastic case. Thereafter, an essential part of this paper is addressed towards the application of the SVR for the two-dimensional elasto-plastic case preceded by a finite element method. In this study, the SVR computational cost is reduced by using anisotropic training grid where the number of points are only increased in the direction of the most important input parameters. Finally, the SVR accuracy is improved by smoothing the response surface based on the linear regression. The SVR is constructed using an in-house MATLAB code, while Abaqus is used as a finite element solver.

the model's response is often replaced by a simplified mathematical model known as surrogate model, also referred to as meta-model. In general, the surrogate model is built by considering the original model as a black-box model which is evaluated only on a limited number of simulations. Accordingly, the approximated model can be built to resemble the behaviour of the high-fidelity model while preserving accuracy and being computationally cheaper to evaluate. Indeed, particularly over the last decade, different surrogate modelling techniques have been proposed in the literature such as proper orthogonal decomposition (POD) [1][2][3][4][5], polynomial response surfaces [6,7], polynomial chaos expansion (PCE) [8][9][10], Gaussian process models [11][12][13] and support vector regression (SVR) [14][15][16]. Among all the mentioned surrogate model techniques, SVR has attracted considerable attention in the literature due to its potential ability, efficiency and robustness in approximating the model response in various engineering problems. For instance, SVR was applied for structural crashworthiness design optimization in [17]. Pan et al. [18] utilised SVR as a surrogate model for lightweight vehicle design. The least square support vector regression meta-model technique for sheet metal forming optimization was proposed in [19]. Moustapha et al. applied SVR for nonlinear structural analysis in [20]. More recently, SVR was applied for structural reliability analysis in [21] and for measuring the water quality parameters in [22]. Moreover, a comprehensive comparison between SVR and various meta-model techniques including response surface methodology, kriging, radial basis functions and multivariate adaptive regression splines was examined in [23]. The study shows that SVR outperformed all other meta-model techniques in terms of prediction accuracy and robustness.
In fact, many engineering problems involve non-smooth responses which exhibit discontinuity or sharp changes for certain combinations of the input parameters. This behaviour often occurs in the context of nonlinear finite element method where the output response may behave non-smoothly in the parametric space. Consequently, approximating the non-smooth response of the underlying physical problem by a smooth global surrogate model might lead to large errors and poor prediction capability. This issue has been addressed in the literature in [24][25][26]. To overcome this problem, several attempts have been proposed in the literature based on data mining techniques such as clustering [27] which automatically determines the input domain of similar responses. In addition, the decomposition of the input space was applied in [28] where the different behaviours of the response surface are locally approximated. The multi-element approaches were applied in [29][30][31] where the parametric space is discretised into non-overlapping elements. Thereafter, the surrogate model is constructed element-wise which weakens the non-smoothness influence of the response within each element. However, these techniques might impose additional computations to tackle the non-smooth behaviour of the output. In this paper, the optimisation of SVR parameters is investigated in order to build a single and global surrogate model with high capability to capture the non-smooth responses. The developed SVR model is examined through nonlinear elasto-plastic problem where the output of interest exhibits highly non-smooth behaviour in the inputs parametric space. In this paper, the accuracy and the computational cost of SVR are studied and compared for two types of kernel functions, namely Gaussian and Matèrn 5/2 kernels. More details about the behaviour of different types of kernel functions can be found in [32]. Thereafter, the computational cost of SVR is reduced by applying anisotropic training grid where the training points are only increased in the input parameter where the response behaves highly nonlinear. Finally, this paper provides a smoothing technique of the non-smooth output response using linear regression. The smoothing method can be applied to enhance the accuracy and prediction capability of SVR, in addition to reducing the computational cost.

Support vector regression
The main concept of SVR is to generate a surrogate model of the underlying complex physical response which depends only on a given training set S = {x i , y i } n i=1 of n training points x i ∈ R d with corresponding output y i ∈ R. The SVR aims at approximating a linear regression function where · , · denotes the inner product, w and b ∈ R are the regression parameters to be estimated using the training data x and φ (x) is the transformation of x from the input space into the so called feature space F. Here, the vector x can be mapped into the feature space by using a kernel function K x i , x j that defines an inner product in this feature space as Accordingly, the solution vector w can be rewritten in its dual representation as [33] whereα i and α i are the Lagrange multipliers representing the dual variables. Substituting Eqs. (3) and (2) into Eq. (1), the regression function becomes withα = α 1 , · · · ,α n T and k = [k 1 , · · · , k n ] T where k i = K (x i , x). The variablesα and b are determined by means of an optimisation problem. For this purpose, the ε-intensive loss function is introduced as follows [34]: Evaluating Eq. (5) at the training data points x i leads to the definition of the slack variables The interpretation of the parameter ε and the slack variables ξ i andξ i are clarified in Fig. 1. It can be seen that, SVR aims to construct a function such that the training points are inside a tube of given radius ε. The slack variables ξ andξ indicate how far a training point lies outside the ε-tube. For instance, ξ = 0 if the corresponding training point lies above the tube. In contrast, ξ = 0 if a training point is located below the tube. Finally, both slack variables take the value ξ =ξ = 0 if the training points are inside the tube. Here, the points that are not strictly located inside the ε-tube are called the support vectors. Typically, the SVR formulation written as an optimisation problem consists of minimising the ε-intensive loss function through the calculation of its norm 1 2 w 2 as min w,b,ξ ,ξ where C is a positive weighting constant known as the box constraint.
The primary optimisation problem represented in Eq. (7) can be transformed into its dual form with the aid of the Lagrange multiplier technique as [35,36] where and by satisfying the Karush-Kuhn-Tucker complementary conditions The solution of the optimisation problem results in the dual variablesα = α 1 · · ·α n T withα i =α i − α i and an optimal bias term b that satisfies the following equations [37] In general, the parameters C and ε can be chosen as and where iqr(Y) is the interquartile range of the response spectrum Y = {y 1 , · · · , y n } of the input training data set.

Error measures
In order to ensure the quality and the prediction accuracy of the surrogate model, two main error measures are considered in this study. First, the overall quality of the SVR approximation is determined by using a relative error measure where n * is the number of the test data points and i is the relative absolute error, given by where y i = y x * i is the exact response evaluated at the test data points x * ,ŷ i =ŷ x * i is the SVR surrogate model evaluated at the same test points, andỹ is the average output obtained from the exact responsẽ Second, a local error estimator known as Chebyshev norm estimator is used to determine the local prediction accuracy of SVR where e i is the absolute error defined as the absolute difference between each data point evaluated for the exact response and the SVR surrogate model

Numerical studies
This numerical study aims to construct a global surrogate model using SVR that is capable of capturing highly non-smooth and nonlinear responses. In order to determine the appropriate kernel function K and the optimal values of the SVR parameters C and ε, a one-dimensional phenomenological elasto-plastic model is firstly considered in this study. The model consists of three inputs parameters, namely the total strain ε, the hardening parameter k and the yield stress σ y . Thereafter, the optimised SVR parameters are applied for a twodimensional four-point bending beam where the response of interest exhibits highly non-smooth behaviour in the parametric space. Furthermore, the computational cost of the SVR is reduced by using anisotropic training grid. In addition, the accuracy of the constructed SVR is improved by smoothing the response surface using linear regression.

One-dimensional elasto-plastic model
Consider a one-dimensional pure phenomenological elasto-plastic model where the strain ε takes a scalar quantity. The total strain ε can be additively decomposed into an elastic and plastic strain as The splitting is done by an return-mapping algorithm as described specifically for the one-dimensional case in [38]. In this example, the associated plastic strain ε p is considered as the quantity of interests by assigning the total strain ε, the hardening parameter k and the yield stress σ y as input parameters denoted by x = ε, k, σ y . The variation ranges of the inputs parameters are given in Table 1. The Young's modulus is fixed at E = 210000 MPa.
First, Cartesian grids are used as training-and test points, e.g. with eight points in each dimension  The training grid (20) and the separation of the parametric space into an elastic region (ε p = 0) and a plastic region (ε p >= 0) are shown in Fig. 2.
The reference response surface shown in Fig. 3 is obtained from the exact solution by fixing the value of the total strain at ε = 0, 005 and ε = 0, 0028, respectively. It can be seen that the exact response appears to have only nonlinear behaviour in Fig. 3a and begins to behave non-smoothly (e.g. C 0 -continuous in the elastic region where ε p = 0) by decreasing the value of the total strain, as shown in Fig. 3b.
In order to adopt a kernel function that can capture the non-smooth transition of the response surface in the input parametric space, two kernel functions are applied in this example.
First, the Gaussian kernel function defined as is applied to build the surrogate model using SVR. The training points are distributed uniformly in the parametric space with ten points in each dimension. The exact response surface and the approximated one obtained from SVR at a fixed value of the total ε = 0.0028 are clarified in Fig. 4b. It can be seen that the Gaussian kernel provides a qualitatively sufficient approximation of the exact response surface in the range where ε p = 0, and it exhibits a small deviation in the nonlinear region where ε p > 0. Now, the Matèrn 5/2 kernel function defined as  is applied to the same problem and compared with the Gaussian kernel using the error criteria discussed in Sect. 2.1. In both kernels, the correlation length is set to ρ = 1, and the optimal values of the SVR parameters, namely the box constraint C and the tube width ε, are experimentally determined as The absolute error ε max and the relative error ε rel using the Gaussian and Matèrn kernel functions with respect to the number of training points are shown in Fig. 5. It can be seen that the Matèrn kernel function outperforms the Gaussian kernel for both error measures. The Gaussian kernel exhibits high oscillations in the absolute error and an increment in the relative error by increasing the number of the training points (n > 500). On the contrary, the Matèrn kernel provides higher accuracy by increasing the number of the training points for the both errors. However, it shows a slight oscillation for the number of training points (n > 10 4 ) due to the over-fitting phenomenon.

Four-point bending beam with elasto-plastic material behaviour
In this numerical study, the SVR is investigated on a nonlinear elasto-plastic analysis of a four-point bending test. The 200 mm × 20 mm beam is subjected to displacement-controlled loading at x = 60 mm and x = 140 mm, respectively. The beam is supported at x = 10 mm and x = 190 mm. The Poisson's ratio υ = 0.3 and a von Mises ideal plasticity yield criterion with a plane stress assumption are considered in this example. The height of non-plastified cross-section d el is considered as a quantity of interest by assuming the Young's modulus E, a yield stress σ y and the applied displacementū as inputs parameters. The variation ranges of the inputs parameters are given in Table 2.
The contour plot of the equivalent plastic strain ε p eq obtained from FEM simulation at a fixed value of displacementū = 0.2 mm is plotted in Fig. 6.
The non-plastified cross-section height at the centre of the beam is measured at the centre of gravity (centroid) of the non-plastified elements as where N e | ε p eq =0 is the number of elements with zero centroid equivalent plastic strain and l e is the element length. In general, the equivalent plastic strain can be estimated as where ε p denotes the plastic strain tensor. For the sake of higher precision, the equivalent plastic strain is evaluated at the centroid of each element where super convergence is guaranteed, as it is clarified in Fig. 7.
In this example, the beam height at the centre contains N e = 100 finite elements with an element width length equal to l e = 0, 2 mm. The exact response surface obtained from n = 8000 FEM simulations using Cartesian grid distributed equally in the input parametric space is calcified in Fig. 8. It can be observed that the response surface exhibits highly non-smooth and discontinuous in addition to linear and nearly flat behaviour for certain combinations of the input parameters. This discontinuity is mainly caused due to the fact that the non-plastified height of the beam d el is measured by counting the number of element N e with the zero equivalent plastic strain ε p eq along the beam cross section. Consequently, d el can only change by the element length l e for different combination of the input parameters. In order to approximate this response surface, the SVR model is built using the optimal parameters of the box constraint C and the tube width ε (see eqs. (23) and (24)) using the Matèrn 5/2 kernel function.
The accuracy of the SVR approximation using the error criteria discussed in Sect. 2.1 is given in Fig. 9. It is evident that the SVR approximation shows higher local error where the response behaves discontinuously,    On the contrary, the global relative error is rel = 0.017 for the same number of training points. In fact, the response surface shown in Fig. 8 behaves approximately linearly with respect to the yield stress σ y and the Young's modulus E, whereas it shows nonlinearity with respect to the displacement loadingū. Therefore, a natural choice to reduce the computational cost of the SVR model is to assign a higher number of training points only in the direction of the input parameterū. The reduction in the training points in specific dimensions can be applied by using anisotropic training grids. For the sake of comparison with the previously used isotropic training grids, the number of training points in the anisotropic grid is reduced by half in the parameter directions σ y and E, while it remains at the same number of training points in the direction ofū. A comparison between the isotropic and anisotropic training grid is shown Fig. 10.
The accuracy of the SVR approximation using the anisotropic training grid in comparison with the isotropic one is given in Fig. 11. It can be seen that the anisotropic training grid provides approximately the same accuracy as the isotropic grid while reducing the computational cost by a factor of four. For instance, it requires only n = 1900 training points to achieve max ≈ 0, 4 mm absolute error and rel ≈ 0, 017 relative error, whereas the isotropic grid requires n = 6890 training points to obtain the same accuracy.
The computational time required for SVR surrogate model to obtain the target absolute error max ≈ 0, 43 mm using anisotropic training grid is given in Table 3. Table 3 shows that after constructing the SVR surrogate model, the computational time is reduced to 0.18 s compared with 2 s computation time to run a single FEM simulation. This allows a reduction in the computational cost by a factor of 11.

Smoothing the response surface
In fact, the non-plastified height of the beam d el discussed in Sect. 3.2 is mainly measured by counting the number of finite elements N e with zero centroid equivalent plastic strain ε p eq = 0 using Eq. (25). This mesh dependency causes the response surface to behave discontinuously in the parametric space since the nonplastified height d e can only change by the element length l e . This discontinuous behaviour can be weakened by estimating the non-plastified height of the beam depending on the number of integration points with zero equivalent plastic strain. However, this might only weaken the non-smoothness without relieving the mesh dependency of the response surface d el . Alternatively, this section provides a measure of the equivalent plastic strain ε p eq at the centre of the beam depending on the state of all integration points across the beam cross section, including the ones with ε p eq > 0 by using two linear regression functions, defined as and where the signs (+, −) indicate the positive and negative global y-coordinates of the integration points, respectively, and α and β are the linear combination coefficients to be estimated. Once the linear combinations coefficients are estimated using least squares method, the non-plastified height d el can be simply calculated as An example of the regression functions along the beam height is given in Fig. 12.
A comparison of the response surface obtained from n = 8000 combinations of the input parameters using different measures of the equivalent plastic strain ε p eq , namely at the centroid of the element, integration points and by using linear regression, is plotted in Fig. 13. Figure 13c shows that using the linear regression to measure the equivalent plastic strain provides a smooth transition of the quantity of interest d el in the parametric space unlike the centroid and integration points      Figs. 13a and b, respectively. The SVR approximation of the non-plastified height d el is given in Fig. 14.
First, it can be observed from the global relative error rel , plotted in Fig. 14b, that the smoothed curve using linear regression provides a significant improvement in the accuracy of the SVR approximation compared with the centre point and the integration points. On the contrary, it exhibits approximately identical local accuracy with the integration points, as shown in Fig. 14a. This is mainly due to the fact that the quantity of interest d el remains constant at d el = 20 mm when the beam is in a fully elastic state for different combination of the input parameters. Thereafter, it exhibits C 0 transition in the parametric space where the cross-section height begins to be partially plastified where d el < 20 mm, as clarified in Fig. 13.

Conclusion
This study has presented a global surrogate modelling for mechanical systems with nonlinear elasto-plastic material behaviour based on support vector regression (SVR). First, SVR is applied to a purely phenomenological one-dimensional material model. In the numerical study, it is shown that the Matèrn 5/2 kernel function with appropriately optimised parameters tends to have more accurate representation of the output (the plastic strain ε p ). It outperforms the Gaussian kernel. The optimal values of SVR parameters such as the box constraint C and tube width ε are determined in the first numerical study. The results obtained from the one-dimensional elasto-plastic example are therefore applied to continuum elasto-plasticity within the framework of the finite element method. The four-point bending with elasto-plastic material behaviour is chosen as a technical application. The quantity of interest is considered as the non-plastified cross-sectional height d el by representing the displacement magnitudeū, the Young's modulus E and the yield stress σ y as inputs parameters. It is shown that the post-processing and the element size of the finite element mesh have a significant impact on the accuracy of the SVR approximation. Consequently, it causes the quantity of interest to behave discontinuously in the parametric space. To overcome this problem, the quantity of interest d el is smoothed over the input space by using linear regression. The results show that SVR can provide relatively a sufficient approximation of a highly nonlinear and non-smooth responses surfaces while the computational cost is reduced by a factor of 11. Finally, it is worth mentioning that only three input parameters are considered in this paper. However, the accuracy and the performance of the SVR surrogate model might be affected for high-dimensional non-smooth engineering problems.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Funding Open Access funding enabled and organized by Projekt DEAL.