1 Introduction

Computational fluid dynamics (CFD) and finite-element (FE) analyses are powerful tools to solve engineering problems for which no analytical solution can be found. However, for nonlinear problems with many degrees of freedom these methods can be computationally costly. Consequently the applicability of such models for direct use in inverse analyses, (robust) optimization or control algorithms is limited. For these purposes a surrogate model can be constructed [16, 25]. Surrogate models are cheap to evaluate approximation models that mimic the output of expensive models. The CFD and FE models can be considered as a black box mapping from an input space to an output space. The relations between inputs and outputs can be approximated by fitting a function of input data to output data that are obtained from evaluations of the considered FE or CFD model [8].

In many cases, surrogate modelling (also referred to as metamodelling) is used to map a multidimensional input space to a scalar output space. To construct surrogate models, the expensive model is evaluated on predefined sample points in the input parameter space. The sample points and the results of the expensive model together are referred to as the training set.

Many different metamodelling techniques exist [16, 18, 25, 38]. In our work, metamodels of metal forming processes are studied. These processes are a suitable benchmark because they have strong nonlinearities involved. For example in the work of Wei et al. [36] the wrinkling tendency in a forming process of a car sheet part is expressed as a scalar function of process parameters. In their work the response of an FE-model is replaced with a response surface model to optimize the process. In the work of Wiebenga et al. [37] an industrial V-bending process is optimized. In the V-bending process the flange shape is defined by the main angle, a scalar, which is modelled using a kriging metamodel. Other examples in which metamodels replace metal-forming process simulations use polynomial regression or response surface methodology (RSM) [4, 14, 15, 27, 30, 33, 36], support vector regression (SVR) [31, 34, 35], Multivariate Adaptive Regression Splines (MARS) [3, 21], kriging ([6, 32], radial basis functions (RBF) [29] and neural networks [20, 28]. Some of the aforementioned metamodelling techniques, such as kriging and RBF, interpolate the existing data. Other techniques, such as RSM and SVR, are regression methods. In this work we will focus on interpolating metamodelling techniques and we will refer to the output that is being interpolated as the interpolant.

In several applications, it is required to model a result array instead of a scalar. Then the interpolant consists of more than one variable and the output field may even contain different physical quantities at the same time. When the training set for full field surrogate models is large, reduction techniques can be applied. A commonly used reduction technique is the Proper Orthogonal Decomposition. Proper Orthogonal Decomposition (POD) is an umbrella term that includes Principal Component Analysis (PCA), Karhunen–Loeve Expansion (KLE) and Singular Value Decomposition (SVD) [19]. The goal of the decomposition is to obtain a low-dimensional representation of the output field. This is done by seeking the predominant modes in the output data of the training set and use these as a new basis.

POD is usually combined with RBF interpolation. For example, in Hamdaoui et al. [11], a POD-based surrogate is used to describe the displacement field of the stamping of an axisymmetric cup to model the major and minor strain in a Forming Limit Diagram. In Dang et al. [6] the displacement field is described by a POD-based surrogate model to optimize the shape of a metal product after springback.

Most metamodels have hyperparameters associated with them that influence the fitting quality of the metamodels in the output space. Different criteria can be used to choose these parameters in order to optimize the fitting quality of a metamodel. We will refer to the criterion that is used for optimizing the hyperparameters as the fitting criterion. In this work, we will focus on RBF as an interpolation method. To quantify the quality of the interpolation, different fitting criteria can be determined , such as the likelihood criterion for kriging interpolation [17], or based on the Leave-One-Out (LOO) cross-validation values for RBF interpolation [24].

The key question in this work is how to efficiently create an accurate reproduction of a result array consisting of multiple physical fields using RBF interpolation. Three different approaches for construction of such surrogate model are presented. The main difference between the approaches is the interpolant. The first approach is based on a direct interpolation of the result array on the input space. The second and third approach will involve model reduction. In these approaches a reduced basis is determined using singular value decomposition (SVD). The left singular vectors will be the basis vectors in the truncated basis. By projecting the result arrays onto the basis vectors, the amplitudes of the result arrays in the new basis can be found. These amplitudes corresponding to different basis vectors are interpolated in two different ways, by means of array and scalar interpolation. For array interpolation, the interpolant will be the array that collects the amplitudes corresponding to all basis vectors in the basis. For scalar interpolation, the interpolant will be a scalar that describes the amplitude corresponding to a single basis vector. In the case of array interpolation all amplitudes are interpolated simultaneously, whereas in the case of scalar interpolation, the amplitudes of the separate basis vectors are interpolated independently. In this work, the hyperparameters corresponding to the different approaches for surrogate model construction will be optimized. To do so, a new quality measure that takes into account the physical parts is used as a fitting criterion.

When a reduced order model is constructed, the number of basis vectors to be retained in the truncated basis has to be determined. While inclusion of more basis vectors generally increases its accuracy, it can also add more noise, especially in the higher order basis vectors [26]. A new criterion for choosing the number of basis vectors to be included in the truncated basis of the reduced order models is proposed.

This paper is organized as follows: in Sect. 2, three approaches for surrogate model construction are described. Thereafter, Sect. 3 describes the quality measure and the different sources of error that can be distinguished. Furthermore, it is described how the quality measure and sources of error can be used in a fitting criterion, as well as in a criterion to choose the number of basis vectors in the truncated basis. A demonstrator process is introduced in Sect. 4. In Sect. 5, the performance of the three approaches is compared .

2 Constructing different surrogate models

Three surrogate models of a result field consisting of different physical parts are constructed. The described approaches are generic and can be applied to any result field. As an example the output of a single FE simulation consisting of M variables is considered. The variables can be stored in an \(M \times 1\) result vector \({\textbf{y}}\). In this case, it is assumed that the output is fully described using three physical fields, being displacement \(\textrm{u}\), equivalent plastic strain \(\upvarepsilon\) and stress \(\upsigma\). The different physical parts are partitioned as subarrays in the result array:

$$\begin{aligned} {\textbf{y}} = \left\{ \, \begin{array}{c} \left\{ {\textbf{y}}_\textrm{u} \right\} \\ \left\{ {\textbf{y}}_\upvarepsilon \right\} \\ \left\{ {\textbf{y}}_\upsigma \right\} \\ \end{array}\,\right\} , \end{aligned}$$
(1)

in which \({\textbf{y}}_\textrm{u}\) contains \(M_\textrm{u}\) displacements , \({\textbf{y}}_\upvarepsilon\) contains \(M_\upvarepsilon\) equivalent plastic strains and \({\textbf{y}}_\upsigma\) contains \(M_\upsigma\) stress components .

To construct the different surrogate models, a training set needs to be obtained. The training set is obtained by sampling the input space and evaluating the black-box model, e.g. a FE simulation, on the sample points \({\textbf{x}}_i\).

The training set consists of the set of inputs \({\textbf{X}}\) and the corresponding result arrays that are collected in the matrix \({\textbf{Y}}\):

$$\begin{aligned} \begin{array}{ccc} {\textbf{X}} = \left[ \begin{array}{c} \\ \ldots \\ \\ \end{array} \left\{ \, \begin{array}{c} \\ {\textbf{x}}_i \\ \\ \end{array}\,\right\} \begin{array}{c} \\ \ldots \\ \\ \end{array} \right] &{} \xrightarrow {\text {FEA}} &{} {\textbf{Y}} = \left[ \begin{array}{c} \\ \ldots \\ \\ \end{array} \left\{ \, \begin{array}{c} \\ {\textbf{y}}_i \\ \\ \end{array}\,\right\} \begin{array}{c} \\ \ldots \\ \\ \end{array} \right] \\ \,\,\,\,\,\,\,\,\,\, N_\textrm{dim} \times N_\textrm{exp} &{} &{} \,\,\,\,\,\,\,\,\,\, M \times N_\textrm{exp} \end{array}, \end{aligned}$$
(2)

in which \(N_\textrm{dim}\) is the dimensionality of the input space and \(N_\textrm{exp}\) is the number of sample points or experiments in the training set . We will refer to the matrix \({\textbf{Y}}\), in which the result arrays are collected, as the snapshot matrix.

Three different surrogate models will be constructed. An overview of the construction procedures for the different surrogate models is presented in Fig. 1. The first surrogate model is obtained by directly interpolating the result array on the input space, as will be described in Sect. 2.2. This method is referred to as ‘Direct interpolation’. In this method the interpolant will be directly related to the result array. The full result space that is available based on the training set will be used for interpolation.

The second and third surrogate models are reduced order models, in which singular value decomposition (SVD) is used to reduce the output space. In an SVD the predominant modes in the output data are sought to form a new orthogonal basis. As the first vectors in this basis describe the most variation, the basis can be truncated to K basis vectors. The output fields are projected onto this truncated basis to find the amplitudes. These amplitudes will be interpolated to find a continuous surrogate model. To construct the second surrogate model, the amplitudes corresponding to all basis vectors are interpolated on the input space as one array as described in Sect. 2.4.1. Hence, the interpolant is a K-dimensional array and there will be only one set of optimized hyperparameters for all the amplitudes corresponding to different basis vectors. This method is referred to as ‘SVD, array interpolation’.

For the construction of the third surrogate model, the amplitude function is interpolated for each basis vector separately as will be described in Sect. 2.4.2. Hence, the interpolants are K scalars. The hyperparameters are optimized for the amplitude function for each basis direction . This method is referred to as ‘SVD, scalar interpolation’ and is depicted with the rightmost path in Fig. 1.

The reduced order models are cheaper to evaluate compared to the surrogate model that is based on ‘Direct interpolation’. However, as no reduction is applied, it is expected that the ‘Direct interpolation’ model will be most accurate. The reduced order model that is based on scalar interpolation of the amplitudes has more freedom to optimize the model parameters. Although the flexibility in the fitting procedure increases the risk of overfitting [23], SVD combined with scalar interpolation is expected to result in a more accurate surrogate model than SVD combined with array interpolation.

Fig. 1
figure 1

Flowchart of the construction of three different surrogate models

2.1 Preprocessing the snapshot matrix

To obtain more accurate surrogate models, the output data in the training set will be preprocessed. The function \(f_*(\cdot )\) transforms the snapshot matrix into the preprocessed snapshot matrix \({\textbf{Y}}_*\), where the sub- and superscript respectively denote the applied preprocessing. The function that reverses the applied preprocessing is called the post-processing function and is denoted with \(f_*^{-1}(\cdot )\).

In all three approaches the mean is subtracted from the snapshot matrix. The mean of each row in the snapshot matrix is calculated using

$$\begin{aligned} \overline{{\textbf{y}}} = \frac{1}{N_\textrm{exp}}{\textbf{Y}}{\textbf{1}}_{N_\textrm{exp}}, \end{aligned}$$
(3)

in which \({\textbf{1}}_{N_\textrm{exp}}\) is an \(N_\textrm{exp} \times 1\) vector of ones, following the notation as used by Pronzato [22]. The zero centered snapshot matrix will be

$$\begin{aligned} {\textbf{Y}}_0 = f_0({\textbf{Y}}) = {\textbf{Y}} - \overline{{\textbf{y}}}{\textbf{1}}_{N_\textrm{exp}}^T. \end{aligned}$$
(4)

When the output data consist of different physical parts, e.g., the displacement, strain and stress as shown in Eq. (1), the data of the physical parts can be stored in one snapshot matrix or in separate snapshot matrices. These snapshot matrices can be reduced by means of a Proper Orthogonal Decomposition. It has been shown in earlier work that decomposing the different physical parts in one matrix improves the overall accuracy of the surrogate model [7].

Typically, the strain components are in the order of 10\(^{-1}\), while the stress components are in the order of 10\(^1\)–10\(^2\) MPa (or 10\(^7\)–10\(^8\) Pa). When the different physical quantities are of different order, scaling must be applied to improve the decomposition. Without scaling, the decomposition will be dominated by the component with the highest numerical values [7]. Guéntot [10] proposed to scale the snapshot matrix with the range, mean or standard deviation. In this work it is chosen to scale each physical part by its range of observed values in the data set.

For example a scaling constant \(s_\textrm{u}\) of the displacement field is calculated as

$$\begin{aligned} s_\textrm{u} = \left( \max \left( {\textbf{Y}}_{\textrm{u}} - \overline{{\textbf{y}}}_\textrm{u} {\textbf{1}}^\textrm{T}_{N_\textrm{exp}} \right) - \min \left( {\textbf{Y}}_{\textrm{u}} - \overline{{\textbf{y}}}_\textrm{u} {\textbf{1}}^\textrm{T}_{N_\textrm{exp}} \right) \right) ^{-1}. \end{aligned}$$
(5)

Now we can define a scaling array \({\textbf{s}} = \{s_m\}\) as

$$\begin{aligned} s_m = \left\{ \, \begin{array}{ll} s_\textrm{u} &{}m = 1...M_\textrm{u}\\ s_\upvarepsilon &{}m = M_\textrm{u} + (1...M_\upvarepsilon ) \\ s_\upsigma &{}m = M_\textrm{u} + M_\upvarepsilon + (1...M_\upsigma ) \\ \end{array}. \right. \end{aligned}$$
(6)

The preprocessed snapshot matrix takes the following form:

$$\begin{aligned} {\textbf{Y}}_\textrm{scaled} = f_\textrm{scaled}({\textbf{Y}}) = \textrm{diag}({\textbf{s}}) \left[ {\textbf{Y}} -\overline{{\textbf{y}}}{\textbf{1}}_{N_\textrm{exp}}^T \right] , \end{aligned}$$
(7)

2.2 Direct interpolation

The first method to construct a surrogate model directly interpolates the result array on the input parameter space. To construct a continuous surrogate model of the result array, the data in the zero centered snapshot matrix from Eq. (4) are interpolated. The corresponding interpolant is an array.

To perform the interpolation, radial basis functions (RBF) will be used. In RBF interpolation, a basis function is placed at the location of each data point in the input parameter space. We will derive the radial basis interpolation of arbitrary interpolant \(\hat{{\textbf{f}}}({\textbf{x}})\) which can be described with the output array \(M \times 1\).

The RBF interpolation of interpolant \(\hat{{\textbf{f}}}({\textbf{x}})\) will be the sum of the weighted radial basis functions of each sample point:

$$\begin{aligned} \begin{array}{cccccccc} \hat{{\textbf{f}}}({\textbf{x}}) &{}=&{} \displaystyle \sum _{i=1}^{N_{\textrm{exp}}} &{} {\textbf{w}}_i &{} g_i({\textbf{x}}) &{}=&{} {\textbf{W}} &{} {\textbf{g}}({\textbf{x}}) \\ M \times 1 &{}&{} &{} M \times 1 &{} 1 \times 1 &{}&{} M \times N_\textrm{exp} &{} N_\textrm{exp} \times 1 \end{array}, \end{aligned}$$
(8)

in which \(g_i({\textbf{x}})\) is a radial basis function that depends on the Euclidean distance \(\Vert \cdot \Vert\) between an arbitrary point \({\textbf{x}}\) and sample point \({\textbf{x}}_i\) in the training set, \({\textbf{w}}_i\) is the array that collects the M weights for the radial basis function \(g_i({\textbf{x}})\) and \({\textbf{W}}\) is the weight matrix that collects all M weights corresponding to the \(N_\textrm{exp}\) radial basis functions. The weights for all basis functions can be solved using the interpolation requirement:

$$\begin{aligned} \hat{{\textbf{f}}}({\textbf{x}}_j) = {\textbf{f}}_j. \end{aligned}$$
(9)

Leading to the following linear system of equations:

$$\begin{aligned} \begin{array}{ccccc} {\textbf{W}} &{} {\textbf{G}} &{} = &{} {\textbf{F}} &{} \\ M \times N_\textrm{exp} &{} N_\textrm{exp} \times N_\textrm{exp} &{} &{} M \times N_\textrm{exp} &{} \end{array}, \end{aligned}$$
(10)

in which \(G_{ij} = g_i({\textbf{x}}_j)\) and \({\textbf{F}}\) is the matrix with output training data of the interpolant.

Different choices can be made for the basis function \(g_i({\textbf{x}})\). In many studies it has been shown that the performance of the multiquadric RBF for scalar interpolation is generally good [9, 16]. In a comparative study performed by Hamim [12] on the application of RBF in POD-based surrogate models, it was shown that multiquadric RBFs perform best. The predictive accuracy of the interpolation is improved by application of global scaling parameters \(\mathbf {\uptheta }\) [13]. The global scaling parameters scale the parameter space in each dimension.

The multiquadric RBF with global scaling has the following form:

$$\begin{aligned} g_i({\textbf{x}};\mathbf {\uptheta }) = \sqrt{1 + \left( \Vert \textrm{diag}(\mathbf {\uptheta }) ({\textbf{x}} - {\textbf{x}}_i ) \Vert \right) ^2}. \end{aligned}$$
(11)

The global scaling parameters \(\mathbf {\uptheta }\) are the hyperparameters, that will be optimized based on a scalar error measure as will be described in Sect. 3.

To construct the surrogate model based on direct interpolation, the zero centered snapshot matrix is used as the matrix with output training data of the interpolant \({\textbf{F}} = {\textbf{Y}}_0\). The approximated result array using direct interpolation will be:

$$\begin{aligned} \hat{{\textbf{y}}}({\textbf{x}}) = \bar{{\textbf{y}}} + {\textbf{W}} {\textbf{g}}({\textbf{x}};\mathbf {\uptheta }) \quad \text { in which: } {\textbf{W}} = \left( {\textbf{Y}} - \bar{{\textbf{y}}}{\textbf{1}}_{N_\textrm{exp}}^T \right) {\textbf{G}}^{-1}. \end{aligned}$$
(12)

Scaling of the snapshot matrix has no influence on the approximation of the result vector when direct interpolation is used to construct the surrogate model.

2.3 Reduction of the snapshot matrix

When the number of variables M in the result array of which a surrogate model will be constructed is large, storage issues may arise. In that case reduction techniques can be applied to construct the surrogate model. Another benefit of using reduction methods in the construction of surrogate models is that noise in the output field is reduced [26].

In this work, singular value decomposition (SVD) is used to find the proper orthogonal basis vectors [1]. The SVD of the preprocessed snapshot matrix takes the following form:

$$\begin{aligned} \begin{array}{cccc} {\textbf{Y}}_{\textrm{scaled}} = {\varvec{\Phi }} {\textbf{D}} {\textbf{V}}^T = &{} \, \left[ \begin{array}{c} \\ \ldots \\ \\ \end{array} \left\{ \, \begin{array}{c} \\ \varvec{\upvarphi }_{n} \\ \\ \end{array}\,\right\} \begin{array}{c} \\ \ldots \\ \\ \end{array} \right] &{} \,\left[ \begin{array}{ccc} {{\ddots }} &{} &{} 0 \\ &{} d_{n} &{} \\ 0 &{} &{} {{\ddots }} \end{array} \right] &{} \,\left[ \begin{array}{ccc} &{} {\vdots } &{} \\ \left\{ \right. &{} {\textbf{v}}^T_{n} &{} \left. \right\} \\ &{} {\vdots } &{} \end{array} \right] \\ &{} M \times N_\textrm{exp} &{} N_\textrm{exp} \times N_\textrm{exp} &{} N_\textrm{exp} \times N_\textrm{exp}. \end{array} \end{aligned}$$
(13)

The preprocessed snapshot matrix is decomposed into three matrices: \({\varvec{\Phi }}\) that contains the left singular vectors \(\varvec{\upvarphi }_n\) as its columns, \({\textbf{D}}\) that contains the singular values \(d_n\) on its diagonal and \({\textbf{V}}\) that contains the right singular vectors \({\textbf{v}}_n\) . The subscript n denotes the n-th direction in the basis. The left singular vector matrix spans an orthogonal coordinate system in the result space and will be used as a basis for the data.

As the singular values are sorted by size from largest to smallest, the most information will be captured by the first singular vectors. The basis can therefore be truncated, such that it contains only the first K basis vectors . The truncated basis with K basis vectors is defined as

$$\begin{aligned} \begin{aligned} {\varvec{\Phi }}^{[K]}&= \left[ \left\{ \, \begin{array}{c} \\ \varvec{\upvarphi }_1 \\ \\ \end{array}\,\right\} \begin{array}{c} \\ \ldots \\ \\ \end{array} \left\{ \, \begin{array}{c} \\ \varvec{\upvarphi }_K \\ \\ \end{array}\right\} \right] \\&\,\quad \quad \quad \quad \quad M \times K \end{aligned}. \end{aligned}$$
(14)

Because the mean has been subtracted from the snapshot matrix, the maximum number of available basis vectors will be \(N_\textrm{exp}-1\). The projections of the result vector on the basis vectors are referred to as amplitudes. The amplitudes are found by multiplying the singular values with the right singular vectors:

$$\begin{aligned} \begin{aligned} {\textbf{A}}^{[K]} = {\textbf{D}}^{[K]} {\textbf{V}}^{[K]T}&= \left[ \begin{array}{ccc} &{} {\vdots } &{} \\ \left\{ \right. &{} \varvec{\upalpha }^T_n &{} \left. \right\} \\ &{} {\vdots } &{} \end{array} \right] \\&\,\quad \quad \,\,\,K \times N_\textrm{exp} \end{aligned}. \end{aligned}$$
(15)

The vector \(\varvec{\upalpha }_n\) collects the amplitudes of the \(N_\textrm{exp}\) result vectors corresponding to basis vector \(\varvec{\upvarphi }_n\). The K-rank approximation of the preprocessed snapshot matrix \({\textbf{Y}}^{[K]}\) can now be written as

$$\begin{aligned} {\textbf{Y}}^{[K]} = {\varvec{\Phi }}^{[K]} {\textbf{A}}^{[K]}. \end{aligned}$$
(16)

By rewriting Eq. (7) and substituting Eq. (16) the K-rank approximation \({\textbf{Y}}^{[K]}\) of the snapshot matrix can be found:

$$\begin{aligned} {\textbf{Y}}^{[K]} = \overline{{\textbf{y}}}{\textbf{1}}_{N_\textrm{exp}}^T + \textrm{diag}({\textbf{s}})^{-1}\left( {\varvec{\Phi }}^{[K]}{\textbf{A}}^{[K]} \right) . \end{aligned}$$
(17)

The ith column in this matrix is the K-rank approximation \({\textbf{y}}_{i}^{[K]}\) of the result vector of experiment i :

$$\begin{aligned} {\textbf{y}}^{[K]}_{i} = \overline{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( \sum \limits _{n=1}^K \varvec{\upvarphi }_n \alpha _{in} \right) . \end{aligned}$$
(18)

Similarly, the result vector of experiment i that is solely approximated with basis vector \(\varvec{\upvarphi }_n\), is defined as

$$\begin{aligned} {\textbf{y}}^{n}_{i} = \overline{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( \varvec{\upvarphi }_n \alpha _{in} \right) , \end{aligned}$$
(19)

this approximation will be used to optimize the hyperparameters of the surrogate model constructed using ‘SVD, scalar interpolation’ in Sect. 3.4.3.

2.4 Amplitude interpolation

When SVD is applied to construct the surrogate models, the amplitudes \({\textbf{A}}\) corresponding to the different basis vectors will be interpolated to form a continuous surrogate model on the input space. This can be done using the same method as described in Sect. 2.2. The amplitudes can be interpolated using two different approaches, array interpolation or scalar interpolation. When array interpolation is used, the interpolant describes multiple amplitudes collected in an array. When scalar interpolation is used, the interpolant is the amplitude corresponding to one basis vector. For each basis vector to be included in the prediction, a separate scalar amplitude interpolation is fitted.

2.4.1 Array interpolation

If the amplitudes are interpolated using array interpolation, all amplitudes corresponding to the K basis vectors will be interpolated at once. There will be only one vector with global scaling parameters \(\varvec{\uptheta }\) for all K basis vectors. The interpolant as described in Eq. (8) will be the array with K interpolated amplitudes, \(\hat{{\textbf{f}}}({\textbf{x}}) = \hat{\varvec{\upalpha }}({\textbf{x}})\). The expression for the interpolant will be

$$\begin{aligned} \begin{array}{cccc} \hat{\varvec{\upalpha }}^{[K]}({\textbf{x}}) &{}=&{} {\textbf{W}}^{[K]} &{} {\textbf{g}}({\textbf{x}}{;}\mathbf {\uptheta }) \\ K \times 1 &{}&{} K \times N_\textrm{exp} &{} N_\textrm{exp} \times 1 \end{array}. \end{aligned}$$
(20)

The matrix with output training data of the interpolant that is used to calculate the weights is the amplitude matrix \({\textbf{A}}^{[K]}\) as given in Eq. (15). Following Eq. (10) the matrix with weights is found using

$$\begin{aligned} \begin{array}{cccc} {\textbf{W}}^{[K]} &{}=&{} {\textbf{A}}^{[K]} &{}{\textbf{G}}^{-1} \\ K \times N_\textrm{exp} &{}&{} K \times N_\textrm{exp} &{} N_\textrm{exp} \times N_\textrm{exp} \end{array}. \end{aligned}$$
(21)

The approximated result vector using SVD and array interpolation will be

$$\begin{aligned} \hat{{\textbf{y}}}({\textbf{x}}) = \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( {\varvec{\Phi }}^{[K]} \hat{\varvec{\upalpha }}^{[K]}({\textbf{x}}) \right) . \end{aligned}$$
(22)

2.4.2 Scalar interpolation

The other possibility is to interpolate each amplitude per basis direction separately. Then, the interpolant is a scalar function for each basis vector and there will be a vector with global scaling parameters \(\varvec{\uptheta }\) for each basis vector. The reasoning behind optimizing the RBF for each basis vector separately, is that it may be expected that different modes have different dependencies to the input parameters, which potentially enhances the model accuracy when optimizing all RBFs from different modes separately. The interpolant as described in Eq. (8) will be a scalar with the amplitude corresponding to basis vector n, \({\hat{f}}({\textbf{x}}) = {\hat{\upalpha }}_n({\textbf{x}})\). The expression for the interpolant will be

$$\begin{aligned} \begin{array}{cccc} {\hat{\upalpha }}_n({\textbf{x}}) &{}=&{} {\textbf{w}} &{} {\textbf{g}}({\textbf{x}}{;}\mathbf {\uptheta }_n) \\ 1 \times 1 &{}&{} 1 \times N_\textrm{exp} &{} N_\textrm{exp} \times 1 \end{array}. \end{aligned}$$
(23)

The matrix with output training data of the interpolant that is used to calculate the weights is the amplitude array \(\varvec{\upalpha }^T_n\), a row in the amplitude matrix given in Eq. (15). Following Eq. (10) the array with weights is found using

$$\begin{aligned} \begin{array}{cccc} {\textbf{w}} &{}=&{} \varvec{\upalpha }^T_n &{}{\textbf{G}}^{-1} \\ 1 \times N_\textrm{exp} &{}&{} 1 \times N_\textrm{exp} &{} N_\textrm{exp} \times N_\textrm{exp} \end{array}. \end{aligned}$$
(24)

Hence, the approximated result vector using SVD and scalar interpolation is

$$\begin{aligned} \hat{{\textbf{y}}}({\textbf{x}}) = \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( \sum \limits _{n=1}^K \varvec{\upvarphi }_n {\hat{\upalpha }}_n({\textbf{x}}) \right) . \end{aligned}$$
(25)

3 Optimization of surrogate model parameters

This section describes how to find the optimal values of the global scaling parameters \(\varvec{\uptheta }\) and how to choose the number of basis vectors K for the reduced order models. To do so, first a scalar quality measure that takes into account the different physical parts is introduced in Sect. 3.1. This quality measure will be used as a fitting criterion to find the optimal values of the global scaling parameter and will also be used to define a criterion for the number of basis vectors to be included in the reduced basis. To optimize the global scaling parameters and choose the number of basis vectors, it is important to understand the different sources of error in the surrogate models. The error between a reference solution \({\textbf{y}}_q\) and its approximated result is an \(M\times 1\) array, that is referred to as the error array \(\varvec{\upepsilon }_q = \{ \epsilon _{mq} \}\).

In Sect. 5.2, the scalar quality measure that is defined in this section will also be used to assess the quality of the obtained surrogate models based on a validation set. Therefore, the reference solution \({\textbf{y}}_q\) can either be a result array in the training set, or a result array in a validation set . The resulting error arrays are denoted with \(\tilde{\varvec{\upepsilon }}_q\) when based on the training set and \(\varvec{\upepsilon }_q\) when based on the validation set.

Different sources of error can be distinguished that are annotated with an upper prescript. For the model that is based on direct interpolation the total error denoted with \(\delta\) (Sect. 3.2) is mainly the interpolation error. In Sect. 5.2, it will be show that there is an additional small error due to inherent sparsity of data. For the reduced order models two sources of error are distinguished: the interpolation error denoted with \(\iota\) (Sect. 3.4) and the truncation error denoted with \(\kappa\) (Sect. 3.3). The total error is denoted with \(\tau\) (Sect. 3.2).

For all approaches, a quality measure based on the error due to interpolation will be used to optimize the global scaling parameters \(\varvec{\uptheta }\). In Sect. 3.5 the relation between the interpolation error and the truncation error is used to choose the number K of basis vectors.

3.1 Quality measure considering physical parts

A quality measure is defined based on a reference solution \({\textbf{y}}_q\) such that it can be used to calculate the error with respect to the training set as well as with respect to a validation set.

The error array contains the differences between the reference solution and the approximated result in components of different physical parts. Similar to the result vector itself, the error array can be partitioned in a part corresponding to the displacement field, \(\varvec{\upepsilon }_{{\textrm{u}},q}\), the equivalent plastic strain field, \(\varvec{\upepsilon }_{\upvarepsilon ,q}\), and the stress tensor field, \(\varvec{\upepsilon }_{\upsigma ,q}\).

To obtain a scalar measure for the quality of the result field the Fraction of Variance Unexplained (\(\textrm{FVU}\)) is used. The fraction of variance unexplained is defined as the Sum of Squared Errors (\(\textrm{SSE}\)) divided by the Total Sum of Squares (\(\textrm{SST}\)) of the zero centered data. For example the \(\textrm{FVU}\) in the displacement field is

$$\begin{aligned} \textrm{FVU}_\textrm{u}= & {} \frac{\textrm{SSE}_\textrm{u}}{\textrm{SST}_\textrm{u}} = \frac{ \sum \limits _{q = 1}^{Q} ( \Vert \varvec{\upepsilon }_{\textrm{u},q} \Vert _2 )^2 }{ \sum \limits _{q = 1}^{Q} (\Vert {\textbf{y}}_{\textrm{u},q} - \bar{{\textbf{y}}}_{\textrm{u}} \Vert _2)^2 } \nonumber \\= & {} \frac{ \sum \limits _{q = 1}^{Q} \sum \limits _{m = 1}^{M_{\textrm{u}}} (\epsilon _{{\textrm{u}},mq})^2 }{ \sum \limits _{q = 1}^{Q} \sum \limits _{m = 1}^{M_{\textrm{u}}} ( y_{\textrm{u},mq} - {\bar{y}}_{\textrm{u},m})^2. } \end{aligned}$$
(26)

The \(\textrm{FVU}\) in the equivalent plastic strain and the stress are calculated equivalently. The total \(\textrm{FVU}\) can be defined as

$$\begin{aligned} \textrm{FVU} = \frac{1}{3} \left( \textrm{FVU}_{{\textrm{u}}} + \textrm{FVU}_{\upvarepsilon } + \textrm{FVU}_{{\upsigma }} \right) , \end{aligned}$$
(27)

Due to the normalization in Eq. (26) the three different parts are equally important in the error measure, independent of the dimension of each physical field and of the magnitudes of the values.

Again a distinction is made whether the \(\textrm{FVU}\) is based on the training set or the validation set. The \(\textrm{FVU}\) based on the training set is denoted with \(\widetilde{\textrm{FVU}}\) .

3.2 Total reconstruction error

The total reconstruction error of the direct model \({}^{\delta }{\varvec{\upepsilon }}\) and of the reduced order models \({}^{\tau }{\varvec{\upepsilon }}\) are defined as the difference between a reference solution \({\textbf{y}}_p\) and its interpolated approximation \(\hat{{\textbf{y}}}({\textbf{x}}_p)\). The total reconstruction error of a reference solution approximated using the direct interpolation is

$$\begin{aligned} {}^{\delta }{\varvec{\upepsilon }}_p = {\textbf{y}}_p - \hat{{\textbf{y}}}({\textbf{x}}_p). \end{aligned}$$
(28)

For the reduced order models, truncation of the result space will occur by removing the higher modes. The total reconstruction error of the reduced order models therefore depends on the number of basis vectors that are included in the truncated basis. The total reconstruction error of a reference solution in the validation set \({\textbf{y}}_p\) approximated using K basis vectors will be

$$\begin{aligned} {}^{\tau }{\varvec{\upepsilon }}_{p}^{[K]} = {\textbf{y}}_p - \hat{{\textbf{y}}}^{[K]}({\textbf{x}}_p). \end{aligned}$$
(29)

Note that this error array describes the error fields in the different physical parts, and can be substituted into Eqs. (26) and (27) to calculate a scalar error measure that describes the quality of the model, which is denoted with \({}^\tau \textrm{FVU}\). For the reduced order models the \({}^\tau \textrm{FVU}\) is a function of the number of basis vectors included in the basis, hence we write \({}^\tau \textrm{FVU}^{[K]}\) to indicate the number of basis vectors included.

3.3 Truncation error

The truncation error of the training set is the difference between the result vector and its K-rank approximation as defined in Eq. (18). The two reduced order models (with array and scalar interpolation) use the same basis in construction, only the interpolation of the amplitudes is different. The truncation error in the training set for both reduced order models can be defined as

$$\begin{aligned} \begin{aligned} {}^{\kappa }\tilde{\varvec{\upepsilon }}_i^{[K]}&= {\textbf{y}}_i - {\textbf{y}}_i^{[K]} \\&= {\textbf{y}}_i - \left[ \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( \sum \limits _{n=1}^K \varvec{\upvarphi }_n \alpha _{ni} \right) \right] \end{aligned}. \end{aligned}$$
(30)

As proposed by [2] a result vector from the validation set (which they call a supplementary observation) can be mapped into the truncated basis using a least square projection. The amplitude corresponding to basis direction n for reference solution \({\textbf{y}}_p\) can be found using

$$\begin{aligned} \upalpha _{np} = \varvec{\upvarphi }_n^{T}\left[ \textrm{diag}({\textbf{s}})({\textbf{y}}_p - \bar{{\textbf{y}}}) \right] . \end{aligned}$$
(31)

The reference solution can be approximated in the truncated basis with K basis vectors as

$$\begin{aligned} {\textbf{y}}_p^{[K]} = \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1} \left( {\varvec{\Phi }}^{[K]}{\varvec{\Phi }}^{[K]T}\left[ \textrm{diag}({\textbf{s}})({\textbf{y}}_p - \bar{{\textbf{y}}}) \right] \right) . \end{aligned}$$
(32)

We define the truncation error as the difference between the reference solution and its approximation using K basis vectors. The truncation error for reference solution \({\textbf{y}}_p\) of the validation set will be

$$\begin{aligned} {}^{\kappa }{\varvec{\upepsilon }}_p^{[K]} = {\textbf{y}}_p - {\textbf{y}}_p^{[K]}, \end{aligned}$$
(33)

where \({\textbf{y}}_p^{[K]}\) is the projection of the reference solution onto the truncated basis.

3.4 Interpolation error

The interpolation error \({}^{\iota }{\varvec{\upepsilon }}\) is a measure for the error inside the result space spanned by the basis vectors that are included in the model. For a reference solution in the validation set, the interpolation error is defined as the difference between the reference solution approximated using K basis vectors as defined in Eq. (32), and the approximated result vector. The interpolation error of validation sample \({\textbf{y}}({\textbf{x}}_p)\) reconstructed using K basis vectors will be

$$\begin{aligned} {}^{\iota }{\varvec{\upepsilon }}_p^{[K]} = {\textbf{y}}_p^{[K]} - \hat{{\textbf{y}}}^{[K]}({\textbf{x}}_p). \end{aligned}$$
(34)

To assess the interpolation error in the training set, Leave-One-Out cross validation data will be used. In Leave-One-Out (LOO) cross validation one sample point and the corresponding result are removed from the training set and the surrogate model is reconstructed based on the retained \(N_\textrm{exp}-1\) training points. The surrogate model constructed without training point i is designated as \(\hat{{\textbf{f}}}^{-i}({\textbf{x}})\). By evaluating the obtained surrogate model at the point \({\textbf{x}}_i\) and comparing it with the corresponding result field \({\textbf{f}}_{i}\), the LOO-error is calculated. This procedure is repeated for all \(N_\textrm{exp}\) training points to perform a cross-validation. The calculation of the LOO-error proposed in [24] can be extended to calculate the LOO-error of a result array. For an arbitrary interpolant \({\textbf{f}}({\textbf{x}})\) the error of leaving out training point i is

$$\begin{aligned} { \tilde{{\textbf{e}}}_{i} } = {\textbf{f}}_{i} - \hat{{\textbf{f}}}^{-i}({\textbf{x}}_i) = \frac{ {\textbf{w}}_i }{ \left( {\textbf{G}}^{-1} \right) _{ii} }, \end{aligned}$$
(35)

in which \(\tilde{{\textbf{e}}}_{i}\) is the array with errors, which has the same size as the interpolant.

The interpolation error based on leaving out point \({\textbf{x}}_i\) in one of the surrogate models is defined as

$$\begin{aligned} { {}^{\iota }\tilde{\varvec{\upepsilon }}_i} = {\textbf{y}}_i - \hat{{\textbf{y}}}^{-i}({\textbf{x}}_i). \end{aligned}$$
(36)

In the following sections, the interpolation errors in the different surrogate models are derived by factoring out the interpolant so that Eq. (35) can be substituted. To calculate the interpolation error in the different surrogate models based on this LOO-error three assumptions are made. First it is assumed that the mean \(\bar{{\textbf{y}}}\) does not change when sample i is left out of the training set. Secondly, for the reduced order models, it is assumed that the obtained basis vectors do not change when sample i is left out of the training set. Lastly, it is assumed that the global scaling parameters \(\mathbf {\uptheta }\) stay the same during the LOO cross validation.

The \(\textrm{FVU}\) and \(\textrm{FVU}^{[K]}\) of the interpolation errors presented in the next sections are used as the scalar measure that will be minimized to determine the global scaling parameters \(\varvec{\uptheta }\).

3.4.1 Model 1: direct interpolation

Using direct interpolation, the interpolation error based on the training set is the total error. The error of leaving out point \({\textbf{x}}_i\) in the training set can be found by substituting Eq. (12) into the definition of the interpolation error in Eq. (36) and factoring out the interpolant to be able to substitute Eq. (35). The error of leaving out point \({\textbf{x}}_i\) in the surrogate model with direct interpolation gives the following error:

$$\begin{aligned} \begin{aligned} {{}^{\delta }}\tilde{\varvec{\upepsilon }}_i&= {\textbf{y}}_i - \hat{{\textbf{y}}}^{-i}({\textbf{x}}_i) \\&= \left[ \bar{{\textbf{y}}} + {\textbf{y}}_{0,i} \right] - \left[ \bar{{\textbf{y}}} + \left( \hat{{\textbf{y}}}_0^{-i}({\textbf{x}}_i) \right) \right] \\&= \left[ {\textbf{y}}_{0,i} - \hat{{\textbf{y}}}_0^{-i}({\textbf{x}}_i) \right] \\&= \left[ \frac{ {\textbf{w}}_i }{ \left( {\textbf{G}}^{-1} \right) _{ii} } \right] \end{aligned}, \end{aligned}$$
(37)

in which \({\textbf{w}}_i\) is the \(M \times 1\) column of the matrix with weights corresponding to experiment i. This error array for the different physical parts is substituted into Eqs. (26) and (27) to calculate the total \(\textrm{FVU}\), which is a scalar error measure that describes the quality of the model. The \(\textrm{FVU}\) due to interpolation in the training set will be minimized to find the best fitting metamodel parameters. Hence, the \(N_\textrm{dim}\) components of the global scaling parameter vector \(\mathbf {\uptheta }\) are found using

$$\begin{aligned} {{\,\textrm{argmin}\,}}_{\varvec{\uptheta }} {}^{{\delta }}\widetilde{\textrm{FVU}}. \end{aligned}$$
(38)

3.4.2 Model 2: SVD and array interpolation

The error of leaving out point \({\textbf{x}}_i\) in the training set can again be derived by substituting the obtained approximation into Eq. (36) and factoring out the interpolant. The interpolation error depends on the number of basis vectors K included in the surrogate model. The error is therefore calculated between the K-rank approximation of \({\textbf{y}}_i\) as defined in Eq. (18) and the approximation from the surrogate model fitted without sample point i evaluated at sample point i, that we call \(\hat{{\textbf{y}}}^{-i,[K]}({\textbf{x}}_i)\). By substituting Eqs. (18) and (22) into Eq. (36) the interpolation error of the model based on SVD and array interpolation is found:

$$\begin{aligned} \begin{aligned} {}^{\iota }\tilde{\varvec{\upepsilon }}_i^{[K]}&= {\textbf{y}}_i^{[K]} - \hat{{\textbf{y}}}^{-i,[K]}({\textbf{x}}_i) \\&= \left[ \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( {\varvec{\Phi }}^{[K]}\varvec{\upalpha }^{[K]}_{i} \right) \right] \\&\quad - \left[ \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( {\varvec{\Phi }}^{[K]}\hat{\varvec{\upalpha }}^{-i,[K]} ({\textbf{x}}_i) \right) \right] \\&=\ \textrm{diag}({\textbf{s}})^{-1}\left( {\varvec{\Phi }}^{[K]} \left[ \varvec{\upalpha }^{[K]}_{i} - \hat{\varvec{\upalpha }}^{-i,[K]}({\textbf{x}}_i)\right] \right) \\&=\ \textrm{diag}({\textbf{s}})^{-1}\left( {\varvec{\Phi }}^{[K]} \left[ \frac{ {\textbf{w}}_i }{ \left( {\textbf{G}}^{-1} \right) _{ii} } \right] \right) , \end{aligned} \end{aligned}$$
(39)

in which \(\varvec{\upalpha }^{[K]}_{i}\) is the column in the amplitude matrix (Eq. (15)) with K amplitudes corresponding to experiment i. Again, the errors in the different physical parts are substituted into Eqs. (26) and (27) to calculate a scalar error measure that describes the quality of the model.

For the surrogate model based on array interpolation a global scaling parameter vector \(\varvec{\uptheta }\) is sought for each truncation of the basis:

$$\begin{aligned} \varvec{\uptheta }{^{[K]}} = {{\,\textrm{argmin}\,}}_{\varvec{\uptheta }} {}^{\iota }\widetilde{\textrm{FVU}}^{[K]} \text { with: } K \in \{1,\ldots ,N_\textrm{exp}-1\}. \end{aligned}$$
(40)

3.4.3 Model 3: SVD and scalar interpolation

The interpolation error for the model based on SVD and scalar interpolation is found by substituting Eq. (19) and (25) into Eq. (36) and factoring out the interpolant. The interpolation error of the surrogate model with SVD and scalar interpolation is

$$\begin{aligned} \begin{aligned} {}^{\iota }\tilde{\varvec{\upepsilon }}_i^{n}&= {\textbf{y}}_i^{n} - \hat{{\textbf{y}}}^{-i,n}({\textbf{x}}_i) \\&= \left[ \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1} \left( \varvec{\upvarphi }_n \alpha _{ni} \right) \right] \\&\quad - \left[ \bar{{\textbf{y}}} + \textrm{diag}({\textbf{s}})^{-1}\left( \varvec{\upvarphi }_n {\hat{\upalpha }}^{-i}_n({\textbf{x}}_i) \right) \right] \\&= \ \textrm{diag}({\textbf{s}})^{-1}\left( \varvec{\upvarphi }_n \left[ \alpha _{ni} - {\hat{\upalpha }}^{-i}_n({\textbf{x}}_i) \right] \right) \\&=\ \textrm{diag}({\textbf{s}})^{-1}\left( \varvec{\upvarphi }_n \left[ \frac{ w_{n,i} }{ \left( {\textbf{G}}^{-1} \right) _{n,ii} } \right] \right) . \end{aligned} \end{aligned}$$
(41)

Again this error array can be used to calculate a scalar error measure in the physical domain (Eq. (27)) in order to find the global scaling parameter vector \(\varvec{\uptheta }_n\) for each basis direction:

$$\begin{aligned} \varvec{\uptheta }_n = {{\,\textrm{argmin}\,}}_{\varvec{\uptheta }} {}^{\iota }\widetilde{\textrm{FVU}}^{n} \text { with: } n \in \{1,\ldots ,N_\textrm{exp}-1\}. \end{aligned}$$
(42)

Note that for \(K = 1\) in Eq. (40) and \(n = 1\) in Eq. (42) the amplitude array interpolation and scalar interpolation result in the same global scaling parameters.

3.5 Choosing the number of basis vectors

The number of basis vectors in the truncated basis K can be freely chosen. With smaller K the model is cheaper to evaluate, but less accurate. With larger K the model will be more accurate, but also more expensive to evaluate and it will require more storage space. The higher modes are dominated by noise and only contribute little to the accuracy of the surrogate model or may even deteriorate the accuracy.

To choose the number of basis vectors to be retained in the truncated basis, the ratio between the summation of K included singular values and the summation of all of them is often used [5, 12]. This ratio, referred to as the cumulative energy, is defined as:

$$\begin{aligned} C_K = \frac{\sum _{n=1}^{K} d_{n}^2 }{\sum _{n=1}^{N_\textrm{exp}-1} d_{n}^2 }. \end{aligned}$$
(43)

The cumulative energy which is considered sufficient can be set and is called the cut-off ratio or threshold. For example, in the work of Hamim [12] the cut-off ratio is set to 99%, whereas in the work of Buljak [5] it is set to 99.999%. It has been shown in earlier work that the cumulative energy is not a proper indicator for the quality of the surrogate model [7].

The goal is to find the right balance between added accuracy and added error. It is therefore proposed to choose the number of basis vectors based on a ratio \(R_\textrm{K}\) between the interpolation error \({}^\iota \widetilde{\textrm{FVU}}^{[K]}\) and the truncation error \({}^\kappa \widetilde{\textrm{FVU}}^{[K]}\):

$$\begin{aligned} R_\textrm{K} = \frac{ {}^\kappa \widetilde{\textrm{FVU}}^{[K]} }{ {}^\iota \widetilde{\textrm{FVU}}^{[K]} }. \end{aligned}$$
(44)

The interpolation error \({}^\iota \widetilde{\textrm{FVU}}^{[K]}\) is the component of the error in the subspace that is spanned by the modes 1 to K. Additional basis vectors will not reduce the interpolation error. On the other hand, the truncation error \({}^\kappa \widetilde{\textrm{FVU}}^{[K]}\) is the component of the error that is spanned by the modes \(K+1\) to \(N_\textrm{exp}-1\), and that can therefore be potentially reduced by adding more basis vectors. By choosing a threshold on the ratio \(R_K\), no more basis vectors will be added when the potential improvement by enlarging the basis is less than \(R_K\) times the error that is already present in the model.

4 Demonstrator process: bending

To compare the different methods a demonstrator process is introduced in this section. For demonstration purposes we will investigate the bending of a metal flap. A schematic representation of the process can be found in Fig. 2. In the process a sheet metal work piece with initial sheet thickness (\(x_1\)) is bent downward to the final punch depth (\(x_2\)). Thereafter the punch is released and the workpiece will spring back.

Fig. 2
figure 2

Schematic representation of the demonstrator process and Finite Element mesh [7]

The process is described by two input variables, the sheet thickness (\(x_1\)) and the final punch depth (\(x_2\)). A point in the design space can be denoted as: \({\textbf{x}} = \{ x_1, x_2 \}^T\).

The process is modelled with a 2D plane strain model using the FE analysis software MSC Marc/Mentat version 2016. The sheet metal is modelled with an elastic-plastic isotropic material model with Von Mises yield criterion and a tabular hardening relation between flow stress and equivalent plastic strain. The work piece is meshed using 1200 quadrilateral elements (\(N_\textrm{elem}\)) and 1296 nodes (\(N_\textrm{nod}\)). The elements are fully integrated using four integration points per element with constant dilatational strain. The solution of the FE simulation including importing data into MATLAB 2019a takes approximately 45 s.

The output of the FE-model consists of the nodal displacement field, the equivalent plastic strain field, and stress tensor field in the integration points. Hence a result of the FE simulation is collected in an \(M \times 1\) result vector:

$$\begin{aligned} {\textbf{y}} = \left\{ \, \begin{array}{c} \left\{ {\textbf{y}}_\textrm{u} \right\} \\ \left\{ {\textbf{y}}_\upvarepsilon \right\} \\ \left\{ {\textbf{y}}_\upsigma \right\} \\ \end{array}\,\right\} , \end{aligned}$$
(45)

With two degrees of freedom per node \(M_u = 2 \cdot N_\textrm{nod} = 2592\). With four integration points per element \(M_\upvarepsilon = 4 \cdot N_\textrm{elem} = 4800\). With four integration points per element and because of the plane strain analysis there are 4 independent components of the stress tensor, \(M_\upsigma = 4 \cdot 4 \cdot N_\textrm{elem} =\) 19,200. The overall size of a result vector is \(M = M_\textrm{u} + M_\upvarepsilon + M_\upsigma =\) 26,592.

4.1 Dimensions and range of input and output

The input and output parameters of the model have different ranges and dimensions. The nominal thickness \(x_1\) is 0.3 mm, the thickness varies between 0.295 mm and 0.305 mm. The punch depth varies between between 1.5 mm and 1.6 mm. These ranges in input result in a variation in output. The displacement field is given in millimeter and varies between \(-\,1.69\) mm and \({+}\,0.07\) mm. The equivalent plastic strain field is dimensionless and varies between 0 and 0.48. Lastly, the stress tensor values are in MPa and vary between \(-661\) MPa and \({+}566\) MPa.

4.2 Constructing surrogate models of the demonstrator

To obtain a training set a star point design is combined with a Latin Hypercube Sample (LHS).

To investigate the influence of the sample size on the surrogate model accuracy, four different sample sizes are used. The number of sample points in the LHS designs are 15, 35, 55 and 75 sample points. To rule out the dependency on the distribution of the sample points in the input space, five different LHS designs per sample size are used. For the demonstrator process with \(N_\textrm{dim}=2\), the starpoint design consists of \(N_\textrm{star} = 2 \cdot N_\textrm{dim} + 1 = 5\) sample points. The total number of experiments in one training set will be

$$\begin{aligned} N_\textrm{exp} = N_\textrm{star} + N_\textrm{lhs} = 2 \cdot N_\textrm{dim} + 1 + N_\textrm{lhs}. \end{aligned}$$
(46)

Consequently, the used sample sizes are 20, 40, 60 and 80. The designs of the different samplings are optimized by generating 250 different LHS designs and picking the 5 best based on the maximum minimum distance between sample points of the combined star-point and LHS design.

Because each sample size is replicated with five different LHS designs, the total number of FE-simulations is \(5+5\cdot (15+35+55+75) = 905\). The results of the FE-simulations will lead to a total of 20 training sets for which the global scaling parameters are optimized based on the interpolation errors as presented in Eqs. (38), (40) and (42) for direct interpolation, SVD+array interpolation and SVD+scalar interpolation respectively.

5 Results using different surrogate models

The Fraction of Variance Unexplained (\(\textrm{FVU}\)) as described in Sect. 3.1 is used to compare the different surrogate models after the hyperparameters have been optimized. First the models are analyzed based on the training set using the Leave-One-Out errors in Sect. 5.1. To obtain a validation set the input space is sampled with more sample points and the corresponding simulations are performed . The results based on the validation set are presented in Sect. 5.2. In Sect. 5.3 the selection of the number of basis vectors, based on the criterion in Eq. (44) is reviewed.

5.1 Results based on the training set

Figure 3 shows the different errors based on the training data as derived in Sects. 3.3 and 3.4. By including more basis vectors in the reduced basis, the \({}^\kappa \widetilde{\textrm{FVU}}\) due to truncation becomes smaller. With a full basis the \({}^\kappa \widetilde{\textrm{FVU}}\) due to truncation will be 0, as the training data exist in the subspace that is spanned by the full basis.

Fig. 3
figure 3

Mean \(\textrm{FVU}\) calculated using the training set of surrogate models based on five different training sets constructed using a zero centered and scaled snapshot matrix. The shaded area represents the range over all training sets

As it should, the interpolation error with \(K = 1\) is the same for the models using SVD with array and scalar interpolation. Including more basis vectors in the model increases the \({}^{\iota }\widetilde{\textrm{FVU}}\) due to interpolation, as the part of the result space that is spanned by the interpolation model increases with each addition of a mode. With each added mode (i.e. direction in the result space), an interpolation error (\(\ge 0\)) is added, which leads to a gradual increase of the interpolation error. The model with direct interpolation indicates the maximum \({}^{\iota }\widetilde{\textrm{FVU}}\) in the training set. The \({}^{\iota }\widetilde{\textrm{FVU}}\) of the model based on SVD with array interpolation increases towards the \({}^{{\delta }}\widetilde{\textrm{FVU}}\) of the model with direct interpolation.

The \({}^{\iota }\widetilde{\textrm{FVU}}\) due to interpolation converges to a constant value when more basis vectors are added. If this constant value is reached, including more basis vectors does not improve the model anymore. As expected, using more sample points in the training set decreases the \({}^{\iota }\widetilde{\textrm{FVU}}\) due to interpolation and the \({}^{\delta }\widetilde{\textrm{FVU}}\) of the model based on direct interpolation. In other words, larger training sets result in better models. Using more sample points also decreases the bandwidth of the results, hence the quality of the model becomes less dependent on the sampling of the training set.

5.2 Results with a validation set

A validation set is obtained on a \(9\times 9\) grid, on a range from 10 to 90% in the normalized input parameter space of the model. To obtain the validation set an additional \(N_\textrm{val} = 81\) simulations were performed. Only the centre point of the grid was included in the initial training sets. Figure 4 shows the different types of \(\textrm{FVU}\) based on the validation set.

Fig. 4
figure 4

Mean \(\textrm{FVU}\) calculated using a validation set of surrogate models based on five different training sets constructed using a zero centered and scaled snapshot matrix. The shaded area represents the range over all training sets

The trends in the \(\textrm{FVU}\) in Fig. 4 are similar to the trends in \(\textrm{FVU}\) based on the training set in Fig. 3. The following observations made based on the error in the training set, also hold for the \(\textrm{FVU}\) based on the validation set. First, by including more basis vectors in the reduced basis, the \({}^\kappa \textrm{FVU}\) due to truncation becomes smaller. However, note that the \({}^\kappa \textrm{FVU}\) due to truncation does not drop to 0 when all basis vectors are included in the basis.

The truncation error calculated based on the training set gives a good approximation for the true \({}^\kappa \textrm{FVU}\) for the first basis vectors. For higher number of basis vectors, the truncation error from the training set is an underestimation of the actual truncation error that is determined with the validation set.

Second, the interpolation error for the models using SVD with array and scalar interpolation are equal for a basis truncated to \(K = 1\). Including more basis vectors in the model increases the \({}^{\iota }\textrm{FVU}\) due to interpolation. The \({}^{\iota }\textrm{FVU}\) due to interpolation and the total \({}^\tau \textrm{FVU}\) converge to a constant value when more basis vectors are added. Using more sample points in the training set decreases the \(\textrm{FVU}\) and also decreases the bandwidth of the results.

The total \({}^\tau \textrm{FVU}\) of the model based on SVD with array interpolation decreases towards the \({}^\delta \textrm{FVU}\) of the model with direct interpolation. Based on the estimated \({}^{\iota }\widetilde{\textrm{FVU}}\) due to interpolation as determined with the training set, it would be expected that the SVD with scalar interpolation will perform best of all methods. However, the results from the validation data set indicate the contrary. The total \({}^\tau \textrm{FVU}\) based on the validation set is higher, meaning less variance is explained. This can be due to overfitting, especially in the higher modes. As found by Rao et al. [23] the increased number of hyperparameters can lead to overfitting and underestimation of the LOO error. For noisy data a general trend will perform better in predicting new data, than an overfitted metamodel. Generally the \({}^{\iota }\widetilde{\textrm{FVU}}\) due to interpolation calculated based on the training set is higher than the \({}^{\iota }\textrm{FVU}\) calculated based on the validation set. Hence, the error based on the training set overestimates the error based on the validation set.

5.3 Selection of the number of basis vectors

The cumulative energy in (43) and the ratio in Eq. (44) are calculated based on the training set and will be used to determine when to truncate the reduced basis. In the top row of Fig. 5 the cumulative energy in Eq. (43) is plotted for both reduced order models. Thresholds on the cumulative energy \(C_K\) of \(99.9\%\) and \(99.99\%\) are used to truncate the reduced bases. The middle row shows the ratio in Eq. (44) for the surrogate models based on SVD with scalar and array interpolation. A threshold on the ratio \(R_K\) of 0.1 is chosen to truncate the reduced basis. With this threshold the error due to truncation over all K included basis vectors is 10 times smaller than the error due to interpolation. The potential benefit of adding more basis vectors will be at most a reduction of 10% of the error that is already in the model due to interpolation. The models based on SVD and array interpolation reach this threshold faster than the models based on SVD and scalar interpolation

In the bottom row of Fig. 5 the ratio between the error in the direct model \({}^{\delta }{}{\textrm{FVU}}\) based on the validation set and the total error \({}^{\tau }{}{\textrm{FVU}}\) based on the validation set in the reduced order models is plotted. A ratio of one indicates that the direct model and the reduced order model have the same performance, and a ratio \(>1\) indicates that the truncated model performs better than the direct model.

Fig. 5
figure 5

Top row: cumulative energy. Mid row: ratio between the \(\widetilde{\textrm{FVU}}\) due to truncation \(\varvec{\kappa }\) and interpolation \(\varvec{\iota }\) based on the training set. \(R_K = 0.1\) is indicated with a black line. Bottom row: ratio between the \(\textrm{FVU}\) in the direct model \(\varvec{\delta }\) and the total error \(\varvec{\tau }\) in the reduced order models. Array interpolation is plotted with a blue solid line, scalar interpolation with a red dashed line

The number of basis vectors K at which the model will be truncated based on the criterion that \(R_\textrm{K} < 0.1\) is indicated with a black dot. These truncated models perform equally or slightly better than the direct interpolation models in case of array interpolation, except for the models based on the training sets with \(N_\textrm{exp} = 40\). With more samples in the training set, the difference in performance between the reduced order models with array and scalar interpolation becomes smaller.

The \({}^\iota \widetilde{\textrm{FVU}}\) due to interpolation based on the training set of the models with a truncated basis according to \(R_K < 0.1\), \(C_K > 99.9\%\) and \(C_K > 99.99\%\) are presented in Fig. 6a. On average the modelling approach with SVD and scalar interpolation has the lowest interpolation error \({}^{\iota }\widetilde{\textrm{FVU}}\). This is a result of the larger number of hyperparameters that increases the flexibility in the interpolation, as a separate set of global scaling parameters \(\mathbf {\uptheta }\) is determined for each basis vector. In case of array interpolation, the average reduction of the data over all datasets using the criterion that \(R_K < 0.1\) is 56%, which is in between the reductions with \(C_K > 99.9\%\) (average reduction of 63%) and with \(C_K > 99.99\%\) (average reduction of 30%).

Figure 6b displays the total \({}^{\tau }\textrm{FVU}\) based on the validation set for the surrogate models truncated based on the criterion that \(R_\textrm{K} < 0.1\), \(C_K > 99.9\%\), \(C_K > 99.99\%\) and using a full basis (the complete results can be found in Appendix 1). For all data sets, the model based on SVD+scalar interpolation has a larger error than the model based on direct interpolation and the models based on SVD+array interpolation, which indicates some amount of overfitting for the SVD+scalar interpolation models.

When comparing the different truncation criteria for the case of SVD+array interpolation using the validation data set, it is found that the FVU with the criterion that \(R_\textrm{K} < 0.1\) is on average 5.5% lower than with the criterion that \(C_K > 99.9\%\). In comparison to the criterion that \(C_K > 99.99\%\) the FVU is on average 0.1% lower, which is no significant difference in accuracy. However, the number of used basis vectors is significantly lower for the new criterion.

Fig. 6
figure 6

Fraction of Variance Unexplained a due to interpolation based on the training set \({}^\iota \widetilde{\textrm{FVU}}\) and b total based on the validation set \({}^\tau \textrm{FVU}\).

6 Conclusions

The LOO-error as proposed by Rippa [24] has been successfully extended for use in error estimation of array surrogate models and SVD-based surrogate models. When comparing the Fraction of Variance Unexplained (\(\textrm{FVU}\)) based on the training set (Fig. 3) and the \(\textrm{FVU}\) based on a validation set (Fig. 4) for a sheet bending demonstration problem that is modelled with FE, it is found that the errors based on the training set overestimate the interpolation error. The truncation error calculated based on the training set underestimates the truncation error in the validation set when many basis vectors are included, while it is a good estimate for the truncation error when few basis vectors are used.

Based on the interpolation error in the training set, the surrogate model constructed using SVD with scalar interpolation was expected to have the best performance. However, based on the validation set, this method for surrogate model construction has the largest \(\textrm{FVU}\) and thus the lowest performance. With larger training sets the performance of these surrogate models gets closer to the surrogate models constructed using direct interpolation and using SVD with array interpolation. Nevertheless, it is recommended to use SVD with array interpolation.

If the amplitudes are interpolated using array interpolation, model reduction can be applied without loss of accuracy compared to using a model based on direct interpolation. The models based on SVD with array interpolation perform similar to and sometimes even better than the models based on direct interpolation, depending on where the basis is truncated.

To determine where to truncate the model, a new criterion is proposed in this work. It is shown that the ratio between the truncation error and the interpolation error in the training set (Eq. (44)) can be effectively used to balance between model reduction and accuracy. This is shown in a comparison with the commonly used cumulative energy (Eq. (43)), which only accounts for the truncation error and does not consider the interpolation error. The FVU in the validation data set with the new criterion \(R_k < 0.1\) is on average 5.5% lower than the FVU when using a contribution ratio threshold of \(C_K > 99.9\%\), whereas it is comparable to the FVU when using a tighter threshold on the cumulative energy of \(C_K > 99.99\%\). However, the reduction of the data set is significantly larger with the new criterion: on average 56% instead of 30% with the contribution ratio threshold of \(C_K > 99.99\%\). Therefore, it is concluded that a threshold of \(R_k < 0.1\) is appropriate for determining the number of basis vectors for the given problem.