1 Introduction

Sheet-metal forming is one of the essential manufacturing processes for structural and body parts in various industries, for example, the automotive industry. Essentially, a thin metal sheet is plastically deformed into its desired shape by means of forming tools. Not only the process of forming itself is subject to a number of process parameters, but also material and shape parameters of the component influence the success of the forming process. Numerical methods such as finite element (FE) methods have been developed since the 1960s and have been applied in industrial use since about the 1980s. An early overview can, for example, be found in Makinouchi (1996). More recently, inverse methods have been proposed to save computational resources while still being able to predict the manufacturability of components (Lee and Huh 1997, 1998; Guo et al. 2000). For an overview of more recent developments in simulation methods for sheet metal forming, interested readers are referred to some of the nice review articles on the topic, such as by Ablat and Qattawi (2016) or by Andrade-Campos et al. (2022).

Along with the development of improved simulation methods, new optimization methods for structural problems were suggested. One common challenge in applying such multi-query algorithms to structural problems is the often infeasible amount of computational resources required for running an FE simulation with every evaluation. Modern optimization approaches, such as efficient global optimization [EGO; Jones et al. (1998)], which were specifically designed to reduce the required evaluations, can partially solve this problem. The idea of EGO is to first fit a surrogate model from the initial design of experiments (DoE). Typically, a kriging model (Krige 1951; Matheron 1963; Sacks et al. 1989) is used due to its inherent error approximation. Subsequently, this surrogate model is iteratively improved using an infill criterion that determines new sample locations. The most popular criterion is the originally proposed expected improvement [EI; Jones et al. (1998)], while several other options can be found, for example, in Jones (2001). A more detailed review of this type of surrogate-based optimization is given in Forrester and Keane (2009).

More recently, in an effort to further reduce computational requirements on the optimization scheme, EGO and kriging were extended to so-called multi-fidelity schemes. Here, the accurate, high-fidelity simulation model is complemented by some form of low-fidelity model, which is usually less accurate but significantly cheaper to calculate. In the present work, a multi-fidelity variant of EGO based on hierarchical kriging (HK), a multi-fidelity extension to kriging suggested by Han and Görtz (2012) and an infill criterion called variable-fidelity expected improvement (VF-EI; Zhang et al. (2018)) is utilized. Interested readers are referred to previous work on multi-fidelity surrogate models as well as optimization for more information [e.g., Forrester et al. (2007); Park et al. (2016)].

Since the 1990s, different optimization approaches have also been applied to sheet metal forming. Ohata et al. (1998) optimized a two-stage deep-drawing process using three design variables in incremental forming simulations. Guo et al. (2000) utilized an inverse approach to optimize the blank shape for manufacturability. A surrogate-based optimization approach was suggested by Jansson et al. (2005) for the design of drawbeads and validated with experimental data. Different surrogate-based schemes, including kriging, were used to optimize a time-dependent blankholder force curve by Jakumeit et al. (2005). An overview of some of the earlier applications of optimization schemes to sheet-metal-forming problems is given by Wifi et al. (2007). More recently, a multi-fidelity optimization scheme for drawbead design combining both incremental high-fidelity forming simulations and a more efficient low-fidelity simulation has been proposed by Sun et al. (2010). Although the initial work is based on polynomial regression, the authors later extended the approach to other metamodels such as kriging using an artificial bee colony optimization algorithm (Sun et al. 2012).

In the present work, we apply a modern HK-based multi-fidelity optimization approach to an exemplary problem on the manufacturability of a deep-drawn cross-die component. We compare three different objective functions, which have all been proposed in the literature in a similar form. There are two main goals when comparing the performance of multi-fidelity algorithms in the context of this work. First, the multi-fidelity approach should reduce the overall computational effort of the optimization process. Second, it should not lead to significantly worse results compared to optimization using only high-fidelity simulations. We aim to establish the applicability of a modern multi-fidelity optimization approach to sheet metal forming and work out possible differences between the objective functions.

The present work is structured as follows. In Sect. 2, the multi-fidelity optimization approach utilizing HK and VF-EI is introduced. In Sect. 3, the numerical example studied here is presented. The objective functions compared here are introduced in Sect. 4 along with the definition of the optimization problem. The performance of the algorithms is compared and discussed in Sect. 5. Finally, all findings are summarized, and an outlook into possible future work is given in Sect. 6.

2 Multi-fidelity optimization

In the following, the multi-fidelity EGO approach based on HK and VF-EI is introduced, which will be used in this work. At the top level, this surrogate-based optimization scheme can be divided into two parts. First, a design of experiments is used to generate design samples to subsequently fit the surrogate model. Here, HK is utilized because it has been shown to yield better error approximations compared to other multi-fidelity kriging approaches (Han and Görtz 2012). Subsequently, adaptive samples are added, whereby their location is determined through maximization of an infill criterion on the previously created surrogate model. Here, VF-EI is applied because it has been shown to perform very well in application problems [see, for example, Zhang et al. (2018); Ruan et al. (2020)]. A schematic representation of the optimization scheme applied here is depicted in Fig. 1.

Fig. 1
figure 1

Schematic representation of the optimization scheme applied in the present work [adapted from Zhang et al. (2018)]

All steps of the outlined process are now explained in more detail. The first step as in any population-based optimization scheme is DoE. DoE is an active field of research, as there is no unique ‘best’ way to distribute these initial samples apart from the rather vague goal of good coverage of the design space. Interested readers are referred to one of the review articles such as Garud et al. (2017) for more information on different DoE methods and quality criteria for DoE.

In the present work, an optimal Latin hypercube (OLH) approach is used as it shows great performance in lower-dimensional applications. A Latin hypercube design (LHD) is commonly constructed as follows. When looking for N samples in d dimensions, each dimension of the design space is divided into N bins of equal probability. N cells of the total \(N^d\) created cells are then randomly selected so that each bin of each dimension only contains a single selected cell (McKay et al. 1979). Within each selected cell, a single sample is placed either in the center or randomly located [compare Rajabi et al. (2015)], whereby the former case is used here.

Initial Latin hypercube designs may still suffer from problems, such as correlations. Optimal Latin hypercube (OLH) provides a remedy by incrementally improving DoE quality according to a space-filling criterion. In the present work, a simulated annealing algorithm consisting of random pairwise and coordinate-wise swaps is utilized. New samples are always accepted if they improve the space-filling criterion and are accepted with a certain probability if they do not offer an improvement (Morris and Mitchell 1995). Other optimization approaches for OLH include deterministic sample selection (Ye et al. 2000) and Enhanced Stochastic Evolutionary algorithm as suggested by Jin et al. (2003). An overview of more recent developments around OLH can also be found in Viana (2015).

The second step of EGO is to fit an initial HK model to the d-dimensional objective functions based on the calculated sampling data. Readers are referred to the original publications (Krige 1951; Matheron 1963; Sacks et al. 1989) as well as more recent textbooks [such as Rasmussen and Williams (2005)] for more information on kriging in general and the original publication for a detailed derivation of the HK predictor (Han and Görtz 2012). The idea of HK is to first create a kriging model for the low-fidelity function. Therefore, consider a random process for the low-fidelity (LF) function

$$\begin{aligned} Y_{\mathrm{{LF}}} (\varvec{x}) = \beta _{0,\mathrm{{LF}}} + Z_{\mathrm{{LF}}} (\varvec{x}), \end{aligned}$$
(1)

where \(\beta _{0,\mathrm{{LF}}}\) is an unknown constant and \(Z_{\mathrm{{LF}}} (\varvec{x})\) is a stationary random process. Furthermore, a sample dataset \((\varvec{S}_{\mathrm{{LF}}}, \varvec{y}_{S,\mathrm{{LF}}})\) consisting of \(m_{LF}\) samples with input variable data \(\varvec{S}_{\mathrm{{LF}}} \in \mathbb {R}^{m_{\mathrm{{LF}}} \times d}\) and the corresponding output \(\varvec{y}_{S,\mathrm{{LF}}} \in \mathbb {R}^{m_{\mathrm{{LF}}}}\) is required.

To predict points based on the random process and the sampling dataset, the correlation between sample points is modeled through a so-called kernel. Over the years, many different kernel functions with varying properties have been suggested. Here, a squared-exponential kernel, also called a Gaussian radial-basis function (RBF) kernel, is utilized due to its smoothness and infinite differentiability:

$$\begin{aligned} R\Big (\varvec{x}^{(i)}, \varvec{x}^{(j)}\Big ) = \prod _{k=1}^d \text {exp} \left( - \theta _k \Big \vert x_k^{(i)} - x_k^{(j)}\Big \vert ^2\right) , \end{aligned}$$
(2)

where \(\theta _k\) denotes the kernel length scale that represents the hyperparameter(s) of the kriging surrogate model. The kernel function depicted above is called anisotropic because there is a separate length scale parameter for each design space dimension. In the present work, an isotropic kernel is chosen, where the hyperparameter \(\theta _k = \theta\) is a scalar, that is, independent of coordinate dimension. Other popular kernel function choices can be found in textbooks such as Rasmussen and Williams (2005) or in popular software implementations of kriging [for example, Pedregosa et al. (2011) or GPy (2012)]. Given the sample dataset, the kriging model is fitted by running a separate optimization for the kernel hyperparameter \(\theta\). Here, differential evolution (Storn and Price 1997) is used due to its simplicity and good global search characteristics. However, more advanced approaches for hyperparameter optimization have been suggested [see, for example, Toal et al. (2008) for more information].

With the representation of the random process, the sampling data, and the kernel function, the low-fidelity predictor for a new point \(\varvec{x}\) can be written as follows:

$$\begin{aligned} \begin{aligned} \hat{y}_{\mathrm{{LF}}} ( \varvec{x} )&= \beta _{0,\mathrm{{LF}}} + \varvec{r}_{\mathrm{{LF}}}^T (\varvec{x}) \varvec{R}_{\mathrm{{LF}}}^{-1} ( \varvec{y}_{S,\mathrm{{LF}}} - \beta _{0,\mathrm{{LF}}} \varvec{1} ),\\ \text {with}&\quad \beta _{0,\text {LF}} = \Big (\varvec{1}^T \varvec{R}_{\text {LF}}^{-1} \varvec{1} \Big )^{-1} \varvec{1}^T \varvec{R}_{\text {LF}}^{-1} \varvec{y}_{S,\text {LF}}, \\ \text {and}&\quad \varvec{r}_{\text {LF}} = \left[ R\big (\varvec{x}, \varvec{x}^{(1)}\big ), ..., R\big (\varvec{x}, \varvec{x}^{(m)}\big )\right] \in \mathbb {R}^{m_{\text {LF}}} \end{aligned} \end{aligned},$$
(3)

where \(\varvec{r}_{\text {LF}}\) is the correlation vector between the sample data and the new point, \(\varvec{R}_{\text {LF}} \in \mathbb {R}^{m_{\text {LF}} \times m_{\text {LF}}}\) represents the correlation matrix between the sample data points and \(\varvec{1} \in \mathbb {R} ^{m_{\text {LF}}}\) a column vector filled with ones.

With the low-fidelity predictor \(\hat{y}_{\text {LF}} ( \varvec{x} )\), the hierarchical kriging model can be constructed, which is based on a random process representing the high-fidelity function:

$$\begin{aligned} Y (\varvec{x}) = \beta _0 \hat{y}_{\text {LF}} ( \varvec{x} ) + Z (\varvec{x}). \end{aligned}$$
(4)

\(\beta _0\) is an unknown scaling factor applied to the low-fidelity predictor to represent the trend term of the model and \(Z (\varvec{x})\) is a stationary random process. With the high-fidelity sample dataset \((\varvec{S}, y_S)\) consisting of m samples with input variable data \(\varvec{S} \in \mathbb {R}^{m \times d}\) and the corresponding output \(y_S \in \mathbb {R}^m\) and the kernel function \(R(\varvec{x}^{(i)}, \varvec{x}^{(j)})\) as defined above, the HK predictor for the high-fidelity function is given by

$$\begin{aligned} \begin{aligned} \hat{y} ( \varvec{x} ) = \beta _{0} \hat{y}_{\text {LF}} ( \varvec{x} ) + \varvec{r}^T (\varvec{x}) \varvec{R}^{-1} ( \varvec{y}_{S} - \beta _{0} \varvec{F} )\\ \text {with} \quad \beta _{0} = \big (\varvec{F}^T \varvec{R}^{-1} \varvec{F} \big )^{-1} \varvec{F}^T \varvec{R}^{-1} \varvec{y}_{S}. \end{aligned} \end{aligned}$$
(5)

Here, \(\beta _0\) indicates the correlation between high- and low-fidelity models and \(\varvec{F} = [\hat{y}_{\text {LF}} ( \varvec{x}^{(1)} )... \hat{y}_{\text {LF}} ( \varvec{x}^{(n)} )]^T, \forall \varvec{x}^{(i)} \in \varvec{S}\) represents the low-fidelity prediction at high-fidelity sample point. \(\varvec{r} \in \mathbb {R}^m\) and \(\varvec{R} \in \mathbb {R} ^{m \times d}\) are defined as introduced for the low-fidelity predictor. In the HK predictor, only \(\hat{y} ( \varvec{x} )\) and \(\varvec{r} (\varvec{x})\) depend on the location of the new point. All other factors can be calculated when fitting the model.

Another important quantity that is needed in the later steps of the optimization process is the mean-squared error (MSE) of the HK prediction. It is written here with respect to \(\sigma ^2\), the process variance of \(Z (\varvec{x})\)

$$\begin{aligned} \begin{aligned} \text {MSE}( \hat{y} ( \varvec{x} ))&= \sigma ^2 \Bigg ( 1.0 - \varvec{r}^{T} \varvec{R}^{-1} \varvec{r} \\ {}&\quad + \left[ \varvec{r}^{T} \varvec{R}^{-1} \varvec{F} - \hat{y}_{\text {LF}} \right] ^{2} \left( \varvec{F}^T \varvec{R}^{-1} \varvec{F} \right) ^{-1} \Bigg ). \end{aligned} \end{aligned}$$
(6)

Based on the initial HK model, several iterations are performed to adaptively improve it until a specific termination criterion is reached. The termination criteria will be covered below. The location of the new adaptive samples is determined by optimizing an infill criterion. When considering multi-fidelity optimization, there are generally two options regarding infill criteria: First, a ‘classic’ single-fidelity infill criterion can be chosen. Among them, expected improvement which was used by Jones et al. (1998) to introduce EGO, remains the most popular. However, several different criteria have been suggested over the years [see, for example, Jones (2001) or Forrester and Keane (2009) for overviews]. The prime disadvantage of this option is that only high-fidelity samples can be added adaptively. Therefore, a multi-fidelity infill criterion is utilized here, called variable-fidelity expected improvement (Zhang et al. 2018). It is essentially a multi-fidelity extension of standard EI and in its formulation is very similar to another criterion called augmented EI (Huang et al. 2006). Here, it is favored over the latter because it is free of empirical parameters. More discussion on the comparison between these two criteria can be found in the original publication suggesting VF-EI. VF-EI is defined at location \(\varvec{x}\) and fidelity level L as follows:

$$\begin{aligned} \begin{aligned} \text {EI}_{vf}&(\varvec{x}, L) = \\ {}&{\left\{ \begin{array}{ll} s(\varvec{x}, L) \left[ u \Phi \left( u \right) + \phi \left( u \right) \right] , &{} \text {if} \,\, s(\varvec{x}, L) >0 \\ 0, &{} \text {if} \,\, s(\varvec{x}, L) =0 \end{array}\right. }, \end{aligned} \end{aligned}$$
(7)

where \(u = \frac{y_{\text {min}} - \hat{y} (\varvec{x})}{s(\varvec{x}, L)}\) and \(y_{\text {min}}\) is the currently best feasible high-fidelity function value. \(\Phi (\bullet )\) represents the cumulative distribution of the standard normal distribution and \(\phi (\bullet )\) its probability density function. The term \(s(\varvec{x}, L)\) denotes the uncertainty of the HK model. The previously introduced scaling factor between fidelity levels \(\beta _0\) is used here to model the uncertainty in high-fidelity prediction caused by the low-fidelity predictor:

$$\begin{aligned} s^2 (\varvec{x}, L) = {\left\{ \begin{array}{ll} \beta _0^2 \cdot \text {MSE}( \hat{y}_{\text {LF}} ( \varvec{x} )), &{}L=0 \: \text{low-fid}\\ \text {MSE}( \hat{y} ( \varvec{x} )), &{}L=1 \: \text{high-fid} \end{array}\right. }. \end{aligned}$$
(8)

\(\text {MSE}( \hat{y} ( \varvec{x} ))\) and \(\text {MSE}( \hat{y}_{\text {LF}} ( \varvec{x} ))\) are the MSEs of the high- and low-fidelity kriging predictors, respectively.

The two summands in Eq. (7) can be identified with exploration and exploitation. The first term \(\left( y_{\text {min}} - \hat{y} (\varvec{x}) \right) \Phi (u)\) is dominated by the improvement of the solution \(\hat{y} (\varvec{x})\) and, thus, represents exploitation, while the second term \(s(\varvec{x}, L) \phi (u)\) represents exploration because it is dominated by the uncertainty of the solution \(s(\varvec{x}, L)\).

Due to the highly multimodal nature of the EI functions, differential evolution (Storn and Price 1997) is selected for optimization of the infill criterion in the present work.

Two different criteria are used to determine the end of optimization. First, a minimum allowable value is specified for the optimized infill criterion. Second, a maximum total number of (high-fidelity) objective function evaluations is defined. The values of the criteria are problem dependent, and are listed below with the definitions of the problems. The first criterion can be seen as convergence of the algorithm to an (at least near-) optimal point with little expectation of improvement from adding further samples. The other criterion is used to represent budget restrictions on optimization run time that are commonly encountered in application use cases.

The optimization algorithm, along with a part for DoE, is implemented in an in-house Python code from previous work by the authors (Komeilizadeh et al. 2022; Kaps et al. 2022). HK model generation and kernel implementation are based on the scikit-learn library (Pedregosa et al. 2011).

3 Numerical example

Based on the objective functions which are introduced in Sect. 4, two different optimization schemes are compared for an exemplary numerical problem introduced in the following. Here, a cross-die deep-drawing simulation comparable to the one studied by Hoque and Duddeck (2021) is used as a basis for the optimization problem. An exemplary configuration of the final component is shown in Fig. 2. All tooling is modeled as rigid bodies while the sheet blank is made of steel. Coulomb friction is assumed between the blank and the tools. More detailed information on modeling parameters and numerical values can be found in Table 2 in the Appendix. An incremental explicit simulation in LS-Dyna is utilized for the high-fidelity model. An exemplary high-fidelity simulation model at the initial time step is shown in Appendix Fig. 13. The simulation of deep drawing itself consists of multiple process steps. Initially, the punch moves into contact with the sheet metal blank. The blank itself is pressed to the die by a blankholder. Then, as the punch moves further, the deep drawing, i.e., the nonlinear forming of the blank into its desired shape, is driven by the different contacts between the tools and the blank. Finally, when the punch is removed, an elastic springback of the component occurs. This last step is not considered in the present work because it accounts for dimensional accuracy while the focus here lies on formability. For more detailed overviews of the various details of sheet-metal-forming simulation, readers are referred to the available textbooks on the topic, e.g., Banabic (2010). As the low-fidelity model, the inverse implicit one-step capability of LS-Dyna based on a more coarsely meshed component is utilized. The idea of the inverse one-step approach is to use deformation theory to calculate stresses, strains, and thicknesses in the formed component given the final geometry. A more detailed derivation including the governing equations of the approach can be found, for example, in the original publication (Lee and Huh 1997). The maximum element size is set to 5 mm for the latter, compared to 3 mm for the high-fidelity model. The smallest element size after adaptive mesh refinement in the high-fidelity model is 0.5 mm. Simulation times for high- and low-fidelity simulation models are 10–20 min and 1–2 min, respectively, when running on eight cores.Footnote 1 Exact values vary also depending the choice of design variables.

Fig. 2
figure 2

Exemplary geometry of the cross-die component studied here. The definition of slant depth is indicated in white, i.e., design variable \(x_2\), used here

For the optimization problem studied here, a total of six design variables are selected. All variables and their specified limits are summarized in Table 1. From the literature, the design variables in sheet-metal-forming optimization problems can be divided into three categories: geometry parameters [see, for example, Guo et al. (2000) or Kishor and Kumar (2002)], process parameters [for example, blank holding force, Obermeyer and Majlessi (1998)], and material parameters. In the present work, the design variables are exemplary chosen from all three categories. As geometry parameters, the thickness of the initial sheet metal, the slant depth of the cruciform, which is equivalent to the drawing depth of the process, and the die radius are chosen. The Lankford coefficient, which represents the normal anisotropy of the material, is varied as a material parameter. The Coulomb friction coefficient and the constant blankholder force (BHF) are the two remaining design variables in the class of process parameters.

Previous work has shown that it can be beneficial to vary the BHF in the forming process [e.g., Jakumeit et al. (2005)]. Here, it is kept constant for the sake of simplicity. For the same reason, all design variables in this exemplary problem are considered continuous, even though, for example, the sheet metal thickness or the Lankford coefficient might be more realistically treated as a discrete variable.

The design variable values in the present example are chosen to be very challenging in the sense that they will likely not yield a manufacturable component in the optimization. The main reason is that objective functions \(f_1\) and \(f_2\) are not capable of distinguishing manufacturable components from each other. Both functions take a constant value of zero for manufacturable components. Therefore, manufacturable components in the design space would limit the comparability between objective functions.

Table 1 Overview of the design variables specified for the optimization problem considered here

4 Objective functions

The three different objective functions used to assess the formability of a component are introduced in the following. All functions are defined here to be minimized during optimization. All three functions have been previously used, sometimes with slight variations, in the literature. Therefore, while the presentation here is kept brief, interested readers are referred to the various original publications for further discussion.

The first two objectives make use of the so-called forming limit diagram (FLD). It includes the forming limit curve (FLC) representing the onset of localized necking in the sheet metal component, as well as a limit curve for the onset of wrinkling. Since their first mention in the 1960s (Goodwin 1968; Keeler 1968), many different variants of FLDs have been studied to remedy some of the initial shortcomings such as strain-path effects on the FLC. One such example is the extension of FLCs to nonlinear strain paths and multi-step forming process by Volk and Suh (2013). Readers are referred to previous works reviewing the topic in more detail [e.g., Paul (2013) or Obermeyer and Majlessi (1998)]. An exemplary FLD used in the present work is shown in Fig. 3. Here, the major and minor true strains for each element of the deep-drawn sheet metal component are plotted against each other. The red line is the FLC and the pink line represents the wrinkling limit curve (WLC). Each dot represents an element in the simulation model.

Fig. 3
figure 3

Exemplary forming limit diagram for a cross-die component. The red line is the FLC and the pink line represents the WLC. Each dot represents an element of the simulation model. Blue and yellow colors indicate crack and wrinkling risk areas, respectively. Orange color shows severe thinning area. The black line marks the limit of strain definitions, i.e., \(\epsilon _1 \ge \epsilon _2\). (Color figure online)

The first objective function is based on counting the elements in the different categories of the FLD. Subsequently, the objective function \(f_1\) is defined as the share of ‘bad’ elements:

$$\begin{aligned} \begin{aligned} f_1&= \frac{N_{\text {bad}}}{N}, \text {where} \\ N_{\text {bad}}&= N_\text {C} + N_{\text {CR}} + N_\text {W} + N_{\text {WR}} + N_{\text {TH}} \end{aligned}, \end{aligned}$$
(9)

where N is the total number of elements. \(N_\text {C}\), \(N_{\text {CR}}\), \(N_\text {W}\), \(N_{\text {WR}},\) and \(N_{\text {TH}}\) represent the number of elements in the crack, crack risk, wrinkling, wrinkling tendency, and severe thinning categories of the FLD, respectively (compare Fig. 3). This approach is somewhat similar to two of the four criteria suggested by Jakumeit et al. (2005).

The second objective function assessed in the present work is based on the average distance of bad elements to the respective limiting curves. A very similar approach has been originally suggested by Naceur et al. (2004) and applied in a multi-fidelity setup by Sun et al. (2010). In the present work, the different distances are weighed by the area of the respective element [compare Schenk and Hillmann (2004)]. The FLC and WLC are defined here as black-box functions. Given the values of the major and minor true strains of the element e, these functions return the points on the curves \(\varvec{\hat{\epsilon }}^{\text {FLC}}\) and \(\varvec{\hat{\epsilon }}^{\text {WLC}}\) required for the calculation of the distance. For elements within the domain of the limiting curve, a vertical distance is calculated. The Euclidean distance is utilized for the remaining elements. Therefore, the distance function is defined as follows:

$$\begin{aligned} d \left( \varvec{\epsilon }^e, \varvec{\hat{\epsilon }}^{\text {LC}} \right) = {\left\{ \begin{array}{ll} \vert \epsilon _1^e - \hat{\epsilon }_1^{\text {LC}} \vert , &{} \epsilon _{2}^e \in \mathbb {D}_{\text {LC}} \\ \Vert \varvec{\epsilon }^e -\varvec{\hat{\epsilon }}^{\text {LC}} \Vert _2, &{} \text {else} \\ \end{array}\right. }. \end{aligned}$$
(10)

\(\varvec{\hat{\epsilon }}^{\text {LC}}\) represents the major and minor strains of the point on the respective limiting curve (LC), while \(\mathbb {D}_{\text {LC}}\) represents the domain of the curve. Therefore, the second objective function is given by

$$\begin{aligned} \begin{aligned} f_2&= f_{2,\text {C}} + w f_{2,\text {W}}, \text {where}\\ f_{2,\text {C}}&= {\left\{ \begin{array}{ll} \frac{\sum _{e=1}^{N_\text {C}} d \left( \varvec{\epsilon }^e, \varvec{\hat{\epsilon }}^{\text {FLC}} \right) A^e }{\sum _{e=1}^{N_\text {C}} A^e}, &{} \epsilon _{1}^e > \epsilon ^{\text {FLC}}_1 \\ 0, &{} \text {else} \\ \end{array}\right. } \\ f_{2,\text {W}}&= {\left\{ \begin{array}{ll} \frac{\sum _{e=1}^{N_\text {W}} d \left( \varvec{\epsilon }^e, \varvec{\hat{\epsilon }}^{\text {WLC}} \right) A^e }{\sum _{e=1}^{N_\text {W}} A^e}, &{} \epsilon _{1}^e < \epsilon ^{\text {WLC}}_1 \\ 0, &{} \text {else} \end{array}\right. } \end{aligned}. \end{aligned}$$
(11)

Here, w is a weighting factor balancing the contributions of crack and wrinkling elements. It is set to \(w=0.1\) following the suggestion made in Sun et al. (2010). Initial studies were performed for the present application problem with different values of w. It was found that the value of w does not have a significant impact on the optimization results here; however, the chosen value provides a nice balance between the two contributions. The element area is given by \(A^e\).

The third objective function \(f_3\) is the thickness variation in the drawn component. This indicator has been widely used for many years [e.g., Guo et al. (2000); Naceur et al. (2001); Sattari et al. (2007)]. The definition used here is very similar to that given by Guo et al. (2000) for the special case \(p=2\):

$$\begin{aligned} f_3 = \left( \frac{1}{N} \sum _{e=1}^N (h_t^e - h_0)^2 \right) ^{\frac{1}{2}}. \end{aligned}$$
(12)

N represents the number of elements, \(h_0\) is the initial constant thickness of the sheet metal, and the elemental sheet thickness at the final simulation time step is given by \(h_t^e\). This function is intuitive because a decrease in thickness during the simulation can lead to necking, while an increase in thickness may be correlated with wrinkling. These are the two main failure modes in sheet metal forming. In contrast to the first two objective functions, this function also allows for the comparison of components considered as manufacturable. Objective functions \(f_1\) and \(f_2\) are always zero when a component is considered formable, whereas function \(f_3\) is not.

Defining the lower and upper bounds of the j-th design variable as \(\underline{x}_j\) and \(\bar{x}_j\), respectively, (compare Table 1) and considering the three objective functions \(f_i (\varvec{x} )\) introduced above, the three optimization problems considered here can be formulated as follows:

$$\begin{aligned}&\min _{\varvec{x}}\quad \quad f_i(\varvec{x}), \end{aligned}$$
(13a)
$$\begin{aligned}&\text {where} \quad \underline{x}_j \le x_j \le \bar{x}_j, \quad j = 1, 2, 3, 4, 5, 6 \end{aligned}.$$
(13b)

Here, i will take the values 1, 2, or 3, depending on the objective function considered. The formulation \(f_i(\varvec{x})\) includes the whole simulation workflow, where depending on the fidelity level and design variable values either a high-fidelity or a low-fidelity simulation model is generated, evaluated, and the resulting strain field used to calculate the respective objective function value.

Termination criteria for this problem are set to \(10^{-5}\) for the infill criterion threshold and 100 for the maximum number of iterative evaluations of the objective function.

5 Results

In the following, the optimization results of the optimization scheme proposed by Zhang et al. (2018) for the three different objective functions on the given deep-drawing problem are presented separately, discussed, and finally compared. As a reference, a single-fidelity optimization technique based on kriging, EI, and the high-fidelity simulation model is utilized. The latter is referred to as HF in the following, while the multi-fidelity scheme is called MF.

In general, both techniques are evaluated for the quality, consistency, and computational requirements of the results. Two points are the focus of the present work. First, we assess how each objective function performs in the optimizations and how the optimization results between different objective functions differ. Second, we establish whether the presented multi-fidelity optimization approach shows potential for sheet-metal-forming problems.

All high- and low-fidelity simulations are performed on the same computer using FE software LS-Dyna distributed across eight cores. Each optimization run is repeated ten times to ensure the reliability of the assessment. Unless explicitly stated otherwise, all objective function values listed below are based on the high-fidelity model. For completeness, the average number of simulation model calls for both fidelity level and all objective functions is listed in the Appendix Table 3.

5.1 Objective function \(f_1\)

As a first step in assessing the results for objective function \(f_1\), convergence and termination criteria are checked. Only seven out of the total of 20 optimization runs here terminate due to reaching the threshold infill criterion, whereas all others run into the maximum number of allowed iterations. Three and four of these seven runs occur with the HF and MF techniques, respectively. A convergence plot showing the best current objective function value over the high-fidelity evaluations is shown in Fig. 4. Each gray curve represents repetitions of MF, and each black curve represents HF. The diagram shows generally good convergence behavior, indicating that the different termination criteria encountered may not be problematic per se. However, it should be noted that there are quite a lot of differences between repetitions of the same optimization technique. Possible reasons for these differences are discussed in the following, after presenting the actual optimization results.

Fig. 4
figure 4

Objective function \(f_1\): Convergence plot for ten repetitions of the two optimization methods. Mean of the single-fidelity runs is shown as solid black line, multi-fidelity runs as dashed green line. The colored areas represent upper and lower bounds. The first 50 and 20 evaluations are part of the initial design of experiments for HF and MF, respectively. (Color figure online)

The results of the optimization of objective function \(f_1\), which is based on counting the share of bad elements in the FLD are shown in a parallel coordinates plot in Fig. 5. The two techniques HF and MF are compared. Each curve represents the optimized result of a single optimization run. The color scale indicates the value of the objective function. Different design variables are listed on the x-axis and their normalized ranges on the y-axis. The values of the objective function here are mostly in the range between 0.38 and 0.40 with a total of six exceptions above or below that (see, for example, the dark blue curve for HF or the yellow curve for MF). The average best for the objective function is slightly lower for HF compared to MF. As this difference is smaller than the variation between repetitions of the same technique, it is not considered significant here. Interestingly, the values of the design variables do not reflect the consistency of the objective function results. In fact, \(x_3\) is the only variable for which a consistent optimal value of around 6.45 mm is found. Intuitively, a higher value for the die radius \(x_3\) should be beneficial to prevent cracks during the drawing process. Here, it is lower, because there is a high number of elements in the wrinkling range where a lower radius can be better. All other design variables vary significantly between different repetitions for both optimization techniques.

Fig. 5
figure 5

Objective function \(f_1\): Parallel coordinates plot comparing ten repetitions of the two optimization methods. Design variable values on the y-axis are normalized, actual boundaries can be found in Table 1. The color scale indicates objective function values of the respective results (lower is better). (Color figure online)

The fact that both the convergence plot and the results show significant variations between repetitions across both optimization approaches indicates that the objective function itself might be the problem. Recall that the approach for this objective was to determine the share of ‘bad’ elements in the FLD. This rather naive approach is tempting because it is very easy to implement and understand. However, it has a number of downsides that could contribute to the inconsistent results reported here. First, counting the categorized elements neglects the degree to which an element violates, for example, the forming limit. In reality, it could make a big difference if an element lies barely above the FLC or is far beyond. Second, the intermediate categorization of elements reduces the influence that design variables have on the objective function. To illustrate this point, imagine an element slightly above the FLC (i.e., in the group of cracked elements) in the basis configuration. Now imagine that this element is experiencing an increased major strain due to, for example, an increase in the blankholder force. Ideally, this worsening state should be reflected in some way in the value of the objective function. However, for \(f_1\) the objective value would not change because the element was already in the group of cracked elements before. This reduced influence of design variables makes any attempt at optimization significantly harder. We believe that this insufficient definition of \(f_1\) is responsible for the inconsistent results reported here. It also leads to ‘optimized’ designs which are quite different from those found with the other two objective functions, which will be presented below.

For completeness, it should be mentioned that the MF approach yields on average a time reduction of around 50% for optimization compared to HF. The exact numbers can be found in Table 4 in the Appendix.

For the objective function \(f_1\), it can be concluded that the optimization technique MF is capable of significantly speeding up the optimization process for the present example problem without reducing the quality of the results. However, the objective function itself is not ideally defined for optimization because the influence of the design variables on the objective function is limited. This leads to very inconsistent optimization results, regardless of the technique used.

5.2 Objective function \(f_2\)

The discussion of objective function \(f_2\) which is defined as the weighted average distance of cracked and wrinkle elements to FLC and WLC in the FLD respectively is also started with checking termination conditions and convergence. Here, all MF runs and seven of the ten HF runs terminate due to the threshold infill criterion indicating good convergence. The remaining three optimizations are terminated after reaching the maximum number of iterations. A convergence plot for this objective function in the same style as above is shown in Fig. 6. Here, repetitions of the same algorithm converge to similar values fairly consistently. In addition, the algorithm reaches values close to the final optimum in very few adaptive high-fidelity evaluations. Just to recall, the adaptive phase starts after 50 and 20 high-fidelity evaluations for HF and MF, respectively.

Fig. 6
figure 6

Objective function \(f_2\): Convergence plot for ten repetitions of the two optimization methods. Mean of the single-fidelity runs is shown as solid black line, multi-fidelity runs as dashed green line. The colored areas represent upper and lower bounds. The first 50 and 20 evaluations are part of the initial design of experiments for HF and MF, respectively. (Color figure online)

A comparison of the optimization results between HF and MF is shown in Fig. 7. The optimal values of the objective function here are consistently between 0.34 and 0.38, with only one overall worse value in MF. Optimal objective function values can also be linked to certain design variable values. \(x_1\) and \(x_2\) are at or close to their lower bounds of 0.8 mm and 12 mm, respectively. \(x_3\) is either at its upper limit 9 mm or around 7.5 mm, \(x_4\) is consistently at its upper limit 2.5, while \(x_5\) and \(x_6\) vary across their entire range between optimizations. The variation in the latter variables indicates that their influence on the objective function is limited. For the Coulomb friction coefficient \(x_5\) this is less surprising as its range was also defined pretty narrowly. For the blankholder force \(x_6\), it can be observed that most of the results are in the lower half of its range, so the best explanation is that its influence diminishes below a certain threshold. Overall, the quality of the results between the two optimization methods is very similar, although there is slightly more variation with MF.

Fig. 7
figure 7

Objective function \(f_2\): Parallel coordinates plot comparing ten repetitions of the two optimization methods. Design variable values on the y-axis are normalized, actual boundaries can be found in Table 1. The color scale indicates objective function values of the respective results (lower is better). (Color figure online)

To illustrate the progress made during optimization, two designs evaluated during a MF optimization run are chosen representatively. The first is the initial evaluation of the optimization that produces an objective function value of 2.49 and the second is the optimized evaluation of the same optimization with the objective function value 0.35. FLDs for these two simulations are depicted in Fig. 8. Significant improvements can be seen, particularly with cracked elements, but also in the wrinkling regime. It should be noted here that the optimized result is nevertheless not considered manufacturable, which is expected from the definition of design variable limits (compare Sect. 3).

Fig. 8
figure 8

Objective function \(f_2\): Comparison of two forming limit diagrams (FLD) of an early simulation and the optimized result of the same optimization run. For the latter, results are also shown mapped onto the final geometry. More details on FLDs can be found in Sect. 4

For computational requirements, MF runs on average need around 46% less time to terminate than HF. Detailed numbers are listed in Appendix Table 4. These results match fairly well with previously reported time savings of multi-fidelity optimization schemes in structural optimization problems. Compare, for example, Acar et al. (2020) or Kaps et al. (2022) where the authors used multi-fidelity schemes in automotive crashworthiness examples.

In conclusion, MF can produce results of comparable quality to HF, although they show slightly more variation, while significantly reducing the optimization time using the objective function \(f_2\) in the present example problem. The results with this objective are also significantly more consistent than with \(f_1\), indicating that this is a more suitable objective function for this type of problem.

5.3 Objective function \(f_3\)

The results of the thickness reduction objective function \(f_3\) are shown in Fig. 9. Here, good values of the objective function below 0.05 appear to depend on variables \(x_1\), \(x_2\), \(x_5\), and \(x_6\) being close to their respective lower bounds, while \(x_4\) is at its upper bound of 2.5 and \(x_3\) is in the middle between values of 7.5 and 8 mm. We believe that these results are as expected when considering the average thickness reduction of the component over the deep-drawing process. The values of the design variable are also remarkably similar to those observed for \(f_2\). The only exception is the friction coefficient \(x_5\), which is significantly more consistent in its optimal values close to its lower boundary for the present objective function. Interestingly, for this objective function, the MF approach yields better and more consistent results than HF.

Fig. 9
figure 9

Objective function \(f_3\): Parallel coordinates plot comparing ten repetitions of the two optimization methods. Design variable values on the y-axis are normalized, actual boundaries can be found in Table 1. The color scale indicates objective function values of the respective results (lower is better). (Color figure online)

Before looking at the run times and more details of the comparison, convergence information is reported to ensure reliability of the results. For this objective function, all runs of the HF method and seven of the ten MF runs terminate due to the infill criterion threshold. The other three runs reach the maximum number of iterations. This indicates good convergence of the algorithms. The full convergence plot for all runs, which can be found in Fig. 10, confirms this observation.

Fig. 10
figure 10

Objective function \(f_3\): Convergence plot for ten repetitions of the two optimization methods. Mean of the single-fidelity runs is shown as solid black line, multi-fidelity runs as dashed green line. The colored areas represent upper and lower bounds. The first 50 and 20 evaluations are part of the initial design of experiments for HF and MF, respectively. (Color figure online)

The FLD for the best result obtained by the MF technique among all repetitions is shown in Fig. 11. As the differences between the repetitions are marginal here, it is also representative of other optimized results from the MF method. The component depicted here is not considered manufacturable (see Sect. 3). However, this FLD confirms the remarkable similarity between the results of \(f_2\) and \(f_3\) (compare Fig. 12c).

Fig. 11
figure 11

Objective function \(f_3\): Forming limit diagram of the overall best result obtained with MF technique. Results are also shown on the component geometry

For this objective function, optimizations performed with the MF technique need on average about 31% longer than HF (detailed values are listed in Appendix Table 4). Together with the better and more consistent results found with MF, these findings are unexpected. Usually, the aim of a multi-fidelity optimization scheme is to reduce computational effort while not significantly impairing result quality. In this case, the opposite happens, making further investigation of these findings necessary. Several observations can be made as to why in this case MF performs better. First, looking at the convergence plots (see Fig. 10), HF terminates after fewer high-fidelity evaluations than MF for all optimization runs. Usually, the opposite would be expected (compare, for example, Fig. 6). Additionally, all HF runs terminate due to the infill criterion threshold, indicating that the algorithm converged and no significant improvements are expected. These two observations show that the kriging model in HF may not be sufficiently good and that the optimizer may be stuck at a local optimum. For MF, this appears to be less of a problem. Apparently, the additional function evaluations performed with the low-fidelity model, which is then added as a trend term into the HK surrogate model, help the optimizer avoid local optima by better resolving them in the surrogate model.

Overall, the results for this objective function are surprising, as MF outperforms HF but also requires more computation time. The best current explanation is a higher quality of the surrogate model in MF avoiding local minima. We believe that more detailed investigations going beyond the scope of the present work are necessary here. It should also be confirmed whether these results can be repeated for different components and/or design variables.

5.4 Discussion

After presenting the optimization results for all objective functions, some points of discussion will be given below.

Boxplots comparing optimization results for all three objective functions are shown in Fig. 12. The objective function \(f_1\) is found not to be well suited to optimize the example problem chosen here. Both \(f_2\) and \(f_3\) show good performance in the optimization problem. The results between the two are remarkably similar. However, the conclusions drawn are somewhat more differentiated. The objective function \(f_2\) nicely illustrates the potential of a multi-fidelity optimization technique. With very little influence on the result quality, it speeds up the optimization by a factor of two in the present example. The statistical Wilcoxon rank sum test shows that the null hypothesis of equal results for the two methods holds at a \(5\%\) significance level (\(U=51\), \(p=0.97\)). For the objective function \(f_3\), the same statistical test confirms that the null hypothesis of equal results between the two methods can be rejected at a \(5\%\) significance level (\(U=2\), \(p<0.001\)). Thus, the multi-fidelity approach even outperforms the classic single-fidelity method for this objective, while also requiring more time to produce results.

Fig. 12
figure 12

Box plots comparing results of the different optimization methods. Each method was repeated ten times per objective function

The objective functions \(f_1\) and \(f_2\) are not capable of distinguishing manufacturable components from each other. This issue is avoided in the present work through the definition of the optimization problem. Also, it may not be so relevant in practical applications where the main priority is obtaining a manufacturable component. However, in other contexts, such as fitting machine learning models, it could be a challenge. Another possible problem that was observed here is the mesh dependency, especially of objective functions \(f_1\) and \(f_2\). The values of these functions change if the exact same component is meshed differently. In the present work, this is not a concern because all high-fidelity models are meshed the same way. The low-fidelity models used for the MF approach are only utilized as a trend term in the HK surrogate model and, thus, are not directly included in comparisons. However, this should be considered in other applications.

6 Conclusions

In the present work, an exemplary cross-die deep-drawing optimization problem is investigated with respect to different objective functions and the use of a multi-fidelity efficient global optimization technique. For the former, three different objective functions are defined, all of which have been previously applied in the literature at least in slightly modified form and primarily for single-fidelity techniques. For the latter, a multi-fidelity efficient global optimization scheme based on hierarchical kriging and variable-fidelity expected improvement is proposed here, which has been successfully used in various fields of applications such as fluid mechanics or automotive crashworthiness.

Two of the three objective functions are based on forming limit diagrams that are commonly used in sheet metal forming to determine the manufacturability of components. The first is based on naively counting elements of different classifications and minimizing the share of bad elements. The second function is defined as minimizing the average violation of the forming and wrinkling limit curves for critical elements. The third objective function to be minimized is the average thickness reduction in the component during the deep-drawing process.

The first objective function is found to be hard to consistently optimize with both the multi-fidelity and a reference single-fidelity efficient global optimization method. The limited influence of design variables on the objective function is identified as one of the main reasons. The second objective function shows consistent result quality across the two optimization techniques and highlights the capability of the multi-fidelity scheme to speed up computation times by a factor up to two. The time gains observed here match well with results previously reported for multi-fidelity optimizations in other fields of application. The third objective function shows surprising results in that the multi-fidelity technique delivers better and more consistent results compared to the single-fidelity reference approach while also increasing the computation times by a factor of approximately 1.3. The currently most likely explanation is the better predictive quality of the surrogate model due to the overall higher number of objective function evaluations in multi-fidelity compared to the single-fidelity technique. However, we believe that these last results warrant a more detailed investigation, which could be interesting for future work.

In addition to that, we believe that it is interesting to further expand the use of multi-fidelity optimization schemes in the field of sheet metal forming. On the basis of the results of the present work, we found a number of additional ideas that we believe to be interesting for future work.

  • A number of improvements to the multi-fidelity approach used here have been suggested, which should also be applied to a sheet-metal-forming problem to assess their potential in this field of application.

  • Similarly, a number of different multi-fidelity optimization techniques have been suggested which should be compared against other in a sheet-metal-forming problem.

  • The results presented here should be confirmed on larger and more complex deep-drawing components.

  • Objective functions based on forming limit diagrams cannot distinguish manufacturable components. This might lead to challenges for the optimizer in more realistic problems. An effort should be made to adapt these functions as they are quite intuitive and easily understandable for a human.