1 Introduction

In many problems, some of the input variables of a process show variations from measurement to measurement and they can be described using a probabilistic distribution. Such disturbances which are referred to as noise are a challenge in optimizing the processes. The main concern is to predict their effects on the statistics of the response and to minimize the sensitivity of the response to these noise variables. This problem is very common in various disciplines including engineering, physics, biology and economy. For instance, the uncertainty of the response due to uncertain inputs has been assessed in large-scale energy-economic policy models (Kann and Weyant 2000), zero-defect manufacturing (Myklebust 2013), maintenance modelling (Gao and Zhang 2008), automotive industry (Wurm and Bestle 2016), study of groundwater flow (Dettinger and Wilson 1981), engineering design (Kang et al. 2012), weather forecast (Palmer 2000) and health related issues (Barchiesi et al. 2011).

For optimization under the influence of uncertainty, the input variables are considered in two categories. The variables that can be adjusted to get an optimum process are called design variables and those which are noisy and difficult to control are referred to as noise variables. The goal of robust optimization is to find a set of design variables at which the process is least sensitive to the disturbances.

Computer models are often used for optimization of processes. Optimizing a process using computer models requires many evaluations of the model. Running those models usually requires significant computational effort. Often an approximate model of the process, a so-called metamodel, is built from results of computer simulations. The metamodels are then used to efficiently perform the optimization in the presence of uncertainty. Moreover a single infill criterion or multiple infill criteria can be used to further improve the metamodel and enhance the search for a robust optimum (ur Rehman and Langelaar 2017).

A major challenge in robust optimization is the lack of information about the noise parameters. In most applications it is cumbersome to collect information about noise variables. For instance, in metal forming processes the scatter of material properties (noise variable), is mostly gathered using a very limited number of tests due to costs, availability of equipments and time (Goedel and Merklein 2011). Consequently, simplified models are used in simulations (de Souza and Rolfe 2010; Prates et al. 2018). When the information about noise is limited, it is a common practice to use simple input probability distributions such as a normal distribution (Atzema et al. 2009; de Souza and Rolfe 2010; Marretta and Di Lorenzo 2010; Wiebenga et al. 2012). However, in practice we encounter non-normal input probability distributions. For example, in steel production, the same steel grades are manufactured from different batches and over different production lines. The products from each batch or production line could have different mean and standard deviation. In addition to that, a product might be upgraded or downgraded to a different grade after characterization. This basically leads to multimodal distributions of material properties of steels of the same grade. To observe these phenomena, a large number of tests is required to get statistically significant results, which is time consuming and costly. For example, after testing about 6000 samples Hora et al. (2011) showed that the ultimate tensile strength of DC06 steel grade follows a multimodal distribution.

In some applications, it is relatively easy to gather the scatter data. However, the challenge is how to efficiently use the collected noise data to find the statistics of the response and to perform robust optimization. The propagation of input uncertainty through a metamodel is often performed using Monte Carlo (MC) method (Putko et al. 2002; Zhou et al. 2018b). Subsequently, the mean and standard deviation of response are used to define the objective function and reliability constraints in robust optimization (Cui et al. 2014; Du et al. 2008; Zhao et al. 2015). In Nejadseyfi et al. (2019b) it was shown that finding the robust optimum design can be further improved by reformulating the objective function and constraints.

The purpose of this study is to show how to handle the collected noise data in terms of correlations and probability distribution. Principal component analysis (PCA) is done for dimension reduction and to avoid sampling of physically unrealistic noise parameter values. Having uncorrelated noise allows for efficient calculation of noise propagation. This efficiency is even more enhanced by analytical calculation of uncertainty. For analytical noise propagation, it is crucial to describe the noise in uncorrelated basis and therefore principal component analysis is chosen to be used in combination with the analytical integration. Finally, modified formulations for robust optimization objective and reliability constraints are presented and implemented in Sect. 3. Two examples are presented to validate the method and show the applicability in industrial problems. In the first example, a basketball free throw in windy weather condition is presented. The real wind data are collected and analysed to find an optimum throw setting and validate the proposed methods. This example is an unconstrained robust optimization problem. In the second example the forming of the B-pillar automotive component is optimized while the material properties are noisy. Thinning which is a forming limit is considered as constraint of this problem and therefore the second example is categorized as a constrained robust optimization problem. For this example, 49 coils of DP800 steel were characterized using tensile tests and the correlations between those input variables are investigated. The results are then presented and discussed in Sect. 4.

2 Defining problems

2.1 First example: basketball free throw in windy weather condition

The problem of basketball free throw in windy weather condition (Nejadseyfi et al. 2019a) is presented to address the handling of noise parameters based on collected data. Figure 1 illustrates the side and top views of the projectile motion of a basketball.

Fig. 1
figure 1

Projectile motion of a basketball in xz and xy plane

The position of the basketball center is indicated by a vector \({\mathbf {r}}\). The ball has velocity \({\mathbf {v}}\) and is subject to gravity as well as air resistance. The equation of motion then is:

$$\begin{aligned} m \dot{{\mathbf {v}}} = -mg {\mathbf {e}}_{{\mathrm {z}}} - \frac{1}{2} \rho C_{\mathrm {D}} A | {\mathbf {v}}- {\mathbf {w}}| \left( {\mathbf {v}}- {\mathbf {w}}\right) , \end{aligned}$$
(1)

where m is the ball’s mass, g is the acceleration of gravity, \({\mathbf {e}}_{{\mathrm {z}}}\) is the unit vector in vertical direction, \(\rho\) is air density, \(C_{{\mathrm {D}}}\) is the coefficient of air drag, A is the ball’s frontal area and \({\mathbf {w}}\) is the wind velocity. Together with the relation between position and velocity, \(\dot{{\mathbf {r}}} = {\mathbf {v}}\), a system of first order differential equations:

$$\begin{aligned} \begin{Bmatrix} \dot{{\mathbf {r}}} \\ \dot{{\mathbf {v}}}\end{Bmatrix} = \begin{Bmatrix} {\mathbf {v}}\\ -g {\mathbf {e}}_{{\mathrm {z}}} - \frac{\rho C_{\mathrm {D}} A | {\mathbf {v}}- {\mathbf {w}}|}{2m} \left( {\mathbf {v}}- {\mathbf {w}}\right) \end{Bmatrix} \end{aligned}$$
(2)

is obtained, which is integrated by a standard Runge–Kutta routine.

The values used for different parameters are: \(m=0.625\) kg, \(g=9.82\) m/s\(^2\), \(\rho =1\) kg/m\(^3\), \(C_{{\mathrm {D}}}=0.5\), \(A=0.045\) m\(^2\).

The design parameters for this problem are initial velocity (\(V_0\)), throw angle in xz plane (\(\theta _{xz}\)) and throw angle in xy plane (\(\theta _{xy}\)). The ranges of these parameters are mentioned in Table 1.

Table 1 Design parameters in basketball free throw

A front wind is considered as noise during the projectile motion of the basketball. Wind magnitude, \(V_w\), and the angle of wind in xy plane, \(\gamma _{w}\), are considered as noise variables. The variation of wind direction and speed are collected over a 20-days period of time in 10-min intervals from a weather station at Caribbean Netherlands (Web). Some data points are not recorded due to unavailability of the station at the time of generating the report. Figure 2 shows the distribution of the wind velocity and wind speed for 2861 observations. As can be seen in this figure, noise variables slightly deviate from a normal distribution fit and the presence of two modes is noticeable.

Fig. 2
figure 2

Scatter of wind data and correlation between wind direction and speed

The mutual linear correlation coefficient for two parameters is also shown in this figure which is calculated by:

$$\begin{aligned} \rho _\mathrm {X,Y}& =\frac{{\mathrm {Cov}}({\mathbf {X,Y}})}{\sigma _{\mathrm {X}}\sigma _{\mathrm {Y}}} \end{aligned}$$
(3)

where \({\mathbf {X}}\) and \({\mathbf {Y}}\) are two random variables centered about their mean, \(\sigma _{\mathrm {X}}\) and \(\sigma _{\mathrm {Y}}\) are their standard deviations, and \({\mathrm {Cov}}({\mathbf {X,Y}})\) is the covariance, the degree of linear relationship between \({\mathbf {X}}\) and \({\mathbf {Y}}\). The value for the correlation coefficient ranges from \(-1\) to 1 and it indicates the amount of correlation between two variables. In Fig. 2 the mutual linear correlation coefficient for \(V_w\) and \(\gamma _{w}\) is also shown.

The robust optimization procedure to handle non-normal and correlated data is presented in Sect. 3.

2.2 Second example: forming of a B-pillar

DP800 steel is being used in car body design to produce the pillars between windows and/or doors which are designated alphabetically from the front: A-, B- and C-pillar. The B-pillar is located between front and rear doors and thus contributes to the protection of the passengers against side impact collision. In the lower part of the B-pillar there are many potential forming problems, thus this geometry represents a critical test case. A finite element (FE) model of this part is used for optimization. As shown in Fig. 3, the metal sheet is placed on the die and the blank holder is then moved towards the metal to clamp the sheet. The punch stroke forms the part and the part is removed from the die. A product of this process is shown in Fig. 3b. As shown in this figure, the design variables are blank-holder force (\(F_{bh}\)) and blank corner radius (R). The ranges of the design variables are specified in Table 2. They are specified using their upper and lower bounds and they can be adjusted in that range. In this process, the noise originates from the material.

Fig. 3
figure 3

a FE simulation setup and b the product of the forming process

Table 2 The ranges of the design variables

The main response of the model is the final angle of the profile after springback (\(\theta\)) and the constraint is defined on the maximum percentage thinning (\(\delta\)).

Optimization of material use and production process pushes the process to the limits. Variations in material properties can cause these limits to be passed and therefore products to be scrapped (Ben-Tal and Nemirovski 2002; Enevoldsen and Sorensen 1994). For robust optimization by simulation, accurate characterization of material properties is needed. Not only nominal values of properties, but also their distribution needs to be provided. Therefore, a detailed stochastic description of the material parameters is required based on coil-to-coil variation.

2.2.1 Material characterization

Material characterization was performed on 49 samples obtained from 49 coils of DP800 steel sheet. Those coils were collected from various production lines at Tata Steel. For each sample, the thickness of the sheet, t, was measured. Then the tensile properties in \(0^\circ\), \(45^\circ\) and \(90^\circ\) were obtained as shown in Fig. 4.

Fig. 4
figure 4

Tensile test samples prepared in three directions

The Swift hardening law is fitted to the results of the tensile test in rolling direction. The hardening is described using two parameters, strength coefficient (K) and strain hardening exponent (n) and it is assumed that yield strain \(\varepsilon _0=0.002\).

It was shown by Gomes et al. (2005) that anisotropy has a considerable effect on springback, and therefore, for an accurate shape prediction, anisotropy cannot be neglected. To describe anisotropy in this work, the method proposed by Abspoel et al. (2017) is used as it requires only parameters of standard uniaxial tests. Abspoel et al. (2017) confirmed that for a wide range of materials, the stress points in biaxial, plane strain and shear modes can be predicted using the input variables obtained from only tensile tests at \(0^\circ\), \(45^\circ\) and \(90^\circ\) to rolling direction. They used Vegter yield locus (Vegter and van den Boogaard 2006) and therefore, all measured stresses are normalized to the uniaxial stress in the rolling direction providing the stress factors, such that:

$$\begin{aligned} \begin{aligned} F_{00}&=S_{00}/S_{00}=1\\ F_{45}&=S_{45}/S_{00}\\ F_{90}&=S_{90}/S_{00} \end{aligned} \end{aligned}$$
(4)

in which, \(S_{00}\), \(S_{45}\) and \(S_{90}\) are the ultimate tensile strength (Rm) divided by uniform elongation (Ag) in each direction. The parameters \(r_{00}\), \(r_{45}\), and \(r_{90}\) are defined as the plastic strain in thickness direction (\(\varepsilon _{t}\)) divided by the plastic strain in width direction (\(\varepsilon _\mathrm {w}\)) in uniaxial tensile tests.

Figure 5 shows the variations of the material parameters around their mean value as obtained for 49 coils. The fitted normal distribution to each data set is shown in this figure. It is apparent that for some parameters the deviation of the fit from the underlying population is significant. In addition, there are some correlations between the parameters that are shown in Fig. 6. In this figure, the parameter names are on the diagonal and the bottom triangle shows the mutual linear correlation coefficients for various material parameters calculated by Eq. (3). A larger font in the bottom triangle indicate a higher mutual linear correlation among the parameters.

Fig. 5
figure 5

Variation of material parameters

Fig. 6
figure 6

The variation of material properties within \(\pm 3\sigma\) box for 49 coils after subtracting their mean value (top triangle), the parameter names (diagonal), and mutual linear correlation coefficients (bottom triangle)

Inspection of Fig. 6 reveals that strong correlations are present between \(F_{45}\) and \(F_{90}\), K and \(r_{00}\), K and \(r_{90}\), and between K and t. The modification of robust optimization procedure to handle non-normal and correlated data is presented in Sect. 3.

3 Methods

3.1 Correlations

The purpose of correlation study is to get an uncorrelated subspace, to possibly reduce the parameter space and to avoid sampling of physically-unlikely parameter combinations when the design of experiments (DOE) is created to make a metamodel. In this work, the covariance PCA is used to find the principal components (PCs) of the noise variables as described by Wiebenga et al. (2014). The advantage of PCA over other methods such as copula based methods which are used to estimate the joint distribution of noise parameters, is that PCA provides uncorrelated and univariate basis for the noise. The uncorrelated basis can then be used for dimension reduction using sensitivity analysis. Moreover, for analytical propagation of uncertainty which is presented in 3.2, it is crucial to describe the noise variables using uncorrelated basis. It is notable that in this work linearly uncorrelated principal components are used to describe noise data. If prediction uncertainties from PCA are of concern, higher order principal component analysis or multi-linear principal component analysis can be implemented in the presented framework without additional amendments.

The orthogonal transformation from the correlated data to linearly uncorrelated PCs is schematically shown in Fig. 7a, b. The DOE is then created in the PC space (Fig. 7c) avoiding physically-unlikely parameters sets, the hashed area shown in Fig. 7d. A set of physically-unlikely parameters can lead to an overestimation of the response scatter during the robust optimization (Wiebenga et al. 2014). For simulations, a back transformation from PC domain to the physical parameter domain is required.

Fig. 7
figure 7

Schematic of sampling from correlated parameters

3.2 Noise propagation for a non-normal input

It has been shown in Nejadseyfi et al. (2019a) that resorting to the analytical approach during robust optimization and sequential improvement of metamodel increases the efficiency and accuracy of the search for robust optimum. In this section, the analytical method is extended for non-normal distributions of input variables.

3.2.1 Noise propagation for a normally-distributed input

The input vector consists of \(n_{\mathrm {x}}\) design variables, \({\mathbf {x}}\), and \(n_{\mathrm {z}}\) noise variables, \({\mathbf {z}}\). The response of the process, r, is defined as \(r({\mathbf {x,z}})\). The mean and variance of the response are expressed by:

$$\begin{aligned} \mu _{\mathrm {r}}({\mathbf {x}})& = \int _{{\mathbf {z}}} r({\mathbf {x,z}})p({\mathbf {z}})d{\mathbf {z}} \end{aligned}$$
(5)
$$\begin{aligned} \sigma _{\mathrm {r}}^2({\mathbf {x}})&= \int _{{\mathbf {z}}} [r({\mathbf {x,z}})-\mu _{\mathrm {r}}({\mathbf {X}})]^2p({\mathbf {z}})d{\mathbf {z}} \end{aligned}$$
(6)

In these equations, \(p({\mathbf {z}})\) is the probability distribution function (PDF) of the noise variables. To use these equations, it is assumed that the noise variables are statistically independent. In this work this assumption is valid since PCA is used. It has been shown that if \(r({\mathbf {x}},{\mathbf {z}})\) can be expressed as a sum of tensor-product basis functions the results of univariate integrals can be combined to evaluate multivariate integrals and obtain the response moments (Chen et al. 2005). The mean and standard deviation of the response is then calculated using:

$$\begin{aligned} \mu _{\mathrm {r}}({\mathbf {x}})&= a_{0}+\sum _{i=1}^{N}\left\{ a_{i}\prod _{p=1}^{n_{\mathrm {x}}} b_{ip}(x_{p})\prod _{q=1}^{n_{\mathrm {z}}}C1_{iq}\right\} \end{aligned}$$
(7)
$$\begin{aligned} \sigma _{\mathrm {r}}^{2}({\mathbf {x}})&= \sum _{i=1}^{N}\sum _{j=1}^{N}a_{i}a_{j}\prod _{p=1}^{n_{\mathrm {X}}}b_{ip}(x_{p})b_{jp}(x_{p})\nonumber \\&\quad \times \left( \prod _{q=1}^{n_{\mathrm {z}}}C2_{ijq}-\prod _{q=1}^{n_{\mathrm {z}}}C1_{iq}C1_{jq}\right) \end{aligned}$$
(8)

where \(a_0\) is a constant, N is the number of basis functions, \(a_i\) are weighting factors, \(b_{ip}\) are the basis functions of the metamodel and \(C1_{iq}\) and \(C2_{ijq}\) are defined as:

$$\begin{aligned} C1_{iq}&= \int _{z_q} b_{iq}(z_{q})p(z_{q})\ dz_{q} \end{aligned}$$
(9)
$$\begin{aligned} C2_{ijq}& = \int _{z_q} b_{iq}(z_{q})b_{jq}(z_{q})p(z_{q})\ dz_{q} \end{aligned}$$
(10)

Once \(C1_{iq}\) and \(C2_{ijq}\) are known for the specific metamodel using basis functions and probability distribution, the moments of the response can be evaluated on each design point. A normal probability distribution for input noise is described by:

$$\begin{aligned} \begin{aligned} p(z_q)={\mathcal {N}}(z_{q};\mu _{q},\sigma _{q}) =\frac{1}{\sigma _{q}\sqrt{2\pi }}e^{-\frac{{(z_{q}-\mu _{q})}^{2}}{2\sigma _{q}^{2}}} \end{aligned} \end{aligned}$$
(11)

where \(\mu _{q}\) and \(\sigma _{q}\) are the mean and standards deviation of the input noise data. The basis functions for an ordinary Kriging are:

$$\begin{aligned} \begin{aligned} b_{iP}(x_{P})=e^{-\vartheta _{P}^2(x_{P}-x_{iP})^2} \end{aligned} \end{aligned}$$
(12)

In this equation the values for \(\vartheta _P\) are obtained from fitting a Kriging metamodel to the discrete response. Substituting Eqs. (11) and (12) into Eqs. (9) and (10) leads to \(C1_{iq}\) and \(C2_{ijq}\) (see “Appendix”) for normal distribution and Kriging metamodel:

$$\begin{aligned} C1_{iq}^{{\mathrm {Krig}},{\mathcal {N}}}& = \frac{1}{\sqrt{2\sigma _{q}^{2}\vartheta _{q}^{2}+1}}\nonumber \\&\quad \times \exp \left( \frac{-\vartheta _{q}^{2}}{2\sigma _{q}^{2} \vartheta _{q}^{2}+1}(\mu _{q}-z_{iq})^2\right) \end{aligned}$$
(13)
$$\begin{aligned} C2_{ijq}^{{\mathrm {Krig}},{\mathcal {N}}}& = \frac{1}{\sqrt{4\sigma _{q}^{2} \vartheta _{q}^{2}+1}}\exp \Bigg ( \frac{-\vartheta _{q}^2}{4\sigma _{q}^{2}\vartheta _{q}^{2}+1} \nonumber \\&\quad \times [(\mu _{q}-z_{iq})^2+(\mu _{q}-z_{jq})^2 \nonumber \\&\quad +2\sigma _{q}^{2}\vartheta _{q}^{2}(z_{iq}-z_{jq})^2] \Bigg ) \end{aligned}$$
(14)

The significant advantage of this method is that Eqs. (13) and (14) are evaluated only once for a given noise and therefore it enhances the efficiency of the search for robust optimum in design space.

3.2.2 Noise propagation for non-normal input

To perform robust optimization using analytical method, it is necessary to describe the input noise using a probability density function. Based on data presented in Sect. 2 a multimodal normal distribution is preferred to construct the estimates of the input noise as it can be handled analytically. In the previous section Eqs. (9) and (10) are describing the general form of \(C1_{iq}\) and \(C2_{ijq}\) coefficients. In this section, we alter \(C1_{iq}\) and \(C2_{ijq}\) for bimodal normal input. Although the derivations in this section are for bimodal normal input, the procedure remains the same for a multimodal normal distribution.

A bimodal distribution is expressed by sum of two uni-modal distributions:

$$\begin{aligned} \begin{aligned} p(z_{q})=\omega _q p_1(z_{q})+(1-\omega _q)&p_2(z_{q}) \quad 0\le \omega _q\le 1 \end{aligned} \end{aligned}$$
(15)

If it is assumed that \(p_1\) and \(p_2\) are two normal distributions that are characterized by parameters \(\mu _q^{[1]}, \sigma _q^{[1]}, \mu _q^{[2]}\) and \(\sigma _q^{[2]}\), a bimodal normal distribution is defined by:

$$\begin{aligned} \mathrm {{\mathcal {B}}}(z_{q})& = \omega _q {\mathcal {N}}(z_{q};\mu _q^{[1]},\sigma _q^{[1]})\nonumber \\&\quad +(1-\omega _q) {\mathcal {N}}(z_{q};\mu _q^{[2]},\sigma _q^{[2]}) \end{aligned}$$
(16)
$$\begin{aligned} {\mathrm {CDF}}^\mathrm {{\mathcal {B}}}(z_{q})& = \omega _q {\mathrm {CDF}}^{{\mathcal {N}}}(z_{q};\mu _q^{[1]},\sigma _q^{[1]})\nonumber \\&\quad +(1-\omega _q) {\mathrm {CDF}}^{{\mathcal {N}}}(z_{q};\mu _q^{[2]},\sigma _q^{[2]}) \end{aligned}$$
(17)

where \(\omega _q\) determines the weighting factor for each of those normal distributions, as shown schematically in Fig. 8. The superscripts, [1] and [2], are related to each mode of the distribution and they are put in brackets to distinguish them from exponents. Furthermore, CDF refers to the cumulative distribution function. For the sake of brevity, throughout this paper, we use the term bimodal distribution to refer to bimodal normal distribution, \({\mathcal {B}}\). By substituting Eq. (16) into Eqs. (9) and (10):

$$\begin{aligned} C1_{iq}^{{\mathrm {Krig}},{{\mathcal {B}}}}&=\int _{z_q} b_{iq}(z_{q})\Big (\omega _q{\mathcal {N}}(z_q;\mu _q^{[1]},\sigma _q^{[1]})\nonumber \\&\quad +(1-\omega _q){\mathcal {N}}(z_q;\mu _q^{[2]},\sigma _q^{[2]})\Big )dz_{q}\nonumber \\&=\omega _q C1_{iq}^{{\mathrm {Krig}},{\mathcal {N}}[1]}+(1-\omega _q)C1_{iq}^{{\mathrm {Krig}},{\mathcal {N}}[2]} \end{aligned}$$
(18)
$$\begin{aligned} C2_{ijq}^{{\mathrm {Krig}},{\mathcal {B}}}&=\int _{z_q} b_{iq}(z_{q})b_{jq}(z_{q})\nonumber \\&\quad \times \Big (\omega _q{\mathcal {N}}(z_q;\mu _q^{[1]},\sigma _q^{[1]})\nonumber \\&\quad +(1-\omega _q){\mathcal {N}}(z_q;\mu _q^{[2]},\sigma _q^{[2]})\Big )dz_{q}\nonumber \\&=\omega _q C2_{iq}^{{\mathrm {Krig}},{\mathcal {N}}[1]}+(1-\omega _q)C2_{iq}^{{\mathrm {Krig}}, {\mathcal {N}}[2]} \end{aligned}$$
(19)
Fig. 8
figure 8

a Probability distribution of a bimodal input (\(\omega _q\)=0.5). b Cumulative distribution of a bimodal input (\(\omega _q\)=0.5)

In these equations \(C1^{{\mathrm {Krig}},{\mathcal {N}}}_{iq}\), \(C2^{{\mathrm {Krig}},{\mathcal {N}}}_{ijq}\) are the coefficients for the combination of basis functions and each input mode. Knowing C1 and C2 for normal distribution and a specific metamodel, it is feasible to extend the analytical calculation of uncertainty for a bimodal or a multimodal distribution in a straightforward manner for that metamodel.

The propagation of a bimodal input will lead to a multimodal response. Considering \(n_z\) noise variables and assuming that each noise variable follows a bimodal distribution, the response consists of \(2^{n_z}\) modes. As an example, assume there are two bimodal noise variables as \(z_1\) with a parameter set of (\([\mu _1^{[1]},\sigma _1^{[1]}],[\mu _1^{[2]},\sigma _1^{[2]}],\omega _{1}\)) and \(z_2\) with a parameter set of (\([\mu _2^{[1]},\sigma _2^{[1]}],[\mu _2^{[2]},\sigma _2^{[2]}],\omega _{2}\)). The response modes consist of the propagation of each pair of combined normal distributions in (\(z_1,z_2\)) noise space which are: (\([\mu _1^{[1]},\sigma _1^{[1]}]\) , \([\mu _2^{[1]},\sigma _2^{[1]}]\)); (\([\mu _1^{[1]},\sigma _1^{[1]}]\) , \([\mu _2^{[2]},\sigma _2^{[2]}]\)); (\([\mu _1^{[2]},\sigma _1^{[2]}]\) , \([\mu _2^{[1]},\sigma _2^{[1]}]\)); and (\([\mu _1^{[2]},\sigma _1^{[2]}]\) , \([\mu _2^{[2]},\sigma _2^{[2]}]\)). The weights of the response modes are \(\omega _{1}\omega _{2}\), \(\omega _1(1-\omega _{2})\), \((1-\omega _{1})\omega _{2}\), and \((1-\omega _{1})(1-\omega _{2})\), respectively. The combination of all response modes must be considered in the robustness and in the analysis of constraints.

3.3 Multimodal response and reformulation of the robust optimization problem

Most robust optimization approaches define the variability of response and the reliability of constraints based on mean and standard deviation of the response. To show how this approach can undermine the robust optimization procedure, Fig. 9 shows the propagation of a one-dimensional dataset, z, that follows a bimodal normal distributions through a non-linear function. The input data set is shown using bar plot and it is propagated to obtain the response bar plot. If only mean and standard deviation of the response are used for robust optimization, the variability of the main response and the tails of the constraint response distribution (reliability constraints) are not accurately predicted using a normal density function (dashed curve). In such case, it is important to modify the objective function and reliability constraints formulations. As it is possible that in some special cases the modes of output are overlapping, the proposed method should be able to handle this situation by generalizing existing methods.

Fig. 9
figure 9

Propagation of a bimodal normal distributions and the resulting response distribution

The robustness or variability of the response is directly related to the upper and lower tails of the response distribution. For instance, for an \(n\sigma\) process design, the defect rate is the number of products outside \(\pm n\sigma\) range of a normally distributed response (Koch et al. 2004).

The upper bound constraints are generally defined by:

$$\begin{aligned} \begin{aligned} \mu _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})+n\sigma _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})<C_\mathrm {g} \end{aligned} \end{aligned}$$
(20)

and when there is a lower bound constraint:

$$\begin{aligned} \begin{aligned} \mu _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})-n\sigma _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})>C_\mathrm {g} \end{aligned} \end{aligned}$$
(21)

where the choice of n is related to the required reliability level of a process. For a robustness measure a combination of mean and standard deviation is generally used such as the method presented in Koch et al. (2004):

$$\begin{aligned} \begin{aligned} \min \ \Bigg (\Big (\mu _{\mathrm {r}}({\mathbf {x}})-C_{\mathrm {r}}\Big )^2+w\sigma _{\mathrm {r}}^2({\mathbf {x}})\Bigg ) \end{aligned} \end{aligned}$$
(22)

where \(C_{\mathrm {r}}\) is the target value. In this equation, w is a weighting factor to adjust the optimization objective function between ’mean on target’ and ’minimal response variation’. For a multimodal response, the response variability and reliability constraints must be calculated differently. In this work, U and L are defined to correct the \(n\sigma\) method considering the upper tail and lower tail shift for a multimodal response as shown in Fig. 10 schematically. This figure shows the PDF and CDF of a dataset which follows a bimodal distribution. The dashed line represents a normal distribution fit and it is apparent that fitting a normal distribution to a dataset which follows a bimodal distribution leads to a big deviation in the prediction of the tails of the distribution, and therefore, it causes an overestimation or an underestimation of the variability and the reliability of the satisfaction of constraints.

Fig. 10
figure 10

Schematic of shift of the tails for a bimodal distribution compared to equivalent normal distribution (a) and the difference between cumulative distribution of equivalent normal and bimodal probability distribution functions (b)

Based on the this explanation, U and L are used in evaluation of the the constraints. In case of upper bound:

$$\begin{aligned} \begin{aligned} \mu _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})+U(\varvec{\mu }_{{\mathrm {r}}_{\mathrm {c}}}, \varvec{\sigma }_{{\mathrm {r}}_{\mathrm {c}}},\varvec{\omega }_{{\mathrm {r}}_{\mathrm {c}}};n) \sigma _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})<C_\mathrm {g} \end{aligned} \end{aligned}$$
(23)

and in case of lower bound:

$$\begin{aligned} \begin{aligned} \mu _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})+L(\varvec{\mu }_{{\mathrm {r}}_{\mathrm {c}}}, \varvec{\sigma }_{{\mathrm {r}}_{\mathrm {c}}},\varvec{\omega }_{{\mathrm {r}}_{\mathrm {c}}};n) \sigma _{{\mathrm {r}}_{\mathrm {c}}}({\mathbf {x}})>C_\mathrm {g} \end{aligned} \end{aligned}$$
(24)

In these equations L and U are the correction parameters to \(n\sigma\), that can be calculated using:

$$\begin{aligned} \begin{aligned} L^{n\sigma }&=({\mathrm {CDF}}^{\mathcal {B}})^{-1}\big (\varvec{\mu }_{\mathrm {r}}, \varvec{\sigma }_{\mathrm {r}},\varvec{\omega };{\mathrm {CDF}}^{{\mathcal {N}}}(-n\sigma )\big )\\ U^{n\sigma }&=({\mathrm {CDF}}^{\mathcal {B}})^{-1}\big (\varvec{\mu }_{\mathrm {r}}, \varvec{\sigma }_{\mathrm {r}},\varvec{\omega };{\mathrm {CDF}}^{{\mathcal {N}}}(n\sigma )\big ) \end{aligned} \end{aligned}$$
(25)

When, for example, a reliability according to a \(3\sigma\) process is required:

$$\begin{aligned} \begin{aligned} L^{3\sigma }&=({\mathrm {CDF}}^{\mathcal {B}})^{-1}\big (\varvec{\mu }_{\mathrm {r}}, \varvec{\sigma }_{\mathrm {r}},\varvec{\omega };{\mathrm {CDF}}^{{\mathcal {N}}}(-3\sigma )\big )\\ U^{3\sigma }&=({\mathrm {CDF}}^{\mathcal {B}})^{-1}\big (\varvec{\mu }_{\mathrm {r}}, \varvec{\sigma }_{\mathrm {r}},\varvec{\omega };{\mathrm {CDF}}^{{\mathcal {N}}}(3\sigma )\big ) \end{aligned} \end{aligned}$$
(26)

Similar to constraints, the robustness measure is obtained by:

$$\begin{aligned} \begin{aligned}&\Big (\mu _r({\mathbf {x}})+\frac{U(\varvec{\mu }_{\mathrm {r}},\varvec{\sigma }_{\mathrm {r}}, \varvec{\omega };w)+L(\varvec{\mu }_{\mathrm {r}},\varvec{\sigma }_{\mathrm {r}}, \varvec{\omega };w)}{2}\sigma _r({\mathbf {x}})-C_r\Big )^2\\&\quad +\frac{U(\varvec{\mu }_{\mathrm {r}},\varvec{\sigma }_{\mathrm {r}}, \varvec{\omega };w)-L(\varvec{\mu }_{\mathrm {r}},\varvec{\sigma }_{\mathrm {r}}, \varvec{\omega };w)}{2}\sigma _r^2({\mathbf {x}}) \end{aligned} \end{aligned}$$
(27)

During the robust optimization procedure, after the modes of response are evaluated by uncertainty propagation, U and L are implicitly calculated to evaluate the robustness and to assess the satisfaction of constraints.

As mentioned earlier, this method can handle the special cases in which the modes of response are overlapping. For example, for two modes of response that have the same mean and standard deviation and consequently a normally-distributed response is expected, \(L=-n\) and \(U=+n\). In this case the robustness measure reduces to (22).

4 Results and discussion

4.1 Basketball free throw in windy weather condition

PCA is performed on the dataset presented in section 2.1 to describe the noise using independent PCs. To decide whether a PC should be considered a unimodal normal distribution or a bimodal normal distribution, the 5 parameters of the bimodal distribution in Eq. (17) are fitted to the cumulative probability distribution of each PC. Then a unimodality test is performed according to Eisenberger (1964) by evaluating the separation of the two components of the bimodal distribution:

$$\begin{aligned} \begin{aligned} \varDelta _q&=\mu _q^{[2]}-\mu _q^{[1]}\\ v_q&=\frac{27(\sigma ^{[1]}_q)^2(\sigma _q^{[2]})^2}{4\Big ((\sigma _q^{[1]})^2+(\sigma _q^{[2]})^2\Big )}\\&\quad \mathrm {if }\, \varDelta _q^2<v_q \rightarrow p(z) \sim {\mathcal {N}}(z) \end{aligned} \end{aligned}$$
(28)

where, \(\varDelta _q\) is the difference between means of two modes of the distribution and \(v_q\) is a function of standard deviations of two modes of the distribution. This equation is used to decide whether the input is unimodal or bimodal. The values of \(\varDelta ^2\) and v and the fitted curves to the projection of experiments on the PCs are shown in Fig. 11.

Fig. 11
figure 11

Variation of input after PCA

A DOE is then generated using Latin hypercube sampling (LHS). Kriging method is used to build the metamodel using simulation results on the DOE points. A simulation of basketball free throw gives the final coordinates of the center of the ball in xy plane at the height of hoop (\(x_{f},y_{f};z=3.05\)) as well as the angle of approach towards the hoop, \(\beta _{xy}\). As shown in Fig. 12 the angle of approach towards the hoop and the distance from center of hoop are two main parameters that determine the success of the throw (Brancazio 1981). A higher throw results in a higher angle of approach and therefore a wider range of success. On the other hand, a higher throw increases the effect of the wind and therefore results in more scatter of the final coordinates. It is expected that a strategy can be found that maximizes the number of successful throws.

Fig. 12
figure 12

Side and top views of the hoop and the area of successful approach

The simulation results are combined in one output parameter, p, that measures the success of the throw:

$$\begin{aligned} \begin{aligned} p=\frac{(x_f-3)^2}{d_x^2}+\frac{(y_f)^2}{d_y^2} \end{aligned} \end{aligned}$$
(29)

where:

$$\begin{aligned} \begin{aligned} d_x&=D-\frac{2r}{sin(\beta _{xz})}\\ d_y&=D-\frac{2r}{sin(\beta _{xy})} \end{aligned} \end{aligned}$$
(30)

In these equations \(d_x\) and \(d_y\) determine the range of successful approaches to the hoop and they depend on the angle of approach towards the hoop as shown in Fig. 12 (Brancazio 1981). The lower the value for p the more the success of the throw. The goal of optimization is to obtain a setting of design parameters in which p has minimum value and variation in the windy situation to guarantee the highest rate of success.

Two cases are investigated. In the first case, a normal distribution is fitted to the noise input and Eq. (22) is used for robust optimziation. In the second case, a bimodal normal distribution is fitted to the noise input and Eq. (27) is used as robustness criterion. The results of optimization leads to a design setting in which the variation of response to the noise originated by wind is minimized. Table 3 shows the comparison between the results obtained in the two cases. This table shows that there is a small change in the values of optimum design parameters. To show the effectiveness of the new method and validate the results, direct MC using the raw wind data is performed at each robust design setting and the final coordinates of the basketball landing at the height of the hoop are plotted in Fig. 13. To compare the success rate at the obtained optimum in two cases, the number of center points within the hoop and the number of center points within the \(d_x\) and \(d_y\) region (dotted ellipsoid) are counted and divided to the total number of throws to obtain \(s_1\) and \(s_2\). These numbers which indicate the success rate are shown in Fig. 13. It is shown that using the new method improves the success rate more than \(3\%\). Although this improvement in this example seems marginal, in industrial cases smallest improvement helps to achieve zero-defect manufacturing, to cut costs, and to increase efficiency.

Table 3 Optimum design parameters in basketball free throw
Fig. 13
figure 13

Final coordinates of the center of basketball (top view of the hoop) a for case 1 and b for case 2

It must be noted that the accuracy of the metamodel built using sampling techniques has influence on the accuracy of the obtained optimum. To consider the metamodel uncertainty, an objective switching criterion can be used to investigate whether the robust status of the optimum can be changed because of the interpolation uncertainties or not (Zhou et al. 2018a). In order to further improve the accuracy of metamodel in finding the robust optimum, iterative improvement of the metamodel can be performed by adding new infill points adaptively (Jurecka et al. 2007) and the framework presented in this paper can be combined with any iterative robust optimization techniques.

4.2 Forming of a B-pillar automotive component

PCA is performed on the correlated data presented in Sect. 2.2.1 and the distributions of projection of experiments on the PCs are shown in Fig. 14. It can be seen that some of the distributions deviate significantly from a normal distribution. As the number of measurements is not very large in this example, each distribution is described using either a unimodal or a bimodal normal distribution. If one is willing to use more than two modes of the normal distribution to describe the noise, the method described in Sect. 3.2.2 is still applicable. The probability distributions are shown in Fig. 14.

Fig. 14
figure 14

Variation of material parameters after PCA

A screening step is performed for dimension reduction. A screening DOE is made on the PC space using a fractional factorial design and the results of FE simulations on those points are evaluated. Figure 15 shows the Pareto charts of the responses. As can be seen in this figure, most PCs have a major contribution to at least one of the responses. PC1, PC4, and PC5, are among the least important ones. Considering the influence of PCs on both parameters, PC4 is eliminated because of having a total contribution of less than \(10\%\) for both responses, thus over \(90\%\) control on the process is retained. Eliminating one more PC (e.g. PC1 or PC 5) is not recommended as the total control over the responses will fall below \(80\%\) (Wiebenga et al. 2014).

Fig. 15
figure 15

Pareto charts for final profile angle and thinning

The distributions of the projection of the experiments on the PCs, shown in Fig. 14, are used after eliminating PC4. The remaining 7 PCs that represent the noise space are used for generating the metamodel of the process. Table 4 shows the parameters of the noise distribution according to Fig. 14. PC1, PC3 and PC6 are considered as bimodal distributions and the other PCs are assumed to follow a normal distribution. The PCs that are described using a normal distribution have a zero mean due to subtracting their mean values; and they have the same standard deviation due to normalization. For comparison purpose, the results of the robust optimization assuming that all noise variables are normally distributed is also presented.

Table 4 The variation of principal components used in robust optimization

For robust optimization, the Latin hypercube sampling (LHS) technique is used to create a DOE in combined design-PC space. Maximin method is used to maximize the minimum distance between the points to ensure an optimum sampling in the space. The responses, \(\theta\) and \(\delta\), are calculated using FE simulations which are considered as a black-box function.

The FE model of this process is built in AutoForm R7 software. The sheet is discretized using Elasto-Plastic Shell elements that have 11 integration points in the thickness direction (EPS-11). This model was experimentally validated in a previous work (Abspoel et al. 2016). Then, a Kriging metamodel is built using the response values obtained from FE simulations to account for non-linear relationships between input and response (Bonte et al. 2008). The metamodel of this process is used for robust optimization including non-normal input. Sequential quadratic programming (SQP) is used to search the robust design. A multi-start method is implemented to ensure convergence towards a global optimum. The Kriging metamodel is built on the 9-dimensional space comprising two design variables and 7 PCs that represent the noise space. The analytical method presented in Sects. 3.2.1 and 3.2.2 is employed to propagate the independent noise variables and to find the response modes. The goal of the optimization problem is to minimize the variation of \(\theta\), \(\sigma _{\uptheta}\), while setting the mean, \(\mu _{\theta }\), on a given target value, \(C_{\uptheta}\). Moreover, there is a constraint on \(\delta\). The first optimization case is defined by:

$$\begin{aligned} \begin{aligned}&\underset{{\mathbf {x}}}{\mathrm {minimize}}\qquad \qquad \Bigg ((\mu _{\uptheta}({\mathbf {x}})-C_{\uptheta}\big )^2 +w\sigma _{\uptheta}^2({\mathbf {x}})\Bigg )\\&\begin{aligned} \mathrm {subject\ to} \qquad \qquad&\mu _{\updelta}({\mathbf {x}})+n\sigma _{\updelta}({\mathbf {x}})-C_{\updelta} \le 0\\&{\mathbf {lb}\le {\mathbf {x}} \le \mathbf {ub}}&\\&{\mathbf {z}}\sim {{\mathcal {N}}}({\mathbf {z}})\\&C_{\uptheta}=22^\circ \\&C_{\updelta}=15\%\\ \end{aligned} \end{aligned} \end{aligned}$$
(31)

In the second case:

$$\begin{aligned} \begin{aligned}&\underset{{\mathbf {x}}}{\mathrm {minimize}}\qquad \qquad \Bigg (\Big (\mu _{\uptheta}({\mathbf {x}})+\frac{U(\varvec{\mu }_{\uptheta}, \varvec{\sigma }_{\uptheta},\varvec{\omega };w)+L(\varvec{\mu }_{\uptheta}, \varvec{\sigma }_{\uptheta},\varvec{\omega };w)}{2} \sigma _\theta ({\mathbf {x}})-C_{\uptheta}\Big )^2\\&\qquad \qquad \qquad \qquad \qquad \qquad +\frac{U(\varvec{\mu }_{\uptheta},\varvec{\sigma }_{\uptheta}, \varvec{\omega };w)-L(\varvec{\mu }_{\uptheta},\varvec{\sigma }_{\uptheta}, \varvec{\omega };w)}{2}\sigma _{\uptheta}^2({\mathbf {x}})\Bigg )\\ {}&\begin{aligned} \mathrm {subject\ to} \qquad \qquad&\mu _{\updelta} ({\mathbf {x}})+U(\varvec{\mu }_{\updelta},\varvec{\sigma }_{\updelta}, \varvec{\omega };w)\sigma _{\updelta}({\mathbf {x}})-C_{\updelta} \le 0\\&{\mathbf {lb}\le {\mathbf {x}} \le \mathbf {ub}}&\\&{\mathbf {z}}\sim {{\mathcal {B}}}({\mathbf {z}}) , {{\mathcal {N}}}({\mathbf {z}})\\&C_{\uptheta}=22^\circ \\&C_{\updelta}=15\%\\ \end{aligned} \end{aligned} \end{aligned}$$
(32)

While calculating objective function value and constraints, L and U are evaluated implicitly during each function evaluation.

According to Table 4, three input variables are considered to have bimodal distribution and therefore, 8 modes in the responses are expected as explained in Sect. 3.2.2. Figure 16 shows the modes of each response multiplied by its respective weight as a result of considering PC1, PC3 and PC6 as bimodal distributions. It is apparent from this figure that each mode of both responses has a different contribution to the objective of the optimization. Some modes have bigger variations while others have larger deviation from the target values. The magnitude of the contribution of each mode is also different due to different weights. Figure 17 shows the overall PDF and CDF of \(\theta\) and \(\delta\) considering the bimodal input expressed in Table 4. It must be noted, that the resultant distribution for bimodal case, which is plotted in Fig. 17, is the superposition of the 8 modes of the response. Although the superposition of those 8 modes resembles a normal distribution due to interfering modes, the resulting distribution has clear deviations from a normal distribution, specifically at the tails of the distribution. This highlights the importance of accounting for the bimodal input to increase the accuracy of constrained robust optimization.

Fig. 16
figure 16

The modes of each response

Fig. 17
figure 17

The comparison of response probability and cumulative distribution considering normally-distributed noise and conventional objective function and constraints (Case 1) and considering bimodality of noise with reformulated objective function and constraints (Case 2)

As mentioned earlier, considering the effect of non-normal input and the resulting output, can significantly affect the objective of the robust optimization and the evaluation of constraints. It is apparent from Fig. 17, that considering all modes of input has a significant influence on prediction of the mean and variation of main response which are used to evaluate robustness. Additionally, the tail of the distribution of \(\delta\), which affects the satisfaction of constraint, changes and this leads to a different robust optimum design. Consequently, errors in the search for robust optimum design is expected. Table 5 compares the robust optimum obtained for bimodal and the equivalent normal inputs. The robust design obtained using a bimodal description of input has \(~2.6\%\) deviation in R and \(~1.6\%\) in \(F_{bh}\) compared to normal distribution case. This slight adjustment of the design variables is a result of a more realistic material description and a modified definition of robustness criterion and reliability constraints.

Table 5 Robust optimum design for different cases

It is notable that instead of analyzing the bimodality of the response, the equivalent moments of the multimodal response could be calculated using the method described in Cohen (1967) and then used to evaluate the robustness and evaluation of constraints as discussed in Nejadseyfi et al. (2019b). In that case, the higher order moments of each response can be used to analyze the robustness and to evaluate the constraints. Nevertheless, the choice for a bimodal input in this example has a physical reason as the materials obtained from each production line follow a specific probabilistic characteristic.

5 Conclusions

In this work a framework is presented to perform robust optimization using real noise data. Principal component analysis is employed to describe the variation of input parameters using linearly uncorrelated principal components. It is shown that some of the principal components deviate significantly from a normal probability distribution. In that case, a bimodal normal probability distribution is considered and propagated using an analytical method through the metamodel of the process. Consequently, a multimodal response is obtained. Robustness criterion and the reliability constraints are altered to achieve an accurate description of the reliability of constraint response and the main response variability. The results are validated and compared using two examples, a basketball free throw problem and an industrial forming process. It is shown that using a non-normal description of scatter and reformulating the variability and reliability constraints, helps to further improve the robust optimization procedure.