Evaluation and assessment of non-normal output during robust optimization

A robustness criterion that employs skewness of output is presented for a metamodel-based robust optimization. The propagation of a normally distributed noise variable via nonlinear functions leads to a non-normal output distribution. To consider the non-normality of the output, a skew-normal distribution is used. Mean, standard deviation, and skewness of the output are calculated by applying an analytical approach. To show the applicability of the proposed method, a metal forming process is optimized. The optimization is defined by an objective and a constraint, which are both nonlinear. A Kriging metamodel is used as nonlinear model of that forming process. It is shown that the new robustness criterion is effective at reducing the output variability. Additionally, the results demonstrate that taking into account the skewness of the output helps to satisfy the constraints at the desired level accurately.


Introduction
Optimization in the presence of uncertainty of input parameters has become a common exercise in various fields of study (Picheny et al. 2017;Zhou et al. 2018;Marton et al. 2015;Yazdi 2017). When considering variability in input parameters, one makes sure that the output meets requirements even under the influence of scatter in the input. Robust optimization is one of the methods that is used to minimize the variation of output originating from the scatter of input parameters (Bertsimas et al. 2011). In this approach, the input parameters are categorized as design parameters (x) and as noise parameters (z). Design parameters can be adjusted during the process but noise parameters are those that are difficult or impossible to control. The goal is to find a design point at which the output is least sensitive to the variation of noise variables and the output mean is closest to the target value. Moreover, the constraints must be handled properly because of input uncertainties (Park et al. 2006 The main steps of a metamodel-based robust optimization are shown schematically in Fig. 1. The first step is to design a set of experiments (DOE) considering both design and noise input variables. Several methods already exist for example factorial design, central composite, space filling, and Latin hypercube (Cavazzuti 2013). The expensive black-box function is then evaluated at each sample point (step 1). In the next step, a metamodel is built using the discrete responses. Then, the output mean and standard deviation are evaluated using that metamodel (Vu et al. 2017). One can choose from several methods of metamodeling, such as regression models, Kriging models, or radial basis functions (Luo and Lu 2014). Evaluation of noise propagation through a metamodel is more efficient than through the direct evaluations of an expensive black-box function. To evaluate noise propagation through a metamodel, various techniques can be used such as Monte Carlo (MC) sampling, Taylor series approximation, or analytical integration (Heijungs and Lenzen 2014).
The objective function value is then minimized while satisfying the constraints using an optimization algorithm (step 3). The robustness measure, is expressed as: while the nonlinear constraints are formulated as: where μ r and μ c are the means of the response of the objective and constraints, and σ r and σ c are their standard deviations. The robustness measure quantifies how sensitive the process is to the noise parameters, z, at every design point, x, while constraints are used to assess the statistics of feasibility of a design. The optimization problem is to minimize f (x) while g(x) < C g . For this purpose, sequential quadratic programming or another constrained nonlinear optimization algorithm can be employed (Baginski et al. 2005).
In the literature, the propagation of uncertainty is mainly performed using MC sampling of a Gaussian distribution (Zhou et al. 2018;Putko et al. 2002;Martinelli and Duvigneau 2010). The response probability distribution after noise propagation is also considered to be normally distributed (Havinga et al. 2017;Zhao et al. 2015;Cui et al. 2014;Du et al. 2008). This assumption only holds if the relation between input and output is linear as shown in Fig. 2a. A nonlinear relation between input and output leads to a non-normal distribution. For instance, concavity or convexity of the function might lead to asymmetrical distribution of the output. This effect is shown in Fig. 2b. Additionally, a highly nonlinear response will give rise to higher order moments of output (Mekid and Vaja 2008) which is shown schematically in Fig. 2c.
Both the constraints and the robustness measure are usually defined using the mean and standard deviation of the output. Considering only the mean and standard deviation induces large errors for a non-normal output distribution. There have been attempts to capture the non-normality of the output by applying various methods and include it in the optimization in the presence of uncertainty. For instance, Anderson and Mattson (2012) used first-order and second-order Taylor series expansion to approximate the propagation of skewness and kurtosis by applying engineering models. Zhang (2017) developed a method to include high-order moments of output distribution for reliability analysis in the method of moments. They reported a higher accuracy in predicting the reliability index by using higher order moments. Mekid and Vaja (2008) performed higher order uncertainty propagation using Taylor expansions and compared their approach with other methods that included MC sampling from a Gaussian input distribution. They concluded that the necessity of considering higher order moments depends on the nonlinearity of the response and other input factors.
In this article, we introduce an approach for robust optimization that takes into accounts the skewness of the output. Skewness is used for evaluation of the robustness measure and constraints. An existing analytical method  (Nejadseyfi et al. 2017(Nejadseyfi et al. , 2018 is extended to calculate the mean, standard deviation, and skewness of the output by propagating a Gaussian input noise through a Kriging metamodel. The analytical method is used instead of MC sampling from a Gaussian distribution and it is faster, more accurate, and improves the efficiency of optimization. This article is organized as follows: Section 2 describes the analytical method to find the mean, standard deviation, and skewness of the output, based on the propagation of a normally distributed input through a nonlinear metamodel. Section 3 introduces the new robustness measure and the method to apply the constraints at a desired level of reliability. Sections 4.1 and 4.2 discuss the finite element (FE) simulation of a stretch-bending process as a case study. In Section 4.3, the optimization problem is formulated on the basis of the new approach. The results are presented and then discussed in Section 5.

Analytical method for uncertainty propagation
Assume a process that has an M-dimensional input space that has one output. M consists of m design variables, x, and n noise parameters, z, (M = m + n). A response, r, is defined as r = r(v) = r(x, z). Then, the mean, variance, and skewness of the response are expressed by: In these equations, p(z) is the probability distribution function of the noise parameters. It is assumed that the noise variables are statistically independent. It has been shown that if r(v) can be expressed as a sum of tensor-product basis functions, the results of univariate integrals can be combined to evaluate multivariate integrals and thereby obtain the output moments (Chen et al. 2005). Multivariate tensor-product basis functions, B i (v), can be expressed as a product of M univariate basis functions: in which b iP (v P ) is a univariate basis function. If a general function, r(v), can be defined using linear expansion of these multivariate basis functions, it can also be re-written in terms of univariate basis functions as follows: where N is the number of multivariate basis functions. Many of the metamodels that are commonly used, e.g., polynomial regression, Kriging, and Gaussian radial basis functions (RBF), can be expressed using tensor-product basis functions. By substituting (7) into (3), (4), and (5), it is possible to analytically calculate the mean, standard deviation, and skewness, respectively: In these equations, C1 iq , C2 ij q , and C3 ij kq depend on the choice of metamodel (basis functions) and input probability distribution: Once C1 iq , C2 ij q , and C3 ij kq are known for the specific metamodel and probability distribution, mean and standard deviation and skewness of the output can be evaluated on each design point. A normal probability distribution is assumed for the noise input: where μ q and σ q are the mean and standards deviation of the input noise data. For an ordinary Kriging metamodel: In this equation, v is an untried input point, r(v) is a vector that has the correlations between v and all observations, R is the correlation matrix, y 0 is the matrix of responses on DOE points, α 0 is a constant, and κ is the vector of weight factors. The basis functions are: In this equation, θ P are the parameters obtained from fitting a Kriging metamodel. Inserting (14) and (16) into (11) to (13) leads to: For Gaussian radial basis functions, similar expressions can be derived. Details of the derivation of C1 iq , C2 ij q , and C3 ij kq are given in the Appendix.

Robust optimization including skewness measure
Once the mean, standard deviation, and skewness of the output have been calculated, they can be used to evaluate the robustness measure and the constraints. In this section, a new approach is proposed to include the skewness measure during robust optimization. Even though the preferred method of calculating the output moments in this article is by following an analytical approach, any other method for calculation of noise propagation (e.g., MC) can be used to evaluate those moments and include them during the robust optimization. In this section, first a distribution that can be formulated using the first three moments is introduced. Then, it will be shown how to use this distribution to implement the skewness during robust optimization.

A probability distribution with skewness measure
A skew-normal distribution (SND) is considered with location-scale-shape parameters (ξ, ω, and λ) and it is denoted by φ D (y; ξ, ω, λ). The subscript D indicates the use of the direct parametrization in Azzalini (1985). The density function is: where φ is the probability density function (PDF) of a standardized normal distribution (ND) and is the cumulative distribution function (CDF) of a standardized ND. This equation can be used to describe the probability distribution; however, the evaluation as given in Section 2 yields so-called centered parameters (μ, σ, γ ) rather than location, scale, and shape parameters. Therefore, it is necessary to make a connection between these two sets of parameters. Azzalini (1985) calculated μ, σ, and γ for the above-mentioned distribution: By inverting these equations, once the moments of a given distribution are known, location-scale-shape parameters can be obtained and used for further calculation (Pérezrodríguez et al. 2017): Equation (20) is used as a mathematical description of the output during the optimization. The skewness parameter for the density function, γ , is confined to a range of −0.99527 γ 0.99527, whereas the skewness of output, γ 1 , based on (10) can vary between −∞ < γ 1 < ∞. Therefore, a connection is needed between these two parameters. Since both γ 0.99527 in mathematical description and γ 1 ∞ in (10) lead to a half normal distribution, the following relation between these two parameters is used: Figure 3 shows SNDs of various γ values. For all these distributions, μ = 0 and σ = 1, while the left and right tails are shifting due to change in γ . The impact of tails shifts on satisfying constraints, reliability of a process, and robustness is discussed in the following section.

Including skewness measure in robustness measure and constraints
In a design for a six sigma or three sigma process, the assumption is that the output is normally distributed (Koch et al. 2004). Most moment-based reliability analyses are also based on the normal output distribution. In a robust optimization approach, the handling of constraints is based on requirements on the reliability of constraints satisfaction. For an n sigma process, the reliability is estimated via integration of PDF of an ND from −nσ to nσ (or (nσ ) − (−nσ )). This calculation is only valid if the output distribution follows a normal distribution. If the output is not normally distributed, the reliability changes due to shift of the tails of the distribution, as shown in Fig. 3. Therefore, the reliability obtained is no longer valid. Based on the definitions in (2), the output constraints (sigma level quality constraints or reliability constraints) are generally defined by (24) if there is an upper bound: and in the case of lower bound: For a robustness measure that minimizes the variation of output in addition to the difference between mean and target value, C r , several expressions are proposed in the literature such as (Koch et al. 2004): or (Havinga et al. 2017;Wiebenga et al. 2012): In (24) and (25), the choice of n is related to the desired reliability level in a normal distribution while in (26) and (27), w is a weighting factor to adjust the optimization objective between mean on target and output variation. As mentioned previously, the output might not follow a normal distribution and therefore the constraints and robustness measure must be redefined properly. We introduce U and L parameters to account for upper tail and lower tail shift, as shown in Fig. 4.
To account for skewness in the evaluation of upper bound constraints, one can use: and in the case of lower bound: In these equations, L and U are the parameters that are a function of skewness for a desired reliability level of n sigma. For a normal distribution, L = −n and U = +n meaning that (28) and (29) reduce to (24) and (25). For the robustness measure, a similar approach is used to include the effect of the shift of the tails. By modifying (27): It can be seen that for ND, the robustness measure reduces to (27). As shown in Fig. 4, the new criterion sets μ r (x)+((U (γ ; n) + L(γ ; n))/2)σ r (x) on target and minimizes ((U (γ ; n) − L(γ ; n))/2)σ r (x). Now the aim is to obtain the L and U parameters for various skewed distributions. We introduce a method to obtain L and U in Lσ and Uσ while maintaining the same level of reliability as for n sigma in the case of a normal distribution. To achieve this, the quantiles of SND which lead to the same reliability as n sigma for an ND are approximated. When, for example, a reliability of 3 sigma is required, L is defined as the ∼ 0.0013 quantile of the cumulative SND and U is defined as the ∼ 0.9987 quantile of the cumulative SND: It is self-evident that for γ = 0, L = −3 and U = +3. As γ increases, L and U change asymmetrically. To illustrate the approach to finding L and U in detail, Fig. 5 shows the CDF of four SNDs for various γ parameters and the CDF of an ND. A horizontal line on this graph represents a specific reliability level or a specific quantile. A dashed line on the 0.0013 quantile shown in Fig. 5 intersects with ND on l 1 = −3. The intersection of that line with CDFs of SNDs is at different points, namely l 2 to l 5 . Similarly for the 0.9987 quantile, the dashed line intersects with SND CDFs on u 2 to u 5 . L and U are calculated for various γ values in the case of a 3 sigma reliability approach and the results are plotted in Fig. 6. l 1 to l 5 and u 1 to u 5 in Fig. 5 are also shown on the plot of Fig. 6. For other n sigma approaches, Fig. 7 shows L and U as a function of γ . For a negative γ , due to symmetry, one can use: To make those curves applicable in a robust optimization routine in which the robustness measure and constraints need to be evaluated many times, third-order polynomials were fit to those curves. Table 1 shows the fitting parameters for 3 sigma and 6 sigma reliability levels. During the robust optimization procedure, one can calculate γ using the analytical method or by MC and obtain L and U calculated by using the formulas listed in Table 1. Then, the robustness measure and satisfaction of constraints can be evaluated to find the robust optimum by considering non-normality of the output.

A case study: stretch-bending of dual-phase steel sheet
In this section, a demonstration case is presented to investigate the significance of accounting for skewness of output during robust optimization. The stretch-bending process of DP800 steel sheet is chosen. During this process, a steel sheet is simultaneously stretched and bent to a desired curvature. After bending of the sheet material, due to elastic spring-back, the final bend angle will differ from the applied angle. The amount of springback depends on the mechanical properties of the sheet material. By applying simultaneous stretching, the amount of spring-back may be diminished and the accuracy of the process, in terms deviation of the final bend angle from the desired one, can be improved. On the other hand, applied stretching causes thickness reduction of the sheet, which should be limited. To find the robust optimum, normally distributed noise variables are defined for the applied material morphology and the sheet thickness. The resulting material model and the stretch-bending process are nonlinear and give rise to non-normally distributed results.

The finite element model
A one-dimensional finite element model, as shown in Fig. 8, is used to simulate this process. In this model, each throughthickness element has two nodes in the Z direction and five degrees of freedom. Nodal displacement increments for one element are shown in Fig. 8b as du 1 and du 2 , the two degrees of freedom that are used to model throughthickness strain (dε ZZ ). Additionally, three other degrees of freedom, the curvature (κ), and the membrane strains in circumferential (ε m ) and transverse (ε t ) direction are specified. Then, the local strain increments are determined using:

A noisy material model
To model the material behavior, a micro/macro approach is implemented. In this approach, the rule of mixture is applied on a representative volume element (RVE) and the overall properties are obtained by weighted averaging of the respective properties of the constitutive phases in the material. This method is appropriate for composites or materials consisting of more than one phase, e.g., dualphase (DP) steels. This method was developed by among others (Hill 1965;Hashin and Shtrikman 1963) and is based on the inclusion model proposed by Eshelby (1957). For an extensive treatment, see Mura (1987). To find the average properties, Lielens interpolation method (Hori and Nemat-Nasser 1993;Perdahcıoġlu and Geijselaers 2011) is used. This method was proposed to improve the accuracy of the Mori-Tanaka method (Tanaka and Mori 1970) and is also known as the double inclusion model. Using mean-field homogenization, it is possible to consider the variations in microstructural features of DP steel sheet. Several microstructural aspects contribute to the steel's mechanical properties. The volume fraction (VF) of the martensite distributed in the ferrite matrix plays a key role in controlling the mechanical properties. Additionally, it is known that DP steel often shows a banded martensite structure that is mostly concentrated at the center of the sheet. This depends on the cooling rate during hot rolling, the intercritical annealing temperatures, and the soaking time (Caballero et al. 2006). Another microstructural feature is the shape of martensite grains. Grains tend to elongate during the rolling process. It is assumed that, as a consequence, the martensite grains are not ideally spherical but have an elongated shape. These three microstructural features are combined with thickness variations of the sheet and are considered as noise parameters during optimization. Figure 9 shows the variations in these parameters schematically.

The optimization problem
After developing the finite element model and defining the material model, the optimization problem can be set up by defining the objective function and the constraints. As described in Table 2, in the stretch-bending process, there are two design parameters: x 1 , applied curvature and x 2 , stretch. It should be noted that the final curvature is different from applied curvature because of spring-back. In this process, there are four noise variables that are stated in Table 3. Martensite VF and oblateness factor (α) are taken into account by applying Eshelby's solution. To take into account the concentration of martensite in Fig. 8 a Finite element model and b one-dimensional generalized plane-strain element the middle layer of the material, a banding parameter (β) is introduced. This parameter is defined as the ratio of martensite volume fraction from 0.4 to 0.6 of thickness to the martensite volume fraction in the rest of the throughthickness region, 0 to 0.4 and 0.6 to 1. A sensitivity analysis step showed that these noise parameters have sufficient influence on the response to justify keeping them in the optimization procedure.
The output of the simulation is the final curvature and the percentage thickness reduction of the sheet. For robust optimization, we aim to obtain a target curvature and we apply a constraint on the thinning percentage. Design variables are set by defining upper and lower bounds, noise variables are taken into account through a normal probability distribution and the constraint is expressed using an inequality expression.
The objective function and the constraint are:  The target curvature is C r = 50m −1 and the thinning limit is C g = 2.5%. After defining the objective function, input, output, and constraints, the optimization steps are followed based on Fig. 1. First, a DOE of 793 points is generated on a six-dimensional space based on the combination of a Latin hypercube design (LHS) and a full-factorial design (FFD). Responses (final curvature and thinning) are evaluated by FE analysis as explained in Section 4.1. Kriging metamodels are fit to both the final curvature and the final sheet thickness output. Mean, standard deviation, and skewness of the output are calculated using the analytical approach described in Section 2.

Results and discussion
To observe the influence of skewness in the robust optimization procedure, we use the proposed criterion of (30) and compare it with formulation of (27).
The robust optimum design obtained taking the 3 sigma approach and applying the new criterion is presented in Table 4. For detailed inspection, Fig. 10 shows the variation of response on robust optimum found by the conventional method. It is apparent that the output distribution does not follow an ND. Therefore, some results end up outside the ±3 sigma range. The presence of multiple results outside the ±3 sigma range for final curvature makes the robustness of the optimum questionable. The goal to reduce the deviation of output by simply setting the mean on target is not achieved due to the skewness of the output. In addition, the overestimation of the right tail for thinning constraint causes a too conservative constraint. If one switches to a 6 sigma reliability design to be able to reduce the scrap rate (Koch et al. 2004) then the influence of this conservative constraint will be significant. Nevertheless, as long as skewness exists, the expected n sigma reliability will not be accurate. Therefore, the use of skewness in the robust optimum evaluation is essential. Using the criterion proposed in (30), a different robust optimum design is obtained that has a lower objective function value. To analyze the differences in detail, the variation of output around this optimum is evaluated and is shown in Fig. 11. The SND that is fit to obtain the robust optimum shows an improved prediction of tails shift for each output and maintains the desired reliability for constraints. The new method shifts μ + (U (γ ; n) + L(γ ; n))σ/2 to the target curvature, 50m −1 , and successfully reduces the deviation from that target value. It can be seen that this objective function does not guarantee that the biggest data population is located on the specified target. However, it ensures that the variation of data population around that target value is minimum.
Another important aspect about the new criterion is the treatment of the constraint, which is more accurate than for the conventional method. If we use the conventional criterion for the constraint (shown by dashed ND in Fig. 11), the upper bound of thinning is overestimated and it leads to rejection of this design. The overestimation of the right tail of the distribution using an ND violates the constraint on thinning during robust optimization although the constraint is not really violated as a result of a negative skewness. The new criterion can considerably improve the search for the robust optimum by proper predictions of the tails of the distribution.
Compared to other methods that have been developed to include high-order moments of output distribution for reliability analysis, the proposed approach has several advantages. In the proposed approach, Kriging is used which is capable of handling nonlinear relations between input and output more effectively compared to first-order or second-order approximations. The response in other studies is considered as polynomial, e.g., linear and quadratic in the first-order reliability method (FORM) and the second-order reliability method (SORM), respectively (Zhao and Ono  Fig. 10 Variation of response on robust optimum found by conventional method (the bars in the background are the results from MC) 2001). The second advantage is that in this article, the mean, standard deviation, and skewness parameters are obtained analytically and therefore they are accurate compared to moment calculation based on MC sampling from a Gaussian input (Padmanabhan et al. 2006). Another advantage is that the derivatives of the output mean, standard deviation, and skewness can be obtained analytically and improve the search for robust optimum in a gradient-based algorithm. Nevertheless, the calculation of moments based on analytical approach has some limitations and it is not feasible for specific combinations of metamodels and input probability distributions. During analysis of the constraints and evaluation of the robustness measure, the addition of the skewness parameter is a significant improvement over considering only mean and standard deviation. However, for a highly nonlinear response that gives rise to higher order moments (Mekid and Vaja 2008), one might consider even higher than third-order moments during the search for a robust optimum. This is the topic of ongoing research by the authors.

Conclusions
When a normally distributed input parameter is subjected to a nonlinear process, the resulting output will be characterized by a skewed distribution. The mean, standard deviation, and skewness of the output distribution are calculated analytically. The skewness parameter is then used to find the shift of the tails of the distributions of the objective function and the constraints. The shift of the tails is embedded in the definition of the robustness measure as well as in the evaluation of constraints during robust optimization. Application of this approach for robustness criterion and evaluation of constraints shows that the variation of output is successfully reduced and the constraints are handled more accurately compared to the conventional method of robust optimum evaluation.
Foundation of Fundamental Research on Matter (FOM) (www.fom.nl), which is part of the Netherlands Organization for Scientific Research (www.nwo.nl).
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.