1 Introduction

The wing shape at the cruise phase is much different from its jig shape (as-manufactured shape) due to the static aeroelastic effects. To obtain an optimised wing jig shape while considering these effects, it is necessary to include the relevant disciplines, such as aerodynamic discipline and structural discipline, in the design problems, which belong to the field of multidisciplinary design optimisation (MDO). One possibility for obtaining the wing jig shape can be through the wind tunnel test but this can be too expensive to be conducted and requires the optimised aerodynamic shape and a given structure model. To improve the efficiency of the design process, the industry and academia tend to rely on computational analysis tools, namely computational fluid dynamics (CFD) and computational structural mechanics (CSM). There are two basic requirements to optimise the wing jig shape with computational analysis tools. Firstly, high-fidelity analysis tools should be used to capture as many physical details as required to get high confidence results. Secondly, the optimisation should have a sufficient number of design variables to fully explore the design space. These requirements lead to the focus on formulating and solving high-fidelity large-scale aero-structural design optimisation problems at an acceptable time cost.

Using high-fidelity models in analysis and optimisation requires significant computational resources. For example, one CFD evaluation by Reynolds-Averaged Navier–Stokes (RANS) solver for a 3-D wing can cost hours or even days. Also, wing shape parameterisation with a large number (typically, over 100) of design variables always requires a large number of simulations to do the optimisation, which makes this task even more challenging. With the increasing requirement of reducing the computational cost, gradient-based optimisations using adjoint techniques have gained more and more attention (Kennedy and Martins 2014; Elham and van Tooren 2016; Wang and Kumar 2017; Yang et al. 2018). Compared with the finite difference method, the adjoint method can efficiently compute the derivatives with respect to the design variables. However, gradient-based methods face challenges. Firstly, function values and their derivatives from computational simulations usually contain some level of numerical noise, which could influence the convergence performance and may lead to the optimisation process into a sub-optimal solution (Gilkeson et al. 2014; Viana et al. 2014; Wang et al. 2017). Secondly, problems may arise at some points in the design space where the adopted solver cannot evaluate, typically referred to as failed evaluations. A common way to solve this issue in gradient-based methods is to reduce the searching step and relocate a new design point. However, this could slow down the optimisation process without a guaranteed trouble-free convergence. Finally, there are always doubts that whether the optimised result from a gradient-based optimisation is a local optimum or not. Although designers can conduct several optimisations from different initial points to verify it, a greater amount of computational resources is required (Lyu et al. 2015).

Considering the aforementioned issues, using direct simulation in large-scale optimisation may not be sufficiently reliable for practical applications. Therefore, the use of approximation, also referred to as metamodel, is a good choice. It is not feasible to build a good metamodel throughout the whole design space for complex nonlinear problems, especially for the aero-structural analysis. A large number of design variables would further worsen the situation. Although more training points can be used to build a better metamodel, the resulting computational costs may become prohibitive, a situation that is referred to as the curse of dimensionality.

The multipoint approximation method (MAM), initially reported by Toropov (1989) and Toropov et al. (1993), has shown good efficiency within a reasonable number of training points making it suitable for large-scale problems. The MAM is an iterative optimisation technique that builds approximations in the selected trust-region. The trust-region is a subspace of the entire design space, which would be translated and scaled based on the defined trust-region strategy during the design process. Given this perspective, the MAM can also be interpreted as a Mid-range Approximation Method. Compared with the global approximations, which build metamodels through the entire design space, the mid-range approximations, which build metamodels in a subspace of the whole design space, could use fewer training points to get a metamodel with good quality. And the MAM could use a set of simple mathematical programming functions to replace the responses from the original simulation models. Such functions would have deterministic outputs if they are provided with deterministic inputs, which means they are noise-free or, at least, have a low level of numerical noise which does not adversely affect the optimisation convergence. It alleviates the concerns that the “discretisation-dependency” issues, or rather, “mesh-dependency” issues of the optimisation may bring some troubles to convergence performance and lead to an irregular design, especially with a large number of design variables (Le et al. 2011; Kiendl et al. 2014; Wang et al. 2018). Furthermore, the failed evaluations have a negligible influence on the search performance in the optimisation since the metamodel building process would ignore these failed samples. This method has been successfully used in turbomachinery design (Caloni et al. 2018) and automotive structure design (Mortished et al. 2018). Polynkin and Toropov (2012) have introduced a metamodel assembly method using linear regression in the MAM and tested it on problems with up to 1000 design variables and constraints. Liu and Toropov (2016) implanted the discrete capability to the MAM to solve mixed integer-continuous optimisation problems. Recently, Toropov et al. (2018) used several risk measure methods in the MAM to handle large-scale design optimisation problems with uncertainty.

In this paper, an improved MAM with a gradient-assisted metamodel assembly technique building in a trust-region optimisation framework was tested first by a benchmark problem and then was applied to the aircraft wing jig shape optimisation. The gradient-assisted metamodel assembly technique utilized five simple mathematical functions to improve the quality of approximations of complex functions in large-scale problems without an increase in the number of function calls. The gradient information, which is computed by the adjoint technique from the aero-structural analysis, has been used to reduce the required number of training points and improve the metamodel quality. The trust-region strategy of the MAM has been enhanced to include more optimisation states for different types of problems. Two optimisation cases, gradient-based optimisation and metamodel-based optimisation, were conducted to give an insight into the performance of the improved MAM.

2 Optimisation framework

This section illustrates the methods and numerical tools used in the trust-region optimisation framework within the MAM. The in-house programme MAM is mainly coded in Fortran and used as the main driver in the wing jig shape optimisation.

2.1 Multipoint approximation method

A general optimisation problem that the MAM solves can be defined as

$$\begin{array}{*{20}l} {{\text{minimise}}} \hfill & {F_{0} \left( \user2{x} \right)} \hfill \\ {{\text{subject\,to}}} \hfill & {F_{i} \left( \user2{x} \right) \ge 0,\;\;\;\;\;i = 1,\;2,\; \ldots ,\;m} \hfill \\ {} \hfill & {a_{j} \le x_{j} \le b_{j} ,\;j = 1,\;2,\; \ldots ,\;n} \hfill \\ {{\text{with\, respect\, to}}} \hfill & {\user2{x} = \left[ {x_{1} ,\;x_{2} ,\; \ldots ,\;x_{j} ,\; \ldots ,\;x_{n} } \right]} \hfill \\ \end{array} ,$$
(1)

where x is the design variable vector, F0 is the design objective and Fi is the design constraints. There are m design constraints and n design variables. For one of the design variables xj, its lower and upper boundaries are aj and bj, respectively. In MAM, a series of metamodels are built in the select trust-region to replace the original responses (the objective and constraint functions). By using this method, the original optimisation problem is then transformed into a sequence of approximate sub-optimisation problems as follows:

$$\begin{array}{*{20}l} {{\text{minimise}}} \hfill & {F_{0}^{k} \left( \user2{x} \right)} \hfill \\ {{\text{subject\, to}}} \hfill & {F_{i}^{k} \left( \user2{x} \right) \ge 0,\;\;\;\;\;\;i = 1,2, \ldots ,m} \hfill \\ {} \hfill & {\left. {\begin{array}{*{20}l} {a_{j}^{k} \le x_{j} \le b_{j}^{k} } \hfill \\ {a_{j}^{k} \ge a_{j} } \hfill \\ {b_{j}^{k} \le b_{j} } \hfill \\ \end{array} } \right\}\;j = 1,2, \ldots ,n} \hfill \\ {{\text{with\,respect\,to}}} \hfill & {\user2{x} = \left[ {x_{1} ,\;x_{2} ,\; \ldots ,\;x_{j} ,\; \ldots ,\;x_{n} } \right]} \hfill \\ \end{array} ,$$
(2)

where the superscript k is the current MAM iteration number, \(\;\widetilde{{F_{0}^{k} }}\) and \(\widetilde{{F_{i}^{k} }}\) are a metamodel of objective function F0 and constraint function Fi. Each trust-region \(\left[ {a_{j}^{k} ,\;b_{j}^{k} } \right]\) defines limits on the change in the design space for xj while solving the optimisation problem in the current iteration, which is a subspace of the entire design space \(\left[ {a_{j} ,\;b_{j} } \right]\).

The complete design process is illustrated in Fig. 1. In each MAM iteration, several design points in the current trust-region are sampled by a design of experiment (DoE) method and evaluated by the original simulation model to be used as training points for metamodel building. Then an approximate sub-optimisation problem shown in the Eq. (2) is solved to find an optimum point of the current iteration. Next, the original simulation model, such as the aero-structural analysis in this paper, evaluates the obtained optimal design and produces information such as metamodel quality and design feasibility that is required by the trust-region strategy. The latter checks the current optimisation state to define what is to be done next. If the stopping criteria are satisfied, the MAM optimisation stops, and the obtained design is treated as the final solution to the considered optimisation problem. Otherwise, the location and size of the next trust-region are established for the next MAM iteration.

Fig. 1
figure 1

MAM design process

In the new trust-region, there will be some design points that have been evaluated in previous iterations. The MAM could use these existing points to save computational costs. Therefore, the selected DoE method should have the capability to consider the already existing design points. For this purpose, a non-collapsible randomised DoE method (Korolev et al. 2015) has been used. The DoE points are generated efficiently one by one with a reasonably uniform spread while taking into account the existing ones.

A gradient-based optimiser, sequential least-squares quadratic programming (SLSQP) algorithm (Kraft 1988), is used to solve the approximate sub-optimisation problem. To reduce the possibility of falling into a local optimum, multiple optimisations are conducted in one MAM iteration. These optimisations start from several randomly chosen initial points and generate a set of candidate designs. Based on these candidates’ performance according to the real simulation model, the trust-region strategy chooses the best point as the centre of the next trust-region. Since the objective function and constraint functions in each iteration have been all replaced by metamodels, the additional computational consumption caused by multiple optimisations is small.

2.2 Gradient-assisted metamodel assembly method

The gradient-assisted metamodel assembly method is an extension of the metamodel assembly method in which the gradient information is incorporated to improve the approximation quality. The metamodel assembly method could combine multiple metamodels into one single metamodel to improve prediction accuracy. Several studies (Viana and Haftka 2008; Han et al. 2013; Liem et al. 2015; Yin et al. 2018) have been conducted on the metamodel assembly method using the weighted sum method, and the available types of metamodels include polynomial regression (PR), radial basis function (RBF), Kriging (KRG), etc. Instead of the weighted sum method, this section uses the linear regression method to assemble different metamodels, which is inspired by the work of Polynkin and Toropov (2012).

The proposed procedure can be divided into two parts. The first step is to select and build several individual metamodels. There are several classic gradient-assisted metamodels available in the MAM programme, such as gradient-assisted radial basis function (GARBF) and gradient-assisted kriging (GAKRG). However, these methods involve a lot of matrix operations, like matrix multiplication and matrix inversion, whose computational costs mainly depend on the number of design variables and the number of DoE points. With a large number of design variables, one single establishment or assessment of a metamodel can be time-consuming. For large-scale problems, it is normal to have a large number (typically, more than 100) of design responses. That means the MAM should build a large number of metamodels once in every iteration to replace the design responses and evaluate them multiple times. If one single establishment or assessment of a metamodel is finished in minutes or hours, then the time to finish one MAM iteration could be hours or days, which might be unacceptable for designers. According to the above issues, it is necessary to choose a metamodel that can be easily built and efficiently evaluated to reduce computational costs.

This work uses five simple mathematical functions as follows to build individual metamodels:

$$\begin{aligned} \varphi_{1} \left( {{\user2{x}},{\user2{a}}_{1} } \right) &= a_{1,\;0} + \sum\limits_{j = 1}^{n} {a_{1,\;j} \times x_{j} } \\ \varphi_{2} \left( {{\user2{x}},{\user2{a}}_{2} } \right) &= a_{2,\;0} + \sum\limits_{j = 1}^{n} {a_{2,\;j} \times \frac{1}{{x_{j} }}} \\ \varphi_{3} \left( {{\user2{x}},{\user2{a}}_{3} } \right) &= a_{3,\;0} + \sum\limits_{j = 1}^{n} {a_{3,\;j} \times x_{j}^{2} } \\ \varphi_{4} \left( {{\user2{x}},{\user2{a}}_{4} } \right) &= a_{4,\;0} + \sum\limits_{j = 1}^{n} {a_{4,\;j} \times \frac{1}{{x_{j}^{2} }}} \\ \varphi_{5} \left( {{\user2{x}},{\user2{a}}_{5} } \right) &= a_{5,\;0} \times \prod\limits_{j = 1}^{n} {x_{j}^{{a_{5,\;j} }} } \\ \end{aligned},$$
(3)

where al is the tunning coefficient vector in the lth selected metamodel, \({\user2{a}}_{l} = \left[ {a_{l,\;0} ,\;a_{l,\;1} ,\; \ldots ,\;a_{l,\;n} } \right]\) and l = 1, 2,\(\ldots\), 5. The values in al could be negative values or positive values. The first metamodel \(\varphi_{1}\) is a linear function, while the others are intrinsically linear functions (Box and Draper 1988). The intrinsically linear function itself is nonlinear; however, this function could be transformed into a linear function with respect to tunning coefficients. For example, the 5th metamodel \(\varphi_{5}\) could use the following transformation to be a linear function:

$$\ln \left( {\varphi_{5} \left( {{\user2{x}},{\user2{a}}_{{\mathbf{5}}} } \right)} \right) = \ln \left( {a_{5,\;0} } \right) + \sum\limits_{j = 1}^{n} {a_{5,\;j} \ln \left( {x_{j} } \right)}.$$
(4)

These linear and intrinsically linear functions only have n + 1 unknowns (or tunning coefficients) to be determined, which is a relatively small number. If the gradient information is not provided, there should be at least n + 1 training points by using the least squares method to build metamodels. However, if the gradient information is provided, the minimum required number of training points to solve the least squares problem is equal to 1. This feature allows the MAM to use as few DoE points as possible to build a good metamodel in the selected trust-region. What is more, compared with GARBF or GAKRG, these functions are simple enough to be built and evaluated efficiently in large-scale problems.

To take advantage of the gradient information, this work uses the following formulation to solve the least squares problems and determine the tunning coefficient vector al:

$$min\;\;\sum\limits_{p = 1}^{P} {w_{p} \times \left\{ {\left[ {F\left( {{\user2{x}}_{p} } \right) - \varphi_{l} \left( {{\user2{x}}_{p} ,{\user2{a}}_{l} } \right)} \right]^{2} + \sum\limits_{j = 1}^{n} {\gamma \times \left[ {\frac{{\partial F\left( {{\user2{x}}_{p} } \right)}}{{\partial x_{j} }} - \frac{{\partial \varphi_{l} \left( {{\user2{x}}_{p} ,{\user2{a}}_{l} } \right)}}{{\partial x_{j} }}} \right]^{2} } } \right\}},$$
(5)

where P is the number of training points, p is the index of the selected training point, F is the response function from the original simulation model, wp is the weight coefficient of the corresponding point xp, and γ is a parameter to show how important the derivatives are compared with the responses, \(0 \le \gamma \le 1\). In this work, γ is set to 0.5.

The weight coefficients determine the contribution of each training point and have a great influence on the metamodel quality. To include the inequality among different training points, two principles suggested by Toropov et al. (1993) are used to guide the setting of the weight coefficients. The first principle is that the point close to the boundary of the feasible region should have a more considerable weight. In realistic constrained optimisation, the optimum is always located at the boundary of the feasible region. A more considerable weight could help the metamodel to be more accurate in this region. The second principle is to consider the value of the objective function. The point with a better objective should be assigned a larger weight. Apart from the above two principles, this work also considers inequality from the gradient information. Theoretically, the stationary points, whose function's derivatives are zero, are always the candidates for the optimum. To capture more information near stationary points, the design point with a lower gradient value is given a larger weight.

When all individual metamodels are built, we should combine them into one single metamodel using the following definition:

$$\widetilde{F}\left( {{\user2{x}},\;{\user2{b}}} \right) = \sum\limits_{l = 1}^{nf} {b_{l} \times \varphi_{l} } \left( {{\user2{x}},\;{\user2{a}}_{l} } \right),$$
(6)

where \(\widetilde{F}\) is the final approximation of the original response F, b is the tunning coefficient vector of the metamodel assembly method, \({\user2{b}} = \left[ {b_{1} ,b_{2} , \ldots ,b_{nf} } \right]\). The values in b could be negative values or positive values. nf is the number of individual metamodels, in this work, nf = 5.

Similarly, the tunning coefficient vector b is solved by the least squares method using the following formulation:

$$\text{min}\;\;\sum\limits_{p = 1}^{P} {w_{p} \times \left\{ {\left[ {F\left( {{\user2{x}}_{p} } \right) - \widetilde{F}\left( {{\user2{x}}_{p} ,{\user2{b}}} \right)} \right]^{2} + \sum\limits_{j = 1}^{n} {\gamma \times \left[ {\frac{{\partial F\left( {{\user2{x}}_{p} } \right)}}{{\partial x_{j} }} - \frac{{\partial \widetilde{F}\left( {{\user2{x}}_{p} ,{\user2{b}}} \right)}}{{\partial x_{j} }}} \right]^{2} } } \right\}}.$$
(7)

Through the tunning coefficient vectors al and b, we can define a lot of complex non-monotonic functions in the selected trust-region based on 5 simple mathematical functions.

According to the equation (5 and 7), the gradient-assisted metamodel assembly method does not only need the gradient information from the original simulation models but also require the gradient information from the metamodels. The former one depends on the simulation model used in the optimisation. The latter one can be easily and efficiently solved by using the chain rule and analytic functions since the metamodels we used are all combined by linear and intrinsically linear functions. That is another crucial advantage for the proposed method to be used in large-scale problems.

Finally, parallel computing techniques have been implanted into the proposed gradient-assisted metamodel assembly method to improve the optimisation efficiency. The MAM could build or evaluate metamodels for different response functions simultaneously.

2.3 Trust-region strategy

The trust-region strategy is the decision-making part of the MAM that decides the location and the size of the new trust-region. The choice in the trust-region strategy plays an important role in the searching performance of the MAM and therefore affects the required computational costs to finish the whole optimisation. Compared with the traditional trust-region strategy (Keulen and Toropov 1997), this work adjusts the judging mechanism and contains more optimisation states, which makes the MAM more flexible and controllable to suit different types of problems.

There are 6 indicators used in the trust-region strategy to evaluate the current optimisation state. The first one is metamodel quality, which shows the discrepancy between the approximated function values and the original response values based on the selected set of candidates. Normally these candidates are the solved optimum points from multiple approximate sub-optimisations. The best one among these candidates will be the centre of the new trust-region in the next iteration. Sometimes if there is no local optimum, the candidates might be close to each other or even the same. In this situation, the trust-region strategy would sample extra DoE points to evaluate the metamodel quality. This indicator is calculated by using the largest Root Mean Squared Error (RMSE) among all response functions:

$$\varepsilon = \max \left( {\sqrt {\frac{1}{q}\sum\limits_{q = 1}^{Q} {\left[ {\tilde{F}_{i} ({\user2{x}}_{q} ) - F_{i} ({\user2{x}}_{q} )} \right]^{2} } } ,\;i = 0,\;1,\;2,\; \cdots ,\;m} \right),$$
(8)

where \(\varepsilon\) is the metamodel quality, Q is the number of candidates and q is the index of the selected candidate. Q is equal to the number of multiple approximate sub-optimisations in one MAM iteration. In this work, Q = 3. Based on the user-defined criteria, the metamodel quality is categorised into 3 types: “bad”, “good” and “precise”.

The second indicator is the relative size of the current trust-region compared to the whole design space, later referred to as the trust-region size in the paper. Unlike the classification in the traditional strategy, which includes “small” and “large”, this work uses 3 categories to distinguish the trust-region size: “too small”, “small” and “large”. If the trust-region size is “too small”, the performance of the sampled points from the current trust-region tends to be similar, which will bring ill-conditioned solution matrixes in the process of metamodel building using the linear regression method. To avoid this issue, the trust-region strategy has put a limitation on the smallest trust-region size. And if the trust-region size is “small”, it might be necessary to adjust its size slightly depending on the other indicators. On the other hand, if the size is “large”, it is better to do more adjustments to the trust-region size to gain a better convergence speed. Different from the metamodel quality, the trust-region size is calculated by using the minimum size among all dimensions:

$$r^{k} = \min \left( {\frac{{b_{j}^{k} - a_{j}^{k} }}{{b_{j}^{{}} - a_{j}^{{}} }},\;j = 1,\;2,\; \ldots ,\;n} \right),$$
(9)

where rk is the trust-region size of the kth iteration.

The third indicator is the location of the optimum point in the current trust-region. As mentioned earlier in the metamodel quality part, this optimum point is the best one among all candidates and will be used as the centre of the next trust-region. Depending on whether it is in the current trust-region, the traditional strategy distinguishes optimisation states into 2 categories: “internal” and “external”. Considering that the metamodel quality outside the trust-region might not be as accurate as the one inside, this work imports a limitation that the optimum point can only be selected in the current region. Therefore, by using this indicator, the optimisation states are categorised into 3 types: “inside”, “near the boundary” and “at the boundary”. These states describe the distance between the optimum point and the boundary of the current trust-region. If the current optimum is “inside”, it indicates that the final solution of the original optimisation problem might be in this region and more searching should be done here. Otherwise, it is necessary to move the trust-region and conduct the optimisation in the new area. The difference between that the optimum is “near the boundary” and “at the boundary” is the different scale factors to adjust the trust-region size.

The fourth indicator is the feasibility of the optimum point. One of the prerequisites for the MAM convergence is that the found optimum point has satisfied all design constraints. The fifth and sixth indicators are search direction and oscillation level, respectively, which are judged by the movement history of the trust-region. In successive iterations, if the optimum point is always found “at the boundary” and the optimal design keeps moving in the same direction, it is necessary to enlarge the trust-region to include more design space and speed up the searching. The oscillation level is to check whether the trust-region is oscillating, which may cause endless iterations on similar solutions. The phenomenon may happen when the final solution of the original optimisation problem is nearby. In this way, the trust-region strategy should use a more gentle way to update the trust-region and end this oscillation.

The settings for the indicator classification used in this work are shown in Table 1.

Table 1 Classification settings in the trust-region strategy

The detailed judging mechanism is shown in Fig. 2. This work has defined 13 optimisation states, which include 4 stop states (S1–S4), 6 reduction states (R1–R6), 1 enlargement state (E1) and 2 keeping states (K1, K2). The programme will be terminated if the stop state has been found. S1, S2 and S3 represent an abnormal termination, whose metamodel quality is “bad”, “good” and “precise”, respectively. These different stop states could give users the necessary debug information to correct and improve the MAM performance in the subsequent optimisation. S4 means that the MAM optimisation stops with a normal convergence in which the trust-region size is “small”, the metamodel quality is “precise” and the optimum point is feasible. If all the MAM parameters are appropriately set, S4 should be the most common stop state.

Fig. 2
figure 2

The judging mechanism of the trust-region strategy

In the reduction states, the strategy will reduce the trust-region size in the next iteration, while it will enlarge its trust-region size in the enlargement state. A set of scale factors as follows are used to scale the size of the new trust-region:

$$\begin{gathered} 0 < \beta_{S} < 1,\;\;S = 1,\;2,\; \ldots ,\;6 \hfill \\ \alpha > 1 \hfill \\ \end{gathered},$$
(10)

where \(\beta_{S}\) is the corresponding reduction coefficient in the selected reduction state RS and \(\alpha\) is the enlargement coefficient of the enlargement state. To solve the possible situations that may happen in complex problems, different reduction states should have different reduction coefficients. And for some special states, the corresponding reduction coefficient should be larger or smaller. For example, with a “large” trust-region and a “good” metamodel, if the location of the optimum point is “inside” (R3), a small reduction coefficient should be used to give a large reduction on the new trust-region to improve the convergence speed. Also, with the same conditions, if the optimum location is “near the boundary” (R4), which is a more rigorous state and indicates that the final solution of the original optimisation problem is near the boundary of the current trust-region, a lower value should be used to reduce more size. In contrast, with a “small” trust-region and a “good” metamodel, if the optimum point is not “at the boundary” (R5), the strategy should reduce the size slightly by using a large reduction coefficient. Similarly, with the same conditions, if the metamodel quality is “precise” but the current optimum point is not feasible (R6), which means the MAM optimisation is close to convergence (S4), the adjustment of the current trust-region should be treated more cautiously by using a larger reduction coefficient. For enlargement state (E1), it happens when the trust-region keeps moving in the same direction while the optimum point is “at the boundary” and the metamodel quality is “precise”. It means the number of training points is large enough to get a “precise” approximation in the current trust-region. The trust-region in the next iteration should be enlarged to make full use of the information from the sampled training points. And with a larger trust-region, it is easier to include the final solution of the original problem and make the next optimum point of approximate sub-optimisations “inside” the trust-region. Based on the above considerations and the setting in the traditional trust-region strategy, the following set of scale factors is used in the paper:

$$\begin{gathered} \beta_{1} = 0.8,\;\;\beta_{2} = 0.8,\;\;\beta_{3} = 0.75,\;\;\beta_{4} = 0.5 \hfill \\ \beta_{5} = 0.8,\;\;\beta_{6} = 0.9,\;\;\alpha \;\; = 1.25 \hfill \\ \end{gathered}.$$

Finally, the keeping states mean the trust-region size should remain unchanged. These states happen when the location of the current optimum is “at the boundary”, which means the final solution may not be in the current trust-region. In this situation, the strategy should keep the same size and move the trust-region to the new area. Unlike the K1 state, the predictions in the K2 state are “precise” even when the point is “at the boundary”. A “precise” metamodel can be considered equal to the original simulation model. To take full advantage of these “precise” approximations, the trust-region strategy would re-use the old metamodels in the next iteration to reduce the additional simulation costs.

2.4 Benchmark test

A benchmark case known as Vanderplaats scalable cantilevered beam (Vanderplaats 1984) is tested first to demonstrate the proposed method before the wing jig shape optimisation. It is a classical optimisation problem. The cantilevered beam shown in Fig. 3 consists of S segments with rectangular cross-sections. The number S can be chosen arbitrarily. Each segment has the same length and the total length L is 500 cm. The optimisation tries to seek the minimum beam volume V. The widths bi and heights hi of every segment are treated as design variables. The stress σi and aspect ratio hi/bi of every segment are constrained. The tip deflection yS caused by the external load F = 5 × 104 N is seen as a global constraint. The optimisation model is given in Table 2. All design responses and their gradients could be computed by analytical functions (see Vanderplaats 1984).

Fig. 3
figure 3

Vanderplaats scalable cantilevered beam

Table 2 Optimisation model of Vanderplaats scalable cantilevered beam

The beam case with 256 segments is considered in this work. The corresponding optimisation problem will have 1 design objective, 512 design variables, and 513 design constraints, which is a typical large-scale problem. We have conducted two optimisations. The first one is a gradient-based optimisation using the SLSQP algorithm which is offered as a reference. The second one is a metamodel-based optimisation conducted by the proposed method. The relative size of the initial trust-region is set to 25.0%. The centre of the design space is selected as the initial point. In each MAM iteration, 6 training points are sampled to build metamodels and 3 candidates are collected from 3 approximate sub-optimisations to evaluate the metamodel quality. The desired convergence accuracy for the SLSQP optimiser in two cases is set to 1.0 × 10–6.

The optimisation results are shown in Table 3, and the objective convergence history plots are given in Fig. 4. With all design constraints satisfied, both optimisations obtained similar solutions. The relative error is less than 0.05%. The proposed method has finished the optimisation with a smaller number of function evaluations and gradient evaluations. To further understand the optimisation mechanism in MAM, we have plotted the convergence history of trust-region size and metamodel quality as shown in Fig. 5. As the optimisation goes on, the trust-region size decreases and the metamodel quality improves. In the final stage, the discrepancy is below 10–4 and the optimised points from approximate sub-optimisations are nearly identical to the actual optimum.

Table 3 Optimisation results of Vanderplaats scalable cantilevered beam
Fig. 4
figure 4

Convergence history of Vanderplaats scalable cantilevered beam

Fig. 5
figure 5

Convergence history of trust-region size and metamodel quality in the metamodel-based optimisation

Using the same test case, two sets of trials have been conducted to study the performance of the proposed method in situations that have some levels of numerical noise or failed evaluations. In the first set of trials, we suppose that every design response of every point might be affected by numerical noise and then become Not a Number (NaN) during the function evaluation stage. This artificial numerical noise is triggered in a certain noise probability. This work has tested 7 noise probabilities, including 0.001%, 0.005%, 0.010%, 0.050%, 0.100%, 0.500%, and 1.000%. For each noise probability, the test case has been repeated 20 times. And then, the average values of the optimised objective function and its relative error are extracted to be compared with the results shown in Table 3 which are from a normal solution without artificial numerical noise. Similarly, in the second set of trials, we suppose that every point might be a failed evaluation depending on the failure probability. The considered failure probabilities include 0.050%, 0.100%, 0.500%, 1.000%, 5.000%, 10.000%, and 50.000%.

The detailed results of the performance comparison of the two methods in the first set of trials are given in Table 4. In these test cases, if the relative error of the optimised design from a certain optimisation is greater than 1.000%, the corresponding optimisation case would be classified as a failed optimisation. Figure 6 gives a direct way to show the performance comparison. In this set, the proposed method maintains a good performance when the noise probability is smaller than 0.500%. While the gradient-based method cannot obtain a good solution when the noise probability is larger than 0.005%. For the gradient-based method, the search direction mainly depends on the responses and gradient information of the current point. If parts of these responses are NaN, SLSQP will fail to obtain the right search direction to find a good or even feasible solution. The optimum searching performance in the MAM is less likely affected by numerical noise since it is judged by the solving of approximate sub-optimisation. Besides, the metamodel building stage will ignore the points with NaN responses, which could further alleviate this issue. When the noise probability increases to 0.500% and 1.000%, almost every design point has NaN responses and then neither method can find the optimum. But in these situations, the proposed method still gets a better solution than the selected gradient-based method.

Table 4 Results of performance comparison in cases of different levels of numerical noise
Fig. 6
figure 6

Performance comparison in cases of different levels of numerical noise

In the second set of trials, we compared the optimisation performance in cases of possible failed evaluations. The results are shown in Table 5 and Fig. 7. The gradient-based method is less sensitive to the failed evaluation. As mentioned earlier, if there is a failed evaluation, the gradient-based methods will reduce the searching step and relocate a new design point. SLSQP could always find a successful evaluation from successive tries in one iteration if the failure probability is small. When the failure probability is 50.000%, which is an extreme situation, the gradient-based optimisation might have a premature termination due to no successful evaluation found in one iteration. It has 10 failed optimisations in 20 test cases. But the metamodel-based method could still have a good solution in this extreme situation whose relative error of objective function is 0.0531%.

Table 5 Results of performance comparison in cases of possible failed evaluations
Fig. 7
figure 7

Performance comparison in cases of possible failed evaluations

We need to notice that the influences of numerical noise and failed evaluations on optimisation performance are certainly more complicated in practical engineering problems. And the other gradient-based methods like nonlinear programming with non-monotone and distributed line search (NLPQLP) (Schittkowski 2011) and Sparse Nonlinear OPTimiser (SNOPT) (Gill et al. 2005) might have more robust performance. However, based on the discussion above, the results of these trials do demonstrate the robustness of the proposed method to some extent.

2.5 Aero-structural analysis methodology

To conduct a high-fidelity aero-structural analysis as the original simulation model, this work couples a RANS-based CFD solver and a structural finite-element solver. Then the adjoint method is used to efficiently compute the derivatives of the response functions with respect to a large number of design variables for gradient-assisted metamodels.

An open-source package, namely ADflow (Mader et al. 2020), is used as the aerodynamic solver to compute the aerodynamic load. ADflow solves the steady RANS equations on the structured multi-block grid. A single-equation Spalart–Allmaras (SA) turbulence model is used in this work. The main reason to choose ADflow is that this solver has implanted a discrete adjoint method via automatic differentiation (AD) to solve the gradient information, and it could efficiently work in a parallel computational environment (Kenway et al. 2019).

For the structural solver, the toolkit for analysis of composite structures (TACS) (Kennedy and Martins 2014) is used to get the structural displacements. TACS is an integrated parallel finite-element analysis (FEA) tool that has also coupled an efficient adjoint method for gradient-based optimisations. It is mainly developed for the design of thin-shell structures, especially in aerospace applications where strength, weight and stiffness are the primary concern.

An in-house Python package pyMAMAS is used as the coordinator in the aero-structural analysis methodology. This package mainly works on 3 aspects. Firstly, as shown in Fig. 8, pyMAMAS could call ADflow and TACS sequentially through a tight coupling scheme (Hurka and Ballmann 2001; Martins et al. 2005) to conduct an aero-structural analysis. In this coupling scheme, the aerodynamic solver is firstly called and partly converged. Then the aerodynamic forces are transferred to the structural solver to do a complete structural analysis. Next, the structural displacements are transferred back to update the aerodynamic discipline model. This loop would continue several times until the residuals in both disciplines are reduced to a certain level compared with the initial residual. To ensure a robust and converged solution, this paper sets the relative tolerance which could be seen as partly converged in the aerodynamic solver to 10–1 and requires the residuals in both disciplines should be decreased by 10–6 magnitude. With these settings, an aero-structural analysis normally could take 15 to 30 iterative loops to get a converged solution. During the analysis process, an RBF interpolation method has been implanted to transfer aerodynamic forces to the structural solver and transfer structural displacements to the aerodynamic surface while following the principle of virtual work. That is a re-use of the RBF part in the MAM codes. Then a mesh deformation algorithm using inverse distance weighting (Witteveen and Bijl 2009) spread surface changes to the CFD volume grid.

Fig. 8
figure 8

The flow chart of aero-structural analysis

Secondly, based on the converged solution, pyMAMAS assembles the gradient information from different solvers. The selected solver ADflow or TACS can easily solve the gradient information in a single discipline, such as the derivatives of the aerodynamic responses with respect to the aerodynamic variables and the derivatives of the structural responses with respect to the structural variables. To compute the derivatives of the aerodynamic responses with respect to the structural variables or the reversed ones, pyMAMAS has taken into account the data exchange part of the RBF interpolation and assembled the derivatives from different parts by using the chain rule.

Finally, since the MAM programme is coded by Fortran while both user interfaces of ADflow and TACS are programmed by Python, pyMAMAS has built an interface to exchange the design data, which includes transferring the variables to the aero-structural solver and collecting the responses for the MAM programme.

3 Problem definition

The details of the wing jig shape optimisation are presented in this section. To give a clear insight into the problem definition, we illustrate the optimisation in 3 parts from the view of disciplines: geometry model, aerodynamic model and structural model.

3.1 Geometry model

A single wing from common research model (CRM) (Vassberg et al. 2008) is used to test the proposed method. CRM is a transport aircraft configuration and is designed for transonic flow conditions. It is a relatively representative and famous case in the research of aerodynamic design and aero-structural design. Figure 9 shows the CRM wing geometry. It is extracted from the original CRM wing-body configuration. The fuselage is removed, and the leading edge (LE) in the root section of the remaining wing is set as the origin of the coordinates. In this way, the semi-span of the wing is reduced to 26.327 m and the half reference area is 167.198 m2. The reference chord is 7.005 m and the moment reference point is in the same position as it is in the wing-body configuration, which is (8.460, 0.000, 0.054) m in the current coordinate system.

Fig. 9
figure 9

CRM wing geometry

The free-form deformation (FFD) method based on the non-uniform rational B-spline (NURBS) Curve (He et al. 2018) is used to parameterise the wing geometry. By embedding the initial geometry inside a control box, the wing surface could be changed through the adjustment of the control points in the box while keeping its geometry topology. Therefore, the locations of the control points are treated as the geometry design variables. And following the mathematical definition of the NURBS curve, the derivatives of the wing geometry with respect to the control points could be easily computed by analytical functions.

Figure 10 gives the control box used in this work. The control points, shown as the orange spheres, are distributed in 11 sections along with the span-wise (Y) direction. Each section has 10 control points along with the chord-wise (X) direction on the upper and lower surface, respectively. The geometry design variables include the displacements of all FFD control points in the vertical (Z) direction. What is more, every section except the one at the wing root has one corresponding twist variable to control the span-wise twist distribution. It is achieved by rotating the entire control point around the Y-axis of the leading edge in the selected section.

Fig. 10
figure 10

FFD control box

To maintain the size of wing volume, 110 thickness constraints, shown as the thin red line in Fig. 10, are considered during the optimisation. These thickness constraints are evenly distributed along with the chord-wise direction in 11 sections, which limit the thickness at the corresponding position to be greater than the one in the initial geometry. Considering that the displacement adjustment of a single control point at the leading edge or the trailing edge (TE) can affect the twist angle, this work puts one LE constraint, shown as the green line in Fig. 10, and one TE constraint, shown as the blue line in Fig. 10, in every section. The LE/TE constraints make the control points at these positions move the same distance in the opposite direction along the Z-axis. Then the twist variables become the only driver to change the wing span-wise twist distribution.

3.2 Aerodynamic model

The CRM is designed to fly at a cruise Mach number of M = 0.85 with a nominal lift condition of CL = 0.50, and a chord Reynolds number of Rn = 40 million. According to the U.S. standard atmosphere model in 1976 (National Geophysical Data Center 1992), we can deduce that the corresponding flight attitude for the designed cruise condition is 11.765 km. To get the necessary thermodynamic parameters for the aerodynamic solver, the flight attitude is used in the CFD evaluation instead of the chord Reynolds number. The final flight design condition used in this work is as follows:

$$M = 0.85,\;\;H = 11.765\;km,\;\;C_{L} = 0.50 .$$

Compared with the structural evaluation, the RANS-based CFD evaluation is the main time-consuming part of the aero-structural analysis. To find a suitable grid resolution for the CFD solver that could reduce the influence of grid issues on the optimisation and balance the trade-off between computational cost and accuracy, this work has performed a grid convergence study based on a series of CFD grids. As shown in Fig. 11, we have 4 levels of grids from L3 to L0, whose grid sizes vary from 0.424 million to over 217 million. These grids are built and uniformly refined in the commercial software ANSYS ICEM CFD. The mesh near the wall has been reset in every grid to make the maximum dimensionless wall distance y+ less than or equal to one.

Fig. 11
figure 11

CFD grids for the grid convergence study

Table 6 shows the exact grid sizes of different grids and their aerodynamic results. With the increase in grid resolution, the spatial discretisation errors are asymptotically close to zero as the grid spacing h approaches zero, and thereby the drag coefficient CD decreases. The zero-spacing drag coefficient has also been computed by using the Richardson extrapolation method (Vassberg et al. 2014). We can see that the difference in drag coefficient among zero-spacing grid, L0 grid, L1 grid and L2 grid is less than 1 count, which indicates that the L0 grid, L1 grid and L2 grid all have a result with reasonable accuracy. However, the computational costs, including computer memory and computing time, grow dramatically as the grid size increases. To have a good compromise between fidelity and cost, the L2 grid is used in the wing jig shape optimisation. Notice that in this paper, all the drag coefficients are given in drag count, which is a dimensionless quantity and defined as \(C_{{D\,{\text{count}}}} = C_{D} \times 10^{4}\).

Table 6 Results of the grid convergence study

Finally, the design objective, constraints and variables in the aerodynamic model will be illustrated. From the view of the aerodynamic discipline, the optimisation’s job is to minimise the drag coefficient subject to a lift constraint (CL = 0.5) and a pitching moment constraint (CM ≥  − 0.19). The angle of attack (AoA) is treated as a design variable to satisfy the lift constraint. The pitching moment constraint limits the pitch moment coefficient to be greater than a given value, which ensures that the trim drag from the flight control surfaces would not greatly increase due to the change in the aerodynamic performance of the optimised wing. According to the results shown in Table 6, a slightly strict constraint is given on the pitch moment coefficient, which is set to be greater than − 0.19.

3.3 Structural model

The structural model used in this work is simplified from the full-scale semi-span wingbox structures on the NASA CRM official website (National Aeronautics and Space Administration 2015). As shown in Fig. 12, the simplified wingbox structures consist of 2 spars, 40 ribs, 1 upper skin and 1 lower skin. The LE spar is straight and located at the 10% local chord. The TE spar has a turn in the Yehudi break and it is set at the 70% local chord in the inner wing and outer wing. The ribs in the root section and tip section are parallel to the flow direction, while the other 38 ribs are evenly distributed along with the span-wise direction perpendicular to the leading edge. The root rib is clamped as fixed support for the other wingbox structures.

Fig. 12
figure 12

Layout of the wingbox structures

We use the same commercial mesh generator, ANSYS ICEM CFD, in the structural model to generate the surface mesh of the wingbox geometry. The left part of Fig. 13 gives an exploded view of the finite-element mesh. It is composed of 82,214 elements with a total of 480,492 degrees of freedom (DOF). To reduce the complexity of problems, all finite elements are modelled with the same material properties listed in Table 7. These properties are derived from the 2024 aluminium alloy, which is a material that has been widely used in aircraft, especially in the wing and fuselage structures, due to its high strength and good fatigue resistance.

Fig. 13
figure 13

Exploded view of the wingbox structures

Table 7 Material properties of 2024 aluminium alloy

From the view of the structural discipline, the optimisation considered in this work tries to seek the minimum weight of wingbox structures subject to the yield strength constraints of all finite elements by using the thickness of all finite elements as the design variables. Since there are 82,214 finite elements, it is unwise and unrealistic to count the stress or thickness of every single finite element as one design constraint or variable to the optimiser. For yield strength constraints, we aggregate them into 5 parts to represent the strength properties of different components in wingbox structures by using the Kreisselmeier-Steinhauser (KS) function (Kennedy and Martins 2014): one each for the LE and TE spars, one for all ribs and one each for the upper and lower skins. The structural stress of every single element is generated from the 1 g cruise case. These structural stresses will be uniformly magnified by 2.5 times in the formulation of yield strength constraints to simulate the 2.5 g limit load case, which could make the wing structure sizing closer to the real-life engineering design. A more detailed classification method shown in the right part of Fig. 13 has been used for thickness variables. It treats the thickness of the finite elements in every rib as one design variable and the thickness of every wing skin or spar plate between two adjacent ribs as one design variable. In this way, the structural model has 192 design variables that consist of 40 rib thickness variables, 39 LE spar thickness variables, 35 TE spar thickness variables, 39 upper skin thickness variables and 39 lower skin thickness variables.

3.4 Initial jig shape

Considering that the CRM only gives a good cruise shape of a transonic aircraft configuration, we need to find an appropriate starting point for the wing jig shape optimisation. In this paper, an inverse design method (Liu et al. 2013) is applied to extract an initial jig shape from the CRM wing geometry. This initial jig shape will be used as a starting point in the following optimisation cases. The detailed inverse design process is shown as follows:

  1. (1)

    Remove the aerodynamic loads from the target shape to get a raw jig shape.

  2. (2)

    Obtain the deformed shape by conducting the aero-structural analysis on the raw jig shape.

  3. (3)

    Compare the aerodynamic performance between deformed shape and target shape. If their performances are close enough, stop the inverse design process and use the current raw jig shape as the final solution. Otherwise, go to step 4.

  4. (4)

    Adjust the wing twist distribution of the raw jig shape according to the difference in wing twist distribution between deformed shape and target shape. Then go to step 2.

In step 1, the discipline models are rebuilt based on the raw jig shape, which means a new FFD box, a new CFD grid and a new finite-element mesh would be built. By using the twist variables in the new FFD box, the inverse design method could easily change the wing twist distribution of the raw jig shape to obtain a new jig shape for the next iteration. At the same time, the FFD method updates the CFD surface mesh and the finite-element mesh. Then the mesh deformation algorithm transfers the surface changes to the CFD volume grid. In this way, all the necessary inputs for the aero-structural analysis in step 2 have been updated. The design process repeats the operations from step 2 to step 4 until the aerodynamic performance between deformed shape and target shape is close enough or the same.

In this work, the target shape is the CRM wing geometry. The deformed shape derives from the aero-structural solution of the jig shape. The aerodynamic loads in step 1 are extracted from the CFD evaluation of the target shape using the design condition illustrated in Sect. 3.2. Considering the aerodynamic loads, the structural sizing, shown in Fig. 14, comes from a structural optimisation that uses the setting of the structural model illustrated in Sect. 3.3. The inverse design method has taken 3 iterations to get a converged solution. A comparison in the geometry shape among the target shape, the final jig shape and its deformed shape is shown in Fig. 15. The weight of the wingbox structures in the final jig solution is 16851.66 kg.

Fig. 14
figure 14

Structural sizing of the initial jig shape

Fig. 15
figure 15

Comparison of aerodynamic performance between deformed shape and target shape

Figure 15 also shows the comparison of aerodynamic performance between deformed shape and target shape. We can see that their distributions of pressure coefficient CP show good consistency even if there is a 0.009 deg difference in AoA and a 0.36 count difference in the drag coefficient. Six typical airfoil sections have been selected along the span-wise direction to give a detailed comparison. Although the deformed shape shows a discrepancy in the spatial position at the outer wing, the CP distributions at these sections are nearly identical. Different from the target shape, the aerodynamic performance of the deformed shape is from an aero-structural analysis which needs cooperation among the aerodynamic solver, the structural solver and the data exchange interface. Numerical errors in the structural solver and the data exchange interface could be the main reason for the differences in aerodynamic performance. However, these differences are minor and it is acceptable to use this jig solution as an initial point in the wing jig shape optimisation.

3.5 Optimisation overview

The wing jig shape optimisation has taken all design objectives, design constraints and design variables from the geometry model, aerodynamic model and structural model into consideration. To suit the gradient-based optimiser SLSQP, a global equivalent drag coefficient is treated as the objective in the optimisation, which is a combined function of aerodynamic drag coefficient and structural weight. The global equivalent drag coefficient CDeq, count is defined as follows:

$$C_{{D\;{\text{e}}q,\;\text{count}}} = C_{D\;A,\;\text{count}} + C_{D\;S,\;\text{count}},$$
(11)

where CD A, count and CD S, count are aerodynamic equivalent drag coefficient and structural equivalent drag coefficient, respectively. These equivalent drag coefficients are all given in drag count. CD A, count is equal to the aerodynamic drag coefficient CD count. And CD S, count is converted from the structural weight using the following formulation:

$$C_{D\;S,\;\;\text{count}} = \frac{W}{{\omega_{S} }},$$
(12)

where W is the wingbox structural weight in kilograms, \(\omega_{S}\) is the equivalent factor that determines how many kilograms of structural weight would be equivalent to 1 drag count. In each iteration, the proposed method will build metamodels for CD A, count and CD S, count firstly and then use the combination of their metamodels to replace CD eq, count in the process of solving approximate sub-optimisation problems.

The choice in the equivalent factor will significantly influence the searching direction and the optimised results. In this work, the mission analysis illustrated in the reference (Kenway et al. 2012) is used to assume that 1 drag count is equivalent to 790 kg of structural weight, which means \(\omega_{S} = 790\,{\text{kg}}\). This assumption derives from the Breguet range equation by using the fuel burn as the implicit objective based on the CRM wing-body-tail configuration. To some extent, the global equivalent drag coefficient CD eq, count is an indicator to show the fuel consumption of the designed configuration in the cruise condition.

Table 8 gives an overview of the problem definition for the wing jig shape optimisation. The considered problem consists of 1 design objective, 139 design constraints and 423 design variables, which is a suitable case to test the performance of the proposed method in large-scale aero-structural design optimisations.

Table 8 Overview of the wing jig shape optimisation

4 Results and discussion

This section illustrates the results of the wing jig shape optimisation shown in Table 8. In this work, we have conducted 2 optimisation cases. Both cases start from the initial jig shape obtained in Sect. 3.4. The first one is a gradient-based optimisation, which is driven by the SLSQP optimiser in the MAM programme without the use of metamodels and trust-region strategy. This case is used to verify the coupling of the aero-structural analysis methodology illustrated in Sect. 2.5 and as a benchmark case to give an insight into the performance of the proposed method. The second case is a metamodel-based optimisation, which utilises the improved MAM to get the optimised wing. The MAM settings are the same as the ones in Sect. 2.4. The relative size of the initial trust-region is set to 25.0%. The initial jig shape obtained in Sect. 3.4 is used as the centre of the initial trust-region and the whole design space.

These optimisation jobs are performed in a heterogeneous high-performance computing (HPC) cluster with 504 cores. The MAM programme uses 504 cores to build and evaluate metamodels by using parallel computing techniques. While in the aero-structural analysis, to have a relatively balanced computing performance, the aerodynamic solver is allocated 480 cores and the structural solver uses 24 cores.

Table 9 shows the comparison of optimisation results between gradient-based optimisation and metamodel-based optimisation. The difference in the design objective CD eq, count is 0.20 counts, whose relative error is nearly 0.10%. In the gradient-based optimisation, the global equivalent drag coefficient CD eq, count has been improved by 4.84%. It is reduced by 9.35 counts from 193.21 counts to 183.86 counts. While in the metamodel-based optimisation, the global equivalent drag coefficient CD eq, count has been reduced by 9.15 counts from 193.21 counts to 184.06 counts. As for the aerodynamic equivalent drag coefficient CD A, count, their values in two optimisation cases are close to each other, which are 163.12 counts and 163.15 counts, separately. The main difference is from the structural discipline. In the structural discipline, there is a 0.17 count difference in the equivalent drag coefficient, equal to 137.48 kg in the structural weight. Since we have assumed that 1 drag count is equivalent to 790 kg of structural weight in the formulation of the combined design objective, a slight difference in the structural equivalent drag coefficient may bring a large gap in the structural weight. This assumption makes optimisations mainly focus on the improvement of the aerodynamic drag coefficient. What is more, the initial structural sizing has been optimised in a single structural optimisation. That could be another reason that the improvement of the structural equivalent drag coefficient is smaller than the aerodynamic one.

Table 9 Comparison of optimisation results and computation costs

Table 9 also shows the computational costs in two optimisation cases. Considering the unbalanced computing performance among different cores and the different calculation times required for different designs, we provide the number of function evaluations, the number of gradient evaluations and the overall elapsed time as indicators to give a rough estimation of computational cost. The results show that the metamodel-based optimisation has fewer iterations but more computational costs. If the computation cost of one function evaluation is assumed to be the same as the computation cost of one gradient evaluation, the overall computational costs of response evaluations in the metamodel-based optimisation are approximately 1.88 times larger than the one in the gradient-based optimisation. And the overall elapsed time of the metamodel-based optimisation is approximately 1.98 times that of the gradient-based optimisation. That shows the additional computational costs due to the other parts of MAM such as the metamodel building and evaluation are a small part of the overall computational costs during the optimisation. In the gradient-based optimisation, there could be multiple function evaluations and one gradient evaluation in each iteration since the SLSQP optimiser may have multiple searches in one iteration to get a better design. While for the metamodel-based optimisation, it is necessary to conduct both function evaluation and gradient evaluation for every single design point so that there is enough information to build gradient-assisted metamodels. That is the reason why in Table 9 the number of function evaluations is larger than the number of gradient evaluations in the gradient-based optimisation, while in the metamodel-based optimisation they are equal to each other.

To better understand the difference in the computational cost, the convergence history for equivalent drag coefficients in the gradient-based optimisation (green line) and the metamodel-based optimisation (blue line) is plotted in Fig. 16. From the view of the global equivalent drag coefficient, the convergence history in the gradient-based optimisation is smooth, indicating that there is no or a low level of numerical noise in the objective and constraint function values. It is convincing since the CRM wing is a benchmark model and has been widely used for aerodynamic shape optimisation and wing jig shape optimisation (Kenway et al. 2012; Lyu et al. 2015; Keye et al. 2017; Bartels and Stanford 2018). What is more, no failed evaluation has been detected in both optimisation cases, which is proof of the effectiveness of the aero-structural analysis methodology used in this work. In this circumstance, the gradient-based optimiser could be the most efficient method, although different gradient optimisers with different settings may require different computational costs. On the other hand, to reduce the possibility of local optimality, 3 approximate sub-optimisations have been conducted in every iteration of the metamodel-based optimisation. Every optimum from approximate sub-optimisation would be evaluated by the aero-structural solver, which brings extra computational costs for the metamodel-based optimisation. The gradient-based optimisation needs approximately 3 times the computational costs to do the same thing. If there are high levels of numerical noise or failed evaluations, the required computational costs might be larger. However, based on the results shown in Table 9, the metamodel-based optimisation using the improved MAM has solved the problem with less than twice as many computational costs as the gradient-based optimisation does.

Fig. 16
figure 16

Convergence history for equivalent drag coefficients in different optimisations

The light blue dash line in Fig. 16 shows the approximated function values of these equivalent drag coefficients. There is a large discrepancy in the aerodynamic equivalent drag coefficient between the approximated function values and the original response values in the early stage. As the optimisation goes on, all the equivalent drag coefficients get fairly accurate predictions since the MAM keeps trying to find a suitable trust-region in every iteration to have good metamodels. Another thing we should notice is that in the metamodel-based optimisation, each point of the convergence history shown in Fig. 16 represents 3 complete approximate sub-optimisations. It is the best design point selected from the results of several approximate sub-optimisations in one iteration. If the metamodel quality is poor, there could be extremely bad designs like the results in the early iterations. If the metamodel quality is good, the optimisation could be finished in fewer iterations.

To give the details of the design process in the MAM, Fig. 17 shows the convergence history of trust-region size, metamodel quality and response with maximum error in the metamodel-based optimisation. The trust-region size reduced a lot at the beginning until the metamodel quality was good. Then the design objective CD eq, count was improved smoothly as the trust-region size remained unchanged or had a slight reduction. In most iterations, the metamodel quality fluctuated around 1.0 × 10–3, which ensured the optimisation went in the right direction. In the final stage, the optimisation oscillated around the optimal point. To end this oscillation, the MAM kept reducing the trust-region size before the optimisation had a normal convergence.

Fig. 17
figure 17

Convergence history of trust-region size, metamodel quality and response with maximum error in the metamodel-based optimisation

The third picture in Fig. 17 records the history of the response with maximum error. The Y-axis in the picture is the response index, which is assigned by the programme to provide a list of responses that are used in the optimisation. The value of the response index mainly depends on the user setting and the discipline coupling mechanism. Each response index represents a response that could be an objective or constraint from different discipline models. The location of the response that has the maximum error in its metamodel would be different in different iterations. However, in this picture, we only labelled response names to response indexes in two places, the aerodynamic equivalent drag coefficient and the yield strength constraints, since the maximum error only occurs in these responses. It is difficult to build a good metamodel for the aerodynamic equivalent drag coefficient in the first few iterations. However, the yield strength constraints are the main barrier to building good metamodels during the whole optimisation. That makes it difficult to find a feasible design with low structural weight in the metamodel-based optimisation, which is another reason why the major difference in design objective between the two optimised designs comes from the structural discipline. From Sect. 3.3, we can know that in the structure model, all strength properties from 82,214 finite elements have been aggregated into 5 yield strength constraints by using the KS function. Each yield strength constraint represents a conservative estimate of the maximum stress among the set of about one-fifth of 82,214 finite elements. The change of the stress in one finite element may badly change the value of the corresponding constraint. And the more finite element one constraint contains, the more difficult it is to build a metamodel with good quality. To solve this issue, the MAM chooses to use a small trust-region to simplify the response model, and therefore, more iterations are required to reach the final optimal design. If we could reduce the complexity of yield strength constraint functions, such as applying a more specific classification method like the one in the structural thickness variables, it would be easier to build good metamodels and then have fewer iterations and computational costs. However, this would increase the number of structural constraints, which also means extra computational costs since more adjoint equations need to be solved in the gradient evaluation part.

A detailed comparison of aerodynamic performance among initial design (red), gradient-based optimised design (green) and metamodel-based optimised design (blue) is given in Fig. 18. In this picture, the left part includes CP contours on the upper wing surface, visualised isosurfaces of the shock region (Haimes 1999) and front views of cruise shape and jig shape. And the right part shows CP distributions and airfoil shapes at six span-wise sections. In both optimisations, the lift constraint has been satisfied and the pitch moment has reached the constraint boundary. The shock region has been dramatically reduced and the optimised designs have a smooth CP distribution. Although there are slight differences in the CP contour near the wing root, the aerodynamic performance has a good consistency between gradient-based optimised design and metamodel-based optimised design. The difference in the drag coefficient is 0.03 counts. Their CP distributions and airfoil shapes are near identical. These results indicate that the metamodel-based optimisation using the improved MAM could achieve the same design goal in the aerodynamic performance.

Fig. 18
figure 18

Comparison of aerodynamic performance among initial design (red), gradient-based optimised design (green) and metamodel-based optimised design (blue). (Color figure online)

Similarly, Fig. 19 shows the comparison of structural thickness distribution. The initial thickness distribution is from a structural optimisation whose external loads are generated by an aerodynamic evaluation of the initial design. In the aero-structural analysis of the initial design, there are still parts of yield strength constraints that are not satisfied. The maximum normalised value of the violated yield strength constraint is 1.01. During the optimisation, this distribution has been slightly adjusted to satisfy all structural constraints. Although we can find a 137.48 kg difference in the structural weight between gradient-based optimised design and metamodel-based optimised design, there is no significant difference in their thickness distribution.

Fig. 19
figure 19

Comparison of structural thickness distribution among initial design (red), gradient-based optimised design (green) and metamodel-based optimised design (blue). (Color figure online)

5 Conclusion

This paper presents the latest developments in the multipoint approximation method based on a gradient-assisted metamodel assembly technique within a trust-region optimisation framework and its application to wing jig shape optimisation. It provides a robust and effective solution for high-fidelity large-scale aero-structural design optimisation problems. The MAM is an iterative optimisation technique that builds approximations in the selected trust-region. The trust-region is a subspace of the entire design space, which would be translated and scaled according to the defined trust-region strategy during the design process. The gradient-assisted metamodel assembly method combines 5 simple mathematical functions to approximate arbitrary complex non-monotonic functions based on the linear regression method. The states of part indicators in the trust-region strategy have been subdivided and re-classified to make the MAM more flexible and controllable in different types of problems. The judging mechanism of the trust-region strategy has been improved accordingly to suit the change. To build a high-fidelity aero-structural solver, this work couples a RANS solver ADflow and a structural finite-element solver TACS. Both solvers are open-source packages, and an in-house package pyMAMAS has been used as the coordinator to conduct the aero-structural analysis and assemble the gradient information.

In the gradient-assisted metamodel assembly method, several linear and intrinsically linear functions are chosen as individual metamodels. These metamodels can be easily built and efficiently evaluated at a low computational cost compared with classic metamodels like radial basis function and Kriging. What is more, their gradient information can also be efficiently solved for the gradient-based optimiser. That makes it possible for MAM to be used in large-scale problems. The tunning coefficients in metamodels are solved by the least square method, which could be negative values or positive values. The inequalities from responses and gradient information are included to choose a suitable set of weight coefficients to improve the metamodel quality. The parallel computing technique is implanted to make the MAM could build or evaluate metamodels for different response functions simultaneously.

The trust-region strategy in MAM gives the location and size of the trust-region in the next iteration according to the current optimisation state. The proposed method has subdivided and re-classified the states of part indicators to enhance MAM performance. Finally, this work has defined 13 optimisation states, including 4 stop states, 6 reduction states, 1 enlargement state and 2 keeping states, based on 6 indicators. Different states give different ways to adjust the trust-region in the next iteration.

In the aero-structural analysis methodology, the aerodynamic solver is from ADflow which solves the steady RANS equations with a single-equation Spalart–Allmaras turbulence model on the structured multi-block grid. An integrated parallel finite-element analysis tool TACS is used as the structural solver. The gradient information in each discipline is solved by the adjoint method in the selected solver. The in-house Python package pyMAMAS is used as the coordinator to conduct the aero-structural analysis and assemble the gradient information from different solvers.

A benchmark case known as Vanderplaats scalable cantilevered beam is tested first to demonstrate the robustness of the proposed method in cases of different levels of numerical noise and failed evaluations. Then, the proposed method is applied to the wing jig shape optimisation. We have conducted two optimisation cases, a gradient-based optimisation and a metamodel-based optimisation. The gradient-based optimisation is used to prove the effectiveness of the used aero-structural analysis methodology and then as a benchmark case for the metamodel-based optimisation which has utilised the improved MAM to get the optimised wing. Both optimisations start from an initial jig shape which is derived from the CRM wing geometry by using an inverse design method. The results show that the proposed method can achieve the same design goal as the gradient-based method but with enhanced robustness and efficient performance. The difference in the global equivalent drag coefficient, which is the design objective of the considered optimisation problem, between the two cases is 0.20 counts, whose relative difference is nearly 0.10%. The overall elapsed time of the metamodel-based optimisation is approximately 1.98 times that of the gradient-based optimisation. However, the proposed method is relatively insensitive to numerical noise in the objective and constraint function values and is less affected by the influence of failed evaluations during optimisation. What is more, to reduce the possibility of local optimality, 3 approximate sub-optimisations have been conducted in every iteration of the metamodel-based optimisation. The gradient-based optimisation may need approximately 3 times its original computational costs to do the same thing. And if there are high levels of numerical noise or failed evaluations, the required computational costs might be larger.