Rapid aerodynamic shape optimization under uncertainty using a stochastic gradient approach

A common approach in aerodynamic design is to optimize a performance function—provided some constraints—defined by a choice of an aerodynamic model at nominal operating conditions. Practical experience indicates that such a deterministic approach may result in considerably sub-optimal designs when the adopted aerodynamic model does not lead to accurate predictions, or when the actual operating conditions differ from those considered in the design. One approach to address this shortcoming is to consider an average or robust design, wherein the statistical moments of the performance function, given the uncertainty in the operating conditions and the aerodynamic model, is optimized. However, when the number of uncertain inputs is large or the performance function exhibits significant variability, an accurate evaluation of these moments may require a large number of function evaluations at each optimization iteration, rendering the problem significantly expensive. To tackle this difficulty, we consider a variant of the stochastic gradient descent method where in each iteration, a stochastic approximation of the objective, constraints, and their gradients is generated. This is done via a small number of forward/adjoint solutions corresponding to random selections of the uncertainties. The methodology is applied to the robust optimization of the NACA-0012 airfoil subject to operating condition and turbulence model uncertainty. With a cost that is only a small factor larger than that of the deterministic methodology, the stochastic gradient approach significantly improves the performance of the aerodynamic design for a wide range of operating conditions and turbulence models.


Introduction
The design process of an engineering structure or device requires an appropriate selection of values for the design parameters, such that the desired performance of the system is optimized given some constraints. In particular, the specification of one or more design operating conditions allows practitioners to use deterministic optimization approaches. For instance, in airfoil shape design, the target lift coefficient and structural properties are specified and the objective is to minimize drag under some constraints. However, the use of deterministic, single-point (DSP) formulations may lead to under-performing designs when operating away from the specific, often nominal, design conditions selected for the optimization (Huyse et al. 2002). Similarly, small manufacturing imperfections or fluctuations in the flight conditions may lead to considerable changes in the airfoil performance. In practice, the variability in the operating conditions, e.g., Reynolds or Mach number, cannot be completely eliminated. Tightening the manufacturing tolerances may prove prohibitively expensive or practically impossible to achieve; it is typically expensive to produce a precise design and impossible to maintain a pristine shape during accumulated routine flights. Beyond the aforementioned uncertainties, the optimal design depends on the choice of the aerodynamic model, e.g., a RANS model, and consequently an inaccurate model may lead to an overall sub-optimal design.
The sensitivity of the design performance to such uncertainties provides an incentive to pursue the so-called robust designs, wherein the optimization formulation incorporates the effects of system uncertainties. Such effects may be considered by adding constraints on the failure probability of a system. This approach is known as reliability-based optimization (RBO), and mostly uses first-or second-order Taylor series expansion to approximate the moments of the limit state function with respect to the uncertain parameters (Haldar and Mahadevan 2000). Other strategies to solve RBO problems have been proposed, including polynomial chaos expansion (Eldred and Elman 2011) and Karhunen-Loève expansion (Guest and Igusa 2008). On the other hand, robust optimization considers the impact of uncertainties directly on the objective and constraints through their second-order statistics (Padula et al. 2006), i.e., variance or standard deviation. In aerodynamic shape optimization (ASO) under uncertainty, the statistical moments and their gradients are typically estimated by means of Monte Carlo sampling (Vasile and Quagliarella 2020), genetic algorithms (Fusi et al. 2018), polynomial chaos expansion (Keshavarzzadeh et al. 2016;Keshavarzzadeh and Ghanem 2019;Mishra et al. 2020), and stochastic collocation (Petrone et al. 2011;Lazarov et al. 2012). On the one hand, Monte Carlo strategies are applicable to high-dimensional inputs as the convergence rate of the estimator is independent of the number of input variables, but are computationally expensive as the convergence decays with the number of samples as O(N −1∕2 ) ; relatively higher performances can be obtained with genetic algorithms at expenses of algorithmic/implementation complexities. On the other hand, polynomial chaos expansion and stochastic collocation strategies, while effective, are limited to relatively low-dimensional uncertain inputs (Doostan and Owhadi 2011).
Motivated by recent advances in machine learning (Bottou et al. 2018). De et al. (2020b) proposed stochastic gradient descent (SGD) methods (Bottou et al. 2018) to alleviate the computational burden for topology optimization of structural systems. The approach generates an unbiased, stochastic approximation of the gradients at each optimization iteration in a manner similar to the standard Monte Carlo method but with a small, e.g., O(1) , number of required evaluations of the gradients (and the objective/constraints). These estimates are performed statistically independently across the optimization iterations. Through their application to several problems involving high-dimensional uncertain inputs, e.g., (De et al. 2020a(De et al. , 2020b(De et al. , 2021, it was demonstrated that SGD methods produce robust designs that achieve optimal objectives similar to those obtained using Monte Carlo simulation with large sample size, but at a cost that is only a small multiple (e.g., 4× ) of that for solving the deterministic formulation of the optimization per iteration.
Building upon the work by De et al. (2020b), this paper aims to (i) extend the SGD approach to aerodynamic shape optimization under uncertainty, and (ii) demonstrate its efficacy in designing an airfoil subject to combined operating condition and model uncertainty. In particular, we consider both average and robust designs of a standard NACA-0012 airfoil in a low-Mach-number turbulent flow regime. The variability of the cruise conditions is accounted for by considering a range of Reynolds numbers. Further, to achieve a design that is less sensitive to the choice of the turbulence model, i.e., less biased by the closure assumptions/formulations, five different RANS models are considered via a discrete random variable that represents model uncertainty.
The numerical results suggest that the SGD approach is able to achieve designs that exhibit smaller variance in the presence of such uncertainties and with a cost that is only four times larger than that of a deterministic, single-point design. The rest of this manuscript is organized as follows. Section 2 formally introduces the problem of aerodynamic shape optimization and the motivation for the problem formulation, together with the corresponding sources of uncertainty. The description of the SGD strategy utilized in this work is discussed next in Sect. 3. In Sect. 4, the application of the SGD approach to robust optimization of the NACA-0012 airfoil is described and the corresponding results are discussed. Finally, in Sect. 5, the work is concluded and future directions are proposed.

Aerodynamic shape optimization
This section presents the deterministic airfoil shape optimization setup with the corresponding mathematical notation and provides a discussion on the sources of uncertainty impacting the flow predictions. Subsequently, to account for these uncertainties, the robust optimization formulation considered is introduced.

Deterministic airfoil optimization
The problem of optimizing the shape of an airfoil is generally formulated (Huyse et al. 2002) in terms of reducing the drag under the constraints of a minimum lift and a fixed total volume (or area in 2D) Ω . The drag and lift are customarily represented by their corresponding coefficients C D and C L . The minimum required lift is also expressed by means of a target lift coefficient denoted C ⋆ L . Typically, the set of variables ∈ used to optimize the performance of the airfoil are the angle of attack (AoA) , and the wall-normal positions of the control nodes defining the airfoil geometry. Then, defining the objective function f ( ) as C D , the deterministic optimization problem is formulated as where Ω 0 is the initial volume of the airfoil.

Sources of uncertainty
The aerodynamic optimization of airfoils may be affected by various sources of uncertainty. These include aleatoric uncertainty inherent to the variability in flight conditions, and model-form uncertainty resulting from the computational approaches required for rapid flow prediction in high-Reynolds-number regimes and/or complex geometrical configurations.
In the case of low-Mach-number turbulent flows, the variability in flight conditions, e.g., fluctuations in velocity and/ or fluid properties, can be grouped into the chord Reynolds number Re c = U ∞ c∕ via dimensional analysis (Jofre et al. 2020b), where U ∞ is the free-stream velocity, c is the airfoil chord length, and is the fluid's kinematic viscosity. While this uncertainty can be characterized based on experimental data obtained from flight tests and databases, it cannot be reduced as it depends on the properties of the ambient fluid in which the airfoil is operating; these are generally dictated by large-scale atmospheric phenomena and local weather conditions.
At present, flow prediction over aerodynamic airfoils under turbulent conditions utilizing scale-resolving computational approaches, such as direct numerical simulation (DNS) and large-eddy simulation (LES), is expensive. In addition to the forward flow solutions, gradient-based shape optimization techniques require information about the sensitivity of the objective function and constraints with respect to the design variables, e.g., backward/adjoint computations. Therefore, these high-fidelity computational approaches are not a practical option in optimization of computationally expensive applications. As such, design strategies often rely on numerically cheaper solutions based on Reynolds-averaged Navier-Stokes (RANS) (Wilcox 1998) and detachededdy simulation (DES) (Spalart 2009). The study and optimization of complex turbulent flows by means of RANS simulations have become routine in many scientific and engineering applications. The underlying time-averaging operation of the approach enables significant reduction of the computational requirements by resolving only the large flow scales dictated by the boundary conditions of the problem. However, the Reynolds stresses and their effects on the resolved flow field are not negligible, and therefore require additional modeling. Consequently, the assumptions made in the closure formulations become important sources of model-form uncertainty that impact the QoIs (Emory et al. Jofre et al. 2019). As a result, many RANS models tailored (initially) for specific flow regimes and configurations exist in the literature (Wilcox 1998). In the present study, therefore, this type of uncertainty is also introduced into the optimization problem by considering fundamentally different RANS models which are randomly selected based on a discrete random variable with equal probability masses.
It is important to note that, over the past years, multifidelity (MF) strategies to reduce the cost of the outer-loop problem by combining the accuracy of high-fidelity (HF) models, e.g., DNS and LES, with the speedup achieved by low-fidelity (LF) representations, e.g., RANS, DES and analytical functions/correlations, have been extensively developed (Giles 2008;Peherstorfer et al. 2018) and applied to propagate uncertainty in large-scale multiphysics turbulent flows (Jofre et al. 2020a) and optimization problems (Leary et al. 2003;Robinson et al. 2006;Forrester et al. 2007;Lam et al. 2015). As demonstrated by De et al. (2020b), the SGD approach of this study may be extended to incorporate MF aerodynamic models, a direction we leave for a future study.

Robust optimization
Developing optimization methods that result in more robust designs is of significant interest in engineering practice. The term robustness refers to a variety of goals: (i) identify designs that minimize the variability of a manufactured product in the presence of uncertainty, (ii) mitigate the detrimental effects of the worst-case performance, or (iii) obtain a uniform improvement of the performance over the entire range of operating conditions. Let ∈ ℝ n denote the vector of design variables and ∈ ℝ n the vector of random variables characterizing system uncertainties. Let f ( ; ) ∶ ℝ n × ℝ n → ℝ denote the performance function for an instance of and a sampled value of . Similarly, let g( ; ) ∶ ℝ n × ℝ n → ℝ n g be the vector of n g real-valued constraints. We say that ( ; ) satisfies the constraints if g( ; ) ≤ 0 , and refer to positive values of g( ; ) as constraint violations.
The robust optimization objective considered in this work is defined as a combination of its expected value and variance where [⋅] and [⋅] denote the mathematical expectation and variance of their arguments, respectively. Similarly, the constraint violation is expressed by where G j ( ; ) = {max 0, g j ( ; ) } 2 for j = 1, … , n g (Griva et al. 2009). Note that while f and g are random variables, respectively, R and C j are scalars as they are associated with moments taken with respect to the probability measure of . The parameters R , C,j ≥ 0 denote the importance of variations in f or g relative to their means. The effect of adding the variance to the objective and constraints is to aim for a design that shows relatively smaller variations in the two responses, f and g , even in the presence of uncertainty. Stated differently, larger values of R and C,j penalize more against designs that lead to large variations in f ( ; ) and G j ( ; ) , but may lead to sub-optimal designs with respect to [f ( ; )] optimality, as it can be seen in the results of Sect. 4.2. We denote this formulation as robust optimization formulation for R , C,j > 0 , and average optimization when R , C,j = 0 . Although one may use different values for R and C,j in Eqs. (2) and (3), we restrict the analysis to utilizing the same value in the application studied in Sect. 4; viz. these (more than 2 when j > 1 ) parameters are considered to be equal and denoted by . In practice, the choice of R and C,j is a design decision a practitioner has to make to achieve the desired level of robustness.
Finally, we are interested in solving the unconstrained optimization problem expressed as where is a user-specified vector of parameters that penalizes against constraint violation. Generally, an extensive search for is not computationally feasible. Hence, based on a few preliminary tests, the values were determined to j = 1 for the application studied in this paper.

Stochastic gradient descent approach
In standard Monte Carlo approaches, R( ) , C( ) , and their gradients are approximated utilizing N ≫ 1 , e.g., N = 10 3 , forward and adjoint solutions of the model for specific values of the design variables and N solutions of , as described in Appendix A. However, evaluating f ( , ) , g( , ) , and their gradients at each iteration N times may be computationally expensive for problems with many degrees of freedom. Instead, following De et al. (2020b), we use n s ∼ O(1) ≪ N , e.g., n s = 2, 4, 8 , random samples to estimate the expectations, variances, and gradients of the objective and constraints. Such stochastic approximations are performed independently across optimization iterations; that is, an independent set of n s samples of is used at each iteration.
The SGD method shown in Algorithm 1 uses a single realization of , i.e., i , to update the design at the kth iteration as where the gradient h k is defined as and is the step size, also known as learning rate. Hence, the computational cost of SGD is relatively low, specifically the same as the deterministic variant of the optimization at each iteration. However, the convergence of the standard SGD with n s = 1 can be slow since it may not follow the descent direction at every iteration due to the large variance of the stochastic gradients in Eq. (6).
A straightforward extension of the standard SGD method, known as mini-batch gradient descent (Bottou et al. 2018), is to use n s > 1 random samples at each iteration to estimate the gradient h k , as explained in Appendix A. In the past few years, several modifications to the SGD have been proposed to improve its convergence, e.g., the Adaptive Subgradient (AdaGrad) (Duchi et al. 2011), Adadelta (Zeiler 2012), Adaptive Moment Estimation (Adam) (Kingma and Ba 2014), Stochastic Average Gradient (SAG) (Roux et al. 2012), and Stochastic Variance Reduced Gradient (SVRG) (Johnson and Zhang 2013). The SGD algorithm employed in this work is the AdaGrad (Duchi et al. 2011), described in the next subsection and summarized in Algorithm 2, where historical information about the gradients is used to modify the updates of the design variables. For detailed discussions on the convergence of SGD methods using small mini-batches the interested reader is referred to (Spall 2005;Shamir and Zhang 2013;Défossez et al. 2020;Ward et al. 2019).

Adaptive subgradient method
AdaGrad dampens the movements along directions with historically large gradients, adapting in this way the learning rate and facilitating a faster convergence. In this algorithm, at iteration k, the following auxiliary variable is computed (subscript i refers to an independent sample of ) which is then used, for each design variable j, in the update rule where the vector multiplication is performed componentwise. To avoid division by zero, a small number = 10 −8 is incorporated in the denominator of the update as shown in Algorithm 2.

Optimization experiments
The performance of the SGD strategy introduced in Sect. 3, specifically AdaGrad, is discussed in the subsections below by considering the optimization under uncertainty of an airfoil in a low-Mach-number turbulent flow regime. The problem and uncertainty sources are described first in Sect. 4.1, and the results are presented and discussed next in Sects. 4.2 and 4.3.

Problem description
The optimization problem selected to analyze the performance of the strategy presented in this work considers the NACA-0012 airfoil with sharp trailing edge at an AoA of = 0 • as the baseline geometry. The objective is to minimize the drag coefficient C D at flight conditions under uncertainty in a low-Mach-number turbulent flow regime, while providing a lift coefficient C L above a specified target (i) C ⋆ L = 0.375 , close to the typical values found for commercial transport airliners (Bertin 2002), or (ii) C ⋆ L = 0.75 for comparison.
The Reynolds number of the problem is defined as Re c = U ∞ c∕ , where U ∞ is the free-stream velocity, c is the length of the airfoil's chord line, and is the kinematic viscosity of the fluid. The design variables in the optimization problem are given by the vertical positions of 10 × 2 × 2 = 40 equidistant free-form deformation (FFD) nodes (Sederberg and Parry 1986) and . Three FFD nodes are fixed. One determines the leading edge location and two identical nodes determine the trailing edge location. In addition, the area of the airfoil is fixed by imposing Ω = Ω 0 = 8.2 × 10 −2 m 2 as an entry of the vector of constraints C in (3).
The RANS equations for the flow are discretized and solved on a two-dimensional (2-D) O-type body-fitted mesh utilizing the OpenFOAM (OpenFOAM 2020) computational package. A grid refinement study based on the Spalart-Allmaras RANS model was performed to minimize the relative variation of C D and C L below 5% for the baseline airfoil geometry. The resulting final mesh utilized for the optimizations is composed of approximately 250000 grid points clustered around the airfoil with the first grid point in the wall-normal direction at n + ∼ 1 (wall units), and with the far-field boundaries located at 50 chord lengths. The sensitivities of both drag and lift coefficients with respect to the design parameters are efficiently calculated utilizing a discrete adjoint solver provided by the DAFoam (He et al. 2020) framework.
Two combined sources of uncertainty (i.e., random inputs) are considered. First, an aleatoric uncertainty resulting from the inherent variability of the flight conditions, e.g., changes in fluid density and viscosity and free-stream velocity, grouped within a Reynolds number uniformly distributed in the range Re c = [10 6 ∶ 10 7 ] . Second, uncertainty arising from the utilization of RANS turbulence models to close the systems of equations describing the flow field. In this regard, five different RANS models are selected for this study: (1) Spalart-Allmaras (Spalart and Allmaras 1992), (2) k-ε (Launder and Spalding 1974), (3) realizable k-ε (Shih et al. 1995), (4) shear stress transport (SST) k- (Menter 1993), and (5) Langtry-Menter SST k- (Langtry and Menter 2009); a discrete random variable with equal probability has been utilized for sampling over the RANS model space.

Optimization results for
The convergence behavior of the DSP and average optimization methodologies is studied first in Fig. 1 by depicting the mean drag (a) and lift (b) coefficients as a function of normalized cost (or equivalent number of iterations) defined as the iteration number multiplied by the number of samples per iteration. The DSP approach utilizes one single set of fixed values for the optimization process consisting of Re c = 5 × 10 6 and the Spalart-Allmaras RANS model (option 1), while the average methodology advances the optimization procedure by randomly sampling the stochastic inputs; viz. Reynolds numbers is uniformly distributed in the range Re c = [10 6 ∶ 10 7 ] , and RANS models from options 1, 2, 3, 4, and 5 are uniformly random. In particular, three cases are studied for the average design by utilizing n s ∈ {2, 4, 8} samples per optimization iteration with = 0 ; a fourth case (discussed in Table 1) with 64 samples per iteration has also been performed to corroborate the convergence/ robustness of the DSP approach. Four main observations can be inferred from the figure. First, the DSP optimization converges slightly faster than the average design due to their inherent differences: (i) DSP utilizes one sample per iteration, whereas average design requires more than one sample per iteration to collect statistical information; and (ii) the input values are kept constant for DSP during the optimization. Second, the average design curves present small oscillations, but the main trends are similar to the DSP approach. In fact, smoother curves are obtained for average design when increasing the number of samples per iteration n s , however, at larger computational costs. Third, the optimal AdaGrad performance for this problem is to utilize 4 samples per iteration: (i) 2 samples per iteration is the faster option, but it is not stable enough; and (ii) 8 samples per iteration provides slightly smoother results converging to similar values than with 4 samples, but it is approximately 2× more expensive than utilizing 4 samples per iteration. Fourth, as expected and corroborated by the parallel increase of C D with C L in the figures, there is a trade-off between reducing drag and providing a minimum lift value. Moreover, the computational cost scaling of the DSP and SGD optimization strategies with the number of uncertain parameters is explored by augmenting the number of stochastic inputs to 7. These correspond to the initial 2 uncertainties, i.e., Re c and RANS model, plus a main model-parameter of each of the RANS models with the following ranges (  Fig. 2 showing (i) an approximately linear increase of computational cost, i.e., number of iterations to reach a converged mean drag and lift, with respect to the number of input parameters as compared to the case of 2 input parameters, and (ii) a slightly larger convergence variability in the mean values of C D and C L as a result of the larger variability due to the increased uncertainty in the problem.
The shapes of the optimized airfoils obtained from utilizing the DSP, average, and robust designs are shown in Fig. 3, together with the resulting time-averaged velocity distributions normalized by the free-stream velocity. For the average and robust designs, 4 samples per iteration have been considered with three different values for = 0, 10, and 100 . The goal here is to assess the impact of on the performance of the optimized design by considering values at different orders of magnitude, rather than finding an optimal value that may be achieved using, for instance, a Pareto frontier (Ehrgott 2005). Following the notation in Fig. 3, the corresponding angles of attack are: (a) = 2.03 • , (b) = 1.50 • , (c) = 1.57 • , and (d) = 1.65 • . As depicted in the figure, the DSP strategy generates a significantly more asymmetric shape, with a finer leading edge, than the average-and robust-optimized airfoils. In particular, while the region that C L is largely sensitive to in comparison to the shape of the airfoil. Therefore, for all optimization cases the AoA increases until the target C ⋆ L value is obtained, the point at which the optimization focus is then shifted toward stabilizing C D . In terms of normalized time-averaged velocity fields, the DSP optimization generates a solution with a concentrated region of large velocities at the suction side of the leading edge and extending a substantial distance in the y-direction. Consequently, the lift force is highly concentrated at the front part of the airfoil. On the contrary, the average and robust strategies optimize the airfoil to generate a region of larger relative velocities that is more distributed along the entire suction side, especially for large values.
The performance based on the drag and lift coefficients of the optimized designs described above is assessed by sampling the input parameter space by computing the same 100 randomly generated samples for each airfoil. The resulting data are depicted in Figs. 4 and 5, and summarized in Table 1 in terms of mean [⋅] , variance [⋅] , and coefficient of variation ℂ [⋅] . The discussion of the results can be organized in six major points. First, the DSP approach provides the smallest C D = 9.6 × 10 −3 when optimizing for one set of Re c value and RANS model, but provides an [C D ] = 1.8 × 10 −2 for the parameter space study with ℂ [C D ] = 25% . In addition, [C L ] = 4.0 × 10 −1 is above the threshold value C ⋆ L = 0.375 , but not the entire sto- Second, the optimized airfoil obtained using the DSP approach with = 0 and 64 samples per optimization iteration (last row in Table 1) leads to a design with statistics similar to that of 4 samples. This result suggests the adequacy of a small number n s of forward/adjoint solutions for a successful SGD design. Third, the average design ( = 0 ) is able to significantly reduce the mean drag coefficient to [C D ] = 1.1 × 10 −2 , which is a reduction of 1.6× , with a larger coefficient of variation of ℂ [C D ] = 39% due to a reduction of the mean value with respect to the DSP approach. Moreover, this strategy is able to maintain L . Fifth, the robust design with = 100 keeps slightly increasing the mean drag coefficient value, but is able to significantly reduce the coefficient of variation to ℂ [C D ] = 8% by largely decreasing ℂ [C D ] , as clearly observed by the small variation of the data in Fig. 4d. However, as observed in Fig. 5, the average drag coefficient in this case is larger than that of the = 10 design, thus highlighting the robustness versus average optimality trade-off explained in Sect. 2.3.

Optimization results for
The SGD strategy proposed is further assessed by considering the previous optimization problem with a significantly larger target lift coefficient C ⋆ L = 0.75 . Based on the insight obtained in Sect. 4.2, the performance of the strategy is tested for = 10 and n s = 4 samples per iteration, and compared against the results provided by the DSP designed airfoil. In both cases, the presented statistics of the optimal airfoil are obtained using the same set of 100 samples.
The geometries of the optimized airfoils obtained from utilizing the DSP and robust approaches are shown in Fig. 6, colored by the time-averaged velocity distributions normalized by the free-stream velocity. Based on the notation in Fig. 6, the corresponding angles of attack are similar with values: (a) = 3.57 • , and (b) = 3.28 • . As shown in the figure, the DSP strategy generates a more asymmetric shape with respect to the chord line than the robust-optimized airfoil, presenting a finer leading edge with a pronounced convex suction side and a significantly flat pressure side. Similar to the case for C ⋆ L = 0.375 , the DSP optimization generates a solution with a more concentrated region of large velocities at the suction side of the leading edge, while the robust design extends this region of relatively large velocities further to the trailing edge. As a result, the lift force is more concentrated at the front part of the airfoil for the DSP solution relative to the robust design.
Similarly to the previous subsection, the performance of the optimized designs based on the mean, variance, and coefficient of variation of the drag and lift coefficients is reported in Table 2. The discussion of the results can be organized in three major points. First, the DSP approach leads to the smallest C D = 1.3 × 10 −2 when optimizing for one set of Re c value and RANS model, while attaining [C D ] = 2.7 × 10 −2 and ℂ [C D ] = 42% . Furthermore, [C L ] = 8.1 × 10 −1 is above the threshold value C ⋆ L = 0.75 , but not the entire stochastic distribution since [C L ] − [C L ] 1∕2 = 0.5 < C ⋆ L . Second, the robust design is able to notably reduce the mean drag coefficient to [C D ] = 1.6 × 10 −2 , which is a reduction of 1.7× , with a smaller coefficient of variation of ℂ [C D ] = 29% . Moreover, this strategy is able to maintain [C L ] − [C L ] 1∕2 = 0.76 > C ⋆ L by slightly increasing the mean lift coefficient value to [C L ] = 8.4 × 10 −1 and significantly reducing the variance to [C L ] = 7.1 × 10 −3 . Third, in comparison to the results obtained for C ⋆ L = 0.375 , the robust design approach has been able to achieve the new C L coefficient requirement without substantially increasing C D (viz. [C D ] from 1.2 × 10 −2 to 1.6 × 10 −2 ), whereas the DSP strategy has lead to slightly larger increase in [C D ] (from 1.8 × 10 −2 to 2.7 × 10 −2 ).

Summary and conclusions
Deterministic single-and multi-point optimization strategies generally suffer from significant performance decay when aerodynamic systems operate away from the design conditions selected. A robust solution is to perform the optimization considering the uncertainty sources by defining objectives and constraints with statistical moments of geometric and physical QoIs. However, the necessity to compute forward and backward flow solutions to accurately calculate statistical moments and their gradients pose the problem typically infeasible from a computational cost perspective. This work, therefore, explores the acceleration strategy of generating a stochastic approximation of the objective, constraints, and their gradients by computing a small number of solutions per optimization iteration. The strategy is combined with the AdaGrad SGD optimization method, which has gained attention over the past years for data-driven scientific and engineering applications.
The methodology presented has been assessed by performing the optimization under uncertainty of the NACA-0012 airfoil in a low-Mach-number turbulent flow regime. The results obtained indicate that, in comparison to a deterministic, single-point approach, the strategy analyzed is able to reduce the mean drag coefficient by approximately 2× for a relatively wide range of operating conditions, while satisfying the constraints of the problem in terms of target lift coefficient and fixed airfoil volume. In addition, the strategy presented enables to straightforwardly control, utilizing an external tunable parameter, the degree of robustness incorporated into the aerodynamic design as demonstrated by the reduction of variability in the response surface of the parameter space study. Consequently, the examples analyzed show that the use of stochastic gradients estimated with a small number of random samples at each iteration (4 samples in DSP 2.7 × 10 −2 1.3 × 10 −4 4.2 × 10 −1 8.1 × 10 −1 9.2 × 10 −2 3.7 × 10 −1 Robust design, = 10 1.6 × 10 −2 2.1 × 10 −5 2.9 × 10 −1 8.4 × 10 −1 7.1 × 10 −3 1.0 × 10 −1 the case studied) produces meaningful designs in a computationally affordable manner. Future work will encompass the application of the strategy studied to the robust optimization of aerodynamic problems involving other flow regimes, such as high-speed flows and multiphysics turbulence, utilizing RANS and/or scale-resolving computational approaches, and considering other sources of uncertainty, mesh resolutions and numerical errors. Additionally, the strategy can be expanded by considering different SGD methods and optimization algorithms. For instance, the Adadelta, Adam, SAG and SVRG SGD methods, and the Globally Convergent Method of Moving Asymptotes (GCMMA) optimization algorithm. Moreover, as demonstrated by De et al. (De et al. 2020b), an avenue to further accelerate the optimization of aerodynamic systems by means of stochastic gradient-based approaches is to leverage the speedup of MF strategies based on combining the accuracy of high-fidelity models with the rapid computation of equivalent low-fidelity representations. Finally, a formal comparison of the SGD method of this work with more standard UQ strategies, e.g., polynomial chaos and stochastic collocation, for ASO under high-dimensional uncertainty forms an interesting future work.

Appendix A
In the standard Monte Carlo approach, R( ) and ( ) , from (2) and (3), are approximated utilizing N forward solutions of the model for specific values of the design variables and N solutions of as follows The Monte Carlo estimates of the gradients of R( ) and ( ) with respect to can be generated using N forward and adjoint solutions by means of the following definitions where the normalization terms N∕(N − 1) lead to unbiased variance estimates. Notice that (15)-(22) are linear operators, and that the variance estimates (19) and (20) require both forward and adjoint solutions. Finally, the approximate gradient of the objective formulation (4), which combines gradients of the objective and the constraint functions, is given by The SGD approach of Sect. 3 uses the aforementioned procedure to compute a stochastic gradient ̂ ( ) with the key distinctions that (i) instead of using N random samples, e.g., N = 10 3 , only n s ∼ O(1) samples, e.g., n s = 2, 4, 8 , are used, and (ii) n s independent solutions of are utilized at each iteration of the optimization.