1 Introduction

Access to space is one of the greatest challenge in todays aerospace technologies. The prohibitive costs of current systems led to the rise of new approaches. Reusable systems, such as the American Space Shuttle, were designed to reduce the expenses of the access to Earth Orbit for satellite commissioning, science missions and ferry to the International Space Station.

In the present study a novel launch system is considered. This system’s mission is represented in Fig. 1. An air-launched suborbital winged spaceplane is mounted piggyback on a conventional airliner. The airliner brings the spaceplane up to its maximum altitude. Once the maximum altitude for the airliner is reached, the spaceplane is released and propelled by a rocket engine up to 80 kilometers at more than 2 km/s. After all the fuel has been burned, which is characterized by the main engine cut-off (MECO) point, the spaceplane enters a weightless ballistic flight phase. Five seconds after the MECO the Upper-Stage (US) is ejected from the spaceplane payload bay. The US rocket engine is ignited and brings the payload to its targeted orbit which is assumed to be a sun-synchronous orbit at 700 km altitude (SSO700) in this study.

Fig. 1
figure 1

Studied system mission. Courtesy of Swiss Space Systems

Such a system is intrinsically multidisciplinary as the design of the structure, the aerodynamics and the trajectory are highly coupled sub-problems. Finding the best solution requires to find the best compromise between the different disciplines. Multidisciplinary Design Optimization (MDO) addresses this issue by automating the design process and applying the optimization theory (Sobieszczanski-Sobieski and Haftka 1997) in order to find a better solution in less time than conventional methods. Applying MDO in the space industries can potentially lead to a large increase in profits. Examples of using MDO for spaceplane design and optimization can be found in the works of Yokoyama et al. (2007), Tsuchiya et al. (2007), Takeshi and Mori (2004), Rowell et al. (1999).

The MDO techniques were applied to the considered system. Three disciplines were considered in the optimization of the system. The trajectory, the aerodynamics and the structure. In the latter, the weight of the spaceplane was estimated using an empirical approach based on winged-space and hypersonic vehicles data developed by Harloff and Berkowitz (1988). Despite the low fidelity of this approach it encountered success in the similar study of Yokoyama et al. (2007). Higher fidelity analysis methods were employed to compute the aerodynamics. The cost of these computations is not compatible with the large number of aerodynamic evaluations required for the trajectory optimization. In order to make the computational time practical a surrogate model was employed to represent the aerodynamics of the vehicle. The surrogate model was developed based on Gradient Enhanced Gaussian process methodologies (Rasmussen 2006). A major drawback of using such techniques for fitting aerodynamic data over the Mach number range that is required for a space vehicle, is the difficulty of fitting properly both the sharply varying region in the transonic regime and the wide plateau of the hypersonic regime. The considered spaceplane in this study falls into this category with Mach number varying from zero to 10. This issue was addressed by Gramacy and Lee (2008) and by Ying et al (2007) by partitioning the design space and by implementing a mapping function for the covariance respectively. In the present study a new method is proposed using the non-stationary kernel developed by Paciorek and Schervish (2004), Plagemann et al. (2008) combined with a set of sub-level stationary Gaussian processes based on the gradients of the aerodynamic coefficients. This method shall ensure both the continuity of the generated models and limits the number of hyper-parameters to be optimized while being able to capture accurately the local and global trends of the aerodynamic characteristics of the spaceplane.

This paper will first present the optimization problem and describe the architecture chosen to address the MDO problem. The trajectory optimization procedure will be described later. This will show the necessity of having quick disciplinary evaluations because of the large number of required calculations. The weight estimation procedure will then be explained in more details. In a next section the aerodynamic analysis will be explained in details, for which an Euler formulation was selected with an adjoint formulation for the estimation of the gradients of the aerodynamic coefficients. These methods had a significant success in the field of aerospace optimization for which the work of Martins et al. (2004) and Reuther and Jameson (1995) are solid references. This will lead to the description of the new non-stationary gradient enhanced Gaussian process method which will show its successful use in solving the optimization problem in the last section where the geometrical modifications, optimal control problem results and performance improvements will be presented.

2 Problem definition

The objective of the MDO problem is to maximize the mass of the payload m payload brought to the SSO700 orbit. This section presents the mathematical definition of the problem as well as the assumptions that were use to solve the optimization problem.

The space system is composed of three distinct elements, the carrier, the spaceplane and the US. Several assumptions were made on each of these elements. The carrier aircraft, a conventional airliner available on the market, is fixed. It is assumed to have a maximum carrying capacity m max, carrier that imposed an upper limit on the possible spaceplane mass. The spaceplane and the US are the two other parts of the system to be optimized. The upper stage has been designed by a different company, so there is no freedom to modify and optimize it. Therefore the total mass of the US is assumed to be fixed for this study.

Before introducing the formulation of problem the following subsection will show how the US was treated in the optimization and how the complexity of the problem was reduced by the use a performance map.

2.1 Performance map of the US

The US total mass m US, tot includes the US dry mass m US, dry , the mass of the payload m payload and the mass of the US required fuel m US, fuel :

$$ m_{US,tot} = m_{US,dry} + m_{payload} + m_{US,fuel} $$
(1)

The US dry mass includes a constant basic mass m US,0 formed by the engine, the avionic, the reaction control system, etc... and a fuel dependent mass that accounts for the increase of tank and structure mass for an increased fuel volume. This dependency, modeled by a linear function with a slope c. The basic US mass depends on the design of the US and is assumed to be constant in this analysis. Therefore the US dry mass is expressed as follows:

$$ m_{US,dry} = m_{US,0}+ c m_{US,fuel} $$
(2)

After the separation from the spaceplane the US is assumed to completely burn its fuel before it reaches the required orbit to release the payload. Therefore the total mass that will reach the orbit is the dry mass of the US plus the payload mass. This mass is called the mass injected into orbit m injected :

$$ m_{injected} = m_{US,dry} + m_{payload} $$
(3)

Consequently the payload mass can be expressed as:

$$\begin{array}{@{}rcl@{}} m_{payload}=m_{injected}(1+c) - c m_{US,tot}-m_{US,0}(1+c) \end{array} $$
(4)

As the total US mass m US, tot is assumed to be fixed, maximizing the payload mass is equivalent to maximize to the mass injected into orbit. This assumption is used to decouple the optimization of the US segment from the optimization of the spaceplane. Indeed the trajectory of the US solely depends on the conditions at the separation point, namely the flight path angle (γ f ), the speed (v f ) and the altitude (h f ). Consequently it is possible to optimize independently the trajectory of the US according to the separation conditions. The optimization of the spaceplane can be performed separately because the final flight conditions of the spaceplane correspond to an optimal, pre-calculated, trajectory for the US leading to a given injected mass.

A large number of US trajectory optimization with different initial conditions, representing the flight conditions at the separation point, were pre-computed to obtain a 3D performance map. In these optimizations the mass injected into orbit is maximized according to the optimal control vector. An example of these maps is shown in Fig. 2. This approach allows for decoupling the trajectory optimization of the US from the optimization of the spaceplane. The injected mass into orbit being interpolated using the final flight conditions of the spaceplane. This reduces the size of the MDO problem, that makes it faster to be solved with better convergence properties.

Fig. 2
figure 2

US performance map depending on US segment initial velocity v f vs. US segment initial flight path angle γ f at an altitude of 80 km. The contours represent the ratio \(M^{\ast }_{injected}\) of the total injected mass with the baseline total injected mass

2.2 Spaceplane constraints and assumptions

Several assumption are made concerning the spaceplane. It is assumed to be a rigid body where only longitudinal dynamics are considered. It means that the roll rate, the roll angle and the sideslip angle are always zero. It is worthwhile to note that the yaw corresponds to the heading. The control surfaces were considered to be always at their zero position. The vehicle is controlled only with the use of thrust vectoring. During the initial glide phase the control surfaces are still assumed to be at their zero position despite the fact that the vehicle is unpowered. The errors associated to this initial phase actuation are negligible.

The spaceplane is also subject to several constraints. Its mass is limited by the capabilities of the carrier which can lift up to m max, carrier . Right after the spaceplane is separated from the carrier it must glide to a safe distance before igniting its engine. This glide phase is assumed to last from t = 0 to the time of the main engine ignition t MEI equal to five seconds. During this phase, the glide ratio of the spaceplane must be high enough to perform the separation maneuver. This poses the constraint that L/D>4 during the initial glide.

During the reentry phase of the spaceplane, after the US ejection, the spaceplane must be able to be naturally trimmed and statically stable at angles of attack ensuring a high drag. This drag is necessary to maximize the energy dissipation before the more dense of the atmosphere are encountered to limit the thermal and structural stress. This poses the constraint that the pitching moment coefficient C m must have a zero value lying between 35 and 45 degrees of angle of attack for Mach numbers between 5 and 10. Also, the slope of the pitching moment coefficient with respect to the angle of attack must be negative to ensure the longitudinal static stability.

Lastly the spaceplane is constrained to burn all its fuel when the MECO point, defined as the time t f , is reached.

2.3 Mathematical expression of the problem

The goal of the optimization is to maximize the payload injected into a SS0700 orbit. This injected mass is calculated by interpolating the US performance map as a function of the final state of the spaceplane at the separation point: m injected = m injected (r f , v f , γ f ). This final state, occurring at t f depends on the spaceplane geometrical variables as well as the optimal control solution for the spaceplane trajectory.

The geometry variables are selected to be the wing span b, the length of the spaceplane L ref , the double delta sweep angle at the nose λ nose and the thickness-to- chord ratio t/c of the wing. The design vector is therefore α, b, L ref nose , t/c, where \(\alpha =\left [\alpha _{1},...,\alpha _{N}\right ]\) is the optimal control vector solution where N is the number of collocation points.

The state vector is given by the flight parameters [r, ϕ, 𝜃, v,Ψ], respectively the distance to the earth center, the longitude, the latitude, the speed, and the heading. The longitude, latitude and heading are taken into account in order to add the effect of the rotation of the earth ω as well as Coriolis effect. Because SSO is slightly retrograde, this impact negatively the performances of the system with a magnitude depending on the initial coordinate. A North-American launch site was selected for this study corresponding to 100 m/s loss in velocity as an order of magnitude estimate.

The classical equations of motions of a spaceplane (Betts 2010) with the addition of the Coriolis effect terms were used in the wind frame. The atmosphere was modeled using the United State Standard Atmosphere of 1976. The problem is expressed as:

$$ \max\limits_{\alpha,b,L_{ref},\lambda_{nose},t/c}{m_{injected}(r_{f},v_{f},\gamma_{f})} $$
(5)

subject to:

$$\begin{array}{@{}rcl@{}} \dot r &=& v\sin\gamma \\ \dot \phi &=& \frac{v\sin{\Psi}\cos\gamma}{r\cos\theta} \\ \dot \theta &=& \frac{v\cos{\Psi}\cos\gamma}{r} \\ \dot v &=& -g_{0}\left( \frac{R_{e}}{r}\right)^{2} \sin\gamma + \frac{T\cos\alpha-D}{m} \\ &&+ \omega^{2}r\cos\theta(\sin\gamma\cos\theta-\cos\gamma\sin\theta\cos{\Psi}) \\ \dot \gamma &=& -g_{0}\left( \frac{R_{e}}{r}\right)^{2} \frac{\cos\gamma}{v} + \frac{T\sin\alpha + L}{mv} + \frac{v}{r}\cos\gamma\\ &&+ 2\omega\sin{\Psi}\cos\theta \\ &&+ \frac{\omega^{2}r\cos\theta}{v}(\cos\gamma\cos\theta+\sin\gamma\sin\theta\cos{\Psi}) \\ \dot {\Psi} &=& \frac{L\sin{\beta}+T\sin\alpha\sin\beta}{mv\cos\gamma} + \frac{v\cos\gamma\sin{\Psi}\sin\theta}{r\cos\theta} \\ &&+ 2\omega (\sin\theta-\cos\theta\cos{\Psi}\tan\gamma) + \omega^{2}\frac{r\sin\theta\cos\theta\sin{\Psi}}{v\cos\gamma} \\ \dot b &=& 0 \\ \dot L_{ref}& =& 0 \\ \dot \lambda_{nose}& =& 0 \\ \dot t/c &=& 0 \end{array} $$
(6)

Under the following constraints:

$$\begin{array}{@{}rcl@{}} &&\frac{L}{D}\left( \alpha_{0},...,\alpha_{t,MEI} \right) \geq 4 \end{array} $$
(7)
$$\begin{array}{@{}rcl@{}} &&C_{m}(\alpha = 35) \geq 0{for} M>4 \end{array} $$
(8)
$$\begin{array}{@{}rcl@{}} &&C_{m}(\alpha = 45) \leq 0 {for} M>4 \end{array} $$
(9)
$$\begin{array}{@{}rcl@{}} &&m_{total}(b,L_{ref},\lambda_{nose},t/c) \leq m_{max,carrier} \end{array} $$
(10)
$$\begin{array}{@{}rcl@{}} &&t_{MECO} - \frac{m_{fuel}(L_{ref})\times I_{sp}g_{0}}{T_{0}} = 0 \end{array} $$
(11)

The lift and drag are calculated from the lift and drag coefficient:

$$\begin{array}{@{}rcl@{}} L=0.5\rho v^{2} S_{ref} C_{L}(\alpha,M,b,L_{ref},\lambda_{nose},t/c) \end{array} $$
(12)
$$\begin{array}{@{}rcl@{}} D=0.5\rho v^{2} S_{ref} C_{D}(\alpha,M,b,L_{ref},\lambda_{nose},t/c) \end{array} $$
(13)

It is also assumed the fuel mass vary linearly with the spaceplane length because, since a longer vehicle means longer fuel tanks. The weight evolution with respect to the geometrical changes is captured by an empirical method for weight estimation. This method is fed with the design vector as well as a selected static load which will be explained further. The aerodynamics of the vehicle is captured by the use of a surrogate model.

2.4 Optimization architecture

In order to solve the problem the Non-Linear-Programming of the trajectory optimization characteristics were used. This takes advantage from the fact that the trajectory optimizer already solves thousands of constraints. By adding the geometrical design variables as state variables in the trajectory problem and by setting their derivatives to zero the optimizer was able to take into account all the design variables at once. This approach is simple, intuitive and particularly applicable to this problem. It gave good results with good convergence properties. One drawback is that it adds one constrain per discretization point per geometrical design variable (derivative equals to zero). This might make the problem unnecessary larger and this must be seriously considered for higher dimension problems. The procedure is shown in Fig. 3.

Fig. 3
figure 3

Optimization process

3 Optimal control problem

In order to solve the trajectory optimization problem the work of Fahroo and Ross (2002) is followed. They proposed a Chebyshev pseudo-spectral collocation method having the advantages of using Legendre-Gauss-Lobatto (LGL) points which are said to lead to a very high accuracy in the discretization and low computational costs. This method is based on a discretization of the control and/or state history which transforms the optimal control problem into a non-linear programing problem (NLP). An interpolation scheme is used to determine the time history of the control and the state variables (Betts 2010). This approach is very efficient and particularly applicable to the choice of architecture for the MDO problem. A Python code with Fortran wrapped routines was developed to solve this problem.

The optimal control problem can be stated as follows: find the control vector u(τ) minimizing

$$ J = M[x(\tau_{f}),\tau_{f}] + {\int}_{\tau_{0}}^{\tau_{f}}{L[x(\tau),u(\tau),\tau]}d\tau $$
(14)

The dynamic of the system is formulated as equality constraints:

$$ f[\dot x(\tau),x(\tau),u(\tau),\tau] = 0 $$
(15)

If \(\partial f / \partial \dot x\) is non-singular, this leads to the ordinary differential equation of the system:

$$ \dot x(\tau) = f[x(\tau),u(\tau),\tau] $$
(16)

The pseudo-spectral approach transforms this problem into a set of non-linear constraints, which leads to the following NLP: find the coefficients

$$ X = (x_{0},x_{1},...,x_{N}), \ U = (u_{0},u_{1},...,u_{N}) $$
(17)

and the final time τ f (if necessary) that minimize

$$ J^{N}(X,U,\tau_{f}) = M(x_{N},\tau_{f}) + \frac{\tau_{f}-\tau_{0}}{2}\sum\limits_{k=0}^{N}{L(x_{k},u_{k},\tau_{k})w_{k}} $$
(18)

subject to

$$\begin{array}{@{}rcl@{}} f[2/(\tau_{f}-\tau_{0})d_{k},x_{k},u_{k},\tau_{k}] = 0, \ k = 0,1,...,N \end{array} $$
(19)
$$\begin{array}{@{}rcl@{}} g_{l} \leq g(x_{k},u_{k},\tau_{k}) \leq g_{u}, \ k = 0,1,...,N \end{array} $$
(20)
$$\begin{array}{@{}rcl@{}} \psi_{l} \leq \psi[x_{0},x_{f},(\tau_{f}-\tau_{0})] \geq \psi_{u} \end{array} $$
(21)

The LGL points and the optimal weights w i are given by the Clenshaw-Curtis scheme (Clenshaw and Alan 1960). The LGL points are required to lie within the [−1,1] interval, therefore the following time transformation is required with t∈[−1,1]:

$$ \tau(t) = \frac{(\tau_{f}-\tau_{0})t + (\tau_{f} + \tau_{0})}{2} $$
(22)

This creates a large set of constraints that has to be fed to the optimizer.

3.1 Jacobian sparsity

It is possible to transform a large problem with a large number of collocation points into a set of smaller problems with less points. This leads to more precise and faster optimization. For this purpose a knotting method (Ross and Fahroo 2004) was implemented. Assuming a knot to be placed between two phases with \(\vec x^{1}\) and \(\vec x^{2}\) being the states and \({t_{i}^{1}}\) and \({t_{i}^{2}}\) the i-th being the collocation points (which are the time in most of the problem) of the phase one and the phase two respectively. The system has K state variables and the mass is the K-th plus one.

$$ g_{k=1,...,K-1} = x_{k=1,...,K-1}^{1}({t_{f}^{1}}) - x_{k=1,...,K-1}^{1}({t_{0}^{2}}) = 0 $$
(23)

In order to account for the discontinuity the following equation is required:

$$ g_{K} = {x_{K}^{1}}({t_{f}^{1}}) - {x_{K}^{1}}({t_{0}^{2}}) = 0 $$
(24)

Having two different phases is the same as having two separate smaller problems with supplementary constraints for the consistency of the problem. Under those circumstances the Jacobian of the constraints has more zero-element than the unsplit problem. When the optimizer is able to recognize the zero-terms of the Jacobian, the evaluation of this latter becomes much faster. Figure 4 shows a typical problem without and with a knot placed in the middle of the trajectory. From this figure one can observe that the sparsity of the Jacobian substantially increases by using a knot. The advantage lies in the fact that as the number of collocation points increases, the sparsity rises. The larger the number of points the more accurate the results. Because the sparsity increases, the problem, although much larger in terms of number of points, is not significantly more difficult to be solved by the optimizer.

Fig. 4
figure 4

Jacobian sparsity without (top) and with (bottom) a knot placed in the middle of the trajectory

3.2 Adaptive collocation points mesh

The previously demonstrated advantages of splitting the problem into several phases led to the implementation of an adaptive collocation points mesh in the code. The problem is solved at the first step on an initial mesh. Following an existing work (Darby et al. 2011), a residual matrix R of this initial solution is build up in the following way:

$$ R = \left| \bar D \bar X - \frac{t_{i}-t_{i-1}}{2} F(\bar X,\bar U, \tau)\right| $$
(25)

Where \(\bar D\) and \(\bar X\) are respectively the differentiation matrix and the state values at the collocation points. \(\bar t\) is defined as

$$ \bar t_{i} = \frac{t_{i}+t_{i+1}}{2} $$
(26)

With i = 1,..., N c −1, N c being the number of collocation points. The state values are interpolated on these points using a Lagrange interpolation method.

The column of the obtained matrix leads to a metric allowing for iteratively adapting the mesh:

  1. 1.

    If the metric values of q phase have the same magnitude, then the number of collocation points is increased in that phase.

  2. 2.

    If the metric values present a peak value, then a knot is placed at this peak separating the phase into two new phases.

This mesh adaptation proved itself to ease the optimization and to lead to smoother results for the optimal control.

The optimizer used to solve the NLP must be able to deal with a large (thousands) number of constraints. For this reason SNOPT (Gill et al. 2002) was chosen. This optimizer will be the driver of the MDO problem solution. Because the number of constraints to evaluate is high they must be quickly evaluated in order to keep the computational time practical. Consequently the estimation of the weight and the evaluation of the aerodynamic must be rapidly carried out.

4 Spaceplane weight estimation

The weight of the spaceplane has a very large effect on its performance and on the subsequent trajectory optimization procedure. The spaceplane follows a sub-orbital trajectory. Consequently the energy to dissipate is substantially smaller than an orbital reentry. Most of its energy is expressed in terms of potential energy instead of kinetic energy. The chance of skipping-back into space is zero. The problem due to heating is less severe but the load factor can be problematic.

The reentry aerodynamic loads are used as the sizing load case. Indeed a possible failure in the spaceplane mission is the mis-ejection of the US. In this case, jettisoning the US is not possible as it carries the most valuable item of the flight: the payload. In this case, the spaceplane reenters the atmosphere with its dry weight plus the weight of the US fully wet.

High fidelity methods such as FEM do not provide readily the overall weight of the splaceplane. An empirical method was selected for the weight estimation. Such a method had successful results in previous similar studies (Yokoyama et al. 2007; Rowell et al 1999).

The method selected for the weight estimation is based on the hypersonic aerospace sizing analysis (HASA) (Harloff and Berkowitz 1988). This method uses elements from Glatt (1974), a empirical weight estimation method for advanced transportation vehicle developed by NASA. HESA uses a series of empirical equations to estimate the weight of different weight groups of an spaceplane. These weight groups are the structure, the propulsive stack, the hydraulics and electrical systems, the avionics, the landing gear and the thermal protection system. For each of these groups a weight equation is derived from statistical regressions over hypersonic vehicle data. The method has been tested against real vehicle values (Space Shuttle) in Harloff and Berkowitz (1988) with very good results.

Because the studied vehicle has no horizontal stabilizer and a vertical stabilization assured by two winglets, the method is slightly modified by averaging the contribution of the vertical and horizontal stabilizers weighted respectively by the cosine and sine of the wingtip angles. This method proved itself to be quite efficient. The results it gave were very close to the real mass estimate of the currently studied vehicle that came from more classical design methods. The latter statement settled the decision to use it for the global multidisciplinary optimization. The iterative process only takes milli-seconds and was therefore carried out online during the optimization.

5 Aerodynamic analysis

The Stanford’s University Unstructured Code (SU2) (Palacios et al.) was used to carry out the aerodynamic analysis. It is a multi-physics code with a wide range of capabilities and parallel computation support. It possesses an optimization ready framework, which makes it very suitable for aircraft optimization.

In this study, an Euler formulation was used. Consequently the viscous terms of the Navier-Stokes equations were not considered. The choice for such a simulation can be justified by the following reasons:

  1. 1.

    Most of the flight regimes are at very high Mach numbers. At these speeds, the viscous effects are largely overcome by the wave drag. Consequently taking into account the viscous terms has less effect than at lower airspeeds.

  2. 2.

    The discretization of the Euler equations creates a second-derivative term from the Taylor expansion. This term creates an artificial dissipation that partially simulates the viscous drag.

  3. 3.

    The mesh deformation procedure for large geometry deformations has a destructive effect on the viscous layer of the mesh. Therefore a viscous mesh could not be employed and fluid analysis such as RANS could not be performed.

The CAD model of the spaceplane was used to generate an unstructured, non-viscous mesh using Pointwise. Figure 5 shows an example of the results obtained with SU2 for solving the Euler equations for a typical Mach number and angle of attack. In this figure the spaceplane flies at Mach 4 and 25 degrees of angle of attack.

Fig. 5
figure 5

SU2 calculation results at Mach 4 and 25 degrees of angle of attack

The gradients were obtained using the adjoint formulation (Jameson 1988; Pironneau 1994) implemented in SU2. Accurate gradients of the aerodynamic coefficients with respect to the Mach number, α and geometrical deformations were calculated using this method. The gradients are used to provide information for the Gaussian process gradient enhancement.

The geometry deformations were applied directly to the computational mesh using the free-form deformation (FFD)(Sederberg and Parry 1986) tool available in SU2. Samareh (Samareh 2004) demonstrated the advantages of these techniques: they avoid to place a CAD software in the mesh deformation loop which greatly eases the process and they also reduce drastically the number of design variables as the mesh nodes positions are parametrized by a limited number of control points. Instead of deforming directly the surface of the object, the FFD creates a deformation field over the entire domain. Therefore the FFD method deforms the object independently of its geometrical description.

During the mesh deformation using FFD some possible tangling of the mesh was observed, particularly when large deformations were applied. To overcome this problem, a mesh repairing technique was used (Escobar et al. 2005). An optimization procedure is set up using a quality measure of the mesh to repair any possible tangling of the mesh. This method was used after every geometry deformation in order to ensure that the mesh fed to the CFD solver does not have negative volume cells.

One aerodynamic evaluation, including the sensitivities calculation, takes approximately two minutes on a 700-core cluster to converge (the convergence of the solution was defined using a Cauchy criterion of 𝜖 = 10−5). Trajectory optimization using the pseudo-spectral method as explained earlier requires a large number of aerodynamic analysis (103 order of magnitude). This high number of evaluations is a consequence of the pseudo-spectral method. It requires the validation of thousands of constraints as the equation of motions are expressed as a set of equality constraints on every collocation point. These constraints and their Jacobian must be evaluated for every optimizer iteration. The computational burden can become significant if the evaluation method is computationally demanding. To overcome this problem a series of response surfaces of the aerodynamic coefficients are created using a Gaussian process methodology where a new approach for a non-stationary kernel is proposed.

6 Gaussian process regression for response surface creation

The need for a prompt evaluation of the aerodynamic coefficient is answered by the creation of response surfaces. This approach is commonly found in MDO problems and various techniques were investigated as in the work of Colonno (2007), Yokoyama et al. (2007) and Kaletta et al. (2004).

In this study a Gaussian process regression (GPR) approach is selected due to its ability to fit a dimensionally sparse dataset and to provide a covariance of the estimated enabling a reading of the fit quality. Due to the availability of the gradients of the aerodynamic coefficients, thanks to the adjoint formulations, the dataset information is enriched with the derivatives information. The readiness of the derivatives is also a motivation to take a novel approach to the creation of a non-stationary kernel.

6.1 Dataset sampling

In this study one of the most common methods for data sampling was chosen: the Latin Hyper Cube sampling (or LHC sampling), which was first proposed by McKay et al. (1979). It is a popular Design of Experiment (DoE) methods for GPRs. However in the present study the physics of the problem gives additional information. The aerodynamic coefficients of the vehicle displayed a highly varying behavior between Mach 0 and 2 and relatively constant behavior for higher Mach numbers. Six design variables are used to generate the samples: Mach number, angle of attack and four geometrical variables b, L ref nose and t/c. The design domain was split into two sub-domains, in which different LHC sampling with different densities were performed. This leads to a biased spatial uniformity of the DoE in the dimension of the Mach number with the advantage that the variations occurring during the transonic and low supersonic phases have better chances to be properly captured.

6.2 Gradient enhanced gaussian process

Using the obtained dataset x from the CFD calculations, the GPR, as described by Rasmussen (2006), allows for interpolating any desired point x . Describing the dataset as a multivariate Gaussian and the point of interest as another Gaussian, it is possible to use the Bayesian inference rule to obtain the following estimation distribution, with y the desired value, y the dataset and K ij are the covariance matrix between the points i and j.

$$ p(y^{*}/y) \equiv \mathcal{N}(K_{x^{*}x}K_{xx}^{-1}y , K_{xx} - K_{x^{*}x} K_{xx}^{-1} K_{x^{*}x}^{T}) $$
(27)

The expected value of y , \(\bar {y^{*}}\) and its variance squared \(\sigma _{y^{*}}^{2}\) are:

$$ \bar{y^{*}} = K_{x^{*}x}K_{xx}^{-1}y $$
(28)
$$ \sigma_{y^{*}}^{2} = K_{xx} - K_{x^{*}x} K_{xx}^{-1} K_{x^{*}x}^{T} $$
(29)

Where K xx and \(K_{x^{*}x}\) are the covariance matrices, or kernels, respectively between the dataset and itself and the interpolated point and the dataset.

GPR can be greatly improved, especially when the number of dimensions is large, by adding the gradient information to the Gaussian object. The additional information allows for reaching a satisfactory fit with less data points. This technique is sometimes referred to as co-Kriging or gradient enhanced Kriging as the work of Dwight and Han (2009). Because most of the disciplinary analysis in aircraft design can usually provide gradients, these methods are particularly relevant when the gradient information is added to the function value inside the regression process.

As proposed by Solak et al. (2003) the kernel is extended with the derivatives of the exponential covariance function:

$$\begin{array}{@{}rcl@{}} k(\frac{\partial p}{\partial x_{i}},q) =\left. \frac{\partial k(p,q)}{\partial x_{i}} \right\rvert_{q} \end{array} $$
(30)
$$\begin{array}{@{}rcl@{}} k(p,\frac{\partial q}{\partial x_{j}}) =\left. \frac{\partial k(p,q)}{\partial x_{j}} \right\rvert_{p} \end{array} $$
(31)
$$\begin{array}{@{}rcl@{}} k(\frac{\partial p}{\partial x_{i}},\frac{\partial q}{\partial x_{j}}) =\left. \frac{\partial}{\partial x_{i}}\left( \left.\frac{\partial k(p,q)}{\partial x_{j}} \right\rvert_{q} \right)\right\rvert_{p} \end{array} $$
(32)

In order to add the gradient informations to the function value it is assumed that the function and its gradients are correlated. The covariance of the initial distribution is then augmented with the covariance between the function and its gradients forming the following composite kernel:

$$ K = \left[\begin{array}{cc} k(p,q) & k(\frac{\partial p}{\partial x_{i}},q) \\ k(p,\frac{\partial q}{\partial x_{j}})& k(\frac{\partial p}{\partial x_{i}},\frac{\partial q}{\partial x_{j}}) \end{array}\right] $$
(33)

The dataset is also augmented with the gradients ∇y. As mentioned before, the gradients of the aerodynamic coefficients are obtained using an adjoint formulation. This method allows for a rapid evaluation of the derivatives of the aerodynamic functions with respect to the angle of attack, the Mach number and the geometrical parameters:

$$ Y = \left[\begin{array}{c} y \\ \nabla y \end{array}\right] $$
(34)

Adding the gradients information greatly helps to reduce the number of points needed for an acceptable fit. Here lies a strong advantage of using the adjoint for the gradient estimations. Because the derivatives in every dimensions can be obtained in only one calculation, the computational time required to obtain an accurate surrogate model can be greatly reduced as opposed to simply refining the data sample by adding training points. However as identified by Dwight and Han (2009) the gradient information can deteriorate the quality of the fit if the derivatives are not accurate enough. A very promising solution proposed by Lukaczyk et al. (2013) is to relax the ’shoot-angle’ of the gradient by adding noise. In other words the earlier assumption of an exact correlation between the gradient and the prior is relaxed. A new covariance matrix is used for the GP:

$$ \bar{K} = K + K_{\epsilon} $$
(35)

Where K 𝜖 is a noise component added to the initial covariance distribution. A useful noise model is an independent Gaussian noise with null mean (Lukaczyk et al. 2013). This results in:

$$ K_{\epsilon} = \left[\begin{array}{cc} \sigma_{n,y}^{2} & 0 \\ 0 & \sigma_{n,\nabla y}^{2} \end{array}\right] $$
(36)

where σ n, y , σ n,∇y are hyperparameters to be selected. The fact of having two different noises for the objective and the gradient relaxes the assumption on their correlation and greatly improves the fit quality in case of inaccurate gradients.

6.3 Stationary Kernel and Hyperparameters selection

A common choice to create the kernels in GPR is to use a squared-exponential function. The covariance between the point p and q is given by:

$$ k(p,q) = {\sigma_{f}^{2}} \exp\left[-\frac{1}{2\theta^{2}} \sum\limits_{k=1}^{N}(p_{k}-q_{k})^{2}\right] $$
(37)

where N is the spatial dimension, and p k and q k are the k-th components of the p and q vectors respectively. σ f and 𝜃 are the parameters of the covariance function and consequently the hyperparameters of the GPR.

The choice of the hyperparameters has a very large influence on the quality of the fit. To select the appropriate hyperparameters, it is proposed to minimize the following marginal likelihood function (Rasmussen 2006):

$$ J = -\frac{1}{2} Y^{T} [\bar K]^{-1}Y - \frac{1}{2}\log\|[\bar K]\| - \frac{n}{2}\log2\pi, $$
(38)

This results in an optimization problem:

$$ \min\limits_{\sigma_{f},\theta,\sigma_{n,y},\sigma_{n,\nabla y}} J , $$
(39)

This optimization problem is solved using the Covariance Matrix Adaptation Evolution Strategy or CMA-ES (Hansen 2006). This method is a global optimization, which is beneficial since it is difficult to give an initial solution for the hyperparameters before-hand.

The selection of the hyperparameters enlightens one of the main problems encountered when using GPR for aerospace applications: because the hyperparameters are chosen globally, they cannot be adapted for both a sharply varying region (the transonic region) and a flat region (the supersonic region). These type of results, which exhibit variable smoothness and non-linearities, are very frequent in the field of aerodynamics because of the transonic effects. In our case, the trajectory is significantly impacted by the sound barrier passage. Therefore it is important to capture the Mach one region properly. An extreme example is given in Fig. 6, which shows the evolution of the spaceplane pitching moment coefficient as a function of the Mach number for a 15 degree angle of attack. In this figure it is clear that the GPR cannot find the parameters that fit the sharply varying coefficient from Mach 0 to Mach 4 and the plateau for higher speeds. Therefore in order to properly capture the aerodynamic of the spaceplane a non-stationary kernel is required.

Fig. 6
figure 6

Pitching moment coefficient of the spaceplane estimation using GPR with a stationary Kernel as a function of the Mach number for AoA of 15 degree. The red dots corresponds to the data sample. The blue line is the GPR estimate

6.4 Non-stationary kernel

An important requirement for a non-stationary kernel is the continuity of the second derivative of the resulting GPR. The trajectory optimization can become ill-behaved if this condition is not respected and the whole MDO can fail (Betts 2010). Consequently splitting the domain and having locally adapted kernels for each sub-domain is not desirable (Gramacy and Lee 2008). Instead a smoothly varying 𝜃 is required. In this study a new technique is developed inspired by the work of Plagemann et al. (Lang et al. 2007), that takes advantages of the readily available gradients.

In order to develop a non-stationary kernel, the non-stationary covariance function described by Paciorek and Schervish (2004) is used as the starting point:

$$\begin{array}{@{}rcl@{}} k(p,q) &=& \sigma_{f} \rvert {\Sigma}_{p}|^{\frac{1}{4}} |{\Sigma}_{p}|^{\frac{1}{4}}| {\Sigma}_{q}|^{\frac{1}{4}} \left\rvert\frac{{\Sigma}_{p}+{\Sigma}_{q}}{2} \right\rvert^{-\frac{1}{2}} \\ &&\times\exp\left[ -(p-q)^{T} \left( \frac{{\Sigma}_{p}+{\Sigma}_{q}}{2} \right)^{-1} (p-q) \right] \end{array} $$
(40)

The Local Adaptation Kernels (LAKs) Σ p and Σ q transmit the local variation of the hyperparameters with respect to the spatial variations between the points p and q to the GP covariance matrix K.

In order to obtain the LAKs the benefit of having access to the gradient information is used. When the variation of the process is high, the equivalent 𝜃 must be small. When the variation of the process is small, the equivalent 𝜃 must be high. This intuition leads to the following definition of Σ:

$$ {\Sigma} = {\Sigma}\left( \left( \left\rvert\frac{\partial\mathbf{y}}{\partial \mathbf{x}}\right\rvert\right)^{-1}\right) $$
(41)

With \(\frac {\partial \mathbf {y}}{\partial \mathbf {x}}=\left [\frac {\partial y}{\partial x_{1}},...,\frac {\partial y}{\partial x_{N}}\right ]\), and N is the number of dimensions. A simple yet efficient formulation is defined:

$$\begin{array}{@{}rcl@{}} {\Sigma} = \left[\left( \left\rvert\left.\frac{\partial\mathbf{y}}{\partial \mathbf{x}}\right\rvert\right\rvert\right)^{-1}\right]\cdot \left[\left( \left\rvert\left.\frac{\partial\mathbf{y}}{\partial \mathbf{x}}\right\rvert\right\rvert\right)^{-1}\right]^{T} \end{array} $$
(42)

This leads to an N×N diagonal Σ matrix. Note that the inverse operator acts element-wise here and |⋅| is the element-wise absolute value operator.

The LAKs for the prior dataset are readily available because the derivatives exist at the data points. However when a new point must be interpolated the derivatives are not available. For this reason Latent Gaussian Processes (LGPs) are used. For every dimension, a GP is created based on the derivatives in this dimension on the global design space. This allows for interpolating the derivatives at any point in the design space and sending up this information to the LAKs at this point. LAKs are then available at any point. It is necessary to create a LAKs for every dimension because the local variation can occur in only one dimension. For example the drag coefficient presents such behavior only along the Mach number whereas along the angle of attack the behavior is smoothly quadratic.

The LAKs are defined at any point p of the design space as:

$$ {\Sigma}_{p} = \left[\xi_{p} \right]\cdot \left[\xi_{p} \right]^{T} $$
(43)

Where \(\xi _{p} = \left [\xi _{p,1},...,\xi _{p,N}\right ]\) is defined using a conventional GPR described by (28):

$$ \xi_{p,i} = K^{*}_{p,x}K^{*-1}_{x,x}\omega_{i} $$
(44)

K is the covariance matrix formed with the stationary exponential covariance function described in (37). ω i , i = 1,..., N is the metric constructed with the i-th derivative at the m-point x s of the dataset defined as:

$$ \omega_{i} = \left[ \left( \left\rvert\frac{\partial y_{1}}{\partial x_{1,i}}\right\rvert\right)^{-1}, \left( \left\rvert\frac{\partial y_{2}}{\partial x_{2,i}}\right\rvert\right)^{-1} , ... , \left( \left\rvert\frac{\partial y_{m}}{\partial x_{m,i}}\right\rvert\right)^{-1}\right] $$
(45)

In order to avoid ill-posed LGPs, a bandwidth factor is used to damp strong variation in the derivative dataset. This permits to avoid negative values in the LAKs and to reduce undesirable oscillations in the LGP. The k-th metric of the dataset in the i-th dimension ω k, i is damped with the factor a as follow:

$$ \omega_{k,i} = \min(\omega_{k,i},a \cdot \min(\omega_{i})) $$
(46)

The factor a is added to the hyperparameters to be optimized. The hyperparameters of the LGPs also have to be optimized. They are exponential covariance functions (3 hyperparameters) as well as a noise parameter. In this study, only one dispersion parameter and noise parameter were optimized for all the LGPs whereas one length scale per LGP was added to the set of the hyperparameters. This helps to reduce the exponentially increasing complexity of the optimization process with the number of dimensions. As pointed by Ying et al (2007) an increasing number of hyperparameters to optimize can lead to a difficult optimization problem. With this approach this number is kept low and the optimization of the hyperparameters can be solved in approximatively 5 minutes on a standard workstation. A new marginal likelihood function is defined, inspired by Plagemann et al. (2008):

$$\begin{array}{@{}rcl@{}} J_{NS} &=& -\frac{1}{2} Y^{T} [\bar K]^{-1}Y - \frac{1}{2}\log\|[\bar K]\| \\ &&- \sum\limits_{i=1}^{N}\left( \frac{1}{2}\log\|[ K^{*}_{i}]\| \right) - \frac{n}{2}\log2\pi \end{array} $$
(47)

Where \(K^{*}_{i}\) is the i-th LGP covariance matrix. This defines a new optimization problem:

$$ \min\limits_{\sigma_{f},\theta,\sigma_{n,y},\sigma_{n,\nabla y},\sigma_{l},\sigma_{n,l},\theta_{l,1},...,\theta_{l,N}} J_{NS} $$
(48)

This proposed method led to very appreciable results. For instance the previously described pitching moment coefficient was very difficult to be fitted properly with the stationary approach. Figure 7 shows the improvement on the fit using the Latent GPs non-stationary approach.

Fig. 7
figure 7

Pitch moment coefficient GPR and corresponding LGP as function of the Mach number for AoA 15 degree. Non-Stationary Kernel GPR

The accuracy of the new regression method is investigated using an ordered dataset of the axial force coefficient C X . The ordered dataset is compared to the values obtained using the GPR trained with 25 points sampled with a LHC method. The comparison as well as the covariance of the estimates is shown in Fig. 8 along with the regression of the same dataset using a stationary kernel in order to show the performances of the proposed method. None of the point compared in the figure was part of the training dataset of the GPR. Figure 8 shows clearly the better fit capability of the non-stationary GPR. This validation showed an average error of 1.19 % for the newly developed non-stationary GPR against 12.67 % for the stationary GPR.

Fig. 8
figure 8

Validation of the new GPR method and comparison with a stationary GPR using an axial force coefficient dataset

6.5 Aerodynamic response surfaces

The aerodynamic calculations were used to iteratively create the response surfaces using the process pictured in Fig. 9. This process is repeated until the fit quality converges.

Fig. 9
figure 9

Aerodynamic Response Surface generation flowchart

Figures 10 and 11 respectively show the drag and the lift response surfaces created using the results obtained with SU2. The variable smoothness and non-linearities of these phenomenon, a problem for the GPs as mentioned earlier, is clear on these figures. Such response surfaces are created for the aerodynamic coefficient with respect to the design vector, i.e. the angle of attack, the Mach number and the geometrical parameters. They are used to promptly obtained the aerodynamic coefficients during the MDO process.

Fig. 10
figure 10

Drag coefficient response surface obtained with SU2

Fig. 11
figure 11

Lift coefficient response surface obtained with SU2

7 Optimization results

With the help of the previously described methods the MDO problem was successfully solved. The optimal control solution is given in Fig. 12.

Fig. 12
figure 12

Optimal Control Problem solution

The optimized parameters are given in Table 1 in the form of the ratio between the optimal and the baseline. The optimized shape is displayed in Fig. 13 on top of the baseline shape. The red one is the optimized shape and the blue one the baseline.

Fig. 13
figure 13

Optimized and baseline suborbital spaceplane shape

Table 1 Optimized parameters and performance index gain

The pitching moment of the optimized vehicle is shown in Fig. 14 along with the trim line (C m = 0). It is clear on this figure that over Mach 4 the vehicle is trimmed between α = 35 and α = 45 and is longitudinally statically stable (C m, α < 0).

Fig. 14
figure 14

Pitch moment of the optimized vehicle above Mach 4 along with the trim line

The results of this optimization led to the trajectory presented in Figs. 1516 and 17.

Fig. 15
figure 15

Optimized trajectory radius and longitude as functions of the time

Fig. 16
figure 16

Optimized trajectory latitude and velocity as functions of the time

Fig. 17
figure 17

Optimized trajectory flight path angle and azimuth as functions of the time

Because of those constraints that are linked to the mission, mainly the separation from the carrier issue, the optimizer did not tend to change the spaceplane into a pure rocket. Despite the mass and drag gained by adding wings this can be compensated by taking advantage of the lift during the ascent. From the baseline, the optimal shape is longer. This gives a larger quantity of fuel and the spaceplane is consequently able to have engine thrust for a longer time. This comes with a drag and a mass penalty. However the drag becomes really small in a short time as the spaceplane ascent due to the exponentially decreasing air density. The span is broader, the main positive effect is a gain in L/D ratio but comes with a mass penalty. Having a higher glide ratio means that the spaceplane is able to have a better lift-assisted climb. It is therefore able to reach quickly a very low density atmosphere and consequently with very low drag. This might compensate the drag penalty obtained with the increase in length. The thinner wing, although heavier, also compensates for the drag penalty. It seems that the optimizer tended to increase as much as possible the aerodynamic efficiency in order to compensate the drag gained by lengthening the spaceplane. Perhaps this compensation stopped when the mass penalty because of increasing the aerodynamic efficiency became too high and jeopardized the process. The increase in the double delta has almost no effect on the aerodynamic efficiency and has very little effect on the mass. It seemed natural to say that its modification was due to the constrain on the longitudinal stability in the hypersonic regime and that having more double delta was necessary to fulfill this constrain.

8 Conclusion

This study demonstrated the optimization of a sub-orbital spaceplane using a coupling between a geometrical optimization, mainly based on aerodynamic and structural optimization, with an optimal control problem. The aerodynamics was calculated using an Euler code and used Free-Form Deformation in order to apply the geometrical changes directly to the mesh in a safe manner using a mesh untangling procedure. The sensitivities of the aerodynamic objectives, lift, drag and pitching moment were obtained using an adjoint formulation of the Euler equations for the aforementioned FFD transformations. The results were then used to feed a Gradient Enhanced Gaussian Process Regression algorithm for which a non-stationary Kernel method based on local variation estimates using latent processes was specifically developed and implemented in a Python-Fortran combination. This non-stationary approach has the advantage of being able to fit properly a variable smoothness prior such the aerodynamic of the vehicle over its whole Mach range. The weight estimation was based on empirical methods developed by a NASA contractor and the program Hypersonic Aerospace Sizing Analysis (HASA) was used. The trajectory optimal control problem was solved using a Chebyshev Pseudo-Spectral method with collocation points mesh refinement and adaptation. For the latter analysis a dedicated program was written as a combination, once more, of Python and Fortran. The sequential quadratic programming optimizer used was SNOPT which is very suited choices for solving NLPs.

The multidisciplinary optimization problem was defined as an integration of the geometry optimization and the optimal control formulation. The optimizer of the optimal control problem was used to optimize both the geometrical and control parameters at the same time. The obtained results were satisfactory. The geometrical parameters were modified less than a maximum of 15 %. The main reason for the small modification in the geometrical parameters is the fact the baseline shape originates from a pre-existing design, the european space shuttle project Hermes. The optimal trajectory coupled with the optimal shape led to a 7.49% increase in the injected mass.