1 Introduction

In a variety of applications, the subsonic flow around lifting surfaces separates at a defined location and is otherwise attached to the surface. These lifting surfaces are often formed by flexible bodies, for which the aerodynamic forces exerted are substantial enough to induce motional interactions with the structure. In these cases, the aeroelastic investigation of both structural and flow behavior is of particular interest. A diverse range of applications highlights the importance of investigating the aeroelastic behavior within these specific fluid flow contexts, for instance helicopter rotors [40], morphing wings [13], and notably, the rotor blades of wind turbines. Recent trends show an increase in the size of wind turbine rotor blades, leading to higher flexibility [35]. The high complexity of these systems precludes the use of closed analytical methods. Instead, a partitioned methodology is employed, involving an unsteady three-dimensional flow solver in conjunction with a nonlinear structural solver to ensure a comprehensive analysis of the structural system [39]. The choice of approach varies according to specific requirements. In the design phase, where numerous calculations are necessary, low-fidelity methods such as Blade Element Momentum Theory (BEMT) are often utilized, particularly in wind energy systems [30]. Conversely, for detailed investigations of fluid–structure interactions in a limited set of simulations, high-fidelity methods are preferred. These methods, which solve the three-dimensional Navier-Stokes equations using techniques like the Finite Volume Method, offer superior accuracy. However, they are associated with substantial computational demands, especially in scenarios involving variable fluid–structure boundaries [38].

The structure itself is usually modelled numerically using finite elements, e.g., finite beam elements in cases where the overall structural dynamics are of interest, or finite shell models in cases where a detailed investigation of the local structural behavior such as the stress field is carried out [39].

In many cases, mid-fidelity methods based on potential flow theory offer an attractive compromise between accurately predicting the unsteady three-dimensional flow behavior and feasible computational costs. One such method, which has been applied to investigate aeroelastic phenomena, is the unsteady vortex-lattice method (UVLM). Among others, Palacios et al. developed and verified a coupled UVLM-beam solver for the investigation of flexible aircraft [29]. Murua and colleagues similarly investigated flight dynamics and aeroelastic effects on aircraft using a coupled UVLM-beam approach with linearization in state-space [28]. Hang et al. also performed an analytical sensitivity analysis of UVLM to investigate aircraft but coupled this with both linear beams and shells [20]. An analytical sensitivity analysis of UVLM was also performed by Stanford and Beran to optimize flapping wings [36]. Muñoz-Simón et al. used UVLM to investigate the aerodynamic behavior of wind turbines and compared it with BEMT results, finding that UVLM better captures the unsteady effects occurring in real operating conditions [27]. Ritter and colleagues validate the UVLM, including linearization in state-space, against experiments, using a highly flexible wing called Pazy Wing [31]. Goizueta et al. compare different models of this wing to predict flutter as preparation for wind tunnel tests [19]. The Pazy Wing benchmark was used for validation in a comparison of several nonlinear aeroelastic solvers, especially with regard to flutter characteristics [32]. The main difference of the aeroelastic framework [23] used within our work compared to the ones describe above is that we directly solve the nonlinear aeroelastic equlibrium without using a linearized state-space. The linearization of the aerodynamic forces is based on the unsteady Bernoulli equation with respect to generalized structural coordinates and velocities. We apply an implicit time integration scheme based on discrete time derivatives. For details on the aeroelastic framework, its UVLM implementation and linearization we refer the reader to [23]. While Newton’s method is the standard nonlinear system solver in any implicit time integration, inexact Newton algorithms depend on the application-specific structure. To the best of our knowledge, our work is the first investigation of inexact Newton algorithms for aeroelastic simulations using the UVLM.

The coupled problem can either be solved in a weak form, where only the influence of the aerodynamic forces on the structure is considered at a certain time, or in a strong form, where both problems are evaluated at the same time instance and the coupled nonlinear governing equation of the aeroelastic system has to be solved [26]. For systems experiencing nonlinear behavior, e.g., due to large deformations, a strong coupling is preferable. To integrate the governing equation in time, implicit time integration has proven more robust [23]. Consequently, the coupled nonlinear governing equation needs to be solved in each time-step. This is typically done by applying the Newton algorithm, which requires the linearization of the equation, i.e., the evaluation of a derivative matrix of a nonlinear function.

The main drawback of the (exact) Newton algorithm is the potentially high cost for evaluating the derivative matrix and for solving the linear Newton system at each iteration. On the other hand, due to the excellent local convergence, it is often possible to save computational cost by solving the exact Newton system approximately or by solving a system with an approximated derivative marix. These two approaches are both promising in our context because the full derivative matrix is the sum of a structural derivative (which is cheap to compute and to invert) and an aerodynamic derivative (which has entries of smaller magnitude but is expensive to compute). Then the first possiblity is a quasi-Newton type algorithm wherein the approximate Newton step is computed by neglecting the expensive aerodynamic derivative entirely, i.e., solving the linear system with the structural inverse. This algorithm has already been proposed in [23]. The second possiblity is a so-called inexact Newton algorithm wherein the same structural inverse is used to solve the exact Newton system to a pre-specified accuracy by iterative refinement. This algorithm is novel for aeroelastic UVLM simulation. Both approaches have their pros and cons. In particular, the quasi-Newton type algorithm is faster per iteration but may converge slowly or even diverge whereas the inexact Newton algorithm is slower per iteration but more robust and can even guarantee quadratic local convergence; for details see Sect. 3. The precise computational cost of each algorithm depends on the individual costs of evaluating each of the two derivatives and of solving with the full inverse versus the approximate inverse. As it turns out, the overall situation is quite peculiar in our context.

The remainder of this paper is structured as follows. In Sect. 2 we give a brief overview of the structural and aerodynamic simulation models along with their coupling by the unsteady vortex-lattice method. Exact and approximate Newton algorithms and their specific application within our aeroelastic simulation are presented in Sect. 3. In Sect. 4 we conduct numerical experiments with the three Newton type algorithms on a flexible plate and on a wind turbine. Finally, we summarize our findings in Sect. 5.

2 Aeroelastic simulation

2.1 Continuous governing equation

This subsection provides a brief overview of the structural model employed in this article. For a more in-depth understanding and comprehensive explanations, we direct the reader to refer to the articles [16,17,18, 21, 22].

The structural model relies on a rotation-free multibody system formalism within the Finite Element Method (FEM) presented in [18], consisting of rigid bodies, geometrically exact beams and geometrically exact solid-degenerate shells [16]. The kinematics are described in the total Lagrangian formulation, in a primal-dual approach, including generalized coordinates and generalized velocities. The nonlinear mechanical governing equations are derived based on the variational principle leading to the constrained governing equation for a material body \(\mathcal {B}_0 \subseteq \mathbb {R}^3\),

$$\begin{aligned} \delta S&= \int _{\mathcal {B}_0} \bigl (\langle \delta \varvec{x},\dot{\varvec{l}}(\varvec{v},t) + \varvec{f}^{\textrm{int}}(\varvec{x},t) - \varvec{f}^{\textrm{ext}}(t) + \varvec{H}(\varvec{x},t)^{\textrm{T}} \varvec{\lambda }(t)\rangle \nonumber \\&\quad + \langle \delta \varvec{v}, \varvec{l}(\varvec{v},t) - \varvec{l}(\dot{\varvec{x}},t)\rangle + \langle \delta \varvec{\lambda }, \varvec{h}(\varvec{x},t)\rangle \bigr ) \, dV = 0. \end{aligned}$$
(1)

Here \(\langle \cdot ,\cdot \rangle \) denotes a scalar product, \(\varvec{x}(\theta _i,t)\) and \(\varvec{v}(\theta _i,t)\) are the positions and velocities depending on the generalized coordinates and generalized velocities of the canonical models, \(\varvec{\lambda }(t)\) is the vector of Lagrange multipliers of the holonomic constraint \(\varvec{h}(\varvec{x},t) = 0\) and \(\delta (\cdot )\) denotes their admissible variations. A material point is defined by the coordinates \(\theta _i\), with \(i = 1,2,3\), in the body’s reference coordinate system. The internal and external force densities are denoted by \(\varvec{f}^{\textrm{int}}(\varvec{x},t)\) and \(\varvec{f}^{\textrm{ext}}(t)\). \(\varvec{H}(\varvec{x},t)\) is the Jacobian of kinematical restrictions. Due to the primal-dual formulation, the continuous governing equation enforces the equality of the displacement-based and velocity-based momentum densities by incorporating \(\varvec{l}(\varvec{v},t) - \varvec{l}(\dot{\varvec{x}},t)\). Inertial forces are considered in the time derivative of the velocity-based momentum density \(\dot{\varvec{l}}(\varvec{v},t)\).

2.2 Spatial and temporal discretization

The discrete governing equations of the multibody system Finite Element approach are derived through the spatial approximation of the kinematics, specifically the continuous generalized coordinates and velocities of the flexible bodies, using finite elements and solving the integral (1). The temporal discretization is achieved by employing an implicit time integration scheme, which is based on the average vector field method. This approach ensures the preservation of linear and angular momentum as well as the total energy of the system. The methodology is elaborated in Armero et al. [1] and has been integrated into the current framework by Gebhardt et al. in [17]. Consequently, this leads to the weak form of the discrete governing equations:

$$\begin{aligned} \delta \tilde{S}_{n+1}&= \bigl \langle \delta \tilde{\varvec{q}}_{n+1/2}, \dot{\tilde{\varvec{l}}}(\tilde{\varvec{s}}_{n+1},\tilde{\varvec{s}}_n) \nonumber \\&\quad + \tilde{\varvec{f}}^{\textrm{int}}(\tilde{\varvec{q}}_{n+1},\tilde{\varvec{q}}_n,\tilde{\varvec{s}}_{n+1},\tilde{\varvec{s}}_n) \nonumber \\&\quad - \tilde{\varvec{f}}^{\textrm{ext}}(\tilde{\varvec{q}}_{n+1},\tilde{\varvec{q}}_n) \nonumber \\&\quad - \tilde{\varvec{f}}^{\textrm{ext},\textrm{nc}}(\tilde{\varvec{q}}_{n+1},\tilde{\varvec{q}}_n) \nonumber \\&\quad + \tilde{\varvec{H}}(\tilde{\varvec{q}}_{n+1},\tilde{\varvec{q}}_n)^{\textrm{T}} \tilde{\varvec{\lambda }}_{n+1/2} \bigr \rangle \nonumber \\&\quad + \langle \delta \tilde{\varvec{s}}_{n+1}, \tilde{\varvec{l}}(\tilde{\varvec{s}}_{n+1},\tilde{\varvec{s}}_n) - \tilde{\varvec{l}}(\tilde{\varvec{q}}_{n+1},\tilde{\varvec{q}}_n)\rangle \nonumber \\&\quad + \langle \delta \tilde{\varvec{\lambda }}_{n+1}, \tilde{\varvec{h}} (\tilde{\varvec{q}}_{n+1},\tilde{\varvec{q}}_n)\rangle = 0, \end{aligned}$$
(2)

in which n denotes the discrete time instant at \(t_n\). The spatially discrete quantities are indicated by a tilde accent. In (2), \(\tilde{\varvec{f}}^{\textrm{ext},\textrm{nc}}(\tilde{\varvec{q}}_{t_{n+1}},\tilde{\varvec{q}}_{t_n})\) specifies the discrete vector of non-conservative forces. In an aeroelastic framework, this vector consists of aerodynamic forces, which are further elucidated in subsequent sections.

2.3 Computation of aerodynamic loads using the UVLM

The aerodynamic loads are computed by the unsteady vortex-lattice method (UVLM) based on the work in [15, 23, 25, 33, 38]. We assume that the influence of viscosity is restricted to a very thin boundary layer. This boundary layer is dominated by vortices which can only be generated there. We further assume the fluid outside of the boundary layer to be incompressible and inviscid, also known as an ideal fluid [24]. For flows with a high Reynolds number and a low Mach number, this is a reasonable approximation, which highly simplifies the computation. The fluid behavior can then be described by the well-known Euler equation and the continuity equation for incompressible flows given by

$$\begin{aligned} \partial _t \varvec{v}(\varvec{r}, t) + (\varvec{v}(\varvec{r}, t) \cdot \nabla ) \varvec{v}(\varvec{r}, t)&= -\frac{1}{\frac{1}{\rho _F}} \nabla p(\varvec{r}, t), \end{aligned}$$
(3)
$$\begin{aligned} \nabla \cdot \varvec{v}(\varvec{r}, t)&= 0, \end{aligned}$$
(4)

where \(\varvec{v}\) is the fluid particle velocity at its spatial position \(\varvec{r}\) and time instant t, \(\rho _F\) the fluid density and p the pressure field. Additionally, the velocity field is characterized by a potential flow and can be expressed by Helmholtz’s decomposition as the superposition of a scalar potential \(\varphi (\varvec{r},t)\) and a vector potential \(\varvec{\Psi }(\varvec{r},t)\) [2] in the following form:

$$\begin{aligned} \varvec{v}(\varvec{r},t)&=\nabla \varphi (\varvec{r},t) +\nabla \times \varvec{\Psi }(\varvec{r},t) \nonumber \\&= \varvec{v}_\varphi (\varvec{r},t) + \varvec{v}_\psi (\varvec{r},t), \end{aligned}$$
(5)

where the scalar potential component of the velocity is irrotational, and the vector potential component captures any vorticity effect. Furthermore, the kinematical description of the flow can be described by the vorticity field \(\varvec{\Omega }\) defined by

$$\begin{aligned} \varvec{\Omega }(\varvec{r}, t) = \nabla \times \varvec{v}(\varvec{r}, t). \end{aligned}$$
(6)

Since flow conditions here are assumed to be dominated by vortices, the fluid velocity is

$$\begin{aligned} \varvec{v}(\varvec{r}, t) = \varvec{v}_\infty (\varvec{r}, t) + \varvec{v}_B(\varvec{r}, t) + \varvec{v}_W(\varvec{r}, t), \end{aligned}$$
(7)

with the external flow field velocity \(\varvec{v}_\infty \), the velocity induced by bounded vortex sheets \(\varvec{v}_B\) (e.g., boundary of the fluid defined by a structural surface), the velocity induced by free vortex sheets \(\varvec{v}_W\), e.g., the wake. Without compromising generality, we assume that \(\varvec{v}_B\) and the free-stream component are absorbed by \(\nabla \varphi \) while the field \(\varvec{v}_W\) is identified with \(\nabla \times \varvec{\Psi }\). Incorporating (5) into (4) and (6) facilitates the derivation of a Laplace equation that delineates the fluid’s boundary conditions, and a Poisson equation that correlates the potential flow with vorticities [23, 38]. Consequently, the velocities induced by vortices are characterized by the analytical solution provided by the Biot-Savart law. For a vortex segment, this law is defined as follows:

$$\begin{aligned} \varvec{v}_{\Gamma }(\varvec{r}, t)=\frac{\Gamma }{4 \pi } \int _{C(s, t)} \frac{\hat{\varvec{T}}(s, t) \times (\varvec{r}- \varvec{r}_0(s))}{\left\| {\varvec{r}- \varvec{r}_0(s)}\right\| _2^3} \, ds(t), \end{aligned}$$
(8)

where \(\hat{\varvec{T}}\) is the unit tangent vector to the curve C and \(\Gamma \) is the segment’s vorticity.

In the context of the UVLM, the infinitely thin boundary layer around lifting surfaces, where vortices and thus vorticity can be generated or destroyed, is idealized as vortex sheets which in turn are discretized as vortex-lattices with vortex rings consisting of vortex segments having finite circulations. Within each ring, we use vortex segments of equal and constant strength. The computation of the flow field essentially involves the application of the non-penetration boundary condition at the fluid’s boundary, specifically at the structural surface. This condition dictates that the normal component \(\varvec{n}(\varvec{r},t)\) of the relative velocity at the surface should be zero, i.e.,

$$\begin{aligned}&\varvec{v}(\varvec{r},t) \cdot \varvec{n}(\varvec{r},t) \nonumber \\&\quad = \bigl [ \varvec{v}_\infty + \varvec{v}_B(\varvec{r}, t) + \varvec{v}_W(\varvec{r}, t) - \varvec{v}_S(\varvec{r}, t) \bigr ] \cdot \varvec{n}(\varvec{r},t) \nonumber \\&\quad = 0, \end{aligned}$$
(9)

where \(\varvec{v}_S\) is the velocity of the moving surface. Equation (9) reduces the problem to solving for the unknown circulations of the bounded vortex rings. The aerodynamic loads on the surface are ascertained based on (3) solving numerically the unsteady Bernoulli equation, which links the pressure differences across the vortex elements to the prevailing velocity field.

2.4 Aeroelastic solution

An important aspect in combining structural and aerodynamic models numerically is the strategy adopted for the transfer of information between their respective meshes. We transfer the aerodynamic loads coming from the UVLM into the structural model, stating that for any time the virtual work done by the aerodynamic loads on the aerodynamic mesh is equal to the virtual work done by the generalized aerodynamic forces \(\tilde{\varvec{f}}^{\textrm{ext},\textrm{ae}}\) on the nodes of the structural mesh [15, 23]. The spatial coordinates of any point on the fluid domain are mapped into the configuration space of the structural model using weighted linear surjective vector-valued mapping functions comprising radial-based \(C^\infty \) bump functions with compact support [23].

Following (2), the discrete nonlinear governing equation in a strong fluid–structure interaction, wherein the structural behavior and aerodynamics are concurrently solved at an identical time instant in terms of the structural variables, can now be stated as follows:

$$\begin{aligned}&\varvec{f}(\tilde{\varvec{q}}, \tilde{\varvec{s}}, \tilde{\varvec{\lambda }}_{n+1}) \nonumber \\&\quad = \left[ \begin{array}{c} \tilde{\varvec{f}}^{\textrm{int}}(\tilde{\varvec{q}},\tilde{\varvec{s}}) - \tilde{\varvec{f}}^{\textrm{ext},\textrm{ae}}(\tilde{\varvec{q}},\tilde{\varvec{s}}) +\dot{\tilde{\varvec{l}}}(\tilde{\varvec{s}}) + \tilde{\varvec{H}}(\tilde{\varvec{q}})^{\textrm{T}} \tilde{\varvec{\lambda }} \\ \tilde{\varvec{l}}(\tilde{\varvec{s}}) - \tilde{\varvec{l}}(\tilde{\varvec{q}}) \\ \tilde{\varvec{h}}(\tilde{\varvec{q}}) \end{array}\right] _{n+1} \nonumber \\&\quad =\varvec{0}, \end{aligned}$$
(10)

where the subscript \(n+1\) indicates that the unknowns are solved at time step \(t_{n+1}\). We do not explicitly express (10) with the specific time dependencies arising from the employed time integration method, denoted as \((\tilde{\varvec{q}}_{n},\tilde{\varvec{q}}_{n+1}, \tilde{\varvec{s}}_{n},\tilde{\varvec{s}}_{n+1},\tilde{\varvec{\lambda }}_{n+1/2})\) and outlined in (2). However, it is important for the reader to bear in mind these temporal dependencies when considering the system’s dynamics. In (10), the first equation represents the discrete dynamic equilibrium, capturing the balance of forces acting on the system. The second equation is the momentum equivalence, and the third one represents the discrete constraints equations. We solve the nonlinear governing equation iteratively applying Newton’s method, deriving its linearized form with the Taylor expansion by neglecting the higher order terms to:

$$\begin{aligned} \varvec{f}(\tilde{\varvec{q}}, \tilde{\varvec{s}}, \tilde{\varvec{\lambda }})_{n+1}^{k+1} =\varvec{f}(\tilde{\varvec{q}}, \tilde{\varvec{s}}, \tilde{\varvec{\lambda }})_{n+1}^{k} +\Delta \varvec{f}(\tilde{\varvec{q}}, \tilde{\varvec{s}}, \tilde{\varvec{\lambda }})_{n+1}^{k} = \varvec{0}, \end{aligned}$$
(11)

where k stands for the iteration step within the Newton iteration process and \(\Delta \varvec{f}\) is the discrete increment of \(\varvec{f}\) obtained by calculating the partial derivatives with respect to the discrete generalized variables and the Lagrange multiplier, i.e.,

$$\begin{aligned} \Delta \varvec{f}(\tilde{\varvec{q}}, \tilde{\varvec{s}}, \tilde{\varvec{\lambda }}) _{n+1}^{k}&= \frac{\partial {\varvec{f}}}{\partial {\tilde{\varvec{q}}}} \Delta \tilde{\varvec{q}} +\frac{\partial {\varvec{f}}}{\partial {\tilde{\varvec{s}}}} \Delta \tilde{\varvec{s}} +\frac{\partial {\varvec{f}}}{\partial {\tilde{\varvec{\lambda }}}} \Delta \tilde{\varvec{\lambda }} \nonumber \\&= \varvec{K}(\tilde{\varvec{q}}, \tilde{\varvec{s}}. \tilde{\varvec{\lambda }})_{n+1}^{k} \Delta \tilde{\varvec{y}} . \end{aligned}$$
(12)

In (12), \(\varvec{K}\) is the derivative matrix (or Jacobian) of (10) and \(\Delta \tilde{\varvec{y}}=(\Delta \tilde{\varvec{q}}, \Delta \tilde{\varvec{s}}, \Delta \tilde{\varvec{\lambda }})\). We evaluate all partial derivatives in the linear system (12) analytically as described in [21] and [23]. We solve the system for all variables, including the Lagrange multiplier. Within the aeroelastic approach, the matrix consists of two parts: firstly, the Jacobian of the structural model, derived from the partial derivatives of the discrete structural model, the discrete equivalence of linear momentum, and the constraint equations; and secondly, the Jacobian of the discrete generalized aerodynamic forces, which involves the partial derivatives of the aerodynamic forces. For an exhaustive explanation and derivation of both matrices, readers are referred to [18, 21,22,23].

3 Solution algorithms

In this section we focus on solving the nonlinear equation system for an integration step of the strongly coupled aeroelastic system within an implicit integration scheme. We start with a brief recapitulation of the relevant Newton techniques as such before discussing their application in our aeroelastic simulation.

Given a smooth function \(\varvec{F}: \mathbb {R}^n \rightarrow \mathbb {R}^n\), we seek a solution \(\varvec{y}^*\) of the nonlinear equation system

$$\begin{aligned} \varvec{F}(\varvec{y}) = 0. \end{aligned}$$
(13)

The standard Newton iteration starts with an initial guess \(\varvec{y}_0\) and updates each iterate \(\varvec{y}_k\) with a step \(\varvec{p}_k^N\) that solves the linearization of (13) at \(\varvec{y}_k\), called the Newton system,

$$\begin{aligned} \varvec{y}_{k + 1} = \varvec{y}_k + \varvec{p}_k^N \end{aligned}$$

where

$$\begin{aligned} \varvec{F}'(\varvec{y}_k) \varvec{p}_k^N = -\varvec{F}(\varvec{y}_k). \end{aligned}$$
(14)

The key property of Newton’s method is its quadratic local convergence, which is guaranteed if the derivative matrix \(\varvec{F}'(\varvec{y}) \in \mathbb {R}^{n \times n}\) is invertible and \(\varvec{F}'\) is Lipschitz continuous in a convex domain around \(\varvec{y}^*\). Modern affine-invariant convergence theory even provides sharp estimates of the convergence speed and convergence radius in terms of the Lipschitz constant (which measures the maximal curvature of \(\varvec{F}\)) and of the initial step \(\varvec{p}_{0}^N\) [11]; see also [3] for the seamless extension to constrained nonlinear least-squares problems. Moreover, global convergence can be enforced with a steplength control.

$$\begin{aligned} \varvec{y}_{k + 1} = \varvec{y}_k + \alpha _k \varvec{p}_k^N, \quad \alpha _k \in (0, 1], \end{aligned}$$
(15)

where the steplength \(\alpha _k\) is determined by a standard backtracking line search based on the Armijo condition with a parameter \(c \in (0, 1)\),

$$\begin{aligned} \left\| {\varvec{F}(\varvec{y}_k + \alpha _k \varvec{p}_k^N)}\right\| _2^2 \le (1 - \tfrac{c}{2}) \left\| {\varvec{F}(\varvec{y}_k)}\right\| _2^2 . \end{aligned}$$
(16)

This damped Newton iteration converges to a stationary point \(\varvec{y}^*\) of \(\left\| {\varvec{F}(\varvec{y})}\right\| _2\) under mild assumptions. In the following we refer to the standard Newton method as exact Newton method and reserve the notation \(\varvec{p}_k^N\) for the exact Newton step.

Another essential property of Newton’s method is that it suffices to solve the linear system (14) approximately: small errors in \(\varvec{p}_k^N\) do not destroy the local convergence, although the convergence rate will usually be superlinear or merely linear instead of quadratic. An easy way to exploit this property consists in replacing each matrix \(\varvec{F}'(\varvec{y}_k)\) with an invertible approximation \(\varvec{B}_k\), either an ad hoc choice or a systematic choice, when the exact derivative is expensive to evaluate or to factorize. Quasi-Newton methods represent the most prominent systematic approach in this spirit. Originally proposed by Broyden [4] based on prior work of Davidon [5], the key idea is to construct \(\varvec{B}_{k+1}\) from \(\varvec{B}_k\) by some cheap update formula such that the secant condition holds,

$$\begin{aligned} \varvec{B}_{k+1} \varvec{p}_{k} = \varvec{F}(\varvec{y}_{k+1}) - \varvec{F}(\varvec{y}_k), \end{aligned}$$
(17)

where the quasi-Newton step \(\varvec{p}_{k} \approx \varvec{p}_k^N\) solves

$$\begin{aligned} \varvec{B}_k \varvec{p}_{k} = -\varvec{F}(\varvec{y}_k). \end{aligned}$$
(18)

The matrix \(\varvec{B}_{k+1}\) is not uniquely determined by (17), and different additional requirements lead to different quasi-Newton methods. Moreover, it is possible to work directly with the inverse \(\varvec{H}_k = \varvec{B}_k^{-1}\) to obtain the step \(\varvec{p}_{k} = -\varvec{H}_k \varvec{F}(\varvec{y}_k)\) without solving a linear system. These quasi-Newton methods are primarily employed for minimizing (or maximizing) a function \(\phi : \mathbb {R}^n \rightarrow \mathbb {R}\) by solving the stationarity condition \(\phi '(\varvec{y}) = 0\). Here we have \(\varvec{F}(\varvec{y}) = \phi '(\varvec{y})\), the Jacobian of \(\varvec{F}\) is the symmetric Hessian of \(\phi \), \(\varvec{F}'(\varvec{y}) = \phi ''(\varvec{y})\), and the convergence theory is even richer than for general equations [8, 9, 12]. In the sequel we will use the term quasi-Newton method for any Newton type iteration that replaces \(\varvec{F}'(\varvec{y}_k)\) in (14) with some approximation \(\varvec{B}_k\) to compute the step \(\varvec{p}_{k}\) from (18).

Of course, approximate solutions of (14) can also be obtained by iterative linear solvers, which leads to the class of inexact Newton methods. Originally proposed by Dembo, Eisenstat and Steihaug [6, 7], these methods have been further developed and analyzed by Deuflhard [10, 11] and others. They produce a finite sequence of increasingly accurate approximations \(\varvec{p}_{k}^1, \dots , \varvec{p}_{k}^i =:\varvec{p}_{k}\) to the current exact step \(\varvec{p}_k^N\), thus offering the advantage of controlling the residual error \(\varvec{r}_k\) of the inexact Newton step \(\varvec{p}_{k} \approx \varvec{p}_k^N\),

$$\begin{aligned} \varvec{r}_k = \varvec{F}'(\varvec{y}_k) \varvec{p}_{k} + \varvec{F}(\varvec{y}_k), \end{aligned}$$
(19)

such that it matches the progress of the Newton iteration,

$$\begin{aligned} \left\| {\varvec{r}_k}\right\| \le \eta _k \left\| {\varvec{F}(\varvec{y}_k)}\right\| , \quad \eta _k \in (0, 1). \end{aligned}$$
(20)

A proper adaptive choice of the forcing sequence \((\eta _k)_{k \in \mathbb {N}}\) then guarantees superlinear or even quadratic local convergence. Thus inexact Newton methods can reduce the effort for solving the linear system (14). However, unlike quasi-Newton methods, they require the evaluation of the derivative \(\varvec{F}'(\varvec{y}_k)\) or at least of the product \(\varvec{F}'(\varvec{y}_k) \varvec{p}_{k}\) at every iterate.

In our setting, the quasi-Newton and inexact Newton algorithms use the same approximate Jacobian, \(\varvec{B}_k \approx \varvec{F}'(\varvec{y}_k)\), and the trial steps \(\varvec{p}_{k}^j\) are generated by iterative refinement,

$$\begin{aligned} \varvec{p}_{k}^j = \varvec{p}_{k}^{j-1} - \varvec{B}_k^{-1} \varvec{r}_k^{j-1}, \quad \varvec{r}_k^j = \varvec{F}'(\varvec{y}_k) \varvec{p}_{k}^j + \varvec{F}(\varvec{y}_k), \end{aligned}$$

starting with \(\varvec{p}_{k}^0 = 0\) and hence \(\varvec{r}_k^0 = \varvec{F}(\varvec{y}_k)\). Local convergence of the inexact Newton approach is then guaranteed if the approximation quality of each \(\varvec{B}_k\) is sufficiently good in the rigorous sense that \(\varvec{B}_k\) gives a spectral radius

$$\begin{aligned} \rho _k := \rho (\varvec{B}_k^{-1} \varvec{F}'(y_k) - I) {\mathop {<}\limits ^{!}} 1, \end{aligned}$$

because this implies convergence of \(\varvec{p}_{k}^j\) to \(\varvec{p}_k^N\),

$$\begin{aligned} \left\| {\varvec{r}_k^j}\right\| \le \rho _k \left\| {\varvec{r}_k^{j-1}}\right\| \le \rho _k^j \left\| {\varvec{F}(\varvec{y}_k)}\right\| \rightarrow 0. \end{aligned}$$

The quasi-Newton approach, where \(\varvec{p}_{k} = \varvec{p}_{k}^1\), can still converge slowly in this situation because it does not control the reduction of the residual error, and it can even diverge if \(\left\| {\varvec{F}(\varvec{y}_k + \varvec{p}_{k}^1)}\right\| \ge \left\| {\varvec{F}(\varvec{y}_k)}\right\| \). The inexact Newton approach is therefore preferable from a theoretical perspective.

We are now ready to proceed with the nonlinear Newton system (10) of the implicit time step for the aeroelastic simulation. Omitting the tilde for simplicity, the (discrete) variables are \(\varvec{y}= (\varvec{q}, \varvec{s}, \varvec{\lambda })\). The derivative \(\varvec{F}'(\varvec{y})\) consists of two parts: one from the mechanical structure and one from the aerodynamics,

$$\begin{aligned} \varvec{F}'(\varvec{y})&= \varvec{K}^{\textrm{str}}(\varvec{y}) + \varvec{K}^{\textrm{aero}}(\varvec{y}) \nonumber \\&= \left[ \begin{array}{ccc} \varvec{K}_{\varvec{q}\varvec{q}}(\varvec{y}) &{} \varvec{K}_{\varvec{q}\varvec{s}}(\varvec{y}) &{} H_d(\varvec{q})^T \\ \varvec{K}_{\varvec{s}\varvec{q}}(\varvec{y}) &{} \varvec{K}_{\varvec{s}\varvec{s}}(\varvec{y}) &{} 0 \\ H_d(\varvec{q})^T &{} 0 &{} 0 \end{array}\right] +\left[ \begin{array}{ccc} \varvec{K}^{\textrm{aero}}_{\varvec{q}\varvec{q}}(\varvec{y}) &{} \varvec{K}^{\textrm{aero}}_{\varvec{q}\varvec{s}}(\varvec{y}) &{} 0 \\ 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0\end{array}\right] . \end{aligned}$$
(21)
Fig. 1
figure 1

Illustration of the rigidly supported flexible plate

The structural derivative \(\varvec{K}^{\textrm{str}}(\varvec{y})\) possesses a near-symmetric saddle-point structure with sparse blocks. It is comparatively cheap to evaluate and to factorize. The aerodynamic derivative \(\varvec{K}^{\textrm{aero}}(\varvec{y})\) is non-symmetric with two nearly dense blocks, which are very expensive to evaluate and which have small entries as compared to \(\varvec{K}^{\textrm{str}}(\varvec{y})\). Moreover, the full derivative \(\varvec{F}'(\varvec{y})\) is considerably more expensive to factorize than just \(\varvec{K}^{\textrm{str}}(\varvec{y})\) because of the two dense blocks. The individual matrix blocks are defined as follows. The first row contains partial derivatives of the dynamic equilibrium, giving structural blocks

$$\begin{aligned} \varvec{K}_{\varvec{q}\varvec{q}}(\varvec{y}) = \frac{\partial {\varvec{f}^{\textrm{int}}(\varvec{q},\varvec{s})}}{\partial {\varvec{q}}}, \,\, \varvec{K}_{\varvec{q}\varvec{s}}(\varvec{y}) = \frac{\partial {[\varvec{f}^{\textrm{int}}(\varvec{q},\varvec{s}) +\dot{\varvec{l}}(\varvec{s})]}}{\partial {\varvec{s}}}, \end{aligned}$$

and aerodynamic blocks

$$\begin{aligned} \varvec{K}^{\textrm{aero}}_{\varvec{q}\varvec{q}}(\varvec{y}) = -\frac{\partial {\varvec{f}^{\textrm{ext},\textrm{ae}} (\varvec{q},\varvec{s})}}{\partial {\varvec{q}}}, \,\, \varvec{K}^{\textrm{aero}}_{\varvec{q}\varvec{s}}(\varvec{y}) = -\frac{\partial {\varvec{f}^{\textrm{ext},\textrm{ae}} (\varvec{q},\varvec{s})}}{\partial {\varvec{s}}}. \end{aligned}$$

The second row contains partial derivatives of the momentum equivalence, giving

$$\begin{aligned} \varvec{K}_{\varvec{s}\varvec{q}}(\varvec{y}) = -\varvec{l}'(\varvec{q}), \quad \varvec{K}_{\varvec{s}\varvec{s}}(\varvec{y}) = \varvec{l}'(\varvec{s}). \end{aligned}$$

Finally the Jacobian of the holonomic constraint, \(\varvec{H}_d(\varvec{q}) = \varvec{h}'(\varvec{q})\), appears in the lower left and (transposed) in the upper right of \(\varvec{K}^{\textrm{str}}(y)\).

We consider three solution approaches for the nonlinear equation system:

  1. 1.

    the exact Newton method wherein (14) is solved directly;

  2. 2.

    a quasi-Newton method wherein (18) is solved directly with \(\varvec{B}_k = \varvec{K}^{\textrm{str}}(\varvec{y}_k)\) [23];

  3. 3.

    an inexact Newton method wherein (14) is solved by iterative refinement using the structural inverse \({\varvec{K}^{\textrm{str}}(\varvec{y}_k)}^{-1}\) as preconditioner for the full inverse \({\varvec{F}'(\varvec{y}_k)}^{-1}\).

At each iteration, approach (1) requires one factorization of \(\varvec{F}'(\varvec{y}_k)\) and one backsolve, approach (2) requires one factorization of \(\varvec{K}^{\textrm{str}}(\varvec{y}_k)\) and one backsolve, and approach (3) requires one factorization of \(\varvec{K}^{\textrm{str}}(\varvec{y}_k)\) and several backsolves. In all cases we employ Pardiso [34] from the Intel Math Kernel Library (MKL) as direct sparse solver for the factorization and the backsolves.

The cost per iteration is very high for approach (1) since the entire partially dense matrix \(\varvec{F}'(\varvec{y}_k)\) needs to be evaluated and factorized. For approach (2) the cost per iteration is very low since the expensive dense blocks of \(\varvec{K}^{\text {aero}}(\varvec{y}_k)\) are never evaluated and only the sparse matrix \(\varvec{K}^{\textrm{str}}(\varvec{y}_k)\) needs to be factorized. For approach (3) the cost per iteration is in between since again only \(\varvec{K}^{\textrm{str}}(\varvec{y}_k)\) needs to be factorized but \(\varvec{K}^{\text {aero}}(\varvec{y}_k)\) needs to be evaluated for the refinement steps.

4 Computational results

We consider two computational examples: a thin, flexible plate in an external airflow and a wind turbine in operating conditions. All computations are performed on a single core of a desktop PC with eight AMD Ryzen 7 7700 cores and 64 GiB of RAM. The simulation code DeSiO is implemented in Fortran 90, the inexact Newton algorithm is implemented in C++, and the time measurements are performed using the C++ library boost::chrono [37].

The time steps in both examples are so small that each Newton iteration starts in the local convergence domain; therefore we perform full-step iterations without any steplength control. Moreover, the employed approximate inverse \({\varvec{K}^{\textrm{str}}(\varvec{y})}^{-1}\) is quite close to the full inverse \({\varvec{F}'(\varvec{y})}^{-1}\) throughout all simulations. This means that all three Newton algorithms converge safely and rapidly, which allows for a direct comparison of computation times.

Table 1 Geometry and material parameters for beam model of flexible plate
Table 2 Simulation parameters for aeroelastic analysis of flexible plate

4.1 Plate

The following analysis explores the dynamic aeroelastic response of a thin, flexible aluminum plate when subjected to an external airflow. The plate, measuring \(10.0\,\hbox {m}\) in length, \(1.0\,\hbox {m}\) in width, and \(8.0\,\hbox {mm}\) in thickness, adheres to standard isotropic linear elastic properties, featuring a Young’s modulus of \(7.0 \times 10^{10}\,\hbox {N}/\hbox {m}^{2}\), a shear modulus of \(2.63\times 10^{10}\,\hbox {N}/\hbox {m}^{2}\), and a material density of \(2.7\times 10^{3}\,\hbox {kg}/\hbox {m}^{3}\). The plate’s free ends are hinge-supported, and the torsional degree-of-freedom at the left end is constrained as well. The air inflow is characterized by a free-stream velocity, denoted as \(V_{\infty } = 45\,\hbox {m}/\hbox {s}\), and an angle of attack \(\alpha = 15^{\circ }\). The flow is applied impulsively and the air density is specified as \(\rho _F = 1.225\,\hbox {kg}/\hbox {m}^{3}\). A summary of the geometric and material input data for both the structural and aerodynamic models is given in Table 1. In our aeroelastic model, we discretized the infinitely thin boundary layer using a mesh of \(m_A \times n_A = 50 \times 4\) vortex-rings, where \(m_A\) denotes the span-wise discretization and \(n_A\) represents the chord-wise discretization. The structural model is characterized by \(m_S = 50\) geometrically exact beam elements. As mentioned above, the information transfer between both models is ensured by a radial-based function. For this, it is necessary to define a fixed transferring radius which serves as a distance threshold, limiting the range of interest in transferring information among the models. For this example, we use a search radius of \(\gamma _{\textrm{ref}} = 0.501\,\hbox {m}\) at each node.

The simulation is conducted until a steady-state solution is attained, necessitating a total simulation time of \(T_n = 3\,\hbox {s}\). The convection of the free wake is achieved considering a characteristic length \(\Delta L = 0.25\,\hbox {m}\) resulting in a time increment \(\Delta t = 5.56 \times 10^{3}\,\hbox {s}\). To mitigate the effects of singularities resulting from vortex-induced velocities, we introduce a cut-off value, denoted as \(\delta = 0.01\). The main simulation parameters are summarized in Table 2.

In Fig. 2, we show some physical results for validation purposes. On top, in Fig. 2a, the displacement of the node at the center of the beam over time is given. It can be seen that the aerodynamic forces first excite the structure before aerodynamic damping ensures that a steady state is approached. The displacement occurs mainly in the vertical direction, i.e., almost no rotation takes place. A similar behavior is observed with regard to the lift coefficient \(C_L\), shown at the bottom in Fig. 2b. After some initial vibrations, the system approaches a steady state. The resulting angle of attack is shown together with the analytical solution for a finite plate [25] for the resulting steady-state angle of attack, \(\alpha = 15.05^\circ \). It should be noted that this analytical solution is derived for elliptic airfoils and small angles of attack. Therefore, the small deviation of ca. 12.5% in the resulting lift coefficient is acceptable. It should also be mentioned that the settings for our simulation are chosen not to ensure a description as physically accurate as possible but to test the different solution algorithms. As expected, the results for all chosen solution algorithms are identical since the same system of equation is solved.

Fig. 2
figure 2

Results of the aeroelastic simulation of the flexible plate

Table 3 Flexible plate: runtimes in seconds and iteration counts
Fig. 3
figure 3

Structural (left) and aeroelastic (right) model of the IEA 15 MW reference wind turbine including wake in DeSiO (blade surfaces shown for visualisation only)

Table 4 Wind turbine: runtimes in seconds and in percent of exact Newton (%E), quasi-Newton (%Q), inexact Newton (%I), and iteration counts

The absolute tolerance for all three Newton algorithms is \(10^{-8}\), i.e., each iteration terminates when \(\left\| {\varvec{F}(\varvec{y}_k)}\right\| \le 10^{-8}\). For the inexact Newton algorithm we consider two different forcing sequences,

$$\begin{aligned} \eta _k^1&= \min \Bigl (\frac{1}{2}, \left\| {\varvec{F}(\varvec{y}_k)}\right\| \Bigr ), \quad \eta _k^2 = 10^{-5} \eta _k^1. \end{aligned}$$

Both choices guarantee quadratic convergence. The computational results are presented in Table 3, where “Inexact Newton i” refers to forcing sequence \((\eta _k^i)\).

We observe that in all cases at most half the total simulation time is spent on solving the nonlinear equation system (10) by the respective Newton algorithms. The remaining computing time (roughly 2100 s total or 3.9 s per time step) is almost entirely spent on UVLM evaluations during the time integration. Within the Newton algorithms the time is also predominantly spent on UVLM evaluations, i.e., on arodynamic forces and, except for quasi-Newton, on the associated derivative matrices. We focus on the computation time spent in the Newton algorithms. Herein, the actual solving (of linear systems) amounts to less than 10% (exact), 5% (quasi-Newton), and 1% (inexact), respectively.

The quasi-Newton algorithm is clearly the fastest nonlinear solver in this simulation. On the one hand, it generates only approximate Newton increments, which increases the number of required Newton steps by roughly 20% as compared to the exact Newton algorithm. On the other hand, it avoids the evaluation of aerodynamic derivatives \(\varvec{K}^{\textrm{aero}}(\varvec{y}_k)\) entirely and in addition it solves the modified linear system much faster, which ultimately reduces the total solution time to just 13%. There is only one drawback: the performance of the quasi-Newton method depends entirely on the quality of all encountered approximate inverses, which is not controlled or even measured; therefore the algorithm can become very slow or even diverge in unfortunate cases.

The inexact Newton algorithm avoids this drawback, and with the first forcing sequence it often produces the same Newton increments as the quasi-Newton algorithm, taking only one or two linear system solutions per Newton step. However, the additional control of the linear system residual requires the evaluation of aerodynamic derivatives \(\varvec{K}^{\textrm{aero}}(\varvec{y}_k)\). Thus, although the linear system solving is just moderately slower than with the quasi-Newton algorithm and still ten times as fast as with the exact Newton algorithm, the total solution time increases beyond that of the exact Newton algorithm because of the additional 20% Newton steps. This is a very peculiar situation: the time for linear system solving is virtually irrelevant since the time for evaluating UVLM data dominates everything.

The previous observations motivate the choice of the second forcing sequence. Here we increase the effort spent in the (cheap) linear solver by asking for five additional decimal digits of accuracy for every Newton increment. In effect, this yields essentially the same accuracy as in the exact Newton algorithm and precisely the same number of Newton steps: 2625 instead of 3038. Now the number of iterative refinement steps almost doubles as compared to the first forcing sequence, but the time for linear system solution increases by less than 5% while the total solution time decreases by 14%, or by 7% as compared to the exact Newton algorithm.

In summary, with a succinct choice of the forcing sequence, the inexact Newton algorithm outperforms the exact Newton algorithm while matching its accuracy and guaranteeing quadratic convergence (provided that the structural inverse \({\varvec{K}^{\textrm{str}}(\varvec{y}_k)}^{-1}\) is sufficiently accurate for convergence of the iterative refinement). The quasi-Newton algorithm outperforms both of them by far, although with no convergence guarantee at all.

4.2 Wind turbine

As a second example we consider the transient simulation of a wind turbine in operating conditions. We use the IEA 15 MW reference wind turbine [14] with a rotor diameter of 240 m in this example. Since we are only interested in the aeroelastic simulation, we do not consider any hydrodynamic forces or control input, instead considering the turbine fixed at the bottom of the tower. To avoid an initial impulse on the system, we start the aeroelastic simulation in a state based on the results of two structural pre-simulations. We first consider the deformation due to gravitity and, based on this, we then perform the structural computations for the rotor to ensure that the velocities transferred from the structural model to the aerodynamic model are correct in the first time step.

The wind speed is constant at \(V_\infty = 8\,\hbox {m}/\hbox {s}\), and starting from its initial velocity the rotor can rotate freely, i.e., we do not consider any generator torque. According to the report [14], we set the pitch angle to a fixed value of \(\varphi = 0^{\circ }\) and the initial rotation speed to 5.6 rpm. These conditions are between the cut-in wind speed and the rated wind speed of the turbine.

The wind turbine model consists of four geometrically exact beams for the tower and the three blades while the hub and nacelle are modelled as rigid bodies. The material and geometrical properties of all components are again based on the report [14]. Only the blades are considered in the computation of the aerodynamic forces; the influence of tower, hub, and nacelle is neglected. Each beam consists of 20 elements, the vortex sheet is discretized by 20 elements in span-wise and 4 elements in chord-wise direction. A simulation time of 10 s is considered with a time step \(\Delta t = 0.05\,\hbox {s}\). The resulting wake is illustrated in Fig. 3.

The absolute tolerance for all three Newton algorithms is now \(10^{-6}\), and for the inexact Newton algorithm we now consider just one forcing sequence (which again guarantees quadratic convergence): \(\eta _k = \min (\frac{1}{2}, \left\| {\varvec{F}(\varvec{y}_k)}\right\| )\). The computational results are presented in Table 4.

Here the time spent on solving the nonlinear system (10) dominates the computational effort in case of the exact and inexact Newton algorithms: with roughly 1000 s (5 s per time step) it is almost three times as large as the cost for UVLM evaluations with roughly 350 s (1.7 s per time step). Within the Newton algorithms, however, the time is still predominantly spent on UVLM evaluations, which take almost 90% or even more for all three algorithms. The solving of linear systems takes 7.8% (exact), 10.3% (quasi-Newton), and 1.2% (inexact), respectively.

As before, the quasi-Newton algorithm is the fastest nonlinear solver with an even better relative performance: it is almost 12 times as fast as the exact Newton algorithm, in part due to the fact that the number of Newton steps is not increased. In constrast, it is slightly decreased from 992 to 986. This indicates that all approximate inverses are of excellent quality in this particular simulation.

The inexact Newton algorithm now produces the same iterates as the quasi-Newton algorithm, with exactly one linear system solutions per Newton step (which confirms the excellent quality of all approximate inverses). Hence the additional cost is immediate from the table: it amounts to 869.46 s of 959.30 s (4.348 s of 4.729 s per time step) for UVLM evaluations and 2.80 s of 11.73 s (0.014 s of 0.0587 per time step) for checking the residual accuracy in the linear solver. This is still an extreme overhead between a quasi-Newton algorithm and the corresponding inexact Newton algorithm which is based on the quasi-Newton inverse. In comparison to the exact Newton algorithm, on the other hand, the cost for UVLM evaluations and linear solving are both reduced, with the linear solver being responsible for 89% of the total time savings. This is in line with the typical behavior of inexact Newton techniques.

In summary, the inexact Newton algorithm now outperforms the exact Newton algorithm by a moderate factor (and without an artificially increased accuracy) whereas the quasi-Newton algorithm outperforms both of them by a factor over ten.

5 Conclusion

In this article we have tested a quasi-Newton type algorithm and an inexact Newton algorithm to accelerate nonlinear system solving in implicit integration schemes for aeroelastic simulations using the unsteady vortex-lattice method. Both algorithms are based on the structural inverse as approximation of the full aeroelastic inverse. On the tested computational examples the structural inverse approximates the full inverse sufficiently well for both approximate Newton algorithms to converge. On these examples the quasi-Newton type algorithm yields a roughly ten-fold gain in performance over the exact Newton algorithm. In contrast, the inexact Newton algorithm is found to be moderately faster than the exact Newton algorithm on the wind turbine problem but much slower than the quasi-Newton algorithm although both algorithms generate identical iterates. On the flexible plate example, the inexact Newton algorithm is even slower than the exact Newton algorithm but moderately faster with increased linear termination accuracy. This strange and unexpected behavior is easily explained by the fact that the UVLM evaluations dominate the computational cost to an extent that makes the cost for linear system solving almost irrelevant. In conclusion, the situation remains unsatisfactory because of two facts. First, the inexact Newton algorithm is much slower than the quasi-Newton algorithm even if both of them generate exactly the same Newton iterates, just because the latter checks for sufficient accuracy while the former does not. Second, the performance of the quasi-Newton algorithm depends entirely on the approximation quality of the approximate inverses encountered; the algorithm can diverge if that quality happens to be very poor, even in cases where the inexact Newton algorithm still guarantees quadratic convergence. Our future work will therefore explore more efficient ways to compute the derivatives of aerodynamic forces that are required to check sufficient accuracy of the linear residuals.