Consistent treatment of incompletely converged iterative linear solvers in reverse-mode algorithmic differentiation

Algorithmic differentiation (AD) is a widely-used approach to compute derivatives of numerical models. Many numerical models include an iterative process to solve non-linear systems of equations. To improve efficiency and numerical stability, AD is typically not applied to the linear solvers. Instead, the differentiated linear solver call is replaced with hand-produced derivative code that exploits the linearity of the original call. In practice, the iterative linear solvers are often stopped prematurely to recompute the linearisation of the non-linear outer loop. We show that in the reverse-mode of AD, the derivatives obtained with partial convergence become inconsistent with the original and the tangent-linear models, resulting in inaccurate adjoints. We present a correction term that restores consistency between adjoint and tangent-linear gradients if linear systems are only partially converged. We prove the consistency of this correction term and show in numerical experiments that the accuracy of adjoint gradients of an incompressible flow solver applied to an industrial test case is restored when the correction term is used.


Introduction
The computation of gradients is required for numerous applications, such as shape and topology optimisation, error estimation, goal-based mesh adaptation and uncertainty quantification. Algorithmic differentiation (AD) to automatically produce accurate derivatives for numerical codes [13,23] is a commonly used 1 3 technique [3,5,8,16,32]. In typical numerical models this involves the solution of large linear systems of the form which often represents the most expensive part of a computation. We assume here that ∈ ℝ n×n is a known non-singular matrix and x, b ∈ ℝ n are the unknown and right hand side (RHS) vectors respectively. Historically, linear solver methods have been categorised into two main groups, namely, direct solvers and iterative solvers, even though this classification has become increasingly blurred by developments that combine solvers from either category. Direct solvers are typically robust and widely used in scientific computing packages, but scale poorly with the problem size. Because of this, applications that require the solution to large linear systems such as CFD flow solvers often use efficient iterative linear solvers [29], commonly used methods include CG, BiCG, GMRES, and algebraic or geometric multi-grid methods.
When AD is applied to an algorithm that uses a linear solver, the linear solver itself is typically differentiated by manually produced replacement derivative code rather than applying AD to the solver. This is often the only practical option, for example if the linear solver is part of an external library, or if an AD-differentiated solver would be computationally inefficient [6,9] or numerically unstable [21]. A manual differentiation can take into account high-level mathematical properties of a given function, which may not be exploited by an automated AD process.
The most common way of manually differentiating calls to linear solvers (direct or iterative) is presented in [9] and hereafter referred to as Differentiated Solver Replacement (DSR). The approach assumes that a linear solver call, x=solve(A,b), is equivalent to the expression x = −1 b , which is valid if the solver computes the solution to machine precision. In this case, the derivative computation can be performed using another call to the same linear solver for a modified system, as shown in Sect. 2.
Often a numerical algorithm solves a non-linear problem and converges to a steady-state solution within a fixed-point iteration (FPI) loop. An example is the typical iterative approach to solving non-linear systems, consisting of a number of outer, non-linear iterations, each of which performs linearisation and contains calls to linear system solves. In the early phase of convergence to the non-linear solution, it is not efficient to exactly solve the linear system for a linearisation based on a poor approximation. An example in Computational Fluid Dynamics (CFD) is the typical segregated approach to solve the incompressible Navier-Stokes equations through a sequence of linear problems for the momentum and pressure correction equations [7]. In a straightforward application of AD to such algorithms, the gradients are accumulated from a zero initial solution, hence are not in FPI form. Different techniques [31] have been presented to have an FPI discrete adjoint of such algorithms. For instance, the reverse accumulation method [4]. More recently some AD tools (e.g. Tapenade [15]) even offer this capability as an option for reverse differentiation. However, except for fully coupled systems, implementation of fixed-point adjoints for algorithmically differentiated codes is complex and accumulation of gradients is most often used.
It is to be noted that in the original problem, the incomplete linear solves do not affect the accuracy of the final solution, the primal solution, provided enough outer iterations are conducted. Contrary to what one might expect, incomplete linear solves of an accumulated adjoint that uses DSR leads to inaccurate sensitivities, as the analysis and the numerical experiments in this paper show.
In this paper, we show that this is caused by neglecting the influence of the initial guess on the linear system solution, which can be significant if the system is not fully converged. The proposed C-DSR correction achieves consistency between primal and adjoint gradient computation by correctly modelling the adjoint derivatives of an algorithm that uses truncated iterative solvers with the same convergence threshold used for the primal linear systems.
A number of studies have investigated the correction of objective functionals using estimated errors and weighting with adjoint sensitivities, e.g. [12] and [33]. The approaches perform post-processing and consider error estimates derived from the converged steady-state flow solution and weight this with the converged adjoint field. This produces a correction to the objective functional computed from the converged primal. The algorithm proposed in this paper is different, in that it corrects the errors arising from incomplete linear solves in each accumulation step during the computation of the adjoint solution.
The structure of the paper is as follows. In Sect. 2, a brief introduction of AD and DSR is presented. The shortcomings of DSR in the context of reverse-mode AD of algorithms with incompletely converged linear solvers, as well as the proposed correction method C-DSR, are presented in Sect. 3. In Sect. 4 we show numerical experiments that demonstrate the effectiveness of C-DSR. Finally, a summary and conclusions are presented in Sect. 5.

Background
In this section, a brief background of AD is provided. Following this, the DSR in both forward and reverse-mode AD is presented.

Algorithmic differentiation
AD is a technique that evaluates the derivative of the output of a computer program with respect to its inputs. AD differentiates a given primal computer program by applying the chain rule of calculus to the program's sequence of elementary operations (e.g. additions, subtractions, transcendental functions) [13].
AD has two basic modes of operation, namely the forward-mode (resulting in a tangent-linear model of the primal), and the reverse-mode (producing an adjoint model of the primal). The tangent-linear model computes the product of the Jacobian matrix of the primal program with a given seed vector that has the same number of dimensions as the program input. In contrast, the adjoint model computes the product of the transpose Jacobian with a seed vector that has the size of the primal output.
In the application of AD to numerical codes, the derivative of a given scalar objective function with respect to a scalar primal input variable can be computed at almost equal cost in both tangent-linear and adjoint models. However, in many applications such as gradient-based shape optimisation with CFD, the number of design parameters is much larger than the number of objective functions that are to be computed. As a consequence, the use of adjoint models is essential to compute the gradients at a computational cost that is independent of the number of control variables [10,11,19,24].
A variety of AD tools have been developed in the past, which vary in the supported languages and the used techniques. Examples include Tapenade [15], ADIFOR [1], ADOL-C [14], dco/c++ [18], CoDiPack [30] and ADiMat [2]. The discussion in this study is valid to all types of AD tool.

Model problem
Consider a non-linear system of the form with as input and x the solution to the system. The problem can be re-formulated as Applying a linearisation technique, the numerical solution to such a system can be gained by an iterative algorithm where ℙ is the algorithm operator and the system is considered to be fully solved when m is almost zero. In each iteration of this algorithm a linear system needs to be solved: which itself is often solved by an iterative linear solver.
In many numerical models, the objective functional, J, that is going to be differentiated is implicitly dependent on the design variable through the solution x( ) of a nonlinear system of equations similar to (1). Assuming J = J(x( ), ) , the general form of such an algorithm is shown in Algorithm 1.
In Algorithm 1, the subscripts (m, M) denote non-linear (outer) iterations while the superscript N denotes the solution after N linear (inner) solver iterations. The arrows denote output. and b are being updated in the non-linear loop and x N m is the approximate solution to the linear system after N inner iterations at outer iteration m. For each linear solve, solve, x is an input (as the initial guess, x 0 m ) and an output (as the solution, x N m ). The objective functional J is dependent on the final solution of the algorithm, x N M . In the following, we consider the case that the number of inner iterations N is not sufficient to fully converge the linear, inner solver to machine accuracy, x N m ≈ −1 m b m . However, we assume that a sufficient number of M outer, non-linear iterations is conducted, each containing N inner, linear iterations. In this way, in the final outer iterations, the linear system is sufficiently close to the non-linear system, and the error in both non-linear and linear system solutions is close to machine precision.

Differentiated solver replacement: tangent-linear
The forward differentiation of the gradient of the objective functional J w.r.t. requires the differentiation of the non-linear algorithm which at iteration 'm' reads Knowing that the number of primal outer iteration is enough to drive m to zero, the differentiated system can be simplified as This requires to compute a solution to the differentiated linear system (2) in each differentiated outer iteration aṡx

3
The forward differentiation of the Algorithm 1 is shown in Algorithm 2. The function appended with the suffix '_d' represents the tangent-linear derivative of that function.
Algorithm 2: Forward-mode appl. of AD to Algorithm 1.
The forward-differentiation in Algorithm 2 naturally inherits the fixed-point form of the primal, hence the resulting tangent-linear solution and the gradients computed with it are also impervious to incomplete inner solves, as long as the number of outer iterations is sufficient. If the problem to solve is steady, then linearisation around the converged solution to (2) is sufficient, making the entire problem linear which means that inner and outer iterations solve the same problem.
As mentioned in the Introduction, the differentiation of linear solvers is in practice often performed using an approach that we refer to as differentiated linear solver replacement or DSR. A sample pseudo code of DSR in forward-mode for the linear solver in Algorithm 1 is illustrated in Algorithm 3.

Differentiated solver replacement: adjoint
The objective J is assumed here to depend on the control and the state x: J = J( , x) . Its derivative is hence In many applications, the function J( ) can be computed explicitly, without requiring linear solvers. We therefore focus in this work on the term J xẋ which is implicit, that is, it involves a linear solve for ̇x . Since x is a function of A and b, the function J also depends on A and b, or formally, x = x( , b) and J = J( , b) , and we can expand and transpose as By definition, the adjoint of the reverse-differentiated variables is which simplifies (5) to where the column vectors are expressed as row matrices and ⟨ , ⟩ denotes an inner product between matrices. Substituting ̇x with −1 (ḃ −̇ x) , one can rearrange (6) intȯJ x Tx =̇ T̄ +̇b Tb or ⟨̇x,x⟩ = ⟨̇ ,̄ ⟩ + ⟨̇b,b⟩, Then recalling from linear algebra [20], the inner product between two matrices, 1 and 2 , reads Here 'Tr' stands for trace of a matrix, i.e., the sum of its diagonal elements. Now we can expand and rewrite the inner products in RHS of (7), Finally, (8) and (9) can be replaced into (7) Therefore, b and ̄ at iteration 'm' can be expressed as follows [9] In practice, the adjoints are incremented because they may already contain previously computed sensitivities from elsewhere in the program (see Algorithm 4). One can derive the adjoint of and b and obtain the reverse-DSR as, The reverse-mode application of AD to the Algorithm 1 and the hand assembled reverse DSR are illustrated in Algorithms 4 and 5. A function appended with the suffix '_b' represents the reverse derivative of that function. For brevity, brackets are used to show the accumulation of sensitivities for matrices and vectors via reversedifferentiated functions.

3
Consistent treatment of incompletely converged iterative.

Algorithm 4:
Reverse-mode appl. of AD to Alg. 1 in which the adjoint sensitivities are accumulated.
As we will show in Sect. 3, in contrast to the primal and its forward differentiation, if the adjoint linear systems are not fully solved, b � m ≠ −T mxm , they introduce an error to the system that would not vanish even after a large number of outer iterations. In the next section, this error and its correction will be discussed.

3 3 Corrected differentiated solver replacement in reverse-mode
In this section, the forward and reverse differentiation of the Jacobi solver within an outer, non-linear iterative solver is discussed. Furthermore, we discuss the effect that the initial guess has on the solution, when the differentiated inner linear solver is only partially converged. The C-DSR correction method is then developed, which includes a correction term for this error. Finally, we demonstrate the benefit of this correction method. We choose the Jacobi solver because it is easy to prove properties of its differentiation and convergence. However, our C-DSR method also benefits other solvers, as we will show later in this paper.

Error correction for reverse differentiation of Jacobi solver
The system matrix m can be decomposed as m = m + m , where m and m hold the diagonal and off-diagonals entries of m , respectively. The iterative relaxation scheme can be written as The error due to incomplete Jacobi convergence can be expressed as where x m is the exact solution to the linear system at the (m)th outer iteration. Therefore, after N iterations starting from an initial guess x 0 m , the approximated solution obtained from the linear solver can be written explicitly as The initial guess of the system is actually the solution to the linear system in the previous outer iteration. In this context, one can write in which after a sufficient number iterations x n+1

3
Consistent treatment of incompletely converged iterative…

Forward differentiation
The tangent-linear model has the same behaviour as its primal, that is, the initial guess for the differentiated solver replacement (DSR) is the solution to the system in the previous outer iteration (see Algorithm 2): The tangent-linear model of (16) is given by where ̇x m is the exact solution to the tangent-linear problem , 1 and 2 are the errors due to incomplete solve of the primal and tangent-linear problems, respectively Even though the linear systems are not solved to machine precision in each outer iteration, the errors vanish when the outer loop is iterated sufficiently. Please be aware that the number of outer iteration, M, is considered large enough such that m = m−1 and b m = b m−1 . As a result, the initial guess and final result of the linear solver are identical to machine precision in the final outer iteration, or formally,

Reverse differentiation
The reverse-mode application of AD to the model is shown in Algorithm 4, where the sensitivities are accumulated over the reverse loop and for better clarity, except in the DSR, the primal expressions are not depicted. The incomplete convergence of the adjoint linear system means with the residual (̄b) m of the system As a result, the computation of terms ̄ and b are affected in each DSR call such that It is not difficult to derive the derivative of J w.r.t. in the reverse-mode from (4,5), which leads to an accumulated error given by The source of the error is the residual of the adjoint systems, and this error is accumulated over the outer iterations. It is important to realise that running more outer iterations does not remove the error, contrary to what might be extrapolated from the behaviour of the primal. Due to the accumulative nature of the adjoint differentiation, with standard DSR any incomplete convergence of the adjoint systems imparts an error on the gradients, which remains even if the number of outer iterations is enough for the primal algorithm to converge. To correct this error, the state of the art is to converge the inner adjoint system solves to machine precision, which makes the adjoint computation significantly more expensive than the primal. This paper proposes an alternative approach, namely an effective way to compute a correction for this error.

Reverse-DSR correction
Equation (18) can be rewritten as where the term is the influence of the initial guess on the approximated tangentlinear derivative in the (m)th outer iteration of the algorithm after N Jacobi steps (linear solver iterations), To derive the reverse differentiation of expression (25) we first rewrite it as addition of three vectors:

Consistent treatment of incompletely converged iterative…
As shown in the section 2.2.1 of [9], for such an equation the following expression holds in the reverse-mode: Moreover, from section 2.2.2 of [9], the adjoint of a multiplication expression, gives From (28,29) the influence of initial guess in the reverse-mode can be shown to be The vector x 0 m in (30) is one of the outputs of the differentiated solver (see Algorithm 4). On the other hand, DSR is based on the assumption that the linear systems are fully converged; meaning N is large enough such that However, the incomplete convergence causes this assumption to be violated. If the adjoint linear system is not fully solved the term x 0 m is not zero. Consequently, the sensitivity computation is inaccurate by the error shown in (24).
N matrix-vector products are required to compute (30), which is essentially as expensive as the primal linear solver. However, it can be computed much cheaper as a by-product of a computation that is already part of the DSR.
In order to solve T mb � m =x N m in DSR, one Jacobi iteration is performed as If the same number of iterations N are used for (32) as for the primal system and using an initial guess of zero, from (15) one obtains Using (33) it can be shown that computing the residual after N iterations can be done with a single matrix-vector product which yields exactly the same result as (30): (27) x N m =l 1 +l 2 +l 3 .
Hence, if the adjoint system of DSR in reverse-mode is not fully solved, the output variable x 0 m can be defined as the residual of the system. We call this C-DSR, as in corrected DSR. The DSR and C-DSR approaches are compared in Algorithms 6, 7 and 8.

Application of C-DSR to other solvers
In the previous section we presented a correctness proof for C-DSR with Jacobi solvers. A similar proof can be established for any other linear solver using linear operators. Linear solvers with non-linear operators, such as GMRES or CG, do not yield to this type of analysis. However, the test cases shown in the remainder of this paper demonstrates that C-DSR also leads to improved consistency for other solvers, when incomplete convergence is set at levels typical for the primal algorithm.

Test cases
In this section we first demonstrate the effectiveness of C-DSR using a one-dimensional heat equation solver that uses Jacobi iterations to solve the linear systems. Then with a three-dimensional CFD solver we show that the application of C-DSR to Krylov-type linear solvers also improves the gradient accuracy.

One-dimensional (1D) non-linear heat equation
The first validation study is the finite-difference (central differences in space, backward Euler in time) solution to a non-linear 1D steady-state heat conduction problem in a uniform rod lying on the x-axis from x L = 0 to x R = 1 where the heat conduction coefficient k is a simple linear function of temperature T, The domain (see Fig. 1) has 12 nodes and is discretised by central finite difference in space and backward Euler in time. Dirichlet boundary conditions are imposed on both ends. The temperature at the right boundary T R is defined as the control variable and the objective function is evaluated as a function of the temperature at one of the internal nodes, J = 100 × (T (i=1) ) 2 . The primal outer loop is iterated enough that in the final outer iterations the error of the linear system is close to machine zero. The Tapenade source-transformation AD tool [15] is used to differentiate the code with checkpointing of all outer iterations in the reverse-mode.
It is worth noting that this is a steady-state problem that does not require time marching; hence the adjoint solution can be computed by linearising only around the final steady state solution, without checkpointing. We solve the primal and its adjoint in this way so that it can serve as a model problem that can be extended to more complex problems such as unsteady or segregated (decoupled) solvers later in the paper.
Two different settings are considered for Jacobi solver (see Table 1). The results are compared in Table 2. The results confirm that when the Jacobi solver is solved to machine precision, the sensitivity ( dJ dT R ) obtained by DSR (in both AD-forward and adjoint) and the second order finite difference computation are in good agreement. However, when the solver is not fully solved, the computed sensitivity with DSR in reverse-mode shows a relative error of 6%. C-DSR improves the accuracy of gradient and reduces the error to machine precision.

Three-dimensional (3D) S-Bend Duct
The second validation study is an adjoint CFD computation of a VW Golf air climate duct [34], a benchmark case of the About Flow project [26] provided by Volkswagen AG. The flow is steady, laminar and incompressible with a Reynolds number of 300 at the inlet relative to the height of the duct, the domain is discretised with 40,000 hexahedral mesh cells.
The objective function is mass averaged pressure drop between inlet and outlet. To solve the flow, the in-house incompressible flow solver gpde [17] is used, which is based on the finite volume segregated SIMPLE pressure-correction method [25]. The arising linear systems for momentum and pressure correction are solved using bi-conjugate gradient stabilised (Bi-CGSTAB) and conjugate gradient (CG) linear solvers, respectively, from the SPARSKIT library [28]. The spring analogy method [27] is implemented in gpde to deform the volume mesh following a design change. The gpde solver is written in FORTRAN 90 and differentiated by the AD tool Tapenade [15] and without checkpointing all outer iterations.    To compare sensitivities, the surface mesh coordinates of the middle S-section of the duct, i , are perturbed by a cosine function, where 0 and 0 are the bump centre and the surface normal, respectively. The perturbation is designed to create an inward bump in the duct (see Fig. 2) and the bump height is controlled by the variable .
The differentiated code computes the derivative of the objective function at fully converged flow state w.r.t. the design variable, in this case the height of the perturbed bump.
In practice, the convergence criteria of linear solvers in non-linear numerical methods such as CFD solvers are determined from experience [7,22]. The solver settings for this duct flow using the gpde solver is shown in Table 3.
In addition, using several convergence criteria, different accuracies of iterative linear solver are tested for DSR in reverse-mode to determine when the precision of gradients, ̇J , computed with DSR matches that of C-DSR. The settings and the results are shown in Table 5.
The gradient computation comparison in Table 4 demonstrates the validity and significance of the correction for a practical application using Krylov solvers. Table 5 shows that tightening the convergence level improves the accuracy of gradients with DSR, but C-DSR still achieves a higher accuracy at a much smaller computational effort.

Summary and conclusions
The correct treatment of iterative linear solvers in forward and reverse-mode AD has been studied. The most commonly used previous method to differentiate linear solvers is based on the assumption that linear systems are fully converged, which in practice is often not the case. The analysis presented in our paper identifies the exact source of errors arising from incompletely converged linear systems used in inner iterations of the solution of non-linear unsteady or segregated problems. We show how this error is linked to the initial guess provided to the linear solver, and how the error accumulates to severely affect adjoint gradients of non-linear solvers. This is also demonstrated in two test cases.
The C-DSR correction term proposed in this paper is shown in our work to be exact for relaxation-type solvers such as Jacobi iterations and other iterative linear solvers. A test case with Jacobi solvers demonstrates the validity of the approach. The C-DSR correction is then applied to a test case from Computational Fluid Dynamics which uses Krylov type solvers for the inner systems. Comparing to DSR, the proposed correction shows significant improvement in the gradient accuracy with much smaller computational cost.
Because the correction formula consists of only a single matrix-vector product and a vector subtraction, the computational cost of computing the correction is small, which makes our method affordable and beneficial for widespread practical application.