Runge–Kutta–Möbius methods

In the numerical integration of nonlinear autonomous initial value problems, the computational process depends on the step size scaled vector field hf as a distinct entity. This paper considers a parameterized transformation hf↦hf∘(I-γhf)-1,\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} hf \mapsto hf \circ (I-\gamma hf)^{-1}, \end{aligned}$$\end{document}and its role in the finite step size stability of singly diagonally implicit Runge—Kutta (SDIRK) methods. For a suitably chosen γ>0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma > 0$$\end{document}, the transformed map is Lipschitz continuous with a reasonably small constant, whenever hf is negative monotone. With this transformation, an SDIRK method is equivalent to an explicit Runge–Kutta (ERK) method applied to the transformed vector field. Through this mapping, the SDIRK methods’ A-stability, and linear order conditions are investigated. The latter are closely related to approximations of the exponential function ez\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{e}^z$$\end{document} that are polynomial in z, when considering ERK methods, and polynomial in terms of the transformed variable z(1-γz)-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$z(1-\gamma z)^{-1}$$\end{document}, in case of SDIRK methods. Considering the second family of methods, and expanding the exponential function in terms of this transformed variable, the coefficients can be expressed in terms of Laguerre polynomials. Lastly, a family of methods is constructed using the transformed vector field, and its order conditions, A-stability, and B-stability are investigated.


Introduction
Devised by Dahlquist, the linear test equationẋ = λx, with parameter λ ∈ C, is the standard problem for analyzing numerical stability of time stepping methods for initial value problems of ordinary differential equations (ODEs). Although numerical stability depends both on the problem parameter λ and the method's step size h > 0, it only depends on their product z = hλ ∈ C. Thus, it can be analyzed in terms of a single parameter, the scaled vector field z. For example, integrating the test equation by a Runge-Kutta (RK) method, we obtain the recursion where the stability function S(z) is a polynomial P(z) if the method is explicit, and a rational function R(z) if the method is implicit. The method's stability region consists of the set of z ∈ C which are mapped by the stability function S into the unit circle, i.e., |R(z)| ≤ 1 in the implicit case, and |P(z)| ≤ 1 in the explicit case.
Since P(z) → ∞ when z → ∞, all explicit methods have bounded stability regions. Thus, explicit methods will do as long as |z| 1, corresponding to nonstiff problems, but only implicit methods can be stable for large vales of z. To be useful for stiff differential equations, implicit methods are typically designed so that the stability region {z ∈ C : |R(z)| ≤ 1} covers a large portion, possibly all, of C − . This way numerical stability can be maintained without severe step size restrictions.
Although simple, the linear test equation has strong implications. In a broader context Re(z) < 0 corresponds to uniform negative monotonicity and dissipation. Likewise, |z| 1 corresponds to problems with large scaled Lipschitz constants. The idea of this paper is to transform the scaled vector field into another vector field, which can be handled by an explicit method. We seek a map M : C → C such that Re(z) ≤ 0 ⇒ |M(z)| 1.

The simplest choice is a Möbius transformation
where γ > 0 is chosen so that the left half-plane Re(z) ≤ 0 is mapped into a disk of moderate radius, Here the imaginary axis in the z-plane is mapped to the boundary of the disk, and z = −1 to its inside (see Fig. 1). The motivation is that a polynomial P(w) is then equivalent to a rational function, Thus, applying an explicit Runge-Kutta (ERK) method (with a bounded stability region) to the modified vector field (which has a moderate scaled Lipschitz constant) is equivalent to applying a particular kind of implicit RK method with unbounded stability region to the original vector field, which is only assumed to be dissipative. We shall demonstrate that for a single parameter γ , this procedure is equivalent to a singly diagonally implicit Runge-Kutta (SDIRK) method, while, if several different parameters Half-planes Re z < a < 0 are mapped by the transformation z → w = z 1−γ z into circles centered at − 1 2γ 1−2aγ 1−aγ with radius 1 2γ 4aγ −1 aγ −1 . Depicted is the γ = 1 case, with color shades corresponding to different values of a. In particular, for a = 0, the left half plane Re z ≤ 0 is mapped into a subset of the unit circle, |z + 1 2 | ≤ 1 2 γ are chosen, it is equivalent to a DIRK method. We then use this equivalence to explore the behaviour of SDIRK methods on linear problems. This leads us to an expansion of the exponential function in terms of modified Laguerre polynomials. We explore how a similar transformation may be used to define a family of RK methods with B-stability and consistency that are easy to characterize. A useful review of general purpose DIRK-type methods is given by [6], where many examples are given of the different method properties, and what aspects have to be considered in the choice of methods. A full treatment of explicit and implicit RK methods is given in [2,4,5]. This also includes the special topic of B-stability [1], and its relation to A-stability [3].

From the test equation to systems of ODEs
The transformation applied to the linear test equation can be adapted to linear and nonlinear systems of ordinary differential equations.

The linear case
The linear scalar test equation provides a sufficient model for diagonalizable systems of ODEs. If A ∈ R n×n represents the vector field and A = T −1 T is its spectral decomposition, then, taking y(t) = T x(t), the systemṡ are equivalent. The latter system is merely a collection of scalar equations, whose solutions decrease monotonically if and only if the eigenvalues reside in the left complex half plane, i.e., where α[A] is the spectral abscissa of A.
Integrating this system with an RK method yields the recursion where S is the method's stability function and h = t n+1 − t n is the time step, with x n ≈ x(t n ). This recursion is equivalent to y n+1 = S(h )y n , with y n ≈ y(t n ). Therefore, the condition for stability is In other words, if the stability function S maps the negative half plane Re z ≤ 0 into the unit circle (the A-stability condition), then the numerical solution is stable whenever the differential equation is. This result generalizes to systems of equations using the Euclidean logarithmic norms and matrix norms to replace the real part and absolute value, respectively. Thus, if S satisfies the A-stability condition above, any matrix having a nonpositive Euclidean logarithmic norm M 2 [h A] ≤ 0 will map to a contraction, S(h A) 2 ≤ 1, and guarantee stability. Here, the logarithmic norm is defined by This pattern also generalizes to the nonlinear case, with certain restrictions on A-and B-stability. Following [8], for a vector field f : R n → R n we define its least upper bound (lub.) logarithmic Lipschitz constant and its Lipschitz constant We remark that a greatest lower bound logarithmic Lipschitz constant can be defined similarly, with inf in place of sup in the former equation. We will not use this quantity directly, therefore we omit the lub. qualifier, i.e., by the logarithmic Lipschitz constant we will understand M 2 . Given γ > 0, let us define M γ : R n×n → R n×n , a mapping between matrix spaces, such that

Theorem 2.1 If h, γ > 0 and A ∈ R n×n is a matrix, then the implication chain
In other words, the nonnegative definiteness of h A implies a circle condition on M γ (h A) which leads to the h-independent bound on the latter.
Instead of proving this theorem separately, we show how the same chain of implications holds in a general, nonlinear setting, where a scaled uniformly negative monotone vector field is transformed by the Möbius map to a vector field with small scaled Lipschitz constant.

The nonlinear case
Let us fix a γ > 0 and introduce the function spaces Using these we can extend the Möbius map to the nonlinear case as The domain and range in this definition are justified by the following theorem.
The second implication follows from a reverse triangle inequality. To show the first, we start from the inequality defining holds for all u, v in some suitably chosen domain. To further simplify notation, we will use capital letters to refer to these differences: Then this inequality (intended in the "for all possible pairs of n-vectors" sense) becomes

SDIRK ⇔ ERK + M
Let us consider the Möbius transform of a step size scaled vector field h f Here, as we have seen, the modified vector field has an h-independently small scaled Lipschitz constant, in the sense that even if h f has a large Lipschitz constant, hg has a small scaled Lipschitz constant. Therefore it is possible to solve the modified problem numerically using an explicit Runge-Kutta method. This leads us to our main equivalence result, stated in the following theorem. Proof Let us take a single step of step size h from x 0 to x 1 using a general s-stage explicit Runge-Kutta method given by its Butcher- A step with the explicit method takes the form of 1, . . . , s),

Introducing the variables
which is equivalent to = 1, . . . , s), Here we recognize the formula of a time step by an SDIRK method with Butcher-tableau Let us remark that a similar argument works in the DIRK case, however the transformation is more complicated. When we are at step n and time t n , we may define hg such that 1, . . . , s) holds. Then the above argument can be repeated with the appropriate γ i in place of γ .

SDIRK methods through the Möbius transformation
In this section we investigate the behaviour of SDIRK methods on linear problems through the Möbius transformation.
The two fundamental topics of interest are stability and consistency. In the linear case, both of these are studied throughR, the stability function of the method. The first is related to the magnitude ofR, the second is to the ability ofR to approximate the (complex) exponential map.

Stability
In Sect. 1 we have argued that if the ERK method's stability function is R, then the stability function of the method obtained by first transforming the vector field, then applying this ERK method to it isR A-stability then becomes Therefore it is enough to require that the image of the left half plane by the Möbius transformation is contained in the stability region of the explicit method. The previous set is the disk centered at − 1 2γ with radius 1 2γ , thus A-stability may be written as Letting P(z) = R −1+z 2γ , the condition becomes that P should map the unit disk into itself. Assuming that the coefficients of P are c k , we have Forming a vector c of these coefficients, we have the following. Due to Parseval's theorem, a necessary condition is that c 2 ≤ 1. On the other hand, one sufficient condition is that c 1 ≤ 1, implying that c 2 ≤ 1 √ deg P+1 is enough.

Consistency
As we have already mentioned, the order of consistency depends on how well the stability function approximates the exponential map. More precisely, the SDIRK method satisfying the linear order conditions up to order p can be expressed briefly as This implies that we are facing the approximation problem for some polynomial R.

Möbius-Laguerre expansion of e z
Let us introduce the modified Laguerre polynomials where L n is the nth Laguerre polynomial [7, with α = 1]. Then the following theorem holds.
Multiplying both sides by (1 − t) 2 , this is equivalent to The recursion implies that this can be rewritten as where the term −xt can be moved into the sum with n = 1. Substituting and using that t(t − 1) −1 is an involution, we arrive at our result We remark that the relation between Laguerre polynomials and the stability function of the SDIRK methods has been explored previously [5], but not through the Möbius transformation perspective.

A remark on implementation
The mathematical equivalence outlined in this paper is well mirrored in code.
In a fairly standard imperative style implementation of an SDIRK method one has three main layers -loops -of computation. First there are the time steps. Inside each of these are the stage steps, which calculate the stage values and derivatives. Inside each of these calculations one has to solve a typically nonlinear equation of the form This is usually done using an iterative Newton-like method, which becomes our last layer of computation, below this lie the majority of vector field evaluations.
If implemented in the Runge-Kutta-Möbius sense, the layers stay the same with the distinction that the iterative solver is moved down to the layer of vector field evaluations.
The reason for this is twofold. Firstly, in an explicit method there is no need for equation solving during the stage steps. Secondly, implementing an inversion such as Therefore the two viewpoints yield similar codes. This brings similar computational costs. However, when the evaluation of the map is cheaper than the solution of the corresponding nonlinear equation, the Möbius style dominates.

Construction
Assume a fixed γ > 0. Let us introduce the elementary Runge-Kutta-Möbius method N 1 (α) identified with its step function This is a single stage implicit Runge-Kutta method since We define the s-stage elementary Runge-Kutta-Möbius (RKM) method as a composition of these We will use the following remark in showing that these are Runge-Kutta methods. Proof This is the functional form of Theorem 3.1.

Proof
We proceed by induction. The one-stage case is clear.

Stability
The stability function of these methods takes the form Clearly, since this is the product of the stability functions of the components. Due to the construction, both A-and B-stability can be guaranteed by requiring the components to be A-and B-stable, respectively.
Let us characterize the B-stability of the components.
From the assumptions we have that J , H ≤ 0 ≤ H , H . Considering the signs of γ − α and γ + α, there are four cases. On the one hand, when γ ≥ α and γ ≥ −α, the inequality holds.
On the other hand, setting f = cI , and dividing both sides by J , J > 0, we get Thus, picking c < 0 constants appropriately, we see that the case where γ ≥ α and γ ≥ −α hold is the only possible one.
Proof Apply Theorem 5.3 to the elementary RKM method

Consistency
In Theorem 5.2, we have seen that these are SDIRK methods. Therefore, the s (t) weight of a t rooted tree can be expressed as a polynomial in γ , where the coefficients do not depend on γ , and these coefficients can be expressed in terms of tree weights of the underlying ERK method [2]. We are going to concentrate on the latter. More precisely, we shall provide formulae for separating the last k of the b i parameters from the rest in the order conditions. Firstly, one might separate b s from the rest using the formula where unroot maps a tree to a forest by removing its root node (and the corresponding edges). We will use t ∈ t to denote the same thing.
We introduce the formal expressions (t ( j) ).
We will use the shorter notation and write this expression as For example, These satisfy the recursion

Theorem 5.5 If k ≤ s, then
Proof The k = 0 case is clear. We proceed by induction.
Corollary 5.6 If k ≤ s, then, for any lanky tree t l , holds.
Proof Unrooting a lanky tree yields a forest that has a single member, a lanky tree of size one less. Thus, (k) can be removed from the formula, and we are left with the elementary symmetric polynomials.

Corollary 5.7
Given γ and the first s − k of the b i coefficients, it is possible to construct a polynomial such that choosing its roots as the last k of the b i coefficients, the method satisfies the first k linear order conditions.
Proof Apply the previous formula to the first k lanky trees one by one to recursively get equations of the form 1, . . . , k).
These are Viète-formulae that provide the coefficients of the polynomial.

Conclusion
In this paper we have considered a complex Möbius transformation for some γ > 0. This maps the left complex half plane to the inside of a circle of radius γ −1 .
Firstly, we have extended this transformation to linear systems. In Theorem 2.1, we have shown that this extension maps matrices with nonpositive spectral abscissa to matrices with 2-norm at most γ −1 .
Secondly, we have extended this transformation to the nonlinear system case via the formula In Theorem 2.2, we have shown that this extension maps uniformly negative monotone, Lipschitz-continous vector fields to ones with a Lipschitz-constant at most γ −1 .
Thirdly, we have argued that a step size scaled vector field h f transformed this way will therefore have an h-independent, small bound on its Lipschitz constant. Therefore an ERK method may be applied to the transformed vector field. In Theorem 3.1, we have shown the equivalence SDIRK ⇔ ERK + M γ , which says that applying an ERK method and transforming the step size scaled vector field with M γ yields the same numerical solution as applying the corresponding SDIRK method.
Fourthly, we have used the Möbius transformation to view the stability function of SDIRK methods, and by consequence their linear order and stability conditions in a new light. The transformation led us to prove a Möbius-Laguerre expansion of the exponential function in Lastly, we have used another Möbius transformation to define a new family of RKM methods. We have shown that these are SDIRK methods. In Theorem 5.3, we have extended the proof of Theorem 2.2 to characterize their B-stability, and lastly, explored their order conditions.