Practical splitting methods for the adaptive integration of nonlinear evolution equations. Part I: Construction of optimized schemes and pairs of schemes

We present a number of new contributions to the topic of constructing efficient higher-order splitting methods for the numerical integration of evolution equations. Particular schemes are constructed via setup and solution of polynomial systems for the splitting coefficients. To this end we use and modify a recent approach for generating these systems for a large class of splittings. In particular, various types of pairs of schemes intended for use in adaptive integrators are constructed.


Introduction
Operator splitting techniques for the efficient numerical integration of evolution equations ∂ t u(t) = F(u(t)), t ≥ 0, u(0) given, (1.1) have become increasingly popular in recent years. Splitting the right-hand side F(u) into two or more components in an appropriate way enables efficient and accurate approximations. In particular, a number of higher-order schemes with real or complex coefficients have been constructed and analyzed. Relevant contributions to this fields can, e.g., be found in [7,8,10,12,13,15,[19][20][21]23]. Furthermore, application to particular problem classes have been studied in the literature where the vector field F has special properties, such that splitting methods can be tuned for such cases. In [9] and [18], for instance, perturbations of integrable systems have been considered, say F(u) = A(u) + ε B(u) where ε is s small perturbation parameter. Exploiting this perturbation structure allows the construction of more efficient (de facto) higher-order schemes compared to generic ones.

Overview
We present some new contributions to the topic of splitting methods; here we will concentrate on the generic case, i.e., no special properties of the vector field F are assumed. At first we review the approach from [1] for the automatic setup of order conditions represented by polynomial equations in the coefficients to be determined. Special cases involving symmetries or composition methods based on lower-order schemes can be treated as well. Splitting of the right-hand side of (1.1) into two or three components is considered. The goal is to identify good schemes of a desired order p. 'Good' refers to a compromise between efficiency (minimizing effort) as well as accuracy (minimizing a measure for the expected behavior of the local error). In particular, we focus on the constructions of pairs of schemes of orders ( p, p + 1), where a scheme of order p acts as a 'worker', while a related scheme of order p + 1 plays the role of a 'controller' for the purpose of practical local error estimation. The idea of using pairs of embedded schemes (an idea related to Runge-Kutta pairs) is due to [17]. Via more flexible embeddings, optimized variants can be constructed. Here, 'optimization' means searching for schemes where a reasonable measure for the behavior of the local error becomes minimal among a set of comparable schemes. It is well-known that this is a very relevant point, because such local error measures may vary over several orders of magnitude.We also consider alternative ways of choosing ( p, p + 1)-pairs, e.g., adjoint pairs.
Concerning the search for optimal solutions for a given set of order conditions (see Sect. 4), different techniques were applied, depending on the particular case at hand, including exact, symbolic solution representations using 1 Maple (for lower-order schemes), or numerical searches using optimization tools or straightforward Monte-Carlo techniques.
The ultimate purpose is adaptive integration of evolution equations based on a reliable local error control. This topic has been studied in detail, in particular in the context of Schrödinger equations, in [2,[4][5][6]. In these papers, an alternative method for local error estimation has been constructed and analyzed. It is based on a computable high order approximation of an integral representation of the local error in terms of the defect of the numerical solution. While this approach is rather universal and useful in several cases, the alternative of using optimized pairs of schemes, if applicable, will usually be more efficient.
In Part II of this work we will present a detailed study of adaptive integration, using both approaches for local error estimation, for different types of linear and nonlinear evolution equations.

Remark 1
Recently we became aware of the paper [9], where a method of deriving order conditions has been proposed which is similar to our approach. Both approaches are based on the notion of a Lyndon basis (also called Lyndon-Shirshov basis) in a free Lie algebra. In view of the similarities between our work and [9], we stress that we have implemented a fully automatic computational procedure for deriving order conditions which requires no extra analytical hand work. This is a versatile implementation, and it can easily be adapted to cover special cases like palindromic schemes, flexible embeddings, and also splitting into more than two operators (see Sects. 2, 3).
The procedure for setting up higher order conditions involves the generation of long weighted sums of power products of noncommuting variables representing the components of the split vector fields. These sums can easily be distributed in order to obtain a significant speed-up in a parallel environment, and we have realized such a version.

Problem setting and notation
For an evolution equation (1.1) where the right-hand side is split into two components, a single step of a multiplicative splitting scheme, starting from u and over a step of length h, is given by 2  and is also expected to be extended in the future, depending on further investigations on the topic at hand. We will refer to this webpage throughout as reference [3] to avoid listing coefficients in the present paper for the sake of brevity. Some remarks on the schemes collected in [3] are given in Sect. 5; for more detailed information about the properties of the various schemes we also refer to [3]. In Sect. 7 we present a numerical example.

Order conditions
Many authors have contributed to the topic of finding good methods. For an overview on the topic see [7,20]. Here we do not attempt to describe the relevant approaches and results in detail but mainly refer to work related to our present activity. For the relevant mathematical background we refer to [7,15,20]. Among many others, [8,10,12,13] are devoted to the construction of optimal higherorder methods with real or complex coefficients, either via composition or by solving a set of order conditions generated in different ways. Order conditions take the form of a polynomial system in the unknown coefficients or composition weights ω μ , see Sect. 2.2. In the following we recapitulate and illustrate by examples how order conditions can be set up according to [1]; as mentioned before, this is similar to one of the approaches taken in [9]. Later on we will also present optimized schemes and pairs of schemes obtained on the basis of this approach, where 'optimized' means that a measure for the local error is chosen as small as possible.

Setup of order conditions
There are different ways to generate a polynomial system representing the conditions on the splitting coefficients for a desired order p. An essential theoretical basis is the well-known Baker-Campbell-Hausdorff (BCH) formula, see for example [15].
The approach proposed in [1], which we follow here, also relies on the BCH formula, but order conditions are set up in a completely automatic way. Most of the schemes and pairs of schemes specified in [3] have been obtained on the basis of the algorithm from [1]. In the following we explain and illustrate this approach by means of examples. For the purpose of generating order conditions it is sufficient to consider the case of a linear operator split into two parts A and B. We denote Consider the Taylor expansion of the local error 3 of a one-step method starting at u, The method is of order p iff L (h) = O(h p+1 ); thus the conditions for order p are given by For the case of a splitting method we have (with k = (k 1 , . . . , k s ) ∈ N s 0 ) If the conditions (2.2) are satisfied up to a given order p, then the leading term of the local error is given by h p+1 dh p+1 L (0). This leading error term is a linear combination of higher-order commutators of the operators A and B. As explained in [1], a non-redundant set of order conditions can be built in a recursive way by generating the symbolic expressions (2.3) for q = 1, 2, 3, . . . in terms of formally linear but non-commuting operators A, B, and identifying coefficients associated with power products of A-and B-factors which uniquely identify commutators out of an appropriate basis of Lie-elements. For this purpose we use the so-called Lyndon basis, also called Lyndon-Shirshov basis, of the free Lie algebra generated by A and B. The elements of this basis are represented by the (associative) Lyndon words over the alphabet {A, B}, see Table 1.
Let us first illustrate the procedure by means of a simple example.
The basic consistency condition for order p = 1 is d dh L (0) = 0 which is equivalent to a 1 +a 2 = 1 and b 1 +b 2 = 1. Assuming these first-order conditions are satisfied, the second derivative d 2 dh 2 L (0), which now represents the leading error term, simplifies to the commutator expression giving the additional condition 2 a 2 b 1 = 1 for order p = 2. Assuming now that the conditions for p = 2 are satisfied, the third derivative d 3 dh 3 L (0), which will now represent the leading error term, is a linear combination of the commutators If a scheme of order 3 is desired, the system of equations is augmented by the further equations 3 a 2 2 b 1 = 1 and 3 a 2 b 2 1 = 1. (For the case s = 2 displayed here, the resulting system of equations has no solution; we need s ≥ 3.) In general, for arbitrary s and p, this procedure is continued up to the desired order, by 'implicit recursive elimination' as described in [1], automatically producing a generically non-redundant set of order conditions for a desired order p. This process is based on a special bijection between (associative) Lyndon words and bracketed, non-associative versions of these words which, in our context, are identified with higher-order commutators representing basis elements for the free Lie algebra generated by A and B. The expanded version of such a commutator is a Lie polynomial in terms of the non-commutative variables A and B. The essential point is that its leading monomial, with respect to (alphabetically increasing) lexicographical order, is precisely the monomial represented by the corresponding Lyndon word; see [11].
In the following, the relation '<' refers to lexicographical order of words over the alphabet {A, B}.
Example 2 Consider a scheme of order p = 4, i.e., assume that the conditions up to order p = 4 are satisfied. Then, d 5 dh 5 L (0) is a linear combination of commutators, or non-associative words, listed below and represented by the six Lyndon words of length 5 (see Table 1), The commutators are bracketed, non-associative versions of these words, 4 As mentioned above, the leading (lowest) monomials in the expanded commutators, in the sense of lexicographical order, correspond to the Lyndon words. Note that some of these monomials also occur in lower commutators ('lower' again in the sense of lexicographical ordering). Let us now denote these six commutators by K k , k = 1 . . . 6. We a priori know that d 5 dh 5 L (0) is of the form, with 5 = 6, where the scalars κ k are multivariate polynomials of degree 5 in the coefficients a j , b j of the underlying scheme of order p = 4. Therefore the additional conditions for order p = 5 are given by Extracting these coefficients κ k from the expression (2.3) for d 5 dh 5 L (0) is a combinatorial challenge, but we can do better: We simply extract the coefficients of the Lyndon monomials-let us denote them by λ k -which is a standard operation in computer algebra. Now, instead of (2.7a) we require (2.7b) In our example, for κ = (κ 1 , . . . , κ 6 ) T and λ = (λ 1 , . . . , λ 6 ) T we have where the lower diagonal entries correspond to the additional occurrence of the λ k in non-leading positions. Therefore the systems (2.7a) and (2.7b) are equivalent.
The situation displayed in this example occurs also in the general case. For any order p, the vectors κ and λ consisting of polynomials of degree p+1 satisfy λ = M κ where M is a lower triangular matrix with unit diagonal. In particular, a Lyndon monomial λ k never occurs in an expanded commutator K j for j > k because this would contradict the leading position [11] of the Lyndon monomial λ j > λ k in K j .

Special cases: symmetries
In the sequel, denotes the adjoint scheme associated with S .
The order conditions generated by the algorithm indicated in Sect. 2.1 are generically non-redundant. However, there exist special cases: -Symmetric (or: 'time-symmetric') one-step schemes are characterized by the property For symmetric splitting schemes we have either a 1 = 0 or b s = 0, and the remaining coefficient tupels (a j ) and (b j ) are both palindromic. Since symmetric schemes have an even order p (cf. [15,Chapter 3]), only odd-order conditions for an appropriately reduced number of free coefficients need to be imposed. The general algorithm described in Sect. 2.1 can easily be adapted to this case. -The following type of schemes seems not to have been considered earlier in the literature: 5 Palindromic schemes, or 'reflected schemes' in the terminology of [1], are characterized by b j = a s+1− j , j = 1 . . . s, i.e., a 1 , b 1 , a 2 , b 2 , . . . , b 2 , a 2 , b 1 , a 1 ).
Assume a scheme of order p is given, and consider a splitting step of the form (1.3).
Interchanging the roles of A and B, i.e., replacing (1.3) by For an ansatz with palindromic coefficients, exchanging the roles of A and B in the algorithm from Sect. 2.1 will lead to the identical set of order conditions. Therefore the order conditions associated with 'Lyndon twins' are pairwise identical. Here, we call a pair of Lyndon words a twin if one of them is obtained by exchanging the role of A and B and reading it from right to left, see Table 1. For instance, the 6 words of odd length 5 consist of three twins; the 9 words of even length 6 consist of three twins, the selfie AAABBB, and two solitary words. Due to this redundancy the number of order conditions is appropriately reduced.
-Higher order one-step schemes can be generated by m-fold composition of lower-order schemes with appropriately chosen sub-steps h μ = ω μ h satisfying ω 1 + · · · + ω m = 1 plus additional conditions guaranteeing that a certain order is obtained. 6 A popular class of composition methods are symmetric Strang compositions. Schemes of this type of orders 4, 6 and higher were first devised in [23]. Some of the composition coefficients have to be chosen negative, and the local error measures of these composition schemes are rather large. On the other hand, for higher orders, composition beats the generic lower limits on the number s of stages such that a given order p can be expected. For instance, the sevenfold 6-th order symmetric Strang composition [3, 'Y 8-6'] recombines into an 8-stage scheme, whereas the generic number of order conditions for a symmetric scheme of order p = 6 is 10, which would require s = 10 stages involving 11 free coefficients. Evidently, (symmetric) compositions are an attractive option for constructing higher-order schemes. Therefore we have included this class into our considerations concerning the search for optimal variants (see Sect. 4).

Complex coefficients
Our considerations are not restricted to schemes with real coefficients a j , b j . Complex schemes, with coefficients having positive real parts, are appropriate for the application of splitting methods to parabolic problems, since real schemes with positive coefficients do not exist for order p ≥ 3, see [7]. For this class of methods, in particular based on complex compositions, we refer to [8,13].

Splitting into more than two operators
We also consider evolution equations where the right-hand side splits into three parts, , t ≥ 0, u(0) given, (2.13) and according multiplicative splitting schemes, The methodology from [1] can be directly generalized to the case of splitting into more than two operators. For the practically relevant case of splitting into three operators A, B, C, as in (2.14), the representation (2.

3) generalizes as follows, with
On the basis of these identities, the algorithm from Sect. 2.1 generalizes in a straightforward way. The Lyndon basis representing independent commutators now corresponds to Lyndon words over the alphabet {A, B, C}, see Table 2.
Concerning symmetries, similar considerations as in Sect. 2.2 apply. For a general convergence theory of ABC-splitting for the linear case and some applications we refer to [6]. For example, splitting into three operators can be used to handle evolution equations where the right-hand side splits up into two nonautonomous parts. Introducing the independent variable t as an unknown variable satisfying t = 1, such a problem can be formally considered as an autonomous system split into three parts. In this case, splitting means that the variable t is frozen over several subintervals comprising an integration step. Since the ODE t = 1 is trivial, a large number of higher-order commutators vanishes in this case, and therefore the number of necessary order conditions is significantly reduced, a situation to be considered in further work.

Pairs of splitting schemes
For the purpose of efficient local error estimation as a basis for adaptive stepsize selection, using pairs of related schemes is a well-established idea. One of the schemes, of order p, acts as the worker, and the other, of order p +1, is the controller responsible for local error estimation. 7 Criteria for the selection of pairs of schemes are accuracy and computational efficiency.
Order conditions for pairs of schemes of the types listed below can be generated with minor modifications of the approach described in Sect. 2.
-Embedded pairs. In [17], pairs of splitting schemes of orders p and p + 1 are specified. The idea is to select a controllerS of order p + 1 and to construct a worker S of order p for which a maximal number of stages S j coincides with those of the controller. Let a j , b j andā j ,b j denote the coefficients of the worker and controller, respectively. The approach adopted in [17] may be called static, finding S andS such that a j =ā j and b j =b j for as many j = 1, 2, . . . as possible. In this sense the schemes are related to each other but, in general, the total number of order conditions, and thus the total number of necessary evaluations, is the same as for an arbitrary unrelated ( p, p + 1) pair.
Here we develop the idea of embedding further: again we fix a 'good' controller of order p + 1 and wish to adjoin to it a 'good' worker of order p. Since the number of stagess ofS will be higher than the number of stages s of S , we can select an optimal embedded worker S from a set of candidates obtained by flexible embedding, where the number of coinciding coefficients is not a priori fixed.

Example 3
In [17], an embedded (3, 4)-pair was constructed, where the controller is an optimized symmetric scheme of order p = 4 with s = 7 stages due to [10], with local error measure LEM = 0.01 ('LEM' in the sense of (4.2b) below). The worker specified in [17] is a scheme of order p = 3 with s = 6 stages, where the coefficients a 1 , a 2 , a 3 , a 4 and b 1 , b 2 , b 3 coincide with those of the controller. This amounts to 7 additional evaluations for the worker, and its local error measure is LEM = 0.2. For flexible embedding, in contrast, we consider all possible embedded workers, and we find that a scheme of order p = 3 with s = 4 stages is to be preferred, see [3, Emb 4/3 BM PRK/A], where a 1 , a 2 and b 1 coincide with those of the controller. This amounts to five additional evaluations for the worker, and it has LEM = 0.1.
-Milne pairs. In the context of multistep methods for ODEs, the so-called Milne device is a well-established technique for constructing pairs of schemes. In our context, one may aim for finding a pair (S ,S ) of schemes of the same type, with equal s and p, such that their local errors L ,L are related according to with γ = 1. Then, the additive schemē is a method of order p + 1, and provides an asymptotically correct local error estimate for S (h, u). -Adjoint pairs. Let S be a scheme of odd order p and and S * its adjoint, see Sect. 2.2. Due to [15,Theorem II.3.2] the leading error terms of S and its adjoint S * are identical up to the factor −1. Therefore, the averaged additive schemē is a method of order p + 1, and provides an asymptotically correct local error estimate for S (h, u). In this case the additional effort for computing the local error estimate is identical with the effort for the worker S but not higher as is the case for embedded pairs. An example are palindromic pairs, where S is palindromic (of odd order p), such that S * =Š , see Sect. 2.2.
For detailed comments on a number of new pairs listed in [3], see Sect. 5.
noncommuting symbols, and tables of Lyndon words generated using an algorithm devised in [14]. Since the number of terms in (2.3) resp. (2.15) rapidly increases with q we have implemented a parallel version relying on Maple's Grid package. In particular, the job of generating all the terms in the long sums (2.3) and (2.15) can be (equi-)distributed over several parallel threads. The resulting set of order conditions is a multivariate polynomial system which, for higher orders, requires numerical solution techniques. Once a scheme of order p has been found, its leading local error term is of the form (see Sect. 2) with p+1 commutators K p+1,k associated with Lyndon words of length p + 1. To compare schemes of equal order p one may consider as a reasonable measure for the accuracy of a scheme. However, we use the quantity instead. Using (4.2b) has the advantage that the coefficients λ k = λ p+1,k are exactly those which are generated in the course of the setup of the conditions for order p + 1, see Sect. 2.1, while the coefficients from (4.2a) are more difficult to compute (cf. the discussion in Sect. 2.1). Since different particular solutions to the order conditions typically result in leading local error terms varying over several orders of magnitudes, we consider (4.2b) equally reasonable as (4.2a). For finding and evaluating solutions and pairs of solutions we follow two different strategies.
-For the case where the number of equations equals the number of free coefficients we expect a set of isolated solutions. In this case we use the fsolve function in Maple combined with a Monte-Carlo strategy for generating different initial intervals. Higher precision is used to generate solutions with double precision accuracy. For each detected solution the LEM (4.2b) is computed. -Especially for the case where the number of equations is smaller than the number of free coefficients, the problem is to be considered as a constrained minimization problem: minimize the LEM representing the objective function, with the order conditions imposed as nonlinear equality constraints. To this end we employ stateof-the-art techniques which have also been applied for the construction of special classes of Runge-Kutta methods, see for instance [16]. In particular we have used the MATLAB 8 optimizer fmincon. Again a large number of initial guesses are generated randomly, since this optimization problem is nonconvex in general. The results cannot be guaranteed globally optimal, but results from an exhaustive search usually suggest that this is indeed the case. A post-processing, i.e., refining the solutions to full double precision, is again performed in Maple using higher precision sfloat arithmetic.
We have also re-checked a number of known methods, refined their coefficients to full double precision, and computed their LEMs.

Schemes from the collection [3]
This collection is not intended to be exhaustive. It includes some known and quite a number of new schemes, in particular pairs of schemes, up to order p = 6, with their essential properties. Some methods are included mainly for the sake of completeness or their historical significance.
In the following we comment on some of these methods; for complete information, consult [3]. 'Best' or 'optimal' means that it has minimal LEM (4.2b) among a certain class of methods with comparable effort for a given order p. In some simple cases such optimality properties can be established theoretically; for higher orders we have resorted to more or less exhaustive numerical search.
Methods whose label contains the letter 'A' are new, or taken again into consideration in the context of constructing pairs, or their LEM has been computed for the first time. 9 The list also includes some pairs of embedded schemes ('Emb .. More detailed information about all these methods can be found on the webpage [3].
-'Emb 4/3 AKS p' (palindromic controller with s = 5, p = 4). In particular, this scheme has essentially the same LEM as the fourth order scheme from [10] which has been used in [17], but it has only 5 stages instead of 7. Complex coefficients (with positive real parts).

Splitting into three operators ('ABC schemes')
Due to the rapidly increasing number of generic order conditions, finding general higher order schemes would be a very challenging task for this case. For p = 6, for instance, the generic number of order conditions is 196 for the general case and 59 for the symmetric case. For p = 6 we therefore only consider real or complex Strang compositions which are easier to construct and lead to more compact schemes. Generating the expression for the leading error term d 7 dh 7 L (0) for the purpose of computing the LEM for p = 6, involving 312 coefficients (see Table 2), is computationally expensive, but it can be done at reasonable effort, for the purpose of computing the LEM of a given composition and comparing different variants.
Real coefficients.
-'AK 7-4 c' (s = 7, p = 4) is the best symmetric Strang composition of order p = 4. It is the analog of the AB composition 'A 4-4-c', with the same composition weights. -'AK 15-6 c' (s = 15, p = 6) is the best symmetric Strang composition of order p = 6. It is the analog of the AB composition 'C 8-6-c', with the same composition weights.

Palindromic schemes: discussion and open questions
As indicated in Sect. 2.2, one motivation for considering palindromic schemes is the fact that they are easier to construct. Moreover, as already mentioned in Sect. 5, small error constants are usually observed in this case. Apparently, palindromic schemes tend to have minimal LEMs among a set of competitors, for instance the third-order scheme in the pair 'PP 3/4 A' (a theoretical explanation for this observation is missing). This is the reason why we have included some adjoint pairs of (optimized) palindromic type of orders ( p, p + 1), p odd, in our collection [3]. Palindromic schemes have a certain type of symmetry, but they are not timesymmetric. Investigating special properties of such schemes appears to be of interest in the context of geometric integration, for instance when they are applied to partitioned systems of the form with the natural splitting of the vector field ( f, g) into ( f, 0) and (0, g). In this case, splitting methods are equivalent to a subclass of Partitioned Runge-Kutta (PRK) methods, characterized by a pair ( A, α), (B, β) of Runge-Kutta schemes. For each AB splitting scheme, the associated PRK coefficients satisfy As a consequence, all splitting schemes preserve quadratic invariants when applied to a partitioned system (6.1), and they are symplectic when applied to again from [15] we know that under the additional condition α i = β i , i = 1 . . . s, the invariance properties mentioned above remain valid for (6.2). The latter condition is satisfied if the scheme is a composition of steps of symplectic Euler type, i.e., for a i = b i . In the palindromic case we have β i = α s+1−i , and this does not appear to be a useful property in view of invariance questions. (Cf., for instance, the proof of the general assertion of Theorem IV.2.4 in [15], which does not carry over.) Summarizing, we may say that palindromic schemes by now have not been completely understood, and this may deserve further investigations.

Numerical example
For a numerical illustration, in particular concerning the expected performance of palindromic schemes, we consider an example of a system of coupled nonlinear evolution equations of Schrödinger type (see [22]), with initial condition chosen such that the exact solution is a pair of solitons,  Left: Local error (first step) for scheme (i) starting with 'A' of order 3, and for the averaged scheme (see (3.2)) of order 4 Right: Global error for scheme (i) at t end = 5.0 Left: Local error (first step) for scheme (i) starting with 'A' of order 5, and for the averaged scheme (see (3.2)) of order 6 Right: Global error for scheme (i) at t end = 5.0 starting at t 0 is given by All computations were performed in standard double precision arithmetic. In Tables 3  and 4, 'err' refers to a canonically scaled discrete L 2 -norm, and 'ord' refers to the order observed. and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.