On the zero-stability of multistep methods on smooth nonuniform grids

In order to be convergent, linear multistep methods must be zero stable. While constant step size theory was established in the 1950's, zero stability on nonuniform grids is less well understood. Here we investigate zero stability on compact intervals and smooth nonuniform grids. In practical computations, step size control can be implemented using smooth (small) step size changes. The resulting grid $\{t_n\}_{n=0}^N$ can be modeled as the image of an equidistant grid under a smooth deformation map, i.e., $t_n = \Phi(\tau_n)$, where $\tau_n = n/N$ and the map $\Phi$ is monotonically increasing with $\Phi(0)=0$ and $\Phi(1)=1$. The model is justified for any fixed order method operating in its asymptotic regime when applied to smooth problems, since the step size is then determined by the (smooth) principal error function which determines $\Phi$, and a tolerance requirement which determines $N$. Given any strongly stable multistep method, there is an $N^*$ such that the method is zero stable for $N>N^*$, provided that $\Phi \in C^2[0,1]$. Thus zero stability holds on all nonuniform grids such that adjacent step sizes satisfy $h_n/h_{n-1} = 1 + \mathrm O(N^{-1})$ as $N\rightarrow\infty$. The results are exemplified for BDF-type methods.


Introduction
A linear multistep method, discretizing an initial value problemẏ = f (t, y), is represented by a difference equation of order k, 1 h k ∑ j=0 α j y n+ j = k ∑ j=0 β j f (t n+ j , y n+ j ). (1.1) Here the step size h = t n+k − t n+k−1 > 0 is assumed constant. We denote the forward shift operator by E and write the method h −1 ρ(E)y n = σ (E) f (t n , y n ), with generating polynomials These are arranged to have no common factors, and coefficients are normalized by σ (1) = 1. Zero stability is necessary for convergence, and requires that all roots of ρ(ζ ) = 0 lie inside or on the unit circle, with no multiple unimodular roots. Since consistent methods have ρ(ζ ) = (ζ − 1) · ρ R (ζ ) as indicated above, zero stability is a condition on the extraneous operator ρ R (ζ ). Its zeros are referred to as the extraneous roots. Strong zero stability requires that all extraneous roots are strictly inside the unit circle; this is a condition on the k coefficients {γ j } k−1 j=0 . Since the extraneous operator is void in Adams-Moulton and Adams-Basforth methods, these methods are trivially zero stable for variable steps, [9, p. 407]. The most important case having a nontrivial extraneous operator is the BDF methods, known to be zero stable for 1 ≤ k ≤ 6, cf. [5], [9, p. 381]. Some (nonstiff) method suites, such as the dcBDF and IDC methods [1], are based on the BDF ρ operator, and have the same zero stability properties for k ≥ 2. Other examples of nontrivial extraneous operators are the weakly stable explicit midpoint method (two-step method of order 2) and the lesser used weakly stable implicit Milne methods, [9, p. 363].
Adaptive computations are of particular importance for stiff problems, as widely varying time scales call for correspondingly large variations in step size. Of the methods mentioned above, only the BDF family has unbounded stability regions specifically designed for stiff problems. Thus the BDF methods must handle step size variations well, in spite of its extraneous operator, explaining why studies of variable step size zero stability mostly center on the BDF methods, [9, p. 402ff].
Although there are several ways to construct multistep methods on nonuniform grids, we shall only consider the grid-independent representation of multistep methods, [2]. This represents a multistep method on any nonuniform grid using a fixed parametrization, defining a computational process where the coefficients α j,n , β j,n vary along the solution and depend on k − 1 consecutive step size ratios. For simplicity, but without loss of generality, let us consider a quadrature problemẏ = f (t) on [0, 1] using variable steps. The multistep method (1.1) becomes where h n+k−1 = t n+k − t n+k−1 . Letting y ∈ C p+1 denote the exact solution, we obtain where the global error at t n is e n = y n − y(t n ). Here, the local error c n h p n+k−1 y (p+1) (ϑ ) goes to zero if h n+k−1 → 0 (consistency), but convergence (e n → 0) in addition requires that solutions to the homogeneous problem remain bounded. Thus zero stability on nonuniform grids is investigated in terms of the problemẏ = 0 and finding sufficient conditions on the grid partitioning of [0, 1], such that the numerical solution {y n } N 0 is uniformly bounded as N → ∞. This problem has been approached in several different ways, see e.g. [3], [4], [7], [8], usually with the aim of finding precise bounds on the step size ratios, such that the method remains convergent. Since the method coefficients change from step to step, most analyses become highly complicated. For example, the problem can be addressed by studying infinite products of companion matrices associated with the recursion (1.6), [9, p. 403], or by considering the nonuniform grid as a "perturbation" of an equidistant grid, by letting the step size vary smoothly, [6].
An overview is given in [9, p. 402ff], but the classical results focus on the existence of local step size ratio bounds that guarantee zero stability. By constrast, our focus is on grid smoothness. Using (near) Toeplitz operators, our aim is to develop a proof methodology for adaptive computation, aligned with the formal convergence analysis in the Lax-Stetter framework, cf. [15]. We let the grid points be given by a strictly increasing sequence {t n } N 0 and define the step sizes by h n = t n+1 − t n , requiring that h n → 0 for every n as N → ∞. If the grid is smooth enough, then any multistep method which is strongly zero stable on a uniform grid is also zero stable on the nonuniform grid for N large enough.
The main result has the following structure. Every multistep method is associated with two constants, C 0 and C k , where the former only depends on constant step size theory, and is bounded if the method is strongly zero stable on a uniform grid. The second constant depends on the first order variation of the method's coefficients for infinitesimal step size variations, and is computable using a suitable computer algebra system such as MAPLE or MATHEMATICA. Finally, grid smoothness will be characterized in terms of a differentiable grid deformation map, requiring a bound on a function of the form ϕ ′ /ϕ. Under these conditions, the method is zero stable on the non-uniform grid provided that max |ϕ ′ /ϕ| This separates method properties and grid properties, and only requires that the total number of steps N is large enough. The important issues are to generate a smooth step size sequence (which automatically manages step size ratios), and using a sufficiently small error tolerance, which determines N. Although such step size sequences can easily be constructed in adaptive computation, [12], most multistep codes still use comparatively large step size changes, violating the smoothness conditions required for zero stability. This has been demonstrated to be a likely cause of poor computational stability observed in practice, [13]. In production codes it is often thought that frequent, small step size changes are not "worthwhile," but the present paper and classical theory only support such step size changes. Our approach is intended as an analysis tool for deriving a rigorous convergence proof for adaptive multistep methods, redefining practical implementation principles. A full convergence analysis of the initial value problemẏ = f (t, y) requires further attention to detail, as it also involves the Lipschitz continuity of the vector field f with respect to y, as well as (for implicit methods) the solvability of equations of the form v = γ h f (t, v) + w. The solvability will depend on the magnitude of the Lipschitz constant L[γ h f ] or the logarithmic Lipshitz constant M[γ h f ], see e.g. [14]. Likewise, error bounds will depend on these quantities. Here, however, we only focus on zero stability, which can be fully characterized in the simpler setting of a quadrature problem. We shall return to the full convergence analysis on smooth nonuniform grids in a forthcoming study.

Smooth nonuniform grids
If an initial value problem has a smooth solution, then the step size sequence, keeping the local error (nearly) constant, is also smooth, [6], [11]. A smooth sequence is also known to be necessary in connection with e.g. Hamiltonian problems, [10], as well as in finite difference methods for boundary value problems. For these reasons, we shall model nonuniform grids by a smooth deformation of an equidistant grid. We only consider compact intervals.
Adaptive computation. The asymptotic behavior of the local error per unit step in a multistep method is of the form l n = ch p n y (p+1) . The most common step size control in adaptive computation aims to keep l n constant, equal to a given local error tolerance ε. Representing the step size in terms of a step size modulation function µ(t) allows the step size at time t to be expressed as h = µ(t)/N, so that the "ideal" step size sequence can be modeled by It follows that N ∼ ε −1/p . In other words, the local error tolerance determines N.
In real adaptive computations, a step size control of the form h n = r n−1 h n−1 is used, where the step ratio sequence is processed by a digital filter to generate a smooth step size sequence, [12]. This may e.g. take the form where l n−1 and l n−2 are local error estimates and (b 1 , b 2 , a 1 ) are the filter parameters. The controller keeps the local error close to the tolerance ε. As a consequence the step ratios will remain near 1. Further, reducing the tolerance ε increases N, reducing step sizes as well as step ratios. Thus it is justified to model a nonuniform grid by a smooth grid deformation, and such a grid can be generated using a proper filter to continually adjust the step size. It also corresponds well to the behavior observed in computational practice when such step size controllers are employed.
Modeling a smooth nonuniform grid. Let Φ : τ → t be a smooth, strictly increasing map in C 2 [0, 1], satisfying Φ(0) = 0 and Φ(1) = 1. Further, let its derivative Φ ′ = dΦ/dτ be denoted by ϕ and assume that ϕ ′ /ϕ ∈ L ∞ [0, 1]. Now, given N, let τ n = n/N and construct a smooth nonuniform grid {t n } N n=0 by Hence h n → 0 as N → ∞. This allows us to study zero stability on nonuniform grids in terms of the single-parameter limit N → ∞. This does not substantially restrict h max /h min during the overall integration, although adjacent step ratios will be small.
Step ratios. The coefficients of a multistep method on a nonuniform grid depend on the ratio of adjacent step sizes. By (2.2) the step ratios {r n } N−2 n=0 are given by Hence the step ratios approach 1 as N → ∞, i.e., locally the method behaves like a constant step size method for N large enough, since we assumed It is also of interest to represent the step size change as a relative step size increment, which, in view of (2.3), is defined by Thus v n−1 → 0 as N → ∞, and in practical computations the relative step size increment is invariably small.
The assumption ϕ ′ /ϕ ∈ L ∞ [0, 1] requires that log ϕ ∈ C 1 [0, 1]. By a stronger assumption, log ϕ ∈ C 2 [0, 1], we can also estimate the change in the step size ratios, where ϕ and its derivatives are evaluated at τ n . Thus the ratio of successive step ratios approach 1 even faster than the step ratios themselves. The interpretation is that step ratios change slowly, and there may be long strings of consecutive steps where the step size "ramps up" as the solution to the ODE gradually becomes smoother after a transient phase. This corresponds to a gradual stretching of the mesh width.
Step sizes and ratios as a function of t. Using t = Φ(τ) and dt = ϕ dτ, the step size modulation function µ(t) and the derivative ϕ(τ) satisfy the functional relation Differentiating (2.5) with respect to t and denoting time derivatives by a dot to distinguish them from derivatives with respect to τ, we obtainμ dt = ϕ ′ dτ. Hencė allowing us to express step sizes, step ratios, and relative step size increments along the solution of the differential equation, as functions of t, Obviously, the previous assumption Since µ(t) ∼ y (p+1) (t) −1/p , the assumptions on the deformation map Φ are realistic and reflect problem regularity.

Deflation and operator factorization
The variable step size difference equation can be rewritten in matrix-vector form as where the vector y contains all successive approximations {y n } N n=1 . The vector Y 0 is constructed from the initial conditions, y 0 , . . . y −k+1 . Further, A N (ϕ) is an N × N matrix containing the method coefficients, and is associated with a nonuniform grid characterized by the function ϕ. The step sizes are represented by a diagonal matrix H N =φ/N, We will investigate zero stability as a question of whether there exists a constant C ϕ , independent of N, and an N * , such that A −1 N (ϕ)H N ≤ C ϕ for all N > N * . As ϕ(τ) ≡ 1 corresponds to a uniform grid, A N (1) denotes the Toeplitz matrix of method coefficients for constant step size H N = I/N. Then zero stability is equivalent to A −1 N (1)/N ≤ C 1 for all N. Just as the principal root can be factored out of ρ to construct the extraneous operator ρ R (ζ ), satisfying ρ(ζ ) = (ζ − 1)ρ R (ζ ), a similar factorization holds for the (near) Toeplitz operators. Thus, due to preconsistency (ρ(1) = 0), the n th full row sum of the matrix even on a nonuniform grid. Denoting the n th row of nonzeros in A N (ϕ) by a k + 1 vector a T n (ϕ), and letting 1 k+1 = (1 1 . . . 1) T denote a k + 1 vector of unit elements, preconsistency can be written Hence a T n (ϕ) contains a difference operator. It can therefore be written as a convolution of a k-vector c T n (ϕ) and the backward difference operator ∇ = (−1 1), i.e., For example, for the constant step size BDF2 method, corresponding to α 0 = 1/2, α 1 = −2 and α 2 = 3/2, the convolution can be represented as implying that Gustaf Söderlind et al.   Thus the vector c T n (1) is the n th full row of nonzero elements of the extraneous operator, corresponding to the coefficients γ 0 = −1/2 and γ 1 = 3/2 of the deflated polynomial ρ R (ζ ). Table 3.1 lists the row elements a T n (1) and c T n (1), respectively, for all zero stable BDF methods of step numbers k ≥ 2.

Coefficients of
Unlike generating polynomials, the (near) Toeplitz operators have the advantage of applying also to nonuniform grids. The following factorization of H −1 N A N (ϕ) is then a matrix representation of the deflation operation described above.

7)
where R N (ϕ) is the extraneous operator, dependent on the nonuniform grid, and The simple integrator D −1 N is stable, and for all N ≥ 1 it holds that D −1 N ∞ = 1.
Proof We only need to prove the latter statement. By induction we see that the integrator is a cumulative summation operator, and it immediately follows that D −1 N ∞ = 1 for all N. To establish zero stability we need to show that (H −1 We shall use the uniform norm throughout. Since it formally holds that where φ ∞ is bounded for all smooth grids, the remaining difficulty is to show that R −1 N (ϕ) ∞ ≤ C ϕ for all N > N * , and how this depends on grid regularity. For a unform grid, zero stability is determined by the roots of the extraneous operator; this needs to be translated into norm conditions. A simple possibility is to use the fact that where m ∞ [R N (1)] is the lower logarithmic norm of R N (1), see [14]. The condition m ∞ [R N (1)] > 0 is equivalent to diagonal dominance. For example, by Table 3.1, the BDF2 matrix NA 2,N (1) associated with the ρ operator has the factorization where the nonzero coefficients correspond to the c T n (1) vector of Table 3.1. Since (3.11) it follows that R −1 2,N (1) ∞ ≤ 1/m ∞ [R 2,N (1)] = 1 and that the BDF2 method is zero stable. The same technique works for the BDF3 method, since However, it fails for the BDF4 method and higher, since the extraneous operator is then no longer diagonally dominant. By instead computing e.g. the Euclidean norm numerically, the above technique can be extended to BDF4 and BDF5, but it again fails for BDF6. For this reason, we need a general result, based on sharper estimates.

Theorem 3.2 For every strongly stable k-step method on a uniform grid, there is a constant C
Proof Let T 0 denote the lower triangular Toeplitz matrix R k,N (1) representing the extraneous operator. Then T −1 0 is lower triangular too, albeit full. More importantly, T −1 0 is also Toeplitz. By (1.2), ρ R (ζ ) = ∑ k−1 j=0 γ j ζ j . Noting that α k = γ k−1 , and illustrating the matrix T 0 for k = 3, we have where, in the general case, δ j = γ j /α k are the elements of the scaled matrixT 0 , with Toeplitz inverseT , the sequence u must be absolute summable as N → ∞. By construction, u satisfies the difference equation ρ R (E)u = 0, where E is the forward shift operator. By assumption ρ R (ζ ) satisfies the strict root condition. Therefore u is bounded, i.e., u ∈ l ∞ . Let ρ R (ζ ν ) = 0 and let max ν |ζ ν | ≤ q < 1, where equality applies whenever the maximum modulus root is simple. Then there is a constant K < ∞ such that |u n | ≤ K · q n for all n ≥ 1. Hence u ∈ l 1 , as Since T −1 0 ∞ = u 1 due to the Toeplitz structure ofT −1 0 , we have, for all N ≥ 1, and the proof is complete.
In terms of the (near) Toeplitz operators used above, the variable step size extraneous operator is given by The operator R −1 2,N (ϕ) is bounded whenever the lower logarithmic max norm, along the range of step ratios r. Diagonal dominance requires that 1 + 2r − r 2 > 0, which holds if 0 < r < 1 + √ 2, so the classical bound is obtained once more. As we assume a smooth grid in terms of (2.3), withμ = ϕ ′ /ϕ ∈ L ∞ [0, 1], the condition r n < 1 + √ 2 is fulfilled for In general, however, a method can be zero stable without diagonal dominance, requiring more elaborate techniques to establish zero stability. The variable step size discretization (3.1) ofẏ = 0 is factorized to obtain the difference equation corresponding to the extraneous operator only, where the coefficients γ j,n are multivariate rational functions of k − 1 consecutive step size ratios. If the sequence u is bounded (zero stability), then the original solution y of (3.1) is obtained by simple Euler integration, y n+1 = y n + u n /N, where h = 1/N is a constant step size and N → ∞. Since the latter integration is stable, we only need to bound the solutions u of (4.7). Using (2.4), we write the step ratios where, for smooth grids, Thus, the larger the value of N, the closer is |v n | to zero. Now, for v n ≡ 0 we obtain the classical constant step size method. The difference equation (4.7) can then be rearranged as a Toeplitz system T 0 u = U 0 , where T 0 = R 2,N (1) and u = {u n } N 1 denotes the entire solution. The vector U 0 contains initial data as needed. By Theorem 3.2, With variable steps, the system will depend on the step ratios, and the overall system matrix will no longer be Toeplitz. Nevertheless, for the BDF2 example used above, we have seen that the extraneous system matrix can be written Thus we can write where the T j are Toeplitz and V = diag(v j ) is a diagonal matrix. Since T −1 0 is uniformly bounded, a sufficient condition for R 2,N (ϕ) to be invertible is and we obtain the bound Here the T j T −1 0 ∞ are method dependent constants, and We can now determine a sufficient condition on V ∞ in general, and on N in particular, such that (4.9) is satisfied. Because w : if the grid is regular, there is always an N large enough to satisfy this condition. Considering the equation we find that we have to take N large enough to guarantee that The quantity on the right hand side depends only on the method coefficients, and the left hand side depends only on the total number of steps, and the regularity of the nonuniform grid, as measured by ϕ ′ /ϕ ∞ .

Zero stability on nonuniform grids -Higher order methods
In a k-step method using variable steps, the coefficients depend on k − 1 step ratios. This makes the problem significantly more difficult. Without loss of generality, we will only consider an approach linear in V below. Note that while V ∞ = O(N −1 ), it follows that higher powers of V are V k ∞ = O(N −k ), implying that they are significantly smaller than the first order term when N is large and the grid is smooth. For example, in (4.12) above, we have w = O(N −1 ) implying that the w 2 is negligible as N → ∞; it is therefore sufficient to consider terms of order O(N −1 ) only. This overcomes the added difficulty of considering k-step methods.
The procedure for a general k-step method follows the same pattern as the in the previous examples. Neglecting quadratic and higher order terms in V , the extraneous operator is The diagonal matrices V j only differ in the diagonal elements being successively shifted down the diagonal. Assume that log ϕ ∈ C 2 [0, 1]. By (2.4) and the mean value theorem, evaluating ϕ and its derivatives at τ n+1/2 . It follows that V j+1 = V j + O(N −2 ), and that all V j can be replaced by a single matrix, V , while only incurring O(N −2 ) perturbations. Further, (4.11) holds for all V j . Since T −1 0 ∞ ≤ C 0 , a sufficient condition for the extraneous operator R k,N (ϕ) to have a uniformly bounded inverse is This condition separates grid smoothness ϕ ′ /ϕ ∞ from method parameters, as represented by the Toeplitz matrices T j . Thus, in order to prove zero stability as N → ∞, we need T j ≤ C j for j ≥ 1. The latter condition is easily established, once the coefficients' dependence on the step ratios has been established. Hence we have the following general result.
where we have assumed that v > −1/4, allowing the removal of absolute values. Thus the operator T 0 + V · (T 1 + T 2 ) has a uniformly bounded inverse. The corresponding zero stability condition is For BDF3 [9, p. 406] cite Grigorieff's (1983) sufficient conditions for zero stability, Our BDF3 bounds for ramp-up provide the conditions The differences between these results depend on the methodology, and not least on the choice of norm. The deflation approach used here is similar to the technique used in [7], while smooth grid maps are akin to the assumptions used in [6].
It is important to note that we do not try to determine the greatest possible step size increase, but instead prove that every strongly stable method will be zero stable on smooth grids. We have also seen that the complexity of determining exact stability bounds quickly becomes overwhelming, which is why we argue that an alternative proof, revealing the dependence on smoothness and method parameters, is sufficient.

Conclusions
In this paper we have demonstrated that any linear multistep method which is strongly stable on a uniform grid is also zero stable on any smooth nonuniform grid. Grid smoothness is (in theory) determined by a grid map Φ : [0, 1] → [0, 1], satisfying Φ(0) = 0 and Φ(1) = 1, and having a strictly positive derivative ϕ = Φ ′ . The grid map transforms a uniform grid of N steps into a nonuniform grid, which is smooth if log ϕ is continuously differentiable.
In practice, this corresponds to a smooth step size variation, where the step size at time t ∈ [0, 1] can be represented by a continuous modulation function, so that h(t) = µ(t)/N. Hereμ(t) = ϕ ′ /ϕ, which must remain bounded. The modulation function µ(t) is determined by the solution of the differential equation, while N is determined by the accuracy requirement as specified by the tolerance ε.
The main result is that every k-step method is associated with k bounded Toeplitz operators T 0 , . . . T k−1 , where T 0 is associated with the constant step size method. If that method is strongly zero stable, then T 0 has a bounded inverse. Smooth step size variation is characterized locally by the function ϕ ′ /ϕ, the magnitude of which determines how many steps N that need to be taken in order to guarantee variable step size zero stability. Thus, if the numerical solution toẏ = 0 is stable. Examples are given for BDF methods. This result is also practically significant as it implies that time step adaptivity must be implemented using smooth step size changes, such that consecutive step ratios are r = 1 + O(h). This can easily be achieved, as there is a wide range of smooth controllers available for dedicated purposes, [12]. These are based on digital filter theory, and control log h in small increments, changing the step size on every step. Since h ∼ ϕ/N, such a controller keeps log ϕ smooth, in line with the assumptions of Theorem 5.1. The smoothness requirement is local, and does not imply any bound on h max /h min . It is therefore not a limitation in stiff computation, where overall step size variation necessarily is large.