Unified analysis for variational time discretizations of higher order and higher regularity applied to non-stiff ODEs

Becher, Simon; Matthies, Gunar

doi:10.1007/s11075-021-01164-z

Unified analysis for variational time discretizations of higher order and higher regularity applied to non-stiff ODEs

Original Paper
Open access
Published: 12 August 2021

Volume 89, pages 1533–1565, (2022)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

Unified analysis for variational time discretizations of higher order and higher regularity applied to non-stiff ODEs

Download PDF

916 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

We present a unified analysis for a family of variational time discretization methods, including discontinuous Galerkin methods and continuous Galerkin–Petrov methods, applied to non-stiff initial value problems. Besides the well-definedness of the methods, the global error and superconvergence properties are analyzed under rather weak abstract assumptions which also allow considerations of a wide variety of quadrature formulas. Numerical experiments illustrate and support the theoretical results.

Variational time discretizations of higher order and higher regularity

Article Open access 24 March 2021

Variational Methods for Stable Time Discretization of First-Order Differential Equations

Multigrid Methods for Time-Fractional Evolution Equations: A Numerical Study

Article 27 August 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

We study variational time discretization methods applied to non-stiff initial value problems of the form

$$ u^{\prime}(t) = f\left( t,u(t)\right),\qquad u(t_{0}) = u_{0} \in \mathbb{R}^{d}, $$

(1.1)

where f is supposed to be sufficiently smooth and to satisfy a Lipschitz condition on the second variable. Here, an initial value u₀ is given at t = t₀. The system of ordinary differential equations (ODEs) is considered on a time interval I = (t₀,t₀ + T] with arbitrary but fixed length T > 0.

Probably the most popular variational discretization schemes for solving (1.1) numerically are discontinuous Galerkin (dG) methods, cf. [15], and continuous Galerkin–Petrov (cGP) methods, cf. [13]. Both types of methods have been studied in the literature for many decades. However, an important part for the design of a then fully computable discretization, namely the use of quadrature methods for approximate integration, is often only marginally considered or the assumptions on the quadrature formulas are quite restrictive and, thus, allow a small variability of quadrature formulas only. So, for example, the number of quadrature points is supposed to equal the dimension of the local ansatz space of the discrete solution, cf. [6, 7], or the number of independent variational conditions, cf. [13, 14]. Often also the investigations are restricted to special Gauss, Gauss–Radau, or Gauss–Lobatto formulas. A quite general setting is considered in [9] where different ways to approximate the integrals over products of f with test functions are studied. Since in [9] the dynamical behavior of the schemes is analyzed, it is assumed that at least for Dahlquist’s test problem $u^{\prime } = \lambda u$ all integrals are integrated exactly which again is quite restrictive.

In this paper, we try to keep the requirements on the approximation operators involved in the definition of the discrete method as low as possible. Our scope is to figure out how the approximation properties of these operators affect the error behavior of the numerical solution. Thereby, our studies are not restricted to the dG or cGP method but cover the whole family of variational discretization methods recently introduced in [3, 4]. The used assumptions are quite general and abstract in order to enable the study of a wide variety of methods. Especially, we allow that the right-hand side is approximated by a composition of various interpolation operators (interpolation cascade) or that the quadrature formulas also use derivative values. Therefore, all variational methods considered in [3] and their convergence results can be verified by a rigorous and unified error analysis in the non-stiff case now.

The paper is structured as follows. In Section 2, the variational time discretization methods are formulated. The well-definedness of the method formulation is studied in Section 3 where the existence of unique discrete solutions is proven under appropriate conditions. Section 4 is devoted to the error analysis. Here, at first, the global error is bounded by a sum of rather abstract approximation error terms. Then, in order to give a better insight into the error behavior, the convergence orders of these approximation errors are estimated in terms of basic quantities describing the approximation properties of the involved operators. Thereafter, superconvergence results in the time mesh nodes are derived in Section 5. Finally, in Section 6, we use numerical experiments to illustrate the convergence behavior of the variational time discretization methods and show that the proven estimates are sharp.

In order not to obscure the view on the main arguments, we have reduced the proofs to the essential ideas. For the technical details and a more detailed tracing of the constants, we refer the reader to the preprint version [2]. Specific references are given at the beginning of each proof.

Notation: In this paper, C denotes a generic constant independent of the time mesh interval length. To describe the vector-valued case (d > 1) in an easy way, let (⋅,⋅) be the standard inner product and ∥⋅∥ the Euclidean norm on ${\mathbb {R}}^{d}$, $d \in {\mathbb {N}}$.

For an arbitrary interval J and positive integers q, the spaces of continuous and k times continuously differentiable ${\mathbb {R}}^{q}$-valued functions on J are written as $C(J,{\mathbb {R}}^{q})$ and $C^{k}(J,{\mathbb {R}}^{q})$, respectively. For non-negative integers s, we write $P_{s}(J, {\mathbb {R}}^{q})$ for the space of ${\mathbb {R}}^{q}$-valued polynomials of degree less than or equal to s. Moreover, $P_{-1}(J,{\mathbb {R}}^{q}) := \{0\}$. For q = 1, we sometimes omit ${\mathbb {R}}^{q}$. Further notation will be introduced later at the beginning of the sections where it is needed.

2 Formulation of the methods

The interval I is decomposed by

$$ t_{0} < t_{1} < {\dots} < t_{N-1} < t_{N} = t_{0}+T $$

into N disjoint subintervals I_n := (t_n− 1,t_n], n = 1,…,N. Moreover, we set

$$ \tau_{n} := t_{n} - t_{n-1}, \qquad \tau := \underset{1\le n\le N}{\max} \tau_{n}. $$

For convenience and to simplify the notation in some proofs, we assume τ_n ≤ 1 which is not really a restriction since we are interested in the asymptotic error behavior for τ → 0. For any piecewise continuous function v, we define by

$$ v(t_{n}^{+}) := \underset{t\to t_{n}+0}{\lim} v(t),\qquad v(t_{n}^{-}) := \underset{t\to t_{n}-0}{\lim} v(t),\qquad [v]_{n} := v(t_{n}^{+}) - v(t_{n}^{-}) $$

the one-sided limits and the jump of v at t_n. Furthermore, standard notation for the floor function is used.

We now present a slight generalization of the variational time discretization methods ${\mathbf {VTD}_{k}^{r}}$ investigated in [3]. Let $r, k \in {\mathbb {Z}}$, 0 ≤ k ≤ r. Then, the local version (I_n-problem) of the numerical method reads as follows:

Given $U\left (t_{n-1}^{-}\right )\in \mathbb {R}^{d}$, find $U\in P_{r}\left (I_{n},\mathbb {R}^{d}\right )$ such that

$$ \begin{array}{@{}rcl@{}} U\left( t_{n-1}^{+}\right) & =& U(t_{n-1}^{-}),\quad\quad\quad\quad\quad\quad\quad\quad\quad {if } k \geq 1, \end{array} $$

(2.1a)

$$ \begin{array}{@{}rcl@{}} U^{(i+1)}(t_{n}^{-}) & = &\frac{{\mathrm{d}}^{i}}{{\mathrm{d}} t^{i}}\left. (f(t,U(t)))\right|_{t=t_{n}^{-}}, \quad\quad{\kern13pt} \text{if } k \geq 2, i = 0, \ldots, \left\lfloor\tfrac{k}{2}\right\rfloor - 1, \end{array} $$

(2.1b)

$$ \begin{array}{@{}rcl@{}} U^{(i+1)}\left( t_{n-1}^{+}\right) & = &\frac{{\mathrm{d}}^{i}}{{\mathrm{d}} t^{i}}\left. (f(t,U(t)))\right|_{t=t_{n-1}^{+}}, \quad\quad{\kern8.5pt}\text{if } k \geq 3, i = 0, \ldots, \lfloor\tfrac{k-1}{2}\rfloor - 1, \end{array} $$

(2.1c)

and

$$ \mathscr{I}_{n}\left[ \Big(U^{\prime},\varphi\Big) \right] + \delta_{0,k}\left( [U]_{n-1},\varphi\left( t_{n-1}^{+}\right)\right) = \mathscr{I}_{n}\left[\Big(\mathcal{I}_{n} f\big(\cdot ,U(\cdot)\big), \varphi \Big) \right] \quad\forall \varphi\in P_{r-k}\left( I_{n},\mathbb{R}^{d}\right) $$

(2.1d)

where $U(t_{0}^{-}) = u_{0}$ and δ_i,j is the Kronecker symbol.

Here, ${{\mathscr{I}}_{n}}$ denotes an integrator that typically represents either the integral over I_n or the application of a quadrature formula for approximate integration. Details will be described later on. Moreover, ${\mathcal {I}}_{n}$ could be used to model some projection/interpolation of f or the usage of some special quadrature rules even if ${\mathscr{I}}_{n}$ is just the integral.

The problem class (2.1a) generalizes discontinuous Galerkin methods (corresponding to k = 0) and continuous Galerkin methods (corresponding to k = 1). The conditions (2.1b) and (2.1c) are higher order collocation conditions at the end points of the subintervals involving derivatives of the considered system of ordinary differential equations. For more details, we refer to [3, Remark 2.1].

3 Existence and uniqueness

First of all, we agree that both ${{\mathscr{I}}_{n}}$ and $\mathcal {I}_{n}$ are local versions (obtained by transformation) of appropriate linear operators ${\widehat {{\mathscr{I}}}}$ and $\widehat {\mathcal {I}}$ given on the closure of the reference interval (− 1,1]. However, ${\mathscr{I}}_{n}$ is an approximation of the integral operator while ${\mathcal {I}}_{n}$ approximates the identity operator. Thus, the transformations scale quite differently. More precisely, let

$$ T_{n}:(-1,1]\to I_{n},\quad \hat{t}\mapsto \frac{t_{n} + t_{n-1}}{2}+\frac{\tau_{n}}{2} \hat{t}, $$

(3.1)

denote the affine transformation that maps the reference interval (− 1,1] on an arbitrary mesh interval I_n = (t_n− 1,t_n]. Furthermore, let ${k_{{\mathscr{I}}}}$ and ${k_{\mathcal {I}}}$ be the smallest non-negative integers such that $\widehat {{\mathscr{I}}}$ and ${\widehat {\mathcal {I}}}$ are well-defined for functions in $C^{{k_{{\mathscr{I}}}}}([-1,1])$ and $C^{k_{\mathcal {I}}}([-1,1])$, respectively. Then, we have for all $\varphi \in C^{{k_{{\mathscr{I}}}}}(\overline {I}_{n},{\mathbb {R}}^{d})$ and for all $\psi \in C^{k_{\mathcal {I}}}(\overline {I}_{n},{\mathbb {R}}^{d})$ that

$$ \mathscr{I}_{n}\left[\varphi\right]={{\widehat{\mathscr{I}}}}\left[\varphi \circ T_{n}\right] \left( T_{n}\right)' = \frac{\tau_{n}}{2}{{\widehat{\mathscr{I}}}}\left[\varphi \circ T_{n}\right] \quad \text{and} \quad \mathcal{I}_{n} \psi = \left( \widehat{\mathcal{I}} (\psi \circ T_{n})\right) \circ T_{n}^{-1}. $$

Of course, these operators act component-wise when applied to vector-valued functions. Moreover, we suppose for all non-negative integers l that $\widehat {\mathcal {I}} \hat {v} \in C^{l}([-1,1])$ holds for all $\hat {v} \in C^{\max \limits \{ k_{\mathcal {I}} , l \}}([-1,1])$, i.e., $\widehat {\mathcal {I}} \hat {v}$ is at least as smooth as $\hat {v}$.

The study of existence and uniqueness of solutions to (2.1) as well as the later error analysis is strongly connected with the following operator. Let ${\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}} : C^{{k_{\mathcal {J}}}+1}([-1,1]) \to P_{r}([-1,1])$ where $k_{\mathcal {J}} := \max \limits \left \{\left \lfloor \frac {k}{2} \right \rfloor -1,k_{{\mathscr{I}}}, k_{\mathcal {I}}\right \}$ be defined by

$$ \begin{array}{@{}rcl@{}} \left( \widehat{\mathcal{J}}^{\mathscr{I}, {{\mathcal{I}}}}\hat{v}\right)^{(i)}(-1^{+}) & =& \hat{v}^{(i)}(-1^{+}), \qquad\qquad\qquad\qquad\qquad\qquad\quad \text{if } k \geq 1, i = 0,\ldots,\big\lfloor \tfrac{k-1}{2} \big\rfloor, \end{array} $$

(3.2a)

$$ \begin{array}{@{}rcl@{}} \left( \widehat{\mathcal{J}}^{\mathscr{I}, {{\mathcal{I}}}}\hat{v}\right)^{(i)}(+1^{-}) & =& \hat{v}^{(i)}(+1^{-}), \qquad\qquad\qquad\qquad\qquad\qquad\quad \text{if } k \geq 2, i = 1,\ldots,\big\lfloor \tfrac{k}{2} \big\rfloor, \end{array} $$

(3.2b)

$$ \begin{array}{@{}rcl@{}} {}{{\widehat{\mathscr{I}}}}\Big[\big(\widehat{\mathcal{J}}^{\mathscr{I}, {{\mathcal{I}}}}\hat{v}\big)' \widehat{\varphi}\Big] &+& \delta_{0,k} \widehat{\mathcal{J}}^{\mathscr{I}, {{\mathcal{I}}}}\hat{v}(-1^{+}) \widehat{\varphi}(-1^{+}) \end{array} $$

(3.2c)

$$ \begin{array}{@{}rcl@{}} &=&{ {\widehat{\mathscr{I}}}}\Big[\big(\widehat{\mathcal{I}} (\hat{v}^{\prime})\big) \widehat{\varphi}\Big] + \delta_{0,k} \hat{v}(-1^{+}) \widehat{\varphi}(-1^{+}) \qquad {\kern8pt} \forall \widehat{\varphi} \in P_{r-k}([-1,1]). \end{array} $$

Assumption 1

As before let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r, be the parameters of the method. The integrator ${{\widehat {{\mathscr{I}}}}}$ is such that $\widehat {\psi }\in P_{r-\max \limits \{1,k\}}([-1,1])$ and

$$ {{\widehat{\mathscr{I}}}}\left[\left( 1-\hat{t}\right)^{\left\lfloor \frac{k}{2} \right\rfloor} \left( 1+\hat{t}\right)^{\left|\left\lfloor \frac{k-1}{2} \right\rfloor\right|} \widehat{\psi} \widehat{\varphi}\right] = 0 \qquad \forall \widehat{\varphi} \in P_{r-\max\{1,k\}}([-1,1]) $$

imply $\widehat {\psi } \equiv 0$. Here, the absolute value in the exponent is needed only for k = 0.

Lemma 1

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. Suppose that Assumption 1 holds. Then, $\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}$ given by (3.2) is well-defined. If ${\widehat {\mathcal {I}}}$ preserves polynomials up to degree $\tilde {r}$, then ${\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}}$ preserves polynomials up to degree $\min \limits \{\tilde {r}+1,r\}$.

Proof

(for more details see [2, Lemma 3.1]) In order to show that the operator is well-defined, we need to verify that the

$$ \left( \left\lfloor \frac{k-1}{2}\right\rfloor + 1\right) + \left\lfloor \frac{k}{2} \right\rfloor + (r-k + 1) = \frac{k}{2} + \frac{k-1}{2} - \frac{1}{2} + 2 + r - k = r + 1 $$

linear conditions of (3.2) uniquely define a polynomial in P_r([− 1,1]). In short, it remains to prove that $\hat {v} \equiv 0$ implies ${\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}} \hat {v} \equiv 0$.

There are two cases. For k = 0 choosing in (3.2d) with $\hat {v} \equiv 0$ test functions of the form $\widehat {\varphi } = (1+\hat {t} ) \widetilde {\varphi }$, $\widetilde {\varphi } \in P_{r-1}([-1,1])$, we get by Assumption 1 $\left ({\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}} \hat {v}\right )' \equiv 0$. From this and using (3.2d) for $\hat {v} \equiv 0$ with $\widehat {\varphi } \equiv 1$, we then conclude ${\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}} \hat {v}(-1^{+}) = 0$ and, thus, ${\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}} \hat {v} \equiv 0$.

For k ≥ 1 we see from (3.2a), (3.2b), both with $\hat {v} \equiv 0$, that $\left (\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}\hat {v}\right )'(\hat {t} ) = \left (1-\hat {t}\right )^{\left \lfloor \frac {k}{2} \right \rfloor } \left (1+\hat {t}\right )^{\left \lfloor \frac {k-1}{2} \right \rfloor } \widehat {\psi }(\hat {t} )$ with $\widehat {\psi } \in P_{r-k}([-1,1])$. So, combining Assumption 1 and (3.2d) for $\hat {v} \equiv 0$ we gain $\widehat {\psi } \equiv 0$. Thus, $\left (\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}} \hat {v}\right )' \equiv 0$. So, using (3.2a) with $\hat {v} \equiv 0$ for i = 0, it again follows that ${\widehat {\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}} \hat {v} \equiv 0$.

Using the uniqueness of the approximation, the second statement can be easily verified. □

Lemma 2

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. Suppose that Assumption 1 holds. Then, there is a positive constant κ_r,k, independent of τ_n, such that

$$ \begin{array}{@{}rcl@{}} & &{{\int}_{I_{n}}} \left\|\varphi^{\prime}(t)\right\| {\mathrm{d}} t \leq \tau_{n} \underset{t \in I_{n}}{\sup} \left\|\varphi^{\prime}(t)\right\| \\ & \leq& \kappa_{r,k} \left[\sum\limits_{i=1}^{\lfloor{\frac{k-1}{2}}\rfloor} \left( \tfrac{\tau_{n}}{2}\right)^{i} \left\| \varphi^{(i)}\Big(t_{n-1}^{+}\Big)\right\| + \sum\limits_{i=1}^{\lfloor{\frac{k}{2}}\rfloor} \Big(\tfrac{\tau_{n}}{2}\Big)^{i} \left\| \varphi^{(i)}\Big(t_{n}^{-}\Big)\right\| + \sum\limits_{i=\delta_{0,k}}^{r-k} \left\| \mathscr{I}_{n}\left[\varphi^{\prime}(t) (1+T_{n}^{-1}(t))^{i} \right] \right\| \right] \end{array} $$

and

$$ \underset{t \in I_{n}}{\sup} \left\|\varphi(t)\right\| \leq \min\left\{\left\| \varphi\Big(t_{n-1}^{+}\Big) \right\|, \left\| \varphi\Big(t_{n}^{-}\Big) \right\| \right\} + {{\int}_{I_{n}}} \left\|\varphi^{\prime}(t)\right\| {\mathrm{d}} t $$

for all $\varphi \in P_{r}\left (I_{n},\mathbb {R}^{d}\right )$.

Proof

(for more details see [2, Lemma 3.4]) The second statement follows easily using the fundamental theorem of calculus and the triangle inequality.

In order to show the first statement, we pull back to the reference interval (− 1,1]. To this end, we use that via the affine transformation T_n of (3.1) any $\varphi \in P_{r-1}\left (I_{n},\mathbb {R}^{d}\right )$ is associated with a $\widehat {\varphi }\in P_{r-1}\left ((-1,1],{\mathbb {R}}^{d}\right )$ by

$$ \widehat{\varphi}(\hat{t} ) := \left( \varphi \circ T_{n}\right)(\hat{t} ) = \varphi\left( T_{n}(\hat{t})\right) \qquad\forall \hat{t}\in(-1,1]. $$

On the finite dimensional function space $P_{r-1}\left ((-1,1],\mathbb {R}^{d}\right )$, however, we have by a norm equivalence that

$$ \underset{\hat{t} \in (-1,1]}{\sup} \left\|\widehat{\psi}(\hat{t})\right\| \leq \frac{\kappa_{r,k}}{2} {\left[{\sum\limits_{i=0}^{\lfloor{\frac{k-1}{2}}\rfloor-1} \left\| \widehat{\psi}^{(i)}(-1^{+}) \right\|} + \sum\limits_{i=0}^{\lfloor{\frac{k}{2}}\rfloor-1} \left\| \widehat{\psi}^{(i)}(+1^{-})\right\| + \sum\limits_{i=\delta_{0,k}}^{r-k} \left\|{{\widehat{\mathscr{I}}}}\left[\widehat{\psi}(\hat{t}) (1+\hat{t})^{i} \right] \right\| \right]} $$

for all $\widehat {\psi } \in P_{r-1}\left ((-1,1],\mathbb {R}^{d}\right )$ where κ_r,k > 0 is independent of τ_n. To see that the right-hand side terms really define a norm, note that there appear precisely those r conditions of (3.2) that determine $\left ({\widehat {\mathcal {J}}^{{\mathscr{I}}, \text {Id}} \hat {v}}\right )'$ uniquely which shows the positive definiteness on $P_{r-1}\left ((-1,1],{\mathbb {R}}^{d}\right )$. The wanted statement then follows from scaling and transformation arguments. □

Lemma 3

Let $r \in \mathbb {N}_{0}$. Suppose that Assumption 1 holds. Then, there is a positive constant κ_r, independent of τ_n, such that

$$ \underset{t \in I_{n}}{\sup} \left\|\varphi(t)\right\| \leq \kappa_{r} \sum\limits_{i=0}^{r} \left\| \mathscr{I}_{n}\left[\varphi^{\prime}(t) (1+T_{n}^{-1}(t))^{i} \right] + \delta_{0,i} \varphi\left( t_{n-1}^{+}\right)\right\| $$

for all $\varphi \in P_{r}\big (I_{n},\mathbb {R}^{d}\big )$.

Proof

(for more details see [2, Lemma 3.5]) Similar to the proof of Lemma 2, the statement follows using a suitable norm equivalence on the finite dimensional space $P_{r}\left ((-1,1],{\mathbb {R}}^{d}\right )$ along with transformations via T_n given in (3.1) between the reference interval (− 1,1] and the actual interval I_n. □

In the following, we assume that the inverse inequalities

$$ \underset{t \in I_{n}}{\sup} \left\| V^{(l)}(t) \right\| \leq C_{\text{inv}} \left( \frac{\tau_{n}}{2}\right)^{-l} \underset{t \in I_{n}}{\sup} \| V(t)\| \qquad \forall V\in P_{r}\big(I_{n},\mathbb{R}^{d}\big), l\in\mathbb{N}, $$

(3.3)

hold true where C_inv > 0 is independent of τ_n but may depend on l and r. The proof is standard and uses transformations to together with norm equivalences on finite dimensional spaces on the reference interval.

For the proof of the existence and uniqueness of solutions of (2.1a), we need some more assumptions.

Assumption 2

The reference integrator ${{\widehat {{\mathscr{I}}}}}$ satisfies

$$ \left\|{{\widehat{\mathscr{I}}}}\left[\widehat{\varphi}\right] \right\| \leq \mathfrak{C}_0 \sum\limits_{j=0}^{k_{\mathscr{I}}} \underset{\hat{t} \in [-1,1]}{\sup} \left\| \widehat{\varphi}^{(j)}(\hat{t})\right\| \qquad \forall \widehat{\varphi} \in C^{k_{\mathscr{I}}}\left( [-1,1],\mathbb{R}^{d}\right) $$

where, as before, $k_{{\mathscr{I}}} \geq 0$ is the smallest integer such that ${{\widehat {{\mathscr{I}}}}}$ is well-defined on $C^{k_{{\mathscr{I}}}}([-1,1])$. This means by transformation that

$$ \left\|\mathscr{I}_{n}\left[ \varphi \right] \right\| \leq \mathfrak{C}_0 \frac{\tau_{n}}{2} \sum\limits_{j=0}^{k_{\mathscr{I}}} \left( \frac{\tau_{n}}{2}\right)^{j} \underset{t \in I_{n}}{\sup} \left\| \varphi^{(j)}(t)\right\| \qquad \forall \varphi \in C^{k_{\mathscr{I}}}\left( \overline{I}_{n},\mathbb{R}^{d}\right) $$

holds for the local integrators ${{{\mathscr{I}}}}_{n}, 1 \leq n \leq N$.

Assumption 3

Let $0 \leq l \leq k_{{\mathscr{I}}}$. The reference approximation operator $\widehat {\mathcal {I}}$ satisfies

$$ \underset{\hat{t} \in [-1,1]}{\sup} \left\| (\widehat{\mathcal{I}} \widehat{\varphi} )^{(l)}(\hat{t})\right\| \leq \mathfrak{C}_1 \sum\limits_{j=0}^{\max\{ k_{\mathcal{I}} , l \}} \underset{\hat{t} \in [-1,1]}{\sup} \left\| \widehat{\varphi}^{(j)}(\hat{t})\right\| $$

for $\widehat {\varphi } \in C^{\max \limits \{ k_{\mathcal {I}} , l \}}\left ([-1,1],\mathbb {R}^{d}\right )$ where, as before, $k_{\mathcal {I}} \geq 0$ is the smallest integer such that ${\widehat {\mathcal {I}}}$ is well-defined on $C^{k_{\mathcal {I}}}([-1,1])$. This means by transformation that

$$ \underset{t \in I_{n}}{\sup} \left\| (\mathcal{I}_{n} \varphi )^{(l)}(t)\right\| \leq \mathfrak{C}_1 \sum\limits_{j=0}^{\max\{ k_{\mathcal{I}} , l \}} \left( \frac{\tau_{n}}{2}\right)^{j-l} \underset{t \in I_{n}}{\sup} \left\| \varphi^{(j)}(t)\right\| $$

for $\varphi \in C^{\max \limits \{ k_{\mathcal {I}} , l \}}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$ holds for the local operators $\mathcal {I}_{n}$, 1 ≤ n ≤ N.

Assumption 4

For $0 \leq i \leq k_{\mathcal {J}} = \max \limits \left \{\left \lfloor \tfrac {k}{2} \right \rfloor -1, k_{{\mathscr{I}}}, k_{\mathcal {I}} \right \}$, it holds for sufficiently smooth functions v,w that

$$ \left\| \frac{{\mathrm{d}}^{i}}{{\mathrm{d}} t^{i}} \Big(f(t,v(t))-f(t,w(t))\Big)\Big|_{t=s}\right\| \leq \mathfrak{C}_2 \sum\limits_{l=0}^{i} \left\| (v-w)^{(l)}(s)\right\| $$

for almost every $s \in \overline {I} = [t_{0},t_{0}+T]$. Here $\mathfrak {C}_2$ depends on $k_{\mathcal {J}}$ and f.

Remark 4

Sufficient conditions for Assumption 4 would be

(i)
for $k_{\mathcal {J}} = 0$: f satisfies a Lipschitz condition on the second variable with constant L > 0,
(ii)
for $k_{\mathcal {J}} \geq 1$: f is affine linear in u, i.e., f(t,u(t)) = A(t)u(t) + b(t), and $\left \| A(\cdot ) \right \|_{C^{k_{\mathcal {J}}}} < \infty $. Then the inequality follows from Leibniz’ rule for the i th derivative.
(iii)
In the literature, see [12, p. 74], there also appear conditions of the form
$$ \underset{t \in I, y \in \mathbb{R}^{d}}{\sup} \left\| \tfrac{\partial}{\partial y} f^{(i)}(t,y) \right\| < \infty, \qquad 0 \leq i \leq k_{\mathcal{J}}, $$
where f⁽ⁱ⁾ denotes the i th total derivative of f with respect to t in the sense of [12, p. 65]. These conditions may be weaker in some cases.

Since in general the constant $\mathfrak {C}_2$ is somewhat connected to the Lipschitz constant and, thus, to the stiffness of the ode system, the dependence on this constant is studied very thoroughly in the analysis.

Now, we investigate the solvability of the local problem (2.1a).

Theorem 5 (Existence and uniqueness)

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. We suppose that Assumptions 1, 2, 3, and 4 hold. Then, there is a constant γ_r,k > 0 multiplicatively depending on $\mathfrak {C}_2^{-1}$ but independent of n such that problem (2.1a) has a unique solution for all 1 ≤ n ≤ N when τ_n ≤ γ_r,k.

Proof

(for more details see [2, Theorem 3.7]) Since the single local problems can be solved one after another, it suffices to prove that (2.1a) determines a unique solution for a given initial value $U_{n-1}^{-}$.

In order to show this, we use the auxiliary mapping $g : P_{r}\left (I_{n},\mathbb {R}^{d}\right )\to P_{r}\left (I_{n},\mathbb {R}^{d}\right )$ given by

$$ \begin{array}{@{}rcl@{}} g(U)\left( t_{n-1}^{+}\right)&=& U_{n-1}^{-}, \qquad\qquad\qquad\qquad{\kern2pc}\text{if } k \geq 1, \end{array} $$

(3.4a)

$$ \begin{array}{@{}rcl@{}} g(U)^{(i+1)}\left( t_{n}^{-}\right)&=& \tfrac{{\mathrm{d}}^{i}}{{\mathrm{d}} t^{i}} \big(f(t,U(t))\big)|_{t=t_{n}^{-}}, \quad\qquad\qquad \text{if } k \geq 2, i = 0, \ldots, \left\lfloor\tfrac{k}{2}\right\rfloor - 1, \end{array} $$

(3.4b)

$$ \begin{array}{@{}rcl@{}}{}g(U)^{(i+1)}\left( t_{n-1}^{+}\right) &=& \tfrac{{\mathrm{d}}^{i}}{{\mathrm{d}} t^{i}} \big(f(t,U(t))\big)\big|_{t=t_{n-1}^{+}}, \quad\qquad\quad{\kern.2pc} \text{if } k \geq 3, i = 0, \ldots, \big\lfloor\tfrac{k-1}{2}\big\rfloor - 1, \end{array} $$

(3.4c)

$$ \begin{array}{@{}rcl@{}} \mathscr{I}_{n}\left[ (g(U)'(t),\varphi(t)) \right] + \delta_{0,k} \left( g(U)\left( t_{n-1}^{+}\right),\varphi\left( t_{n-1}^{+}\right)\right) \end{array} $$

(3.4d)

$$ \begin{array}{@{}rcl@{}} &&{}=\mathscr{I}_{n}\left[ (\mathcal{I}_{n} f(t,U(t)), \varphi(t)) \right] + \delta_{0,k} \big(U_{n-1}^{-},\varphi\big(t_{n-1}^{+}\big)\big) \quad\forall \varphi\in P_{r-k}\left( I_{n},\mathbb{R}^{d}\right). \end{array} $$

To verify that g is well-defined, just follow the lines of the proof of Lemma 1.

It can be easily seen that for given $U_{n-1}^{-}$ every fixed point of g is a solution of the local problem (2.1a) and vice versa. To get the existence of a unique fixed point, Banach’s fixed point theorem shall be applied. Therefore, we prove for τ_n ≤ γ_r,k with γ_r,k > 0 to be defined that g is on $G_{r}(U_{n-1}^{-}) := \left \{U \in P_{r}\big (I_{n},{\mathbb {R}}^{d}\big ) : U\big (t_{n-1}^{+}\big ) = U_{n-1}^{-} \text { if } k \geq 1\right \}$ a contraction with respect to the supremum norm.

So, let $V,W \in G_{r}(U_{n-1}^{-})$. Then, due to (3.4a) and Lemma 2 (if k ≥ 1) or Lemma 3 (if k = 0), we have $(g(V)- g(W))\left (t_{n-1}^{+}\right ) = 0$ (for k ≥ 1) and

$$ \begin{array}{@{}rcl@{}} & &\underset{t \in I_{n}}{\sup} \| g(V)-g(W) \| \\ & \leq &C \left[ \overbrace{\sum\limits_{i=1}^{\lfloor{\frac{k-1}{2}}\rfloor} \left( \frac{\tau_{n}}{2}\right)^{i} \left\| (g(V)- g(W))^{(i)}\Big(t_{n-1}^{+}\Big)\right\|}^{\mathrm{(I)}} + \overbrace{\sum\limits_{i=1}^{\lfloor{\frac{k}{2}}\rfloor} \left( \frac{\tau_{n}}{2}\right)^{i} \left\| (g(V)- g(W))^{(i)}\Big(t_{n}^{-}\Big)\right\|}^{\mathrm{(II)}}\right. \\ && \left. + \underbrace{\sum\limits_{i=0}^{r-k} \left\| \mathscr{I}_{n}\left[(g(V)- g(W))'(t) (1+T_{n}^{-1}(t))^{i} \right] + \delta_{0,k}\delta_{0,i} (g(V)-g(W))\left( t_{n-1}^{+}\right) \right\|}_{\mathrm{(III)}} \right]. \end{array} $$

(3.5)

In order to bound the summands of (I), we combine (3.4c), Assumption 4, and the inverse inequalities (3.3) to get for $1 \leq i \leq \left \lfloor {\frac {k-1}{2}}\right \rfloor $

$$ \begin{array}{@{}rcl@{}} && \left\|(g(V)- g(W))^{(i)}\left( t_{n-1}^{+}\right)\right\| = \left\| \tfrac{{\mathrm{d}}^{i-1}}{{\mathrm{d}} t^{i-1}} (f(t,V(t))-f(t,W(t)))\Big|_{t=t_{n-1}^{+}}\right\| \\ & \leq &\mathfrak{C}_2 \sum\limits_{l=0}^{i-1} \left\| (V-W)^{(l)}\left( t_{n-1}^{+}\right)\right\| \leq \mathfrak{C}_2 C_{\text{inv}} i \left( \frac{\tau_{n}}{2}\right)^{-i+1} \underset{t \in I_{n}}{\sup} \left\| (V-W)(t)\right\|. \end{array} $$

The arguments for the summands of (II) are analogous. Altogether, we obtain

$$ \mathrm{(I)}+ \mathrm{(II)} \leq C \mathfrak{C}_2 \frac{\tau_{n}}{2} \underset{t\in I_{n}}{\sup} \|(V-W)(t)\|. $$

(3.6)

We now consider the third sum of (3.5). Exploiting (3.4d) we gain that

$$ \mathrm{(III)} = \sum\limits_{i=0}^{r-k} \left\| \mathscr{I}_{n}\left[\Big(\mathcal{I}_{n} f(t,V(t))-\mathcal{I}_{n} f(t,W(t))\Big) (1+T_{n}^{-1}(t))^{i} \right] \right\|. $$

Because of Assumption 2 and Leibniz’ rule for the j th derivative, it follows

$$ \begin{array}{@{}rcl@{}} && \left\| \mathscr{I}_{n}\left[\Big(\mathcal{I}_{n} f(t,V(t))-\mathcal{I}_{n} f(t,W(t))\Big) (1+T_{n}^{-1}(t))^{i} \right] \right\| \\ & \leq &\mathfrak{C}_0 \frac{\tau_{n}}{2} {\sum\limits_{j=0}^{k_{\mathscr{I}}}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \sum\limits_{l=0}^{j} \Big(\begin{array}{cc}{j}\\{l} \end{array}\Big) \underset{t \in I_{n}}{\sup} \left\| \tfrac{{\mathrm{d}}^{l}}{{\mathrm{d}} t^{l}} \Big(\mathcal{I}_{n} f(t,V(t))-\mathcal{I}_{n} f(t,W(t))\Big) \right\| \\ &&\underset{t \in I_{n}}{\sup} \left\| \Big((1+T_{n}^{-1}(t))^{i}\Big)^{(j-l)} \right\| \\ & \leq& C \frac{\tau_{n}}{2} {\sum\limits_{l=0}^{k_{\mathscr{I}}}} \left( \tfrac{\tau_{n}}{2}\right)^{l} \underset{t \in I_{n}}{\sup} \left\| \tfrac{{\mathrm{d}}^{l}}{{\mathrm{d}} t^{l}} \Big(\mathcal{I}_{n} f(t,V(t))-\mathcal{I}_{n} f(t,W(t))\Big) \right\| \end{array} $$

(3.7)

where we used for the last inequality that

$$ \underset{t \in I_{n}}{\sup} \left\| \Big((1+T_{n}^{-1}(t))^{i}\Big)^{(j-l)} \right\| \leq \begin{cases} \left( \tfrac{2}{\tau_{n}}\right)^{j-l} \tfrac{i!}{(i-(j-l))!} 2^{i-(j-l)}, & 0 \leq j-l \leq i,\\ 0, & i < j-l. \end{cases} $$

Furthermore, involving Assumptions 3 and 4, and the inverse inequalities (3.3) we conclude

$$ \begin{array}{@{}rcl@{}} \mathrm{(III)} & \leq& C \frac{\tau_{n}}{2} \ {\sum\limits_{j=0}^{\max\{k_{\mathscr{I}},k_{\mathcal{I}}\}}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \underset{t \in I_{n}}{\sup} \left\| \tfrac{{\mathrm{d}}^{j}}{{\mathrm{d}} t^{j}} \Big(f(t,V(t))-f(t,W(t))\Big) \right\| \\ & \leq& C \mathfrak{C}_2 \frac{\tau_{n}}{2} {\sum\limits_{j=0}^{\max\{k_{\mathscr{I}},k_{\mathcal{I}}\}}} \left( \tfrac{\tau_{n}}{2}\right)^{j} {\sum\limits_{l=0}^{j}} \underset{t \in I_{n}}{\sup} \left\| (V-W)^{(l)}(t) \right\| \\ & \leq& C \mathfrak{C}_2 \frac{\tau_{n}}{2} \underset{t \in I_{n}}{\sup} \| (V-W)(t) \|. \end{array} $$

(3.8)

Now, combining (3.5), (3.6), and (3.8), we get

$$ \underset{t \in I_{n}}{\sup} \| g(V)-g(W) \| \leq C \mathfrak{C}_2 \frac{\tau_{n}}{2} \underset{t\in I_{n}}{\sup} \|(V-W)(t)\|. $$

Hence, for sufficiently small τ_n ≤ γ_r,k, where γ_r,k multiplicatively depends on ${\mathfrak {C}_{2}}^{-1}$ but is independent of n, the mapping g is on $G_{r}(U_{n-1}^{-})$ a contraction with respect to the supremum norm. By Banach’s fixed-point theorem, we have the existence of a unique fixed point which is just the unique solution of the local problem (2.1a). □

4 Error analysis

In order to prove an error estimate, we reuse the operator $\widehat {\mathcal {J}}^{{\mathscr{I}}, {{{\mathcal {I}}}}}$ introduced in the previous section. By transformation, we get for n = 1,…,N local approximation operators $\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}:C^{k_{\mathcal {J}}+1}\left (\overline {I}_{n},\mathbb {R}^{d}\right ) \to P_{r}\left (I_{n},\mathbb {R}^{d}\right )$ satisfying

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)^{(i)}\Big(t_{n-1}^{+}\Big) &=& v^{(i)}\Big(t_{n-1}^{+}\Big), \qquad\qquad \qquad\qquad\qquad {\kern2.7pc}\text{if } k \geq 1, i=0,\ldots,\big\lfloor \tfrac{k-1}{2} \big\rfloor, \end{array} $$

(4.1a)

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)^{(i)}\Big(t_{n}^{-}\Big) & =& v^{(i)}\Big(t_{n}^{-}\Big), \qquad\qquad\qquad\qquad \qquad\qquad{\kern2pc} \text{if } k \geq 2, i=1,\ldots,\left\lfloor \tfrac{k}{2} \right\rfloor, \end{array} $$

(4.1b)

$$ \begin{array}{@{}rcl@{}} {}\mathscr{I}_{n}\left[ \Big(\big(\mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\big)'(t), \varphi(t)\Big)\right] & + &\delta_{0,k} \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\left( t_{n-1}^{+}\right) \varphi\left( t_{n-1}^{+}\right)\\&=& \mathscr{I}_{n}\left[ (\mathcal{I}_{n} (v^{\prime})(t), \varphi(t))\right] + \delta_{0,k} v\left( t_{n-1}^{+}\right) \varphi\left( t_{n-1}^{+}\right) {\kern.4pc}\forall \varphi \in P_{r-k}\left( I_{n},\mathbb{R}^{d}\right). \end{array} $$

(4.1c)

Of course, these operators are well-defined under Assumption 1.

For convenience, we define for $v \in C^{k_{\mathcal {J}}+1}\big (\overline {I},\mathbb {R}^{d}\big )$ a global approximation operator $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}$ by combining the local approximations on the mesh intervals, i.e., $\big .\big (\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}v\big ) \big |_{I_{n}} = \mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}(v|_{I_{n}}) \in P_{r}\big (I_{n},\mathbb {R}^{d}\big )$ for n = 1,…,N and setting $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}} v(t_{0}^{-}) = v(t_{0}^{-}) = v(t_{0})$. Note that even for k ≥ 1, in general ${\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}} v$ is not globally continuous.

In order to prove the error estimate we need to strengthen the assumptions since derivatives can be handled given in certain points, but not their supremum. Therefore, compared to Theorem 5, we replace Assumption 3 by Assumptions 5a or 5b. Furthermore, we need an auxiliary interpolation operator ${\mathcal {I}^{\text {app}}}$, see Definition 7, which amongst others is based on these assumptions.

For brevity, the Assumptions 5a and 5b below are stated directly for the local operators ${\mathscr{I}}_{n}$ and ${\mathcal {I}}_{n}$. However, appropriate properties of $\widehat {{\mathscr{I}}}$ and ${\widehat {\mathcal {I}}}$ guarantee these assumptions by transformation, cf. Assumptions 2 and 3.

Assumption 5a

For 1 ≤ n ≤ N and $0 \leq l \leq k_{{\mathscr{I}}}$ it holds $\mathcal {I}_{n} \varphi \in C^{l}\left (\overline {I}_{n}, \mathbb {R}^{d}\right )$ and there are disjoint points $\hat {t}_{m}^{ {\mathcal {I}}}$, $m = 0, \ldots ,{K^{\mathcal {I}}}$, in the reference interval [− 1,1] such that

$$ \left( \tfrac{\tau_{n}}{2}\right)^{l} \underset{t \in I_{n}}{\sup} \left\| (\mathcal{I}_{n} \varphi)^{(l)}(t)\right\| \leq \mathfrak{C}_{1,1} \sum\limits_{m=0}^{K^{\mathcal{I}}} \sum\limits_{j=0}^{\widetilde{K}^{\mathcal{I}}_{m}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \left\| \varphi^{(j)}\big(t_{n,m}^{\mathcal{I}}\big)\right\| + \mathfrak{C}_{1,2} \underset{t \in I_{n}}{\sup} \| \varphi(t)\| $$

for $\varphi \in C^{k_{\mathcal {I}}}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$ where $t_{n,m}^{\mathcal {I}} := \frac {t_{n}+t_{n-1}}{2} + \frac {\tau _{n}}{2} \hat {t}_{m}^{ {\mathcal {I}}}$. Note that then typically ${k_{\mathcal {I}}} = \max \limits \big \{{\widetilde {K}^{\mathcal {I}}}_{m} : m = 0,\ldots ,{K^{\mathcal {I}}}\big \}$.

Assumption 5b

For 1 ≤ n ≤ N, there are disjoint points $\hat {t}_{m}^{ {\mathscr{I}}}$, $m = 0, \ldots ,K^{{\mathscr{I}}}$, in the reference interval [− 1,1] such that

$$ \left\| \mathscr{I}_{n}\left[ \varphi \right] \right\| \leq \widetilde{\mathfrak{C}}_{0,1} \frac{\tau_{n}}{2} \sum\limits_{m=0}^{K^{\mathscr{I}}} \sum\limits_{j=0}^{\widetilde{K}^{\mathscr{I}}_{m}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \left\| \varphi^{(j)}\big(t_{n,m}^{\mathscr{I}}\big)\right\| + \widetilde{\mathfrak{C}}_{0,2} \frac{\tau_{n}}{2} \underset{t \in I_{n}}{\sup} \| \varphi(t)\| $$

for $\varphi \in C^{k_{{\mathscr{I}}}}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$ where $t_{n,m}^{{\mathscr{I}}} := \frac {t_{n}+t_{n-1}}{2} + \frac {\tau _{n}}{2} \hat {t}_{m}^{ {\mathscr{I}}}$. Note that then typically ${k_{{\mathscr{I}}}} = \max \limits \big \{{\widetilde {K}^{{\mathscr{I}}}}_{m} : m = 0,\ldots ,{K^{{\mathscr{I}}}}\big \}$.

Moreover, for 1 ≤ n ≤ N assume that there are disjoint points $\hat {t}_{m}^{ \mathcal {I}}$, $m = 0, \ldots ,K^{\mathcal {I}}$, in the reference interval [− 1,1] such that

$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{m=0}^{K^{\mathscr{I}}} \sum\limits_{l=0}^{\widetilde{K}^{\mathscr{I}}_{m}} \left( \tfrac{\tau_{n}}{2}\right)^{l} \left\| (\mathcal{I}_{n} \varphi)^{(l)}\big(t_{n,m}^{\mathscr{I}}\big)\right\| + \underset{t \in I_{n}}{\sup} \| \mathcal{I}_{n} \varphi(t)\| \\ &\leq& \widetilde{\mathfrak{C}}_{1,1} \sum\limits_{m=0}^{K^{\mathcal{I}}} \sum\limits_{j=0}^{\widetilde{K}^{\mathcal{I}}_{m}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \left\| \varphi^{(j)}\big(t_{n,m}^{\mathcal{I}}\big)\right\| + \widetilde{\mathfrak{C}}_{1,2} \underset{t \in I_{n}}{\sup} \| \varphi(t)\| \end{array} $$

for $\varphi \in C^{\max \limits \{k_{{\mathscr{I}}}, k_{\mathcal {I}}\}}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$ where $t_{n,m}^{\mathcal {I}} := \frac {t_{n}+t_{n-1}}{2} + \frac {\tau _{n}}{2} \hat {t}_{m}^{ {\mathcal {I}}}$.

Remark 6

Assumption 5a is satisfied if $\mathcal {I}_{n}$ is a polynomial approximation operator whose defining degrees of freedom only use derivatives in certain points, as, for example, Hermite interpolation operators. Together with Assumption 2, the term $\left \| {{{{\mathscr{I}}}}}_{n}\left [{ {\mathcal {I}}_{n} \varphi }\right ]\right \|$ can be estimated by the supremum of $\left \| \varphi \right \|$ and certain point values of derivatives of φ.

However, Assumption 5a is not satisfied if $\mathcal {I}_{n} = \text {Id}$ and $k_{{\mathscr{I}}} > 0$. In order to enable a similar estimate for $\left \|{\mathscr{I}}_{n}\left [ \mathcal {I}_{n} \varphi \right ] \right \|$ also in this case, Assumption 5b is formulated. Here, the requirements on the integrator ${{{\mathscr{I}}}_{n}}$ are increased. Of course, the defining degrees of freedom for the integrator now should use derivatives in certain points only. In return, the requirements for ${\mathcal {I}_{n}}$ can be weakened such that they are met for example also by ${\mathcal {I}}_{n} = {\text {Id}}$.

Definition 7 (Auxiliary interpolation operator)

For the error estimation, we introduce a special Hermite interpolation operator $\mathcal {I}^{\text {app}}_{n}$. Concretely, the operator should satisfy the following conditions: $\mathcal {I}^{\text {app}}_{n}$ preserves derivatives up to order $\left \lfloor {\frac {k}{2}}\right \rfloor -1$ in $t_{n}^{-}$ and up to order $\big \lfloor {\frac {k-1}{2}}\big \rfloor -1$ in $t_{n-1}^{+}$, i.e.,

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{I}^{\text{app}}_{n} \varphi\right)^{(l)}\left( t_{n}^{-}\right) & = &\varphi^{(l)}\left( t_{n}^{-}\right) \qquad\quad\quad{\kern3pt} \text{for } 0 \leq l \leq \left\lfloor{\tfrac{k}{2}}\right\rfloor-1, \end{array} $$

(4.2a)

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{I}^{\text{app}}_{n} \varphi\right)^{(l)}\left( t_{n-1}^{+}\right) & = &\varphi^{(l)}\left( t_{n-1}^{+}\right) \quad \qquad{\kern6pt} \text{for } 0 \leq l \leq \big\lfloor{\tfrac{k-1}{2}}\big\rfloor-1. \end{array} $$

(4.2b)

Moreover, we demand that

$$ \begin{array}{@{}rcl@{}} &&\big(\mathcal{I}^{\text{app}}_{n} \varphi\big)^{(l)}\big(t_{n,m}^{\mathcal{I}}\big) \end{array} $$

(4.3a)

$$ \begin{array}{@{}rcl@{}} & =& \varphi^{(l)}\big(t_{n,m}^{\mathcal{I}}\big) \quad\quad \text{for } 0 \leq m \leq K^{\mathcal{I}}, 0 \leq l \leq \widetilde{K}^{\mathcal{I}}_{m}, \end{array} $$

(4.3b)

with $t_{n,m}^{\mathcal {I}} := \frac {t_{n}+t_{n-1}}{2} + \frac {\tau _{n}}{2} \hat {t} _{m}^{\mathcal {I}}$ where the points $\hat {t} _{m}^{{\mathcal {I}}}$ are those of Assumptions 5a or 5b, respectively. If (4.2) and (4.3b) provide r^app independent interpolation conditions and r^app < r + 1, then we choose r + 1 − r^app further points $\hat {t} _{m}^{\mathcal {I}} \in (-1,1) \setminus \{\hat {t} _{j}^{{\mathcal {I}}} : j = 0,\ldots , {K^{\mathcal {I}}}\}$, $m = K^{\mathcal {I}} + 1, \ldots , K^{\mathcal {I}} + r +1- r^{\text {app}}$, and demand

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{I}^{\text{app}}_{n} \varphi\right)\big(t_{n,m}^{\mathcal{I}}\big) & = \varphi\big(t_{n,m}^{\mathcal{I}}\big) \quad\quad \text{for} K^{\mathcal{I}} + 1 \leq m \leq K^{\mathcal{I}} + r+1- r^{\text{app}} \end{array} $$

(4.3c)

where again $t_{n,m}^{\mathcal {I}} := \frac {t_{n}+t_{n-1}}{2} + \frac {\tau _{n}}{2} \hat {t}_{m}^{ \mathcal {I}}$. We agree that $\mathcal {I}^{\text {app}}_{n}$ is applied component-wise to vector-valued functions. Overall, conditions (4.2) and (4.3a) uniquely define an Hermite-type interpolation operator of ansatz order $\max \limits \{r^{\text {app}}-1,r\}$.

Now, we are able to prove an abstract error estimate.

Theorem 8

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. We suppose that Assumptions 1, 2, and 4 hold. Moreover, let Assumptions 5a or 5b be satisfied. Denote by u and U the solutions of (1.1) and (2.1a), respectively. Then, we have for 1 ≤ n ≤ N, sufficiently small τ, and l = 0,1 that

$$ \begin{array}{@{}rcl@{}} \underset{t \in I_{n}}{\sup} \left\|(u-U)^{(l)}(t)\right\| &\leq &C \underset{1\leq \nu \leq n-1}{\max} \tau_{\nu}^{-1} \left\|\left( u - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}_{\nu} u\right)(t_{\nu}^{-})\right\| \\ &&+ C \underset{1\leq \nu \leq n}{\max} \left( \underset{t\in I_{\nu}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{\nu}\right)u(t)} \right\| + \sum\limits_{j=0}^{l} \underset{t \in I_{\nu}}{\sup} \left\|\left( u - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}_{\nu} u\right)^{(j)}(t)\right\| \right) \end{array} $$

where the C s in general exponentially depend on the product of T and $\mathfrak {C}_2$.

Proof

(for more details see [2, Theorem 4.5]) Using the approximation $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}u$, for a definition see directly below (4.1), we split the error in two parts

$$ \eta(t):=u(t)-\mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u(t),\qquad \zeta(t):=\mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u(t)-U(t). $$

Then, the triangle inequality yields

$$ E_{n} := \underset{t \in I_{n}}{\sup} \|u(t)-U(t)\| \leq \underset{t \in I_{n}}{\sup} \|\eta(t)\| + \underset{t \in I_{n}}{\sup} \|\zeta(t)\|. $$

(4.4)

The first term on the right-hand side already is the approximation error of the operator $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}$. So, it only remains to study the second term.

Using, among others, the fundamental theorem of calculus we find

$$ \underset{t \in I_{n}}{\sup} \| {\zeta(t)} \| \leq \underbrace{\| \zeta(t_{0}^{-}) \|}_{=0} + \sum\limits_{\nu=1}^{n} \left( {\int}_{I_{\nu}} \|\zeta^{\prime}(s)\| {\mathrm{d}} s + \|[\zeta]_{\nu-1}\| \right) $$

(4.5)

where $\zeta (t_{0}^{-})=0$ follows due to $U(t_{0}^{-}) = u_{0} = u(t_{0}^{-}) = \mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}u(t_{0}^{-})$, see (2.1a) and directly below (4.1).

We start analyzing the integral term of (4.5). For 1 ≤ n ≤ N and because $\left . \zeta \right |_{I_{n}} \in P_{r}\big (I_{n},{\mathbb {R}}^{d}\big )$ we can apply Lemma 2 to get

$$ \begin{array}{@{}rcl@{}} &&{{\int}_{I_{n}}} \| \zeta^{\prime}(t) \| {\mathrm{d}} t \leq \tau_{n} \underset{t \in I_{n}}{\sup} \| \zeta^{\prime}(t) \| \\ & \leq& C \left[ \underbrace{\sum\limits_{i=1}^{\lfloor{\frac{k-1}{2}}\rfloor} \left( \tfrac{\tau_{n}}{2}\right)^{i} \left\| \zeta^{(i)}\left( t_{n-1}^{+}\right)\right\|}_{\mathrm{(Ia)}} + \underbrace{\sum\limits_{i=1}^{\lfloor{\frac{k}{2}}\rfloor} \left( \tfrac{\tau_{n}}{2}\right)^{i} \left\| \zeta^{(i)}(t_{n}^{-})\right\|}_{\mathrm{(Ib)}} + \underbrace{\sum\limits_{i=\delta_{0,k}}^{r-k} \left\| \mathscr{I}_{n}\left[\zeta^{\prime}(t) (1+T_{n}^{-1}(t))^{i} \right] \right\|}_{\mathrm{(II)}} \right]. \end{array} $$

(4.6)

The right-hand side terms are now studied separately.

The arguments for the sums (Ia) and (Ib) are quite analogous. We therefore only consider the latter in detail. From (4.1b), (2.1b), the (i − 1)th derivative of the ode system (1.1), and Assumption 4 we obtain for $1 \leq i \leq \left \lfloor \frac {k}{2} \right \rfloor $

$$ \begin{array}{@{}rcl@{}} \left\| \zeta^{(i)}(t_{n}^{-}) \right\| &= & \left\| \big(\mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}u-U\big)^{(i)}(t_{n}^{-}) \right\| = \left\| (u-U)^{(i)}(t_{n}^{-}) \right\| \\ &= & \left\| \left.\frac{{\mathrm{d}}^{i-1}}{{\mathrm{d}} t^{i-1}} \Big(f(t,u(t))-f(t,U(t))\Big)\right|_{t=t_{n}^{-}} \right\| \leq \mathfrak{C}_2 \sum\limits_{l=0}^{i-1} \left\| (u-U)^{(l)}(t_{n}^{-})\right \|. \end{array} $$

Exploiting the definition of $\mathcal {I}^{\text {app}}_{n}$, especially (4.2a), as well as invoking the inverse inequality (3.3), we find for $0 \leq l \leq i-1 \leq \left \lfloor \frac {k}{2} \right \rfloor -1$

$$ \begin{array}{@{}rcl@{}} \left\|(u-U)^{(l)}(t_{n}^{-})\right\| &=& \left\|\left( \mathcal{I}^{\text{app}}_{n} u-U\right)^{(l)}(t_{n}^{-})\right\| \leq C_{\text{inv}} \left( \tfrac{\tau_{n}}{2}\right)^{-l} \underset{t\in I_{n}}{\sup} \left\|\left( \mathcal{I}^{\text{app}}_{n} u-U\right)(t)\right\| \\ & \leq& C_{\text{inv}} \left( \tfrac{\tau_{n}}{2}\right)^{-l} \left( \underset{t\in I_{n}}{\sup} \| (\mathcal{I}^{\text{app}}_{n} u-u)(t) \| + \underset{t\in I_{n}}{\sup} \| (u-U)(t) \|\right ). \end{array} $$

(4.7)

So, overall we conclude

$$ \mathrm{(Ia)} + \mathrm{(Ib)} \leq C \mathfrak{C}_2 \frac{\tau_{n}}{2} \left( \underset{t\in I_{n}}{\sup} \big\| \left( u-\mathcal{I}^{\text{app}}_{n} u\right)(t) \big\| + E_{n}\right). $$

(4.8)

We now consider (II) in (4.6). First of all, from (4.1c), (2.1d), (1.1), and the continuity of u it follows for all $\varphi \in P_{r-k}\left (I_{n},\mathbb {R}^{d}\right )$

$$ \begin{array}{@{}rcl@{}} &&\mathscr{I}_{n}\left[ (\zeta^{\prime}(t),\varphi(t)) \right] + \delta_{0,k} \left( \zeta\left( t_{n-1}^{+}\right), \varphi\left( t_{n-1}^{+}\right)\right) \\ & =& \mathscr{I}_{n}\left[ (\mathcal{I}_{n} (u')(t) - U^{\prime}(t),\varphi(t)) \right] + \delta_{0,k} \left( (u - U)\left( t_{n-1}^{+}\right), \varphi\left( t_{n-1}^{+}\right)\right) \\ & =&\mathscr{I}_{n}\left[ \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)), \varphi(t) \Big) \right] + \delta_{0,k} \left( (u - U)\left( t_{n-1}^{-}\right), \varphi\left( t_{n-1}^{+}\right)\right). \end{array} $$

(4.9)

Therefore, by a component-wise derivation we get

$$ \mathrm{(II)} = \sum\limits_{i=\delta_{0,k}}^{r-k} \left\| \mathscr{I}_{n}\left[ \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \Big) (1+T_{n}^{-1}(t))^{i} \right] \right\|. $$

(4.10)

For the summands on the right-hand side, we consider two different cases. If Assumption 5a holds, we conclude that

$$ \begin{array}{@{}rcl@{}} && \left\| \mathscr{I}_{n}\left[ \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \Big ) (1+T_{n}^{-1}(t))^{i} \right] \right\| \\ & \leq& C \frac{\tau_{n}}{2} {\sum\limits_{l=0}^{k_{\mathscr{I}}}} \left( \tfrac{\tau_{n}}{2}\right)^{l} \underset{t \in I_{n}}{\sup} \left\| \tfrac{{\mathrm{d}}^{l}}{{\mathrm{d}} t^{l}} \Big(\mathcal{I}_{n} f(t,u(t))-\mathcal{I}_{n} f(t,U(t)) \Big) \right\| \\ & \leq &C \frac{\tau_{n}}{2} \left( \mathfrak{C}_{1,1} {\sum\limits_{m=0}^{K^{\mathcal{I}}} \sum\limits_{j=0}^{\widetilde{K}^{\mathcal{I}}_{m}}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \left\|\left. \tfrac{{\mathrm{d}}^{j}}{{\mathrm{d}} t^{j}} \Big(f(t,u(t))-f(t,U(t))\Big)\right|_{t=t_{n,m}^{\mathcal{I}}}\right\| \right.\\ &&\left.+ \mathfrak{C}_{1,2} \underset{t \in I_{n}}{\sup} \| f(t,u(t))-f(t,U(t))\| \right) \end{array} $$

where the first inequality follows, as in (3.7), by Assumption 2 and Leibniz’ rule for the j th derivative. Using quite similar arguments, under Assumption 5b we gain that

$$ \begin{array}{@{}rcl@{}} & &\left\| \mathscr{I}_{n}\left[ \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \Big) (1+T_{n}^{-1}(t))^{i} \right] \right\| \\ & \leq& \widetilde{\mathfrak{C}}_{0,1} C \frac{\tau_{n}}{2} \sum\limits_{m=0}^{K^{\mathscr{I}}} \sum\limits_{l=0}^{\widetilde{K}^{\mathscr{I}}_{m}} \left( \tfrac{\tau_{n}}{2}\right)^{l} \left\|\left. \tfrac{{\mathrm{d}}^{l}}{{\mathrm{d}} t^{l}} \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t))\Big) \right|_{t=t_{n,m}^{\mathscr{I}}}\right\| \\ &&+ \widetilde{\mathfrak{C}}_{0,2} C \frac{\tau_{n}}{2} \underset{t \in I_{n}}{\sup} \Big\| \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \Big) \Big\| \\ & \leq& C \frac{\tau_{n}}{2} \left( \widetilde{\mathfrak{C}}_{1,1} {\sum\limits_{m=0}^{K^{\mathcal{I}}} \sum\limits_{j=0}^{\widetilde{K}^{\mathcal{I}}_{m}}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \left\|\left. \tfrac{{\mathrm{d}}^{j}}{{\mathrm{d}} t^{j}} \Big(f(t,u(t)) - f(t,U(t)) \Big) \right|_{t=t_{n,m}^{\mathcal{I}}}\right\|\right. \\ &&\left.+ \widetilde{\mathfrak{C}}_{1,2} \underset{t \in I_{n}}{\sup} \Big\|\Big (f(t,u(t)) - f(t,U(t))\Big ) \Big\| \right). \end{array} $$

So, either way applying Assumption 4 gives

$$ \begin{array}{@{}rcl@{}} &&\left\| \mathscr{I}_{n}\left[ \Big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \Big) (1+T_{n}^{-1}(t))^{i} \right] \right\| \\ &\leq& C \mathfrak{C}_2 \frac{\tau_{n}}{2} \left( {\sum\limits_{m=0}^{K^{\mathcal{I}}} \sum\limits_{l=0}^{\widetilde{K}^{\mathcal{I}}_{m}}} \left( \tfrac{\tau_{n}}{2}\right)^{l} \left\| (u-U)^{(l)}\big(t_{n,m}^{\mathcal{I}}\big)\right\| + \underset{t \in I_{n}}{\sup} \| (u-U)(t) \| \right). \end{array} $$

Recalling the definition of $\mathcal {I}^{\text {app}}_{n}$, especially (4.3b), the terms that include derivatives can be estimated similar to (4.7) which implies that

$$ \begin{array}{@{}rcl@{}} &&\left\| \mathscr{I}_{n}\left[ \big(\mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \big) (1+T_{n}^{-1}(t))^{i} \right] \right\| \\ &\leq& C \mathfrak{C}_2 \frac{\tau_{n}}{2} \left( \underset{t\in I_{n}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{n}\right)u(t)} \right\| + E_{n} \right). \end{array} $$

(4.11)

Hence, together with (4.10) we obtain

$$ \mathrm{(II)} \leq C \mathfrak{C}_2 \frac{\tau_{n}}{2} \left( \underset{t\in I_{n}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{n}\right)u(t)} \right\| + E_{n} \right). $$

(4.12)

So, combining (4.6) with (4.8) and (4.12), we have already shown that

$$ {{\int}_{I_{n}}} \| \zeta^{\prime}(t) \| {\mathrm{d}} t \leq \tau_{n} \underset{t \in I_{n}}{\sup} \| \zeta^{\prime}(t) \| \leq C \mathfrak{C}_2 \frac{\tau_{n}}{2} \left( \underset{t\in I_{n}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{n}\right)u(t)} \right\| + E_{n} \right) $$

(4.13)

for all 1 ≤ n ≤ N.

Next, we analyze the jump term of (4.5). First, we have a closer look at [ζ]_n− 1 for 1 ≤ n ≤ N. There are two cases. If k ≥ 1, the discrete solution U is globally continuous due to (2.1b). So, by (4.1a) and the continuity of u

$$ [\zeta]_{n-1} = \left[\mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u\right]_{n-1} = u\left( t_{n-1}^{+}\right) - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u\left( t_{n-1}^{-}\right) = \left( u - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u\right)\left( t_{n-1}^{-}\right). $$

Otherwise, if k = 0 we exploit (4.9) to rewrite the jump term as follows

$$ [\zeta]_{n-1} = \left( u - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u\right)\left( t_{n-1}^{-}\right) +\mathscr{I}_{n}\left[ \mathcal{I}_{n} f(t,u(t)) - \mathcal{I}_{n} f(t,U(t)) \right] -\mathscr{I}_{n}\left[\zeta^{\prime}(t)\right]. $$

Hence, using (4.11) with i = 0 to bound $\|{\mathscr{I}}_{n}\left [ \mathcal {I}_{n} f(t,u(t)) - \mathcal {I}_{n} f(t,U(t)) \right ] \|$ and combining Assumption 2, the inverse inequality (3.3), and (4.13) to estimate $\left \|{{{{\mathscr{I}}}}}_{n}\left [{\zeta ^{\prime }(t)}\right ]\right \|$, we have in both cases

$$ \|[\zeta]_{n-1}\| \leq \left\|\left( u - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u\right)\left( t_{n-1}^{-}\right)\right\| + \delta_{0,k} C \mathfrak{C}_2 \frac{\tau_{n}}{2} \left( \underset{t\in I_{n}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{n}\right)u(t)} \right\| + E_{n} \right). $$

(4.14)

Summarizing, we get from (4.4), (4.5), (4.13), and (4.14) for 1 ≤ n ≤ N

$$ E_{n} \leq \underset{t \in I_{n}}{\sup} \|\eta(t)\| + \underbrace{C \mathfrak{C}_2}_{=\widetilde{C}} \sum\limits_{\nu = 1}^{n} \frac{\tau_{\nu}}{2} \left( \underset{t\in I_{\nu}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{\nu}\right)u(t)} \right\| + E_{\nu}\right ) + \sum\limits_{\nu=1}^{n} \left\|\left( u - \mathcal{J}^{\mathscr{I}, {{\mathcal{I}}}}u\right)(t_{\nu-1}^{-})\right\|. $$

Note that $\widetilde {C}$ is independent of T but especially depends multiplicatively on the Lipschitz constant of f (hidden in ${\mathfrak {C}_{2}}$). For τ_n sufficiently small ($\widetilde {C} \tau _{n}/2 < 1$), the E_n term of the right-hand side can be absorbed on the left.

Applying a variant of the discrete Gronwall lemma, see [17, Lemma 1.4.2, p. 14], and using that $u(t_{0}^{-}) = u_{0} = {{\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}} u(t_{0}^{-})$, we easily obtain the wanted statement for l = 0 where the arising error constant exponentially depends on T and the Lipschitz constant of f (hidden in $\widetilde {C}$, ${\mathfrak {C}_{2}}$).

Finally, we derive a bound for the first derivative of the error. Recalling the estimate for $\zeta ^{\prime }$ in (4.13) it follows

$$ \begin{array}{@{}rcl@{}} \underset{t \in I_{n}}{\sup} \| (u-U)'(t)\| &\leq& \underset{t \in I_{n}}{\sup} \left\| \left( u-\mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}u\right)'(t)\right\| + \underset{t \in I_{n}}{\sup} \left\| \left( \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}u -U\right)'(t)\right\| \\ &\leq& \underset{t \in I_{n}}{\sup} \left\| \left( u-\mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}u\right)'(t)\right\| + C \mathfrak{C}_2 \left( \underset{t\in I_{n}}{\sup} \left\| {\left( \text{Id}-\mathcal{I}^{\text{app}}_{n}\right)u(t)} \right\| + E_{n} \right). \end{array} $$

Using the already known estimate for E_n, the statement for l = 1 follows. □

Remark 9

Based on Theorem 8 and using an inverse inequality we can also prove abstract estimates for higher order derivatives of the error. We find

$$ \begin{array}{@{}rcl@{}} \underset{t \in I_{n}}{\sup} \left\|(u-U)^{(l)}(t)\right\| &\leq& \underset{t \in I_{n}}{\sup} \left\|\left( u-\mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}u\right)^{(l)}(t)\right\| \\ &&+ C_{\text{inv}} \left( \tfrac{\tau_{n}}{2}\right)^{-l} \left( \underset{t \in I_{n}}{\sup} \left \|\left( \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}u - u\right)(t)\right\| + \underset{t \in I_{n}}{\sup} \|(u - U)(t)\|\right). \end{array} $$

However, since we only have a non-local error estimate for $\sup _{t \in I_{n}} \| (u-U)(t) \|$, we cannot expect that $\left (\frac {\tau _{n}}{2}\right )^{-l}$ can be compensated in general. So, usually we additionally need to assume that τ_ν ≤ τ_ν+ 1 for all ν or alternatively that the mesh is quasi uniform (τ/τ_ν ≤ C for all ν) to obtain a proper estimate.

Remark 10

In the proof of Theorem 8 stiffness of the problem would be critical at several points.

Indeed, for large Lipschitz constants (hidden in $\mathfrak {C}_2$ and so in $\widetilde {C}$) the needed inequality $\widetilde {C} \tau _{n}/2 < 1$ would force very small time step lengths. For semidiscretizations in space of time-space problems, where the Lipschitz constant is typically proportional to h^− 2 with h denoting the spatial mesh parameter, this would cause upper bounds on the time step length with respect to h similar to CFL conditions.

Moreover, since the error constant C exponentially depends on $\mathfrak {C}_2$, it would be excessively large for stiff problems. So, the estimate would be useless then.

Of course, Theorem 8 provides an abstract bound for the error of the variational time discretization method. However, the order of convergence still is not clear. Since ${\mathcal {I}^{\text {app}}}$ is an Hermite-type interpolator of ansatz order larger than or equal to r its approximation order is at least r + 1. It remains to prove suitable bounds on the error of the approximation operator ${{\mathcal {J}}^{{\mathscr{I}}, {\mathcal {I}}}}$.

Definition 11

(Approximation orders of ${{{\mathscr{I}}}}_{n}$ and $\mathcal {I}_{n}$) Let $r_{\text {ex}}^{{\mathscr{I}}}$, $r_{\text {ex}}^{\mathcal {I}}$, $r_{\mathcal {I}}$, and $r_{\mathcal {I},{i}}^{{\mathscr{I}}} \in \mathbb {N}_{0} \cup \{-1,\infty \}$ denote the largest numbers such that

$$ \begin{array}{@{}rcl@{}} {\int}_{I_{n}} \varphi(t) {\mathrm{d}} t & =& \mathscr{I}_{n}\left[\varphi\right] \qquad\qquad \ \forall \varphi \in P_{r_{\text{ex}}^{\mathscr{I}}}(I_{n}), \qquad\qquad\qquad\quad\varphi = \mathcal{I}_{n} \varphi \quad \forall \varphi \in P_{r_{\mathcal{I}}}(I_{n}), \\ {\int}_{I_{n}} \varphi(t) {\mathrm{d}} t & =& {\int}_{I_{n}} \mathcal{I}_{n} \varphi(t) {\mathrm{d}} t \qquad{\kern4pt} \forall \varphi \in P_{r_{\text{ex}}^{\mathcal{I}}}(I_{n}), \\ \mathscr{I}_{n}\left[\varphi \psi_{i}\right] &=&\mathscr{I}_{n}\left[(\mathcal{I}_{n}\varphi) \psi_{i}\right] \qquad\forall \varphi \in P_{r_{\mathcal{I},{i}}^{\mathscr{I}}}(I_{n}), \psi_{i} \in P_{i}(I_{n}). \end{array} $$

Here, P_− 1(I_n) is interpreted as {0}, in which case the respective operator does not provide the corresponding approximation property. For convenience, set $r_{\mathcal {I}}^{{\mathscr{I}}} := r_{\mathcal {I},{r-k}}^{{\mathscr{I}}}$. Note that $r_{\text {ex}}^{\mathcal {I}} \geq r_{\mathcal {I},{i}}^{\int \limits } \geq r_{\mathcal {I}}$ and $r_{\mathcal {I},{i}}^{{\mathscr{I}}} \geq r_{\mathcal {I}}$ hold by definition.

Using standard techniques the above quantities can be connected with certain approximation estimates. For example, let $\check {r} \in {\mathbb {N}}_{0}$ then together with Assumption 2 we find that

$$ \left\| {{\int}_{I_{n}}} \varphi(t) {\mathrm{d}} t - \mathscr{I}_{n}\left[\varphi\right] \right\| \leq C \frac{\tau_{n}}{2} \sum\limits_{j=\min\{\check{r},r_{\text{ex}}^{\mathscr{I}}+1\}}^{\max\{k_{\mathscr{I}},\min\{\check{r},r_{\text{ex}}^{\mathscr{I}}+1\}\}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \underset{t \in I_{n}}{\sup} \left\| \varphi^{(j)}(t)\right\| $$

(4.15)

for arbitrary $\varphi \in C^{\max \limits \{k_{{\mathscr{I}}},\min \limits \{\check {r},r_{\text {ex}}^{{\mathscr{I}}}+1\}\}}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$.

Lemma 12

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r, and suppose that Assumptions 1, 2, and 3 hold. Furthermore, let $l, \check {r} \in \mathbb {N}_{0}$ and define

$$ j_{\min,\check{r}} := \min\left\{\check{r}, r+1, r_{\mathcal{I}}^{\mathscr{I}}+2 \right\}, \qquad j_{\max,\check{r}} := \max\left\{k_{\mathcal{J}} + 1, l, j_{\min,\check{r}}\right\}. $$

If $v \in C^{j_{\max \limits ,\check {r}}}\left (\overline {I}_{n}, \mathbb {R}^{d}\right )$ then the error estimate

$$ \underset{t\in I_{n}}{\sup} \left\|\left( v - \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)^{(l)}(t)\right\| \leq C \sum\limits_{j=j_{\min,\check{r}}}^{j_{\max,\check{r}}} \left( \tfrac{\tau_{n}}{2}\right)^{j-l} \underset{t\in I_{n}}{\sup} \left\|v^{(j)}(t)\right\| =: \mathcal{T}^{\check{r},l}_{\text{(4.16)}}[v] $$

(4.16)

holds with a constant C independent of τ_n.

Proof

The estimate follows from standard approximation theory since under the given assumptions $\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}$ preserves polynomials up to degree $\min \limits \{r,{r_{{\mathcal {I}}}^{{\mathscr{I}}}}+1\}$. Also note that a stability estimate for $\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}$, cf. [2, Lemma 4.2], which then motivates the upper summation bound $j_{\max \limits ,\check {r}}$, can be easily proven by inverse inequalities, estimation on the reference interval, and transformation. □

In many cases, the approximation error of $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}$ in the mesh points $t_{n}^{-}$ behaves much better than the pointwise estimate of Lemma 12 suggests. However, to see this, we need some further knowledge on the approximation property connected with the quantity ${r_{{\mathcal {I}},{{i}}}^{{\mathscr{I}}}}$. Besides, the respective result presented in the following lemma will be also used later in the superconvergence analysis.

Lemma 13

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. Suppose that the Assumptions 2 and 3 hold. Moreover, assume that $\psi _{i} \in P_{i}(I_{n},\mathbb {R})$, $i \in \mathbb {N}_{0}$, satisfies $\sup _{t \in I_{n}} |\psi _{i}^{(j)}(t)| \leq C \tau _{n}^{-j}$ for all $j\in {\mathbb {N}}_{0}$. Let $\check {r} \in {\mathbb {N}}_{0}$ and define

$$ j_{\min,i,\check{r}}^{*} := \min\left\{\check{r},r_{\mathcal{I},{i}}^{\mathscr{I}}+1\right\}, \qquad j_{\max,i,\check{r}}^{*} := \max\left\{k_{\mathscr{I}},k_{\mathcal{I}},j_{\min,i,\check{r}}^{*}\right\}. $$

Then, we have for $\varphi \in C^{j_{\max \limits ,i,\check {r}}^{*}}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$ the bound

$$ \left\| \mathscr{I}_{n}\left[(\varphi - \mathcal{I}_{n} \varphi )(t) \psi_{i}(t) \right] \right\| \leq C \frac{\tau_{n}}{2} \sum\limits_{j=j_{\min,i,\check{r}}^{*}}^{j_{\max,i,\check{r}}^{*}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \underset{t\in I_{n}}{\sup} \left\|\varphi^{(j)}(t)\right\| $$

with a constant C independent of τ_n.

Proof

(for more details see [2, Lemma 4.10]) Let $\widetilde {\varphi } \in P_{\min \limits \{\check {r}-1,{r_{{\mathcal {I}},{{i}}}^{{\mathscr{I}}}}\}}\big (I_{n},{\mathbb {R}}^{d}\big )$. We start rewriting the left-hand side of the wanted inequality as follows

$$ \begin{array}{@{}rcl@{}} &&{\kern14pt}\mathscr{I}_{n}\left[(\varphi - \mathcal{I}_{n} \varphi )(t) \psi_{i}(t) \right] \\ && =\mathscr{I}_{n}\left[(\varphi - \widetilde{\varphi})(t) \psi_{i}(t) \right] + \underbrace{ \mathscr{I}_{n}\left[(\widetilde{\varphi} - \mathcal{I}_{n} \widetilde{\varphi} )(t) \psi_{i}(t) \right]}_{=0}+ \mathscr{I}_{n}\left[\mathcal{I}_{n}(\widetilde{\varphi} - \varphi )(t) \psi_{i}(t) \right], \end{array} $$

where the term in the middle vanishes by definition of $r_{\mathcal {I},{i}}^{{\mathscr{I}}}$. Exploiting Assumption 2, the Leibniz rule for the j th derivative, and the given bound for $\psi _{i}^{(j)}$ to estimate the two remaining terms, we gain

$$ \begin{array}{@{}rcl@{}} & &\left\|\mathscr{I}_{n}\left[(\varphi - \mathcal{I}_{n} \varphi )(t) \psi_{i}(t) \right] \right\| \\ & \leq &C \frac{\tau_{n}}{2} \sum\limits_{l=0}^{k_{\mathscr{I}}} \left( \tfrac{\tau_{n}}{2}\right)^{l} \left( \underset{t \in I_{n}}{\sup} \left\| (\varphi - \widetilde{\varphi})^{(l)}(t)\right \| + \underset{t \in I_{n}}{\sup} \left\| (\mathcal{I}_{n}(\varphi - \widetilde{\varphi}))^{(l)}(t) \right\| \right). \end{array} $$

Furthermore, using Assumption 3 to bound the latter summands, we conclude that

$$ \begin{array}{@{}rcl@{}} &&\left\|\mathscr{I}_{n}\left[(\varphi - \mathcal{I}_{n} \varphi )(t) \psi_{i}(t) \right] \right\| \leq C \frac{\tau_{n}}{2} \sum\limits_{j=0}^{\max\{k_{\mathscr{I}},k_{\mathcal{I}}\}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \underset{t \in I_{n}}{\sup} \left\| (\varphi - \widetilde{\varphi})^{(j)}(t)\right\| \end{array} $$

for all $\widetilde {\varphi } \in P_{\min \limits \{\check {r}-1,r_{\mathcal {I},{i}}^{{\mathscr{I}}}\}}\left (I_{n},\mathbb {R}^{d}\right )$. So, choosing $\widetilde {\varphi }$ as the Taylor polynomial of φ at (t_n− 1 + t_n)/2 of degree $\min \limits \left \{\check {r}-1,{r_{{\mathcal {I}},{{i}}}^{{\mathscr{I}}}}\right \}$, the wanted statement follows by standard error estimates for the Taylor polynomial. □

Now, we can state and prove the improved estimate for the approximation error of $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}$ in the mesh points $t_{n}^{-}$.

Lemma 14

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. Suppose that the Assumptions 1, 2, and 3 hold. Moreover, assume that $\max \limits \left \{r_{\text {ex}}^{{\mathscr{I}}},r_{\mathcal {I}}^{{\mathscr{I}}}+1\right \} \geq r-1$. Let $\check {r} \in {\mathbb {N}}_{0}$ and define

$$ \begin{array}{@{}rcl@{}} j_{\min,\check{r}}^{\diamond} & :=& \min\left\{\check{r}, \max\left\{r_{\text{ex}}^{\mathscr{I}}+1,\min\left\{r,r_{\mathcal{I}}^{\mathscr{I}}+1\right\}\right\}+1, r_{\mathcal{I},{0}}^{\mathscr{I}}+2\right\}, \\ j_{\max,\check{r}}^{\diamond} & := &\max\left\{k_{\mathcal{J}}+1, j_{\min,\check{r}}^{\diamond}\right\}. \end{array} $$

Then, provided $v \in C^{j_{\max \limits ,\check {r}}^{\diamond }}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$, the estimate

$$ \left\|\left( v - \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)(t_{n}^{-})\right\| \leq C \sum\limits_{j=j_{\min,\check{r}}^{\diamond}}^{j_{\max,\check{r}}^{\diamond}} \left( \tfrac{\tau_{n}}{2}\right)^{j} \underset{t\in I_{n}}{\sup} \left\|v^{(j)}(t)\right\| =: \mathcal{T}_{\text{(4.17)}}[v] $$

(4.17)

holds for 1 ≤ n ≤ N where the constant C is independent of τ_n.

Proof

(for more details see [2, Lemma 4.11]) From the fundamental theorem of calculus and exploiting the definition (4.1) of $\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}$, we gain

$$ \begin{array}{@{}rcl@{}} && \left( v - \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)\big(t_{n}^{-}\big) = \left( v - \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right) \big(t_{n-1}^{+}\big) + {{\int}_{I_{n}}} \left( v - \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)'(t) {\mathrm{d}} t \\ & =& - \mathscr{I}_{n}\left[\mathcal{I}_{n} v^{\prime} - \left( \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)'\right] + {{\int}_{I_{n}}} \left( v - \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)'(t) {\mathrm{d}} t \\ & =& \mathscr{I}_{n}\left[v^{\prime}-\mathcal{I}_{n} v^{\prime}\right] + \left( {{\int}_{I_{n}}} \cdot {\mathrm{d}} t - {{\mathscr{I}}}_{n} \cdot\right )[v^{\prime}(t)] - \left( {{\int}_{I_{n}}} \cdot {\mathrm{d}} t -{{\mathscr{I}}}_{n} \cdot \right)\left[\left( \mathcal{J}_n^{\mathscr{I}, {{\mathcal{I}}}}v\right)'(t)\right]. \end{array} $$

The first and the second difference on the right-hand side then can be estimated by Lemma 13 with i = 0, ψ₀ ≡ 1, and (4.15), respectively. The third difference vanishes if ${r_{\text {ex}}^{{\mathscr{I}}}} \geq r-1$ since then $\big (\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}v\big )' \in P_{r-1}\big (I_{n},\mathbb {R}^{d}\big )$ is integrated exactly. Otherwise, if $r_{\mathcal {I}}^{{\mathscr{I}}} +1 \geq r-1$, i.e., $\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}}$ preserves polynomials up to degree r − 1, it can be suitably bounded combining (4.15) and (4.16).

Altogether, recalling (4.16) which in some cases may provide a better estimate, we conclude the wanted statement. Here, also note that always ${r_{{\mathcal {I}},{{0}}}^{{\mathscr{I}}}} \geq {r_{{\mathcal {I}},{{r-k}}}^{{\mathscr{I}}}} = {r_{{\mathcal {I}}}^{{\mathscr{I}}}} \geq \min \limits \left \{r-1,{r_{{\mathcal {I}}}^{{\mathscr{I}}}}\right \}$. □

Finally, summarizing the above results, we now want to list the proven convergence orders.

Corollary 15

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r, and l ∈{0,1}. Suppose that Assumptions 1, 2, 3, and 4 hold. Moreover, let Assumption 5a or 5b be satisfied. Denoting by u and U the solutions of (1.1) and (2.1a), respectively, we have for 1 ≤ n ≤ N

$$ \underset{t \in I_{n}}{\sup} \left\| (u-U)^{(l)}(t) \right\| \leq C \tau^{\min\{r,r_{\mathcal{I}}^{\mathscr{I}}+1\}}, $$

(4.18)

with $r_{\mathcal {I}}^{{\mathscr{I}}}$ as defined in Definition 11. If in addition $\max \limits \left \{r_{\text {ex}}^{{\mathscr{I}}},r_{\mathcal {I}}^{{\mathscr{I}}}+1\right \} \geq r-1$, then we even have

$$ \underset{t \in I_{n}}{\sup} \| (u-U)(t) \| \leq C \tau^{\min\{r+1, r_{\mathcal{I}}^{\mathscr{I}}+2, r_{\mathcal{I},{0}}^{\mathscr{I}}+1, \max\{r_{\text{ex}}^{\mathscr{I}}+1, \min\{r,r_{\mathcal{I}}^{\mathscr{I}}+1\}\} \}} $$

(4.19)

as an improved $L^{\infty }$ estimate.

If $\max \limits \left \{r_{\text {ex}}^{{\mathscr{I}}},r_{\mathcal {I}}^{{\mathscr{I}}}+1\right \} \geq r-1$ is satisfied, we obtain formally

$$ \underset{t \in I_{n}}{\sup} \| (u-U)'(t) \| \leq C \tau^{\min\{r, r_{\mathcal{I}}^{\mathscr{I}}+1, r_{\mathcal{I},{0}}^{\mathscr{I}}+1, \max\{r_{\text{ex}}^{\mathscr{I}}+1,\min\{r,r_{\mathcal{I}}^{\mathscr{I}}+1\}\} \}} $$

for the $W^{1,\infty }$ seminorm. However, this also gives (4.18) only.

Remark 16

Since the quantity $r_{\mathcal {I}}^{{\mathscr{I}}} = r_{\mathcal {I},{r-k}}^{{\mathscr{I}}}$ used in the lemmas and the corollary above is quite abstract, we want to provide lower bounds for $r_{\mathcal {I},{i}}^{{\mathscr{I}}}$ based on the more familiar quantities $r_{\mathcal {I}}$, $r_{\text {ex}}^{\mathcal {I}}$, and $r_{\text {ex}}^{{\mathscr{I}}}$. For a proof of these bounds, we refer to [2, Lemma 4.13].

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r, and $i \in \mathbb {N}_{0}$. Then, $r_{\mathcal {I},{i}}^{{\mathscr{I}}} \geq r_{\mathcal {I}}$. So, for $r_{\mathcal {I}} = \infty $ the bound cannot be improved further. Otherwise, supposing that $\mathcal {I}_{n}$ is a projection onto the space of polynomials of maximal degree $r_{\mathcal {I}} < \infty $, i.e., $\mathcal {I}_{n} : C^{k_{\mathcal {I}}}(\overline {I}_{n}) \to P_{r_{\mathcal {I}}}(\overline {I}_{n})$ and $\mathcal {I}_{n} \varphi = \varphi $ for all $\varphi \in P_{r_{\mathcal {I}}}(\overline {I}_{n})$, we even get

$$ r_{\mathcal{I},{i}}^{\mathscr{I}} \geq \max\left\{r_{\mathcal{I}},\min\left\{r_{\text{ex}}^{\mathscr{I}}-i,r_{\mathcal{I},{i}}^{\int}\right\}\right\}. $$

Of course, it holds $r_{\mathcal {I},{0}}^{\int \limits } = r_{\text {ex}}^{\mathcal {I}}$. In order to simplify the term on the right-hand side for i ≥ 1, we additionally assume that $\mathcal {I}_{n}$ satisfies

$$ {{\int}_{I_{n}}} \mathcal{I}_{n} ((\varphi-\mathcal{I}_{n} \varphi) \psi)(t) {\mathrm{d}} t = 0 \qquad \forall \psi \in P_{i}(I_{n}) $$

(4.20)

for all $\varphi \in C^{k_{\mathcal {I}}}(\overline {I}_{n})$. Then, we simply have

$$ r_{\mathcal{I},{i}}^{\mathscr{I}} \geq \max\left\{r_{\mathcal{I}},\min\left\{r_{\text{ex}}^{\mathscr{I}},r_{\text{ex}}^{\mathcal{I}}\right\}-i\right\} $$

since then $r_{\mathcal {I},{i}}^{\int \limits } \geq \max \limits \left \{r_{\mathcal {I}},r_{\text {ex}}^{\mathcal {I}}-i\right \}$.

Furthermore, under the weaker assumption that $\mathcal {I} = \mathcal {I}^{1} \circ {\ldots } \circ \mathcal {I}^{l}$ is a composition of several projection operators $\mathcal {I}^{j}$, 1 ≤ j ≤ l, that all satisfy (4.20), we still find

$$ r_{\mathcal{I},{i}}^{\int} \geq \underset{j\in \mathcal{M}_{i} \cup \{l\}}{\min} \left\{\max\left\{r_{\mathcal{I}^{j}},r_{\text{ex}}^{\mathcal{I}^{j}}-i\right\}\right\} $$

where ${\mathscr{M}}_{i} := \left \{ j \in \mathbb {N} | 1 \leq j \leq l-1, \max \limits \left \{r_{\mathcal {I}^{j}},r_{\text {ex}}^{\mathcal {I}^{j}}-i\right \} < \min \limits _{j+1\leq m \leq l}\{r_{\mathcal {I}^{m}}\}\right \}$.

5 Superconvergence analysis

In order to prove superconvergence in time mesh points, we exploit a special representation of the discrete problem (2.1a). To this end, we need one further approximation operator $\widehat {\mathcal {P}}^{{\mathscr{I}}, {{\mathcal {I}}}}$, see [2, (5.1)], which is connected to $\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}$ in such a way that $\widehat {\mathcal {P}}^{{\mathscr{I}}, {{\mathcal {I}}}}(\hat {v}^{\prime }) = (\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}\hat {v})'$ holds true. For brevity, we directly consider its local version $\mathcal {P}_n^{{\mathscr{I}}, {{\mathcal {I}}}}$. Then, analogously to the respective statements for $\mathcal {J}^{{\mathscr{I}}, {{\mathcal {I}}}}_{n}$, we get the following.

Lemma 17 (Approximation operator)

Let $r, k \in \mathbb {Z}$, 0 ≤ k ≤ r. Moreover, suppose that Assumption 1 holds. Then, the operator $\mathcal {P}_n^{{\mathscr{I}}, {{\mathcal {I}}}}:C^{k_{\mathcal {J}}}\left (\overline {I}_{n},\mathbb {R}^{d}\right ) \to P_{r-1}\left (I_{n},\mathbb {R}^{d}\right )$, 1 ≤ n ≤ N, given by

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{P}_{n}^{\mathscr{I}, {{\mathcal{I}}}}v\right)^{(i)}\left( t_{n-1}^{+}\right) & =& v^{(i)}\left( t_{n-1}^{+}\right), \qquad\qquad\qquad \text{if } k \geq 3, i=0,\ldots,\big\lfloor \tfrac{k-1}{2} \big\rfloor -1, \end{array} $$

(5.1a)

$$ \begin{array}{@{}rcl@{}} \left( \mathcal{P}_{n}^{\mathscr{I}, {{\mathcal{I}}}}v\right)^{(i)}\left( t_{n}^{-}\right) & = &v^{(i)}\left( t_{n}^{-}\right), \qquad \qquad\qquad\quad\text{if } k \geq 2, i=0,\ldots,\left\lfloor \tfrac{k}{2} \right\rfloor-1, \end{array} $$

(5.1b)

$$ \begin{array}{@{}rcl@{}} \mathscr{I}_{n}\left[ (\mathcal{P}_n^{\mathscr{I}, {{\mathcal{I}}}}v(t), \varphi(t))\right] & =& \mathscr{I}_{n}\left[ (\mathcal{I}_{n} v(t), \varphi(t))\right] \qquad\forall \varphi \in P_{r-k}\left( I_{n},\mathbb{R}^{d}\right) \text{ with } \delta_{0,k} \varphi\left( t_{n-1}^{+}\right) = 0, \end{array} $$

(5.1c)

is well-defined.

Furthermore, let Assumptions 2 and 3 hold. For $l, \check {r} \in \mathbb {N}_{0}$, we define

$$ j_{\min,\check{r}}^{\bullet} := \min\left\{\check{r},r,r_{\mathcal{I}}^{\mathscr{I}}+1\right\}, \qquad j_{\max,l,\check{r}}^{\bullet} := \max\left\{k_{\mathcal{J}}, l, j_{\min,\check{r}}^{\bullet}\right\}. $$

Then, we have for $v \in C^{j_{\max \limits ,l,\check {r}}^{\bullet }}\left (\overline {I}_{n},\mathbb {R}^{d}\right )$ the error estimates

$$ \underset{t\in I_{n}}{\sup} \left\|\Big(v - \mathcal{P}_n^{\mathscr{I}, {{\mathcal{I}}}}v\Big)^{(l)}(t)\right\| \leq C \sum\limits_{j=j_{\min,\check{r}}^{\bullet}}^{j_{\max,l,\check{r}}^{\bullet}} \left( \tfrac{\tau_{n}}{2}\right)^{j-l} \underset{t\in I_{n}}{\sup} \left\|v^{(j)}(t)\right\| =: \mathcal{T}_{\text{(5.2)}}^{\check{r},l}[v] $$

(5.2)

with a constant C independent of τ_n.

Inspecting the definition of U in (2.1a) and (5.1), we find that

$$ U^{\prime}|_{I_{n}} = \mathcal{P}_n^{\mathscr{I}, {{\mathcal{I}}}}f(\cdot,U(\cdot)). $$

(5.3)

Therefore, since for k ≥ 1 furthermore $U\left (t_{n-1}^{+}\right ) = U_{n-1}^{-}$ with $U_{n-1}^{-}$ given, the discretization method fits into the unified framework of [1] for k ≥ 1.

Theorem 18 (Superconvergence estimate)

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. Suppose that the Assumptions 1, 2, and 3 hold. Moreover, denote by u and U the solutions of (1.1) and (2.1a), respectively. Suppose that (for τ sufficiently small) the global error $\sup _{t \in I} \|(u-U)(t)\|$, as well as U and all of its derivatives, can be bounded independent of the mesh parameter. Then, we have

$$ \left\| {(u-U)(t_{N}^{-})} \right\| \leq C(f,u) \left( \underset{t \in I}{\sup} \|(u-U)(t)\|^{2} + \tau^{r_{\text{super}}}\right ) $$

(5.4a)

with

$$ {r_{\text{super}} := \min\left\{2r-k+1,r_{\text{var}}^{\mathscr{I}, \mathcal{I}}+1,\max\left\{r_{\text{ex}}^{\mathscr{I}}+1,\min\left\{r,r_{\mathcal{I}}^{\mathscr{I}}+1\right\}\right\}\right\} } $$

(5.4b)

where $r_{\text {var}}^{{\mathscr{I}}, \mathcal {I}} := \min \limits _{0 \leq i \leq r-k}\left \{r_{\mathcal {I},{i}}^{{\mathscr{I}}}+i\right \}$.

Proof

(for more details see [2, Theorem 5.3]) In order to prove the wanted statement, we firstly derive an estimate for the error e(t) = (u − U)(t) at $t_{n}^{-}$ provided that we already have a suitable bound at $t_{n-1}^{-}$. To this end, we adapt some basic ideas known from the superconvergence proof for collocation methods, see [10, Theorem II.1.5, p. 28]. So, we consider the local discrete solution $U|_{I_{n}}$ as solution $\widetilde {U}$ of the perturbed initial value problem

$$ \widetilde{U}^{\prime}(t) = f(t,\widetilde{U}(t)) + \text{def}(t) \quad \text{in } I_{n}, \qquad \widetilde{U}(t_{n-1}) = U\left( t_{n-1}^{+}\right), $$

where $\text {def}(t) := U^{\prime }(t) - f(t,U(t))$ denotes the defect. Since u solves (1.1), we find for all t ∈ I_n after linearization that

$$ U^{\prime}(t) - u^{\prime}(t) = \frac{\partial}{\partial v} f\left( t,u(t)\right) (U(t)-u(t)) + \text{def}(t) + \text{rem}(t) $$

where f is interpreted as function of (t,v) and the remainder term is given by

$$ \text{rem}(t) := {{{\int}_{0}^{1}} {{\int}_{0}^{s}}} (U(t)-u(t))^{T} \tfrac{\partial^{2}}{\partial v^{2}} f\big(t,u(t)+\tilde{s}(U(t)-u(t))\big) (U(t)-u(t)) {\mathrm{d}} \tilde{s} {\mathrm{d}} s. $$

The variation of constants formula, cf. [11, Theorem I.11.2, p. 66], yields

$$ e(t) = R(t,t_{n-1}^{+}) e\left( t_{n-1}^{+}\right) - {{\int}_{t_{n-1}}^{t}} R(t,s)(\text{def}(s) + \text{rem}(s)) {\mathrm{d}} s \qquad \forall t \in I_{n} $$

where R(t,s) is the resolvent of the homogeneous differential equation $y^{\prime }(t)= \frac {\partial }{\partial v} f\left (t,u(t)\right ) y(t)$ for initial values given at s.

Using that u is continuous as well as U for k ≥ 1, we conclude

$$ \begin{array}{@{}rcl@{}} e(t_{n}^{-}) & =& R(t_{n}^{-},t_{n-1}^{+}) e\left( t_{n-1}^{-}\right) \\ &&- \delta_{0,k} R(t_{n}^{-},t_{n-1}^{+}) [U ]_{n-1} - {{\int}_{I_{n}}} R(t_{n}^{-},s)(\text{def}(s) + \text{rem}(s)) {\mathrm{d}} s. \end{array} $$

(5.5)

The right-hand side now is split and the single terms are studied separately.

First, for the term including $e\left (t_{n-1}^{-}\right )$ we gain because of $R(t_{n-1}^{+},t_{n-1}^{+}) = \text {Id}$

$$ \left\|R(t_{n}^{-},t_{n-1}^{+}) e\left( t_{n-1}^{-}\right)\right\| \leq \left( 1+\tau_{n} \underset{\tilde{s} \in I_{n}}{\sup} \left\| \tfrac{\partial}{\partial t} R(\tilde{s},t_{n-1}^{+})\right\| \right) \left\|e\left( t_{n-1}^{-}\right)\right\|. $$

(5.6)

The term including the remainder term rem(⋅) can be bounded as follows

$$ \left\| {{\int}_{I_{n}}} R(t_{n}^{-},s)\text{rem}(s) {\mathrm{d}} s \right\| \leq C \tau_{n} \underset{s \in I_{n}}{\sup} \|e(s)\|^{2} $$

(5.7)

where we also exploited that $u(s)-\tilde {s}e(s)$ is in a bounded neighborhood of u(s) for all s ∈ I_n, $\tilde {s} \in [0,1]$, since by assumption ∥e(s)∥≤ C.

Finally, the remaining terms are considered. A Taylor series expansion of $R(t_{n}^{-},s)$ with respect to s in $t_{n-1}^{+}$ motivates the following decomposition

$$ \begin{array}{@{}rcl@{}} && \delta_{0,k} R(t_{n}^{-},t_{n-1}^{+}) [U ]_{n-1} + {{\int}_{I_{n}}} R(t_{n}^{-},s)\text{def}(s) {\mathrm{d}} s \\ & = & {\sum\limits_{i=0}^{r-k} \frac{1}{i!} \frac{\partial^{i}}{\partial s^{i}} R(t_{n}^{-},t_{n-1}^{+}) \underbrace{\left( {\int}_{I_{n}} (s - t_{n-1})^{i} (U^{\prime}(s) - f(s,U(s)) ) {\mathrm{d}} s + \delta_{0,k} \delta_{0,i} [U ]_{n-1} \right) }_{=\mathrm{(I)}}} \\ & & { + \underbrace{{\int}_{I_{n}}{\int}_{t_{n-1}}^{s} \frac{(\tilde{s}-t_{n-1})^{r-k}}{(r-k)!} \frac{\partial^{r-k+1}}{\partial s^{r-k+1}} R(t_{n}^{-},\tilde{s}) {\mathrm{d}} \tilde{s} (U^{\prime}(s)-f(s,U(s))) {\mathrm{d}} s}_{=\mathrm{(II)}}}. \end{array} $$

(5.8)

We start with (II). Using (5.3) and (5.2) with v(⋅) = f(⋅,U(⋅)), we obtain

$$ \begin{array}{@{}rcl@{}} \|\mathrm{(II)}\| & \leq &\tau_{n} \tfrac{\tau_{n}^{r-k+1}}{(r-k+1)!} \underset{\tilde{s} \in I_{n}}{\sup} \left\| \tfrac{\partial^{r-k+1}}{\partial s^{r-k+1}} R(t_{n}^{-},\tilde{s}) \right\| \mathcal{T}_{\text{(5.2)}}^{\min\{\bar{r},r,r_{\mathcal{I}}^{\mathscr{I}}+1\},0}[f(\cdot,U(\cdot))] \\ & \leq& C \frac{\tau_{n}}{2} \tau_{n}^{\min\{\bar{r},r,r_{\mathcal{I}}^{\mathscr{I}}+1\}+r-k+1} G_{n}^{(\text{II})} \end{array} $$

where $G_{n}^{(\text {II})}$ depends on derivatives of R and f(⋅,U(⋅)).

The term (I) with 0 ≤ i ≤ r − k can be rewritten using (2.1d)

$$ \begin{array}{@{}rcl@{}} \mathrm{(I)}&=& \left( {{\int}_{I_{n}}} \cdot {\mathrm{d}} s - {{\mathscr{I}}}_{n} \cdot \right)\left[(s-t_{n-1})^{i} (U^{\prime}(s) - f(s,U(s))) \right] \\ &&+\mathscr{I}_{n}\left[(s-t_{n-1})^{i} (\mathcal{I}_{n} - \text{Id})f(s,U(s))\right]. \end{array} $$

(5.9)

The first term on the right-hand side can be bounded applying (4.15), Leibniz’ rule for the j th derivative, and if beneficial again (5.3) as well as (5.2) with v(⋅) = f(⋅,U(⋅)). We then find

$$ \left\|\left( {{\int}_{I_{n}}} \cdot {\mathrm{d}} s -{{\mathscr{I}}}_{n} \cdot \right)\left[(s-t_{n-1})^{i} (U^{\prime}(s) - f(s,U(s))) \right] \right\| \leq C \frac{\tau_{n}}{2} \tau_{n}^{r^{\mathrm{(Ia)}}_{i}} G_{n}^{(\text{Ia})} \\ $$

with

$$ r^{\mathrm{(Ia)}}_{i} := \max\left\{\min\left\{\bar{r},r_{\text{ex}}^{\mathscr{I}}+1\right\}, \min\left\{\bar{r},r,r_{\mathcal{I}}^{\mathscr{I}}+1\right\}+i\right\} $$

where $G_{n}^{(\text {Ia})}$ depends on (partial) derivatives of U and f(⋅,U(⋅)). Factoring out ${\tau _{n}^{i}}$, the second term on the right-hand side of (5.9) can be analyzed by Lemma 13 with φ(t) = f(t,U(t)) and $\psi _{i}(t) = \left (\frac {t-t_{n-1}}{\tau _{n}}\right )^{i}$. Then we obtain

$$ \left\| \mathscr{I}_{n}\left[(s-t_{n-1})^{i} (\text{Id} - \mathcal{I}_{n})f(s,U(s))\right] \right\| \leq C \frac{\tau_{n}}{2} \tau_{n}^{\min\{\bar{r},r_{\mathcal{I},{i}}^{\mathscr{I}}+1\}+i} G_{n}^{(\text{Ib})} $$

where $G_{n}^{(\text {Ib})}$ depends on derivatives of f(⋅,U(⋅)). So, combining (5.8) with the estimates for the single parts of (I) and (II), we gain for $\bar {r}$ sufficiently large

$$ \left\| \delta_{0,k} R\left( t_{n}^{-},t_{n-1}^{+}\right) [U ]_{n-1} + {{\int}_{I_{n}}} R(t_{n}^{-},s)\text{def}(s) {\mathrm{d}} s \right\| \leq C \frac{\tau_{n}}{2} \tau_{n}^{r_{\text{super}}} G_{n} $$

(5.10)

with

$$ r_{\text{super}} = \min\left\{2r-k+1,r_{\text{var}}^{\mathscr{I}, \mathcal{I}}+1,\max\left\{r_{\text{ex}}^{\mathscr{I}}+1,\min\left\{r,r_{\mathcal{I}}^{\mathscr{I}}+1\right\}\right\}\right\} $$

where $r_{\text {var}}^{{\mathscr{I}}, \mathcal {I}} = \min \limits _{0 \leq i \leq r-k}\left \{r_{\mathcal {I},{i}}^{{\mathscr{I}}}+i\right \}$ and G_n depends on (partial) derivatives of R, U, and f(⋅,U(⋅)). Here note that $r_{\text {var}}^{{\mathscr{I}}, \mathcal {I}} \leq r_{\mathcal {I},{r-k}}^{{\mathscr{I}}}+r-k ={r_{{\mathcal {I}}}^{{\mathscr{I}}}} +r-k$ was used to simplify the exponent r_super. Therefore, incorporating (5.6), (5.7), and (5.10) in (5.5) gives

$$ \left\|e(t_{n}^{-})\right\| \leq (1+\tau_{n} \lambda_{n} ) \left\|e\left( t_{n-1}^{-}\right)\right\| + C \tau_{n} \underset{s \in I_{n}}{\sup} \|e(s)\|^{2} \\ + C \frac{\tau_{n}}{2} \tau_{n}^{r_{\text{super}}} G_{n} $$

with $\lambda _{n} = \sup _{\tilde {s} \in I_{n}} \left \| \frac {\partial }{\partial t} R\left (\tilde {s},t_{n-1}^{+}\right )\right \|$. A variant of the discrete Gronwall lemma, see [8, Proposition 3.3], together with (1 + x) ≤ e^x then yields

$$ \left\|e(t_{n}^{-})\right\| \leq C \left( t_{n}-t_{0}\right) \exp\left( (t_{n}-t_{0}) \underset{\nu=1,\ldots,n}{\max} \lambda_{\nu} \right) \underset{\nu=1,\ldots,n}{\max} \!\left( \underset{s \in I_{\nu}}{\sup} \|e(s)\|^{2} + \tau_{\nu}^{r_{\text{super}}} G_{\nu}\right) $$

where we also used $e(t_{0}^{-}) = 0$.

It remains, a small technical detail, to verify that G_ν can be uniformly bounded independent of τ_ν. The term depends on partial derivatives of f, derivatives of R, and on the derivatives of the discrete solution U, thus, potentially also on the mesh parameter. However, U can be uniformly bounded by assumption. So, we are done.

□

Remark 19

Using an alternative argument (inspired by the proof of [11, Theorem II.7.9, pp. 212/213]) that is based on the application of the non-linear variation-of-constants formula [11, Corollary I.14.6, p. 97], it can be shown that for 1 ≤ k ≤ r the term $\sup _{t \in I} \|(u-U)(t)\|^{2}$ in (5.4a) is not necessary and can be dropped.

However, for k = 0, the alternative proof is much more complicated and in general only guarantees a worse superconvergence estimate than Theorem 18. Moreover, for all k, the notation gets more involved.

Lemma 20

Suppose that Assumption 1 holds along with an estimate similar to (5.2) that at least guarantees approximation order r − 1 for ${\mathcal {P}_n^{{\mathscr{I}}, {{\mathcal {I}}}}}$ (e.g. if ${r_{{\mathcal {I}}}^{{\mathscr{I}}}} \geq r-2$). In addition, let the solutions u of (1.1) and U of (2.1a) satisfy $\sup _{t \in I_{n}} \| (u - U)(t) \| \leq C$ for some constant C independent of the mesh parameter. Then, we have

$$ \underset{t \in I_{n}}{\sup} \| U^{(l)}(t) \| \leq C(f,u) $$

for all 0 ≤ l ≤ r.

Proof

(for more details see [2, Lemma 5.5]) Firstly, by assumption we have

$$ \underset{t \in I_{n}}{\sup} \| U(t) \| \leq \underset{t \in I_{n}}{\sup} \| u(t) \| + \underset{t \in I_{n}}{\sup} \| u(t) - U(t) \| \leq C(f,u). $$

From this, the wanted estimates follow by induction. Indeed, by (5.3) it holds

$$ \underset{t \in I_{n}}{\sup} \| U^{(l+1)}(t) \| \leq \underset{t \in I_{n}}{\sup} \left\| \tfrac{{\mathrm{d}}^{l}}{{\mathrm{d}} t^{l}} f(t,U(t)) \right\| + \underset{t \in I_{n}}{\sup} \left\| \tfrac{{\mathrm{d}}^{l}}{{\mathrm{d}} t^{l}} \left( \text{Id} - \mathcal{P}_n^{\mathscr{I}, {{\mathcal{I}}}}\right) f(t,U(t)) \right\|. $$

The first term on the right-hand side only contains derivatives of U up to maximal order l. For the second term, a similar upper bound (independent of τ_n) of terms that contain derivatives of U up to maximal order l can be derived by cleverly combining (5.2), a generalization of Faà di Bruno’s formula, see [16, Theorem 2.1], and inverse inequalities. Hence, by the induction hypotheses we gain the upper bound C(f,u). □

Summarizing the above observations, our analysis guarantees the following estimates in the time mesh points.

Corollary 21

Let $r,k \in \mathbb {Z}$, 0 ≤ k ≤ r. Suppose that Assumptions 1, 2, 3, and 4 hold. Moreover, let Assumption 5a or 5b be satisfied. Let u and U denote the solutions of (1.1) and (2.1a), respectively. Then, if ${r_{{\mathcal {I}}}^{{\mathscr{I}}}} \geq r-2$, we have for 1 ≤ n ≤ N

$$ \| (u-U)(t_{n}^{-}) \| \leq C\left (\tau^{r_{\text{super}}} + \delta_{0,k} \tau^{2r_{\mathcal{I}}^{\mathscr{I}}+4}\right ), $$

(5.11)

with r_super from (5.4b) and $r_{\mathcal {I}}^{{\mathscr{I}}} = r_{\mathcal {I},{r-k}}^{{\mathscr{I}}}$, $r_{\mathcal {I},{i}}^{{\mathscr{I}}}$ as defined in Definition 11.

If $r_{\mathcal {I}}^{{\mathscr{I}}} < r-2$, we in general cannot ensure the uniform boundedness of U and its derivatives since Lemma 20 does not hold. Then we only have

$$ \| (u-U)(t_{n}^{-}) \| \leq \underset{t \in I_{n}}{\sup} \| (u-U)(t) \| $$

where we refer to Corollary 15 for bounds on the right-hand side term.

6 Numerical experiments

We consider the initial value problem

$$ \left( \begin{array}{cc}u_{1}^{\prime}(t)\\u_{2}^{\prime}(t) \end{array}\right) = \left( \begin{array}{cc} -{u_{1}^{2}}(t)-u_{2}(t)\\ u_{1}(t)-u_{1}(t)u_{2}(t) \end{array}\right), \quad t\in(0,32),\qquad u(0) = \left( \begin{array}{cc} 1/2 \\ 0 \end{array}\right), $$

of a system of non-linear ordinary differential equations which has

$$ u_{1}(t) = \frac{\cos t}{2+\sin t},\qquad u_{2}(t) = \frac{\sin t}{2+\sin t} $$

as solution.

The non-linear systems within each time step were solved by Newton’s method where we applied a Taylor expansion of the inherited data from the previous time interval to calculate an initial guess for all unknowns on the current interval. If higher order derivatives were needed at initial time t₀ = 0, we applied

$$ \begin{array}{@{}rcl@{}} u^{(0)}(t_{0}) & := &u_{0},\quad\qquad\qquad\qquad u^{(2)}(t_{0}) := \partial_{t} f(t_{0}, u(t_{0})) + \partial_{u} f(t_{0},u(t_{0})) u^{(1)}(t_{0}), \\ u^{(1)}(t_{0}) & :=& f(t_{0},u(t_{0})), \quad\qquad u^{(j)}(t_{0}) := \frac{{\mathrm{d}}^{j-1}}{{\mathrm{d}} t^{j-1}} f\left( t,u(t)\right) |_{t=t_{0}}, \quad j \geq 3, \end{array} $$

based on the ode system (1.1) and its derivatives.

By considering different choices for ${{{\mathscr{I}}}}_{n}$ and $\mathcal {I}_{n}$, we will show that our theory provides sharp bounds on the convergence order. Since ${\mathscr{I}}_{n}$ and ${\mathcal {I}}_{n}$ are obtained from $\widehat {{\mathscr{I}}}$ and ${\widehat {\mathcal {I}}}$ via transformation, we only specify the reference operators.

Each integrator ${{\widehat {{\mathscr{I}}}}}$ that has been used in our calculations is based on Lagrangian interpolation with respect to a specific node set $P_{\widehat {{\mathscr{I}}}}$. Hence, we have ${k_{{\mathscr{I}}}}=0$. The interpolation operator ${\widehat {\mathcal {I}}}$ is of Lagrange-type and uses the node set $P_{{\widehat {\mathcal {I}}}}$. This means that ${k_{\mathcal {I}}}=0$. Both node sets are given for each of our test cases. Since often nodes of quadrature formulas are used, we also write for instance “left Gauss–Radau(k)” to indicate that the nodes of the left-sided Gauss–Radau formula with k points have been used. All upcoming settings fulfill Assumption 1.

For all test cases, the method ${\mathbf {VTD}_{3}^{6}}$, which is cGP-C¹(6), was applied as discretization. All calculations were carried out with the software Julia [5] using the floating point data type BigFloat with 512 bits. Errors were measured in the norms

$$ \|\varphi\|_{L^{\infty}} := \underset{t\in I}{\sup} \|\varphi(t)\|,\qquad \|\varphi\|_{\ell^{\infty}} := \underset{1\le n\le N}{\max} \|\varphi(t_{n}^{-})\| $$

where ∥⋅∥ denotes the Euclidean norm in $\mathbb {R}^{d}$.

6.1 Case group 1

Case 1

Choosing integration and interpolation according to

$$ \textstyle P_{{{\widehat{\mathscr{I}}}}} = \left\{-\frac{3}{4}, -\frac{1}{4}, \frac{1}{4}, \frac{3}{4}\right\}, \qquad P_{\widehat{\mathcal{I}}} = \text{left Gauss--Radau(3)} $$

leads to $r_{\text {ex}}^{{\mathscr{I}}}=3$ and $r_{\mathcal {I}}^{{\mathscr{I}}}=2$. Hence, the condition $\max \limits \Big \{r_{\text {ex}}^{{\mathscr{I}}},r_{\mathcal {I}}^{{\mathscr{I}}}+1\Big \} \geq r-1$ in Corollary 15 is violated and the expected convergence orders for both the $L^{\infty }$ norm and the $W^{1,\infty }$ seminorm are given by $\min \limits \Big \{r,r_{\mathcal {I}}^{{\mathscr{I}}}+1\Big \}=3$, see (4.18). It can be seen from Table 1 that the theoretical predictions are met by the numerical experiments. Moreover, in accordance with Corollary 21 the $\ell ^{\infty }$ convergence order is also just 3. This means that the uniform boundedness of $\sup _{t\in I}\|U^{(l)}(t)\|$ required by Theorem 18, which cannot be guaranteed because of ${r_{\mathcal {I}}}+1 < r-1$, is violated since otherwise (5.4a) would yield order 4.

Table 1 Errors and convergence orders in different norms for Case 1

Full size table

The condition $\max \limits \left \{r_{\text {ex}}^{{\mathscr{I}}},r_{\mathcal {I}}^{{\mathscr{I}}}+1\right \} \geq r-1$ of Corollary 15 will be fulfilled for all coming cases. Hence, the computations should show the convergence order given by (4.19) for the $L^{\infty }$ norm.

6.2 Case group 2

This group of cases provides choices for $P_{{{\widehat {{\mathscr{I}}}}}}$ and $P_{\widehat {\mathcal {I}}}$ such that the $L^{\infty }$ convergence order is limited by the maximum expression inside the outer minimum in (4.19). In addition, the presented cases will show that each of the three terms occurring in the maximum term can limit the convergence order. In the following, we indicate the limiting term in boldface.

Case 2a

The choices

$$ \textstyle P_{{{\widehat{\mathscr{I}}}}} = \left\{-\frac{3}{4}, -\frac{1}{4}, \frac{1}{4}, \frac{3}{4}\right\}, \qquad P_{\widehat{\mathcal{I}}} = \text{Gauss(5)} $$

provide the $L^{\infty }$ convergence order $\min \limits \{7,6,6,\max \limits \{4,\min \limits \{6,\boldsymbol {5}\}\}\}=5$ where the convergence order is limited by the second term inside the inner minimum. We see from Table 2 that the experimental order of convergence is 6, i.e., one order higher than expected. This behavior can be explained by a closer look to Lemma 14. The proof there guarantees in this case that $(v - \mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}} v)(t_{n}^{-})=0$ for all v ∈ P₅(I_n). However, due to symmetry reasons, it holds ${\int \limits }_{I_{n}} (v-\mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}} v)'(t) {\mathrm {d}} t = 0$ for all v ∈ P₆(I_n) which implies that $(v - \mathcal {J}_n^{{\mathscr{I}}, {{\mathcal {I}}}} v)(t_{n}^{-})=0$ even for v ∈ P₆(I_n). Thus, the convergence order of the limiting term is actually better than predicted.

Table 2 Errors and convergence orders in the $L^{\infty }$ norm for the cases of group 2

Full size table

Taking the same setting for $P_{{{\widehat {{\mathscr{I}}}}}}$ but using

$$ \textstyle P_{\widehat{\mathcal{I}}} = \left\{ -\frac{5}{6}, -\frac{13}{23}, \frac{1}{10}, \frac{12}{17}, \frac{4}{5} \right\}, $$

the convergence order predicted by (4.19) is $\min \limits \{7,6,6,\max \limits \{4,\min \limits \{6,\boldsymbol {5}\}\}\}=5$ again. The limitation is here also caused by the second argument of the inner minimum. Table 2 shows under Case 2a* that this convergence order is obtained in the numerical experiments. Note that here the interpolation points just were chosen such that still $r_{\mathcal {I},{0}}^{{\mathscr{I}}}=5 > 4 = r_{\mathcal {I}}^{{\mathscr{I}}}$, i.e., especially ${{\widehat {{\mathscr{I}}}}}[(\hat {v}-\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}\hat {v})'] ={{\widehat {{\mathscr{I}}}}}[\hat {v}^{\prime }- \widehat {\mathcal {I}}(\hat {v}^{\prime })] =0$ for $\hat {v}(\hat {t} )=\hat {t} ^{6}$, but ${\int \limits }_{-1}^{1} (\hat {v}-\widehat {\mathcal {J}}^{{\mathscr{I}}, {{\mathcal {I}}}}\hat {v})'(\hat {t} ) {\mathrm {d}}\hat {t}\neq 0$ for $\hat {v}(\hat {t} )=\hat {t} ^{6}$ which ensures that the estimate of Lemma 14 is sharp.

Case 2b

If we set

$$ \textstyle P_{{ {\widehat{\mathscr{I}}}}} = \left\{-\frac{3}{4}, -\frac{1}{4}, \frac{1}{4}, \frac{3}{4}\right\}, \qquad P_{\widehat{\mathcal{I}}} = \left\{-\frac{3}{4}, -\frac{1}{4}, \frac{1}{4}, \frac{3}{4}\right\}, $$

the expected convergence order is $\min \limits \{7,\infty ,\infty , \max \limits \{4,\min \limits \{\boldsymbol {6},\infty \}\}\} = 6$. Here, the limitation comes from the first argument of the inner minimum. The numerical results given in Table 2 clearly show this convergence order.

Case 2c

Choosing

$$ \textstyle P_{{{\widehat{\mathscr{I}}}}} = \left\{ -1, -\frac{3}{5}, -\frac{1}{5}, \frac{1}{5}, \frac{3}{5}, 1 \right\}, \qquad \textstyle P_{\widehat{\mathcal{I}}} = \left\{ -1, -\frac{3}{5}, -\frac{1}{5}, \frac{1}{5}, \frac{3}{5}, 1 \right\}, $$

results in the convergence order $\min \limits \{7,\infty ,\infty , \max \limits \{\boldsymbol {6},\min \limits \{\boldsymbol {6},\infty \}\}\}=6$. The first argument of the maximum acts as limitation. The numerical results in Table 2 show that the expected convergence order is obtained. Note that it is not possible that ${r_{\text {ex}}^{{\mathscr{I}}}}+1$ is the only limiting term since the structure of (4.19) implies that $\min \limits \left \{r+1, r_{\mathcal {I}}^{{\mathscr{I}}}+2\right \} \geq r_{\text {ex}}^{{\mathscr{I}}} +1 \geq \min \limits \left \{r,r_{\mathcal {I}}^{{\mathscr{I}}}+1\right \}$ if $r_{\text {ex}}^{{\mathscr{I}}}+1$ is limiting. Hence, the integer ${r_{\text {ex}}^{{\mathscr{I}}}}+1$ coincides either with $\min \limits \left \{r+1, {r_{{\mathcal {I}}}^{{\mathscr{I}}}}+2\right \}$ or $\min \limits \left \{r,{r_{{\mathcal {I}}}^{{\mathscr{I}}}}+1\right \}$.

6.3 Case group 3

This group of cases studies the convergence orders in the $L^{\infty }$ norm and the $W^{1,\infty }$ seminorm. The presented choices will show that each of the first three expressions in the outer minimum in (4.19) can bound the $L^{\infty }$ convergence order. Moreover, the cases will demonstrate that the convergence order in the $W^{1,\infty }$ seminorm can be limited by both occurring terms in (4.18). Again, the limiting numbers are given in boldface.

Case 3a

The choice

$$ \textstyle P_{{{\widehat{\mathscr{I}}}}} = \text{Gauss(6)}, \qquad P_{\widehat{\mathcal{I}}} = \left\{ -1, -\frac{1}{2}, \frac{1}{4}, \frac{3}{4}, 1 \right\} $$

results in

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} & =& \min\{7,6,\boldsymbol{5},\max\{12,\min\{6,5\}\}\} = 5, \\ W^{1,\infty}\text{~order} & =& \min\{6,\boldsymbol{5}\} = 5. \end{array} $$

Hence, the third argument in the outer minimum determines the convergence order for the $L^{\infty }$ norm while the second argument of the minimum limits the convergence order of the $W^{1,\infty }$ seminorm. The numerical results in Table 3 provide the predicted convergence orders.

Table 3 Errors and convergence orders in different norms for the Cases 3a and 3b

Full size table

Case 3b

Setting

$$ \textstyle P_{{{\widehat{\mathscr{I}}}}} = \text{Gauss(6)}, \qquad P_{\widehat{\mathcal{I}}} = \text{left Gauss--Radau(3)} $$

gives

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} &=& \min\{7,\boldsymbol{4},5,\max\{12,\min\{6,3\}\}\} = 4, \\ W^{1,\infty}\text{~order} &=& \min\{6,\boldsymbol{3}\} = 3. \end{array} $$

The convergence order in the $W^{1,\infty }$ seminorm is again determined by the second argument of the corresponding minimum. The limitation of the convergence order of the $L^{\infty }$ norm is caused by the second term. We clearly see from Table 3 that the expected convergence orders are obtained by the numerical simulations.

Case 3c

If we take

$$ P_{{{\widehat{\mathscr{I}}}}} = \text{Gauss(6)}, \qquad P_{\widehat{\mathcal{I}}} = \text{Gauss(5)}, $$

we get

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} &=& \min\{\boldsymbol{7},8,10,\max\{12,\min\{6,7\}\}\} = 7, \\ W^{1,\infty}\text{~order} & =& \min\{\boldsymbol{6},7\} = 6,\\ \ell^{\infty}\text{~order} &=& \min\{\boldsymbol{10},\boldsymbol{10},\max\{12,\min\{6,7\}\}\} = 10. \end{array} $$

The convergence orders of the $L^{\infty }$ norm and the $W^{1,\infty }$ seminorm are limited by the first argument in the corresponding minimum expressions. The numerical results in Table 4 indicate that the predicted orders are achieved. The additionally presented results in the $\ell ^{\infty }$ show also the predicted behavior. Note that all three error expressions show the optimal convergence orders that are also obtained if exact integration is used and ${\mathcal {I}}$ is the identity operator.

Table 4 Errors and convergence orders in different norms for Case 3c

Full size table

6.4 Case group 4

This group of cases studies the superconvergence. Hence, we will restrict ourselves to cases where the convergence order in the $\ell ^{\infty }$ norm suggested by (5.11) is strictly greater than the convergence order in the $L^{\infty }$ norm given by (4.19). We will show for this situation that the first two arguments in the minimum in (5.11) and the first argument inside the maximum there can limit the $\ell ^{\infty }$ convergence order. We remind that the limiting term is written in boldface.

Case 4a

The choice

$$ P_{{{\widehat{\mathscr{I}}}}} = \text{Gauss(6)}, \qquad P_{\widehat{\mathcal{I}}} = \text{Gauss--Lobatto(5)} $$

leads to

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} & =& \min\{7,\boldsymbol{6},8,\max\{12,\min\{6,5\}\}\} = 6,\\ \ell^{\infty}\text{~order} & =& \min\{10,\boldsymbol{8},\max\{12,\min\{6,5\}\}\} = 8, \end{array} $$

see (4.19) and (5.11). The convergence order in $\ell ^{\infty }$ is bounded by the second argument of the minimum expression. As viewable in Table 5, the expected convergence orders are obtained.

Table 5 Errors and convergence orders in different norms for the cases of group 4

Full size table

Case 4b

Setting

$$ P_{{{\widehat{\mathscr{I}}}}} = \text{Gauss(6)}, \qquad P_{\widehat{\mathcal{I}}} = \text{Gauss(6)}, $$

the convergence orders

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} &=& \min\{\boldsymbol{7},\infty,\infty,\max\{12,\min\{6,\infty\}\}\} = 7,\\ \ell^{\infty}\text{~order} &= &\min\{\boldsymbol{10},\infty,\max\{12,\min\{6,\infty\}\}\} = 10 \end{array} $$

are expected by our theory. Hence, the convergence order in the $\ell ^{\infty }$ norm is limited by the first term inside the minimum in (5.11). The numerical results coincide with our predictions.

Case 4c

Taking

$$ P_{{{\widehat{\mathscr{I}}}}} = \text{Gauss(4)}, \qquad P_{\widehat{\mathcal{I}}} = \text{Gauss(4)} $$

provides

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} &=& \min\{\boldsymbol{7},\infty,\infty,\max\{8,\min\{6,\infty\}\}\} = 7,\\ \ell^{\infty}\text{~order} &=& \min\{10,\infty,\max\{\boldsymbol{8},\min\{6,\infty\}\}\} = 8, \end{array} $$

compare (4.19) and (5.11). The limitation of the $\ell ^{\infty }$ convergence is caused by the first argument of the maximum in (5.11). The results in Table 5 show clearly the superconvergence since the convergence order in the $\ell ^{\infty }$ norm is one higher than the convergence order in the $L^{\infty }$ norm.

Case 4d

Choosing

$$ P_{{ {\widehat{\mathscr{I}}}}} = \text{Gauss(6)}, \qquad P_{\widehat{\mathcal{I}}} = \text{Gauss(3)}, $$

the estimates (4.19) and (5.11) suggest

$$ \begin{array}{@{}rcl@{}} L^{\infty}\text{~order} &=& \min\{7,\boldsymbol{4},6,\max\{12,\min\{6,3\}\}\} = 4,\\ \ell^{\infty}\text{~order} &=& \min\{10,\boldsymbol{6},\max\{12,\min\{6,3\}\}\} = 6. \end{array} $$

However, since the uniform boundedness of $\sup _{t\in I}\|U^{(l)}(t)\|$ assumed in Theorem 18 cannot be ensured by Lemma 20 due to $r_{\mathcal {I}}+1 = 3 < 5 = r-1$, we actually do not expect any superconvergence. These expectations are confirmed by the numerical results given in Table 5. They show that for both the $L^{\infty }$ and the $\ell ^{\infty }$ norm convergence order 4 is obtained.

6.5 Summary

The experimentally obtained and theoretically predicted convergence orders for all cases and all considered norms are collected in Table 6. The experimental orders of convergence were calculated using the results obtained for 256 and 512 time steps.

Table 6 Errors and convergence orders in different norms for all cases

Full size table

References

Akrivis, G., Makridakis, C.H., Nochetto, R.H.: Galerkin and Runge-Kutta methods: unified formulation, a posteriori error estimates and nodal superconvergence. Numer. Math. 118, 429–456 (2011)
Article MathSciNet Google Scholar
Becher, S., Matthies, G.: Unified analysis for variational time discretizations of higher order and higher regularity applied to non-stiff odes. arXiv:2105.06862v1 (2021)
Becher, S., Matthies, G.: Variational time discretizations of higher order and higher regularity. BIT Numer. Math. https://doi.org/10.1007/s10543-021-00851-6(2021)
Becher, S., Matthies, G., Wenzel, D.: Variational methods for stable time discretization of first-order differential equations. In: Georgiev, K., Todorov, M., Ivan, G. (eds.), Advanced Computing in Industrial Mathematics: BGSIAM 2017, pp. 63–75. Springer International Publishing, Cham (2019)
Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017)
Article MathSciNet Google Scholar
Delfour, M., Hager, W., Trochu, F.: Discontinuous Galerkin methods for ordinary differential equations. Math. Comp. 36(154), 455–473 (1981)
Article MathSciNet Google Scholar
Delfour, M.C., Dubeau, F.: Discontinuous polynomial approximations in the theory of one-step, hybrid and multistep methods for nonlinear ordinary differential equations. Math. Comp. 47(175), 169–189 (1986)
Article MathSciNet Google Scholar
Emmrich, E.: Discrete versions of Gronwall’s lemma and their application to the numerical analysis of parabolic problems. Preprint 637-1999, Preprint series of the Institute of Mathematics, Technische Universität Berlin (1999)
Estep, D., Stuart, A.: The dynamical behavior of the discontinuous Galerkin method and related difference schemes. Math. Comp 71(239), 1075–1103 (2002)
Article MathSciNet Google Scholar
Hairer, E., Lubich, C.h., Wanner, G.: Geometric numerical integration. Springer-Verlag, New York (2002). Corrected 2nd printing 2004
Book Google Scholar
Hairer, E., Nørsett, S.P., Wanner, G.: Solving ordinary differential equations I, 2nd edn. Springer-Verlag, New York (1993). Corrected 3rd printing 2008
MATH Google Scholar
Henrici, P.: Discrete variable methods in ordinary differential equations. Wiley, New York (1962)
MATH Google Scholar
Hulme, B.L.: Discrete Galerkin related one-step methods for ordinary differential equations. Math. Comp. 26(120), 881–891 (1972)
Article MathSciNet Google Scholar
Hulme, B.L.: One-step piecewise polynomial Galerkin methods for initial value problems. Math. Comp. 26(118), 415–426 (1972)
Article MathSciNet Google Scholar
Lasaint, P., Raviart, P.-A.: On a finite element method for solving the neutron transport equation. In: Mathematical aspects of finite elements in partial differential equations. (Proc. Sympos., Math. Res. Center, Univ. Wisconsin, Madison, Wis., 1974), pp 89–123 (1974)
Mishkov, R.L.: Generalization of the formula of Faà di Bruno for a composite function with a vector argument. Int. J. Math. & Math. Sci. 24(7), 481–491 (2000)
Article MathSciNet Google Scholar
Quarteroni, A., Valli, A.: Numerical approximation of partial differential equations. Springer Series in Computational Mathematics. Springer, Berlin (2008)
MATH Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

TU Dresden, Institute of Numerical Mathematics, 01062, Dresden, Germany
Simon Becher & Gunar Matthies

Authors

Simon Becher
View author publications
You can also search for this author in PubMed Google Scholar
Gunar Matthies
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gunar Matthies.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Becher, S., Matthies, G. Unified analysis for variational time discretizations of higher order and higher regularity applied to non-stiff ODEs. Numer Algor 89, 1533–1565 (2022). https://doi.org/10.1007/s11075-021-01164-z

Download citation

Received: 19 May 2021
Accepted: 28 June 2021
Published: 12 August 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s11075-021-01164-z

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Unified analysis for variational time discretizations of higher order and higher regularity applied to non-stiff ODEs

Abstract

Similar content being viewed by others

Variational time discretizations of higher order and higher regularity

Variational Methods for Stable Time Discretization of First-Order Differential Equations

Multigrid Methods for Time-Fractional Evolution Equations: A Numerical Study

1 Introduction

2 Formulation of the methods

3 Existence and uniqueness

Assumption 1

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Assumption 2

Assumption 3

Assumption 4

Remark 4

Theorem 5 (Existence and uniqueness)

Proof

4 Error analysis

Assumption 5a

Assumption 5b

Remark 6

Definition 7 (Auxiliary interpolation operator)

Theorem 8

Proof

Remark 9

Remark 10

Definition 11

Lemma 12

Proof

Lemma 13

Proof

Lemma 14

Proof

Corollary 15

Remark 16

5 Superconvergence analysis

Lemma 17 (Approximation operator)

Theorem 18 (Superconvergence estimate)

Proof

Remark 19

Lemma 20

Proof

Corollary 21

6 Numerical experiments

6.1 Case group 1

Case 1

6.2 Case group 2

Case 2a

Case 2b

Case 2c

6.3 Case group 3

Case 3a

Case 3b

Case 3c

6.4 Case group 4

Case 4a

Case 4b

Case 4c

Case 4d

6.5 Summary

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation