Arbitrarily high-order energy-conserving methods for Poisson problems

Amodio, Pierluigi; Brugnano, Luigi; Iavernaro, Felice

doi:10.1007/s11075-022-01285-z

Arbitrarily high-order energy-conserving methods for Poisson problems

Original Paper
Open access
Published: 09 March 2022

Volume 91, pages 861–894, (2022)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

Arbitrarily high-order energy-conserving methods for Poisson problems

Download PDF

1138 Accesses
7 Citations
1 Altmetric
Explore all metrics

This article has been updated

Abstract

In this paper, we are concerned with energy-conserving methods for Poisson problems, which are effectively solved by defining a suitable generalization of HBVMs, a class of energy-conserving methods for Hamiltonian problems. The actual implementation of the methods is fully discussed, with a particular emphasis on the conservation of Casimirs. Some numerical tests are reported, in order to assess the theoretical findings.

The Energy Conservation of Vlasov-Poisson Systems

Article 28 November 2022

Design and analysis of the Extended Hybrid High-Order method for the Poisson problem

Article Open access 02 July 2022

Least energy sign-changing solutions for Kirchhoff–Poisson systems

Article Open access 17 October 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A Poisson problem is in the form

$$ \dot y = B(y)\nabla H(y) =: f(y), \quad t>0,\qquad y(0)=y_{0}\in\mathbb{R}^{m}, \qquad B(y)^{\top} = -B(y), $$

(1)

where H(y) is a scalar function, usually called the Hamiltonian. For sake of simplicity, hereafter both H(y) and B(y) will be assumed suitably regular. From the skew-symmetry of B(y) one easily deduces that H(y) is a constant of motion, since, along the solution of (1),

$$\frac{\mathrm{d}}{\mathrm{d} t} H(y) = \nabla H(y)^{\top} \dot y = \nabla H(y)^{\top} B(y)\nabla H(y) = 0,$$

due to the skew-symmetry of B(y). Possible additional invariants of (1) are its Casimirs, namely scalar functions C(y) for which

$$ \nabla C(y)^{\top} B(y)={ (0,\dots,0)\in\mathbb{R}^{1\times m}} $$

(2)

holds true for all y. When matrix B(y) is constant, as in the case of Hamiltonian problems,

$$ \dot y = J\nabla H(y), \qquad t>0, \qquad y(0)=y_{0}, \qquad J^{\top}=-J, $$

(3)

energy conservation can be obtained by solving problem (3) via HBVMs, a class of energy-conserving Runge-Kutta methods for Hamiltonian problems (see, e.g., [7, 8, 11, 12] and the monograph [5], see also the recent review paper [6]). Nevertheless, in the case where the problem is not Hamiltonian, HBVMs are no more energy-conserving. This motivates the present paper, where an energy-conserving variant of HBVMs for Poisson problems is derived and analyzed.

The numerical solution of Poisson problems has been tackled by following many different approaches (see, e.g., [20, Chapter VII] and references therein). More recently, it has been considered in [19], where an extension of the AVF method [22] is proposed, and in [1, 3], where a line integral approach has been used instead. Functionally fitted methods have been proposed in [21, 24,25,26]. In this paper we further pursue the line integral approach to the problem, which will provide an energy-conserving variant of HBVMs for solving (1).

With this premise, the structure of the paper is the following: in Section 2 we describe the new framework, in which the methods will be derived; in Section 3 we provide the final shape of the method, while in Section 4 its actual implementation is studied; in Section 5 we present a few numerical tests confirming the theoretical findings; at last, in Section 6 we give some concluding remarks.

2 The new framework

As anticipated above, the framework that we shall use to derive and analyze the methods is that of the so-called line integral methods, namely methods where the conservation properties are derived by the vanishing of a corresponding line integral [5, 6]. Such methods have been largely investigated in the case of Hamiltonian problems, providing their major instance given by Hamiltonian Boundary Value Methods (HBVMs). The analysis will strictly follow that in [11] and [17]. To begin with, let us consider problem (1) on the interval [0,h],

$$ \dot y(ch) = B(y(ch)) \nabla H(y(ch)),\qquad c\in[0,1], \qquad y(0) = y_{0}. $$

(4)

In fact, since we shall speak about a one-step method, it suffices to analyze its first step of application, with h the time-step. Next, let us consider the orthonormal Legendre polynomial basis {P_j}_j≥ 0 on the interval [0,1],

$$ \deg P_{j} = j, \qquad {{\int}_{0}^{1}} P_{i}(x)P_{j}(x)\mathrm{d} x = \delta_{ij}, \qquad \forall i,j=0,1,\dots, $$

(5)

with δ_ij the Kronecker symbol, and the following expansions for the functions at the right-hand side in (4):

$$ \begin{array}{@{}rcl@{}} \nabla H(y(ch)) = {\sum}_{j\ge0} P_{j}(c) \gamma_{j}(y), &&P_{j}(c)B(y(ch)) = {\sum}_{i\ge0} P_{i}(c) \rho_{ij}(y), \qquad~ c\in[0,1],\\ \\ \gamma_{j}(y) = {{\int}_{0}^{1}} P_{j}(\tau)\nabla H(y(\tau h))\mathrm{d}\tau, && \rho_{ij}(y) = {{\int}_{0}^{1}} P_{i}(\tau)P_{j}(\tau)B(y(\tau h))\mathrm{d}\tau, \quad i,j=0,1,\dots. \end{array} $$

(6)

The following properties hold true.

Lemma 1

Assume $\psi :[0,h]\rightarrow V$, with V a vector space, admit a Taylor expansion at 0. Then, for all $j=0,1,\dots $:

$${{\int}_{0}^{1}} P_{j}(c)c^{i}\psi(ch)\mathrm{d} c=O(h^{j-i}), \qquad i=0,\dots,j.$$

Proof

By the hypotheses on ψ, one has:

$$c^{i}\psi(ch)={\sum}_{r\ge0} \frac{\psi^{(r)}(0)}{r!} h^{r} c^{r+i}.$$

Consequently, for all $i=0,\dots ,j$, by virtue of (5) it follows that:

$$ \begin{array}{@{}rcl@{}} {{\int}_{0}^{1}} P_{j}(c)c^{i}\psi(ch)\mathrm{d} c&=& {\sum}_{r\ge0} \frac{\psi^{(r)}(0)}{r!} h^{r} {{\int}_{0}^{1}}P_{j}(c)c^{r+i}\mathrm{d} c\\ &=&{\sum}_{r\ge j-i} \frac{\psi^{(r)}(0)}{r!} h^{r} {{\int}_{0}^{1}}P_{j}(c)c^{r+i}\mathrm{d} c = O(h^{j-i}). \end{array} $$

□

Corollary 1

With reference to (6), for any suitably regular path $\sigma :[0,h]\rightarrow \mathbb {R}^{m}$ one has:

$$ \gamma_{j}(\sigma)=O(h^{j}), \qquad \rho_{ij}(\sigma)=O(h^{|i-j|}).\qquad \forall i,j=0,1,\dots. $$

(7)

Proof

Immediate from Lemma 1, by taking into account (6).□

We also state, without proof, the following straightforward property, deriving from the skew-symmetry of B.

Lemma 2

With reference to (6), for any path $\sigma :[0,h]\rightarrow \mathbb {R}^{m}$ one has:

$$ \rho_{ij}(\sigma) = \rho_{ji}(\sigma) = -\rho_{ij}(\sigma)^{\top}, \qquad \forall i,j=0,1,\dots. $$

(8)

Taking into account (6), the right-hand side in (4) can be rewritten as:

$$ \dot y(ch) = B(y(ch))\nabla H(y(ch)) = {\sum}_{j\ge0} P_{j}(c) B(y(ch)) \gamma_{j}(y) = {\sum}_{i,j\ge 0} P_{i}(c)\rho_{ij}(y) \gamma_{j}(y), \qquad c\in[0,1], $$

(9)

from which one obtains that the solution of (4) can be formally written as:

$$ y(ch) = y_{0} + h{\sum}_{i,j\ge0} {{\int}_{0}^{c}}P_{i}(x)\mathrm{d} x\rho_{ij}(y) \gamma_{j}(y), \qquad c\in[0,1]. $$

(10)

In particular, by considering (5) and that P₀(c) ≡ 1, from which ${{\int \limits }_{0}^{1}}P_{i}(x)\mathrm {d} x=\delta _{i0}$, one has:

$$ y(h) = y_{0} + h{\sum}_{j\ge 0}\rho_{0j}(y) \gamma_{j}(y) \equiv y_{0} + h{\sum}_{j\ge0} {{\int}_{0}^{1}} P_{j}(c)B(y(ch))\mathrm{d} c{{\int}_{0}^{1}} P_{j}(c) \nabla H(y(ch))\mathrm{d} c. ~~ $$

(11)

In order to obtain a polynomial approximation of degree s to y, it suffices to truncate the two infinite series in (9) after s terms:

$$ \dot \sigma(ch) = {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma) \gamma_{j}(\sigma), \qquad c\in[0,1], $$

(12)

with ρ_ij(σ) and γ_j(σ) defined according to (6) by formally replacing y by σ. Consequently, (10) becomes

$$ \sigma(ch) = y_{0} + h{\sum}_{i,j=0}^{s-1} {{\int}_{0}^{c}}P_{i}(x)\mathrm{d} x\rho_{ij}(\sigma) \gamma_{j}(\sigma), \qquad c\in[0,1], $$

(13)

providing the approximation

$$ y_{1}:=\sigma(h) = y_{0} + h{\sum}_{j=0}^{s-1}\rho_{0j}(\sigma) \gamma_{j}(\sigma) \equiv y_{0} + h{\sum}_{j=0}^{s-1} {{\int}_{0}^{1}} P_{j}(c)B(\sigma(ch))\mathrm{d} c{{\int}_{0}^{1}} P_{j}(c) \nabla H(\sigma(ch))\mathrm{d} c, $$

(14)

in place of (11).

2.1 Interpretation of σ

We now provide an interesting interpretation of the polynomial approximation σ. For this purpose, let us rewrite (9), by taking into account (6), as follows:

$$ \begin{array}{@{}rcl@{}} \dot y(ch) &=& {\sum}_{i,j\ge0} P_{i}(c) \rho_{ij}(y)\gamma_{j}(y)\\ &=&{\sum}_{i\ge0} P_{i}(c) {\sum}_{j\ge0} {{\int}_{0}^{1}} P_{i}(\tau) B(y(\tau h))P_{j}(\tau){\mathrm{d}\tau{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(y(\tau_{1}h))\mathrm{d}\tau_{1}\\ &=&{\sum}_{i\ge0} P_{i}(c) {{\int}_{0}^{1}} P_{i}(\tau) B(y(\tau h))\left( \underbrace{{\sum}_{j\ge0}P_{j}(\tau){{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(y(\tau_{1}h))\mathrm{d}\tau_{1}}_{=\nabla H(y(\tau))}\right)\mathrm{d}\tau\\ &\equiv& B(y(ch))\nabla H(y(ch)), \end{array} $$

as is expected. In a similar way, we can rewrite (12) as:

$$ \begin{array}{@{}rcl@{}} \dot \sigma(ch) &=& {\sum}_{i,j=0}^{s-1} P_{i}(c) \rho_{ij}(\sigma)\gamma_{j}(\sigma)\\ &=&{\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{j=0}^{s-1} {{\int}_{0}^{1}} P_{i}(\tau) B(\sigma(\tau h))P_{j}(\tau){\mathrm{d}\tau{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(\sigma(\tau_{1}h))\mathrm{d}\tau_{1}\\ &=&{\sum}_{i=0}^{s-1} P_{i}(c) {{\int}_{0}^{1}} P_{i}(\tau) B(\sigma(\tau h))\left( {\sum}_{j=0}^{s-1}P_{j}(\tau){{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(\sigma(\tau_{1}h))\mathrm{d}\tau_{1}\right)\mathrm{d}\tau\\ &=:& {\sum}_{i=0}^{s-1} P_{i}(c) {{\int}_{0}^{1}} P_{i}(\tau) B(\sigma(\tau h))\left[\nabla H(\sigma(\tau h))\right]_{s}\mathrm{d}\tau\\ &\equiv& \left[ B(\sigma(\tau h))\left[\nabla H(\sigma(\tau h))\right]_{s}\right]_{s}, \end{array} $$

having denoted by $\left [\cdot \right ]_{s}$ the best approximation in π_s− 1 (i.e., [⋅]_s is the best polynomial approximation of degree s − 1) of the function in argument. This fact provides a noticeable interpretation of the polynomial approximation σ, which is the solution of the initial value problem

$$ \dot\sigma(ch) = \left[B(\sigma(ch))\left[\nabla H(\sigma(ch))\right]_{s}\right]_{s}, \qquad c\in[0,1], \qquad \sigma(0)=y_{0}, $$

(15)

equivalent to (12). Thus, the vector field of (15) is defined by a double projection procedure onto the finite dimensional vector space π_s− 1 which involves, in turn, the vector fields ∇H(σ(ch)) and $B(\sigma (ch))\left [\nabla H(\sigma (ch))\right ]_{s}$, respectively.

2.2 Analysis

We now analyze the method (12)–(14). The following result then holds true, stating that the method is energy-conserving.

Theorem 1

H(y₁) = H(y₀).

Proof

In fact, by virtue of (1), (6), and (12)–(14) one has, by using the standard line integral argument:

$$ \begin{array}{@{}rcl@{}} \lefteqn{H(y_{1})-H(y_{0})~=~ H(\sigma(h))-H(\sigma(0)) ~=~ {{\int}_{0}^{h}} \nabla H(\sigma(t))^{\top}\dot\sigma(t)\mathrm{d} t}\\ &=& h{{\int}_{0}^{1}} \nabla H(\sigma(ch))^{\top}\dot\sigma(ch)\mathrm{d} c~=~ h{{\int}_{0}^{1}} \nabla H(\sigma(ch))^{\top} {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma) \gamma_{j}(\sigma)\mathrm{d} c\\ &=&h {\sum}_{i,j=0}^{s-1} \left[{\int}_{0}^{1} P_{i}(c)\nabla H(\sigma(ch))\mathrm{d} c\right]^{\top}\rho_{ij}(\sigma) \gamma_{j}(\sigma) ~=~h {\sum}_{i,j=0}^{s-1} \gamma_{i}(\sigma)^{\top}\rho_{ij}(\sigma) \gamma_{j}(\sigma) ~=~0, \end{array} $$

where the last equality follows from (8).□

Concerning the accuracy of the new approximation, the following result holds true.

Theorem 2

Let y₁ be defined according to (12)–(14). Then, y₁ − y(h) = O(h^2s+ 1).^{Footnote 1}

Proof

Let y(t) ≡ y(t,ξ,η) denote the solution of the initial value problem (see (1))

$$ \dot y = B(y) \nabla H(y) =:F(y), \qquad t\ge\xi, \qquad y(\xi)=\eta, $$

(16)

Moreover, let us denote

$${\Phi}(t,\xi,\eta) = \frac{\partial}{\partial \eta}y(t,\xi,\eta),\qquad t\ge \xi,$$

also recalling that

$$\qquad\frac{\partial}{\partial \xi}y(t,\xi,\eta) = - {\Phi}(t,\xi,\eta)F(\eta).$$

Then, by taking into account Lemma 1 and Corollary 1, and setting

$${\Psi}_{i}(\sigma) = {{\int}_{0}^{1}} P_{i}(c){\Phi}(h,ch,\sigma(ch))\mathrm{d} c =O(h^{i}), \qquad i=0,1,\dots,$$

one has:

$$ \begin{array}{@{}rcl@{}} \lefteqn{ y_{1}-y(h) ~=~\sigma(h)-y(h) = y(h,h,\sigma(h))-y(h,0,\sigma(0)) = {{\int}_{0}^{h}} \frac{\mathrm{d}}{\mathrm{d} t} y(h,t,\sigma(t))\mathrm{d} t}\\ &=&{{\int}_{0}^{h}} \left.\left[\frac{\partial}{\partial \xi} y(h,\xi,\sigma(t))\right|_{\xi=t} + \left.\frac{\partial}{\partial \eta}y(h,t,\eta)\right|_{\eta=\sigma(t)} \dot\sigma(t)\right] \mathrm{d} t\\ &=& {{\int}_{0}^{h}} \left[-{\Phi}(h,t,{ \sigma(t)})F(\sigma(t))+{\Phi}(h,t,{ \sigma(t)})\dot\sigma(t)\right]\mathrm{d} t \\ &=& -h{{\int}_{0}^{1}} {\Phi}(h,ch,{ \sigma(ch)})\left[ F(\sigma(ch))-\dot\sigma(ch)\right]\mathrm{d} c\\ &=&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ \sigma(ch)})\left[ B(\sigma(ch))\nabla H(\sigma(ch)) - {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma)\gamma_{j}(\sigma)\right]\mathrm{d} c\\ &=&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ \sigma(ch)})\left[ {\sum}_{i,j\ge 0} P_{i}(c)\rho_{ij}(\sigma)\gamma_{j}(\sigma) - {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma)\gamma_{j}(\sigma)\right]\mathrm{d} c\\ &=& -h\left[{\sum}_{i,j\ge0} {\Psi}_{i}(\sigma)\rho_{ij}(\sigma)\gamma_{j}(\sigma) -{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(\sigma)\rho_{ij}(\sigma)\gamma_{j}(\sigma) \right]\\ &=&-h\left[ {\sum}_{i=0}^{s-1}{\sum}_{j\ge s} \underbrace{{\Psi}_{i}(\sigma)\rho_{ij}(\sigma)}_{=O(h^{j})}\gamma_{j}(\sigma) + {\sum}_{i\ge s}{\sum}_{j=0}^{s-1} {\Psi}_{i}(\sigma)\underbrace{\rho_{ij}(\sigma)\gamma_{j}(\sigma)}_{=O(h^{i})} + {\sum}_{i,j\ge s} {\Psi}_{i}(\sigma)\rho_{ij}(\sigma)\gamma_{j}(\sigma) \right]\\ &=&O(h^{2s+1}). \end{array} $$

□

At last, we observe that, since the procedure (12)–(14) is equivalent to defining the path σ that joins σ(0) = y₀ to σ(h) = y₁, then the same procedure, when started at y₀ and going forward provides y₁ and, when started from y₁ and going backward, brings back to y₀. In other words, the following result holds true.

Theorem 3

The procedure (12)–(14) is symmetric.

Proof

This result comes as an easy consequence of Theorem 11, where the analogous property for the full discretized method is shown. □

Remark 1

We conclude this section emphasizing that, when problem (1) is Hamiltonian, i.e., in the form (3), then matrix B(y) ≡ J is constant, and therefore (see (6)) ρ_ij(σ) = δ_ijJ. Consequently, (13) becomes

$$\sigma(ch) = y_{0}+h{\sum}_{j=0}^{s-1} {{\int}_{0}^{c}} P_{j}(x)\mathrm{d} x {{\int}_{0}^{1}} P_{j}(\tau)J\nabla H(\sigma(\tau h))\mathrm{d}\tau, \qquad c\in[0,1],$$

which (see [6, 8, 10]) is the so-called master functional equation defining the class of energy-conserving methods named Hamiltonian Boundary Value Methods (HBVMs). Consequently, when the problem is Hamiltonian, then the procedure (12)–(14) reduces to the HBVM$(\infty ,s)$ method in [8].

2.3 Conservation of Casimirs

In this section, we study the required modifications, in order to conserve Casimirs, i.e., functions satisfying (2). For sake of simplicity, we shall consider the simpler case of one Casimir, but multiple ones can be handled by slightly adapting the arguments, as is sketched at the end of the section. To begin with, for the original problem (4), and its equivalent formulation (9), one has:

$$ \begin{array}{@{}rcl@{}} 0 &=& C(y(h))-C(y_{0}) = {{\int}_{0}^{h}} \nabla C(y(t))^{\top} \dot y(t)\mathrm{d} t = h{{\int}_{0}^{1}} \nabla C(y(ch))^{\top}\dot y(ch)\mathrm{d} c \\ &=& h{{\int}_{0}^{1}} \nabla C(y(ch))^{\top} B(y(ch))\nabla H(y(ch))\mathrm{d} c =h{{\int}_{0}^{1}} \nabla C(y(ch))^{\top}{\sum}_{i,j\ge0} P_{i}(c) \rho_{ij}(y)\gamma_{j}(y) \\ &=& h{\sum}_{i,j\ge0}\left[ {{\int}_{0}^{1}} P_{i}(c)\nabla C(y(ch))\mathrm{d} c\right]^{\top} \rho_{ij}(y)\gamma_{j}(y) ~=:~ h{\sum}_{i,j\ge0} \pi_{i}(y)^{\top}\rho_{ij}(y)\gamma_{j}(y), \end{array} $$

(17)

having set

$$ \pi_{i}(y) = {{\int}_{0}^{1}} P_{i}(c)\nabla C(y(ch))\mathrm{d} c = O(h^{i}), $$

(18)

the i-th Fourier coefficient of the gradient of the Casimir, with the last equality following from Lemma 1. Clearly, again from (2), one derives, by taking into account (12):

$$ \begin{array}{@{}rcl@{}} \lefteqn{C(y_{1})-C(y_{0}) ~=~C(\sigma(h))-C(\sigma(0)) ~=~{{\int}_{0}^{h}} \nabla C(\sigma(t))^{\top} \dot \sigma(t)\mathrm{d} t ~=~ h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top}\dot \sigma(ch)\mathrm{d} c}\\ &=& h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top}\left[\dot \sigma(ch)-B(\sigma(ch))\nabla H(\sigma(ch))\right]\mathrm{d} c\\ &&+~\overbrace{h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top} B(\sigma(ch))\nabla H(\sigma(ch))\mathrm{d} c}^{=0}\\ &=&-h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top}\left[ {\sum}_{i,j\ge0} P_{i}(c) \rho_{ij}(\sigma)\gamma_{j}(\sigma) - {\sum}_{i,j=0}^{s-1} P_{i}(c) \rho_{ij}(\sigma)\gamma_{j}(\sigma)\right]\mathrm{d} c\\ &=& -h\left[ {\sum}_{i,j\ge s}\pi_{i}(\sigma)^{\top}\rho_{ij}(\sigma)\gamma_{j}(\sigma) + {\sum}_{i=0}^{s-1}{\sum}_{j\ge s}\underbrace{\pi_{i}(\sigma)^{\top}\rho_{ij}(\sigma)}_{=O(h^{j})}\gamma_{j}(\sigma)+ {\sum}_{j=0}^{s-1}{\sum}_{i\ge s}\pi_{i}(\sigma)^{\top}\underbrace{\rho_{ij}(\sigma)\gamma_{j}(\sigma)}_{=O(h^{i})}\right] \\&=&O(h^{2s+1}). \end{array} $$

(19)

In order to recover the conservation of Casimirs, we shall use a strategy akin to that used in [18] for HBVMs (see also [4]), i.e., suitably perturbing some of its coefficients. In more details, let us consider the following modified polynomial in place of (12):^{Footnote 2}

$$ \dot \sigma_{\alpha}(ch) = {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha}) - \alpha \tilde{B}\gamma_{0}(\sigma_{\alpha}), \qquad c\in[0,1], \qquad \sigma_{\alpha}(0)=y_{0}, $$

(20)

with $\tilde {B}^{\top }=-\tilde {B}\ne O$ an arbitrary skew-symmetric matrix. As is usual, the new approximation will be y₁ := σ_α(h). In other words, we have considered the following perturbed coefficient:

$$\rho_{00}(\sigma_{\alpha}) - \alpha \tilde{B},$$

in place of ρ₀₀(σ) in (12). The following result holds true.

Theorem 4

Assume that $\pi _{0}(\sigma _{\alpha })^{\top } \tilde {B}\gamma _{0}(\sigma _{\alpha })\ne 0$. Then the Casimir C(y) is conserved, provided that

$$ \alpha = \frac{{\sum}_{i,j=0}^{s-1} \pi_{i}(\sigma_{\alpha})^{\top} \rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})}{\pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}\gamma_{0}(\sigma_{\alpha})}. $$

(21)

Moreover, α = O(h^2s).

Proof

In fact, by repeating similar steps as in (19), and replacing σ by σ_α, as defined in (20), one obtains:

$$ \begin{array}{@{}rcl@{}} \lefteqn{C(y_{1})-C(y_{0}) ~=~C(\sigma_{\alpha}(h))-C(\sigma_{\alpha}(0))~=~h{{\int}_{0}^{1}} \nabla C(\sigma_{\alpha}(ch))^{\top}\dot \sigma_{\alpha}(ch)\mathrm{d} c}\\ &=&h{{\int}_{0}^{1}} \nabla C(\sigma_{\alpha}(ch))^{\top}\left[{\sum}_{i,j=0}^{s-1} P_{i}(c) \rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})-\alpha \tilde{B}\gamma_{0}(\sigma_{\alpha})\right]\mathrm{d} c\\ &=& h\left[ {\sum}_{i,j=0}^{s-1} \pi_{i}(\sigma_{\alpha})^{\top}\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})- \alpha \pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}\gamma_{0}(\sigma_{\alpha})\right] ~=~0, \end{array} $$

provided that (21) holds true. The statement is completed by observing that the numerator is O(h^2s), whereas, the denominator is O(1). $\Box {~}$

We now prove that the results of Theorems 1 and 2 continue to hold for the polynomial (20).

Theorem 5

For any α: H(σ_α(h)) = H(σ_α(0)).

Proof

Following similar steps as in the proof of Theorem 1, one has:

$$ \begin{array}{@{}rcl@{}} \lefteqn{H(\sigma_{\alpha}(h))-H(\sigma_{\alpha}(0)) = {{\int}_{0}^{h}} \nabla H(\sigma_{\alpha}(t))^{\top}\dot\sigma_{\alpha}(t)\mathrm{d} t}\\ &=& h{{\int}_{0}^{1}} \nabla H(\sigma_{\alpha}(ch))^{\top}\dot\sigma_{\alpha}(ch)\mathrm{d} c~ = ~ h{{\int}_{0}^{1}} \nabla H(\sigma_{\alpha}(ch))^{\top} {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha})\mathrm{d} c\\ &&- h\alpha\left[{\int}_{0}^{1}\nabla H(\sigma_{\alpha}(ch)) \mathrm{d} c\right]^{\top} \tilde{B}\gamma_{0}(\gamma_{\alpha})\\ &=&h \underbrace{{\sum}_{i,j=0}^{s-1} \gamma_{i}(\sigma_{\alpha})^{\top}\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha})}_{=0} - h\/\alpha \gamma_{0}(\sigma_{\alpha})^{\top} \tilde{B}\gamma_{0}(\sigma_{\alpha}) ~=~0, \end{array} $$

due to the fact that $\tilde {B}$ is skew-symmetric, independently of the considered value of the parameter α.□

Theorem 6

Assume that the parameter α in (20) is chosen according to (21). Then,

$$\sigma_{\alpha}(h)-y(h)=O(h^{2s+1}).$$

Proof

Repeating similar steps as those in the proof of Theorem 2 (and using the same notation), and taking into account (20), one arrives at:^{Footnote 3}

$$ \begin{array}{@{}rcl@{}} \lefteqn{\sigma_{\alpha}(h)-y(h)}\\ &=& -h\left[{\sum}_{i,j\ge0} {\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha}) -{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha}) +\alpha {\Psi}_{0}(\sigma_{\alpha})\tilde{B}\gamma_{0}(\sigma_{\alpha})\right]\\ &=&-h\left[ {\sum}_{i=0}^{s-1}{\sum}_{j\ge s} \underbrace{{\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})}_{=O(h^{j})}\gamma_{j}(\sigma_{\alpha}) + {\sum}_{i\ge s}{\sum}_{j=0}^{s-1} {\Psi}_{i}(\sigma_{\alpha})\underbrace{\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})}_{=O(h^{i})}\right.\\ &&\left. + {\sum}_{i,j\ge s} {\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})+\underbrace{\alpha {\Psi}_{0}(\sigma_{\alpha})\tilde{B}\gamma_{0}(\sigma_{\alpha})}_{=O(h^{2s})} \right] ~=~O(h^{2s+1}). \end{array} $$

□

Remark 2

We observe that the modified polynomial σ_α in (20) is the solution of the approximate perturbed ODE-IVPs:

$$ \dot\sigma_{\alpha}(ch) = \left[\left( B(\sigma_{\alpha}(ch))+\alpha \tilde{B}\right)\left[\nabla H(\sigma_{\alpha}(ch))\right]_{s}\right]_{s}, \qquad c\in[0,1], \qquad \sigma_{\alpha}(0)=y_{0}, $$

(22)

where the parameter α is such that C(σ_α(h)) = C(σ_α(0)). Clearly, when α = 0, one recovers the problem (15) defining σ.

We end this section by sketching the case when we have r independent Casimirs, so that $C:\mathbb {R}^{m}\rightarrow \mathbb {R}^{r}$. In such a case, the notation introduced above formally still holds true, with the following differences:

The Fourier coefficients (see (18)) $\pi _{i}(u_{\alpha })\in \mathbb {R}^{m\times r}$, $i=0,\dots ,s-1$;
The polynomial (20) now becomes

$$ \dot \sigma_{\alpha}(ch) = {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha}) - {\sum}_{\ell=1}^{r}\alpha_{\ell} \tilde{B}_{\ell}\gamma_{0}(\sigma_{\alpha}), \qquad c\in[0,1], \qquad \sigma_{\alpha}(0)=y_{0}, $$
(23)
having set $\alpha =\left (\alpha _{1}, \dots , \alpha _{r}\right )^{\top }$ and with $\tilde {B}_{i}^{\top }=-\tilde {B}_{i}$, $i=1,\dots ,r$, arbitrary skew-symmetric matrices such that
$$ M:= \left[ \pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}_{1}\gamma_{0}(\sigma_{\alpha}), \dots, \pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}_{r}\gamma_{0}(\sigma_{\alpha})\right]\in\mathbb{R}^{r\times r} $$
(24)
is nonsingular;
The vector α, providing the conservation of all Casimirs, is given by (compare with (21))
$$ \alpha = M^{-1} {\sum}_{i,j=0}^{s-1} \pi_{i}(\sigma_{\alpha})^{\top} \rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha}). $$
(25)

3 Discretization

The procedure (12)–(14) described in the previous section is not yet a ready to use numerical method. In fact, in order for this to happen, the integrals γ_j(σ),ρ_ij(σ), $i,j=0,\dots ,s-1$, defined in (6) need to be conveniently computed or approximated. For this purpose, as it has been done in the case of HBVMs [8], we shall use a Gauss-Legendre quadrature formula of order 2k, i.e., the interpolatory quadrature rule based at the zeros of P_k(c), with abscissae and weights (c_i,b_i), for a convenient value k ≥ s. In so doing, we shall in general obtain a new polynomial approximation u ∈π_s, in place of σ as defined in (12)–(13):

$$ \begin{array}{@{}rcl@{}} \dot u(ch) ={\sum}_{i,j=0}^{s-1}P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u),&& u(ch) = y_{0} + h{\sum}_{i,j=0}^{s-1}{{\int}_{0}^{c}} P_{i}(x)\mathrm{d} x\hat\rho_{ij}(u)\hat\gamma_{j}(u), \qquad c\in[0,1],\\ \\ \hat\gamma_{j}(u) = {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H(u(c_{\ell} h)),&& \hat\rho_{ij}(u) = {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B(u(c_{\ell} h)),\quad i,j=0,\dots,s-1. \end{array} $$

(26)

Consequently, the new approximation to y(h) will be given by

$$ y_{1}:=u(h) = y_{0} + h{\sum}_{j=0}^{s-1}\hat\rho_{0j}(u) \hat\gamma_{j}(u) \equiv y_{0} + h{\sum}_{j=0}^{s-1} {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})B(u(c_{\ell} h)) {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell}) \nabla H(u(c_{\ell} h)), $$

(27)

which is the discrete counterpart of (14).

It is worth mentioning that, in a similar way as done in Section 2.1 for the polynomial σ, for u one obtains, by virtue of (26):

$$ \begin{array}{@{}rcl@{}} \dot u(ch) &=& {\sum}_{i,j=0}^{s-1}P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u)\\ &=& {\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{j=0}^{s-1} {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell}) B(u(c_{\ell} h))P_{j}(c_{\ell}) {\sum}_{\ell_{1}=1}^{k} b_{\ell_{1}}P_{j}(c_{\ell_{1}})\nabla H(u(c_{\ell_{1}}h))\\ &=& {\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell}) B(u(c_{\ell} h)){\sum}_{j=0}^{s-1}P_{j}(c_{\ell}) {\sum}_{\ell_{1}=1}^{k} b_{\ell_{1}} P_{j}(c_{\ell_{1}})\nabla H(u(c_{\ell_{1}}h))\\ &=:& {\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell}) B(u(c_{\ell} h)) [\nabla H(u(c_{\ell} h))]_{s}^{(2k)}\\ &\equiv&\left[ B(u(ch)) [\nabla H(u(ch)]_{s}^{(2k)}\right]_{s}^{(2k)}, \end{array} $$

having set $[\cdot ]_{s}^{(2k)}$ the approximate best approximation in π_s− 1 obtained by using a quadrature of order 2k for approximating the involved integrals.^{Footnote 4} Consequently (compare with (15)), the polynomial u is the solution of the initial value problem:

$$ \dot u(ch) = \left[B(u(ch))\left[\nabla H(u(ch))\right]_{s}^{(2k)}\right]_{s}^{(2k)}, \qquad c\in[0,1], \qquad u(0)=y_{0}. $$

(28)

Remark 3

We observe that the polynomial approximation defined by the problem

$$\dot u(ch) = \left[B(u(ch))\left[\nabla H(u(ch))\right]_{s}\right]_{s}^{(2s)}, \qquad c\in[0,1], \qquad u(0)=y_{0},$$

corresponds to that provided by the methods in [19], when the Gauss-Legendre abscissae are used, and to the methods in [26, Definition 3.2], upon selecting the derivative space as π_s− 1. Similarly, some of the methods in [1] provide the approximation

$$\dot u(ch) = \left[B(u(ch))\left[\nabla H(u(ch))\right]_{s}^{(2k)}\right]_{s}^{(2s)}, \qquad c\in[0,1], \qquad u(0)=y_{0}.$$

Remark 4

When in (28) B(σ(ch)) ≡ J, a constant skew-symmetric matrix, we derive the polynomial approximation provided by a HBVM(k,s) method:

$$ u(ch) = y_{0} + h{{\int}_{0}^{c}} \left[J\nabla H(u(\tau h))\right]_{s}^{(2k)}\mathrm{d}\tau, \qquad c\in[0,1]. $$

(29)

3.1 Analysis

As was done in Section 2.2 for the continuous procedure, let us now analyze the fully discrete method (26)–(27). To begin with, the following straightforward result holds true.

Theorem 7

If

$$ B\in{\Pi}_{\mu},\quad H\in{\Pi}_{\nu}, \qquad \text{with}\qquad \mu\le \frac{2k+1}s-2, \quad \nu\le \frac{2k}s. $$

(30)

Then (see (26) and (6)),

$$ \hat\rho_{ij}(u) = \rho_{ij}(u), \qquad \hat\gamma_{j}(u)=\gamma_{j}(u), \qquad \forall i,j=0,\dots,s-1, $$

(31)

and, consequently, with reference to (26) and (13), one has u ≡ σ.

Proof

In fact, if B is a polynomial of degree μ and H a polynomial of degree ν, the integrand defining ρ_ij(u) has at most degree μs + 2s − 2, whereas that defining γ_j(u) has at most degree νs − 1. Consequently, these degrees do not exceed 2k − 1, when (30) holds true. As a result, the quadrature is exact, so that (31) is valid and, therefore, u ≡ σ.□

Consequently, when (30) hold true, the method is energy-conserving and has order 2s, as stated by Theorems 1 and 2, respectively.

Concerning energy conservation, the following additional result holds true, in the case where only H is a polynomial.

Theorem 8

If

$$ H\in{\Pi}_{\nu}, \qquad \text{with}\qquad \nu\le \frac{2k}s, $$

(32)

then H(y₁) = H(y₀).

Proof

In fact, in such a case $\gamma _{j}(u)=\hat \gamma _{j}(u)$, $j=0,\dots ,s-1$, and the proof of Theorem 1 continues formally to hold, upon replacing σ with u, and ρ_ij with $\hat \rho _{ij}$, due to the fact that (compare with (8))

$$ \hat\rho_{ij}(u) = \hat\rho_{ji}(u) = -\hat\rho_{ij}(u)^{\top}, \qquad \forall i,j=0,1,\dots,s-1. $$

(33)

□

When (31) does not hold true, there is a quadrature error that, upon regularity assumptions, can be easily seen to be given by (see (6)):

$$ \begin{array}{@{}rcl@{}} \hat\rho_{ij}(u) - \rho_{ij}(u) &=& \chi_{ij}(h) ~\equiv~ O(h^{2k-i-j}), \\ \hat\gamma_{j}(u)-\gamma_{j}(u) &=& {\Delta}_{j}(h)~\equiv~ O(h^{2k-j}), \qquad \forall i,j=0,\dots,s-1. \end{array} $$

(34)

Nonetheless, also in this case it is straightforward to verify that (compare with (7)),

$$ \forall k\ge s: \qquad \hat\gamma_{j}(u)=O(h^{j}), \qquad \hat\rho_{ij}(u)=O(h^{|i-j|}).\qquad \forall i,j=0,1,\dots,s-1. $$

(35)

Consequently, with reference to the approximation y₁ defined in (27), the following result is easily obtained, when (32) is not valid.

Theorem 9

∀k ≥ s : H(y₁) = H(y₀) + O(h^2k+ 1).

Proof

In fact, using arguments similar to those used in the proof of Theorem 1, one has, by taking into account (33)–(35):

$$ \begin{array}{@{}rcl@{}} \lefteqn{H(y_{1})-H(y_{0})~=~ H(u(h))-H(u(0)) = {{\int}_{0}^{h}} \nabla H(u(t))^{\top}\dot u(t)\mathrm{d} t}\\ &=& h{{\int}_{0}^{1}} \nabla H(u(ch))^{\top}\dot u(ch)\mathrm{d} c~=~ h{{\int}_{0}^{1}} \nabla H(u(ch))^{\top} {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u) \hat\gamma_{j}(u)\mathrm{d} c\\ &=&h {\sum}_{i,j=0}^{s-1} \underbrace{\left[{\int}_{0}^{1} P_{i}(c)\nabla H(u(ch))\mathrm{d} c\right]^{\top}}_{=\gamma_{i}(u)^{\top}}\hat\rho_{ij}(u) \left[\gamma_{j}(u)+{\Delta}_{j}(h)\right] \\ &=& h {\sum}_{i,j=0}^{s-1} \underbrace{\gamma_{i}(u)^{\top}\hat\rho_{ij}(u) \gamma_{j}(u)}_{=0}~+~h {\sum}_{i,j=0}^{s-1} \gamma_{i}(u)^{\top}\hat\rho_{ij}(u) {\Delta}_{j}(h)\\ &=&h \left[{\sum}_{j=0}^{s-1} {\sum}_{i=j}^{s-1} \underbrace{\gamma_{i}(u)^{\top}\hat\rho_{ij}(u)}_{=O(h^{2i-j})}{\Delta}_{j}(h) + {\sum}_{j=0}^{s-1} {\sum}_{i=0}^{j-1} \gamma_{i}(u)^{\top}\underbrace{\hat\rho_{ij}(u){\Delta}_{j}(h)}_{=O(h^{2k-i})}\right] ~=~O(h^{2k+1}). \end{array} $$

□

Concerning the accuracy of the approximation (27), the following result, stating that the convergence order of Theorem 2 is retained, holds true.

Theorem 10

∀k ≥ s : y₁ − y(h) = O(h^2s+ 1).

Proof

By using arguments and notations similar to those used in the proof of Theorem 2, one has, by taking into account (34) and that k ≥ s:

$$ \begin{array}{@{}rcl@{}} \lefteqn{ y_{1}-y(h) ~=~u(h)-y(h) = y(h,h,u(h))-y(h,0,u(0)) = {{\int}_{0}^{h}} \frac{\mathrm{d}}{\mathrm{d} t} y(h,t,u(t))\mathrm{d} t}\\ &=&{{\int}_{0}^{h}} \left[\left.\frac{\partial}{\partial \xi} y(h,\xi,u(t))\right|_{\xi=t} + \left.\frac{\partial}{\partial \eta} y(h,t,\eta)\right|_{\eta=u(t)} \dot u(t)\right] \mathrm{d} t\\ &=& {{\int}_{0}^{h}}\left[ -{\Phi}(h,t,{ u(t)})F(u(t))+{\Phi}(h,t,{ u(t)})\dot u(t)\right]\mathrm{d} t \\ &=& -h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ F(u(ch))-\dot u(ch)\right]\mathrm{d} c\\ &=&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ B(u(ch))\nabla H(u(ch)) - {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u)\right]\mathrm{d} c\\ \end{array} $$

$$ \begin{array}{@{}rcl@{}} &=&\underbrace{-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ {\sum}_{i,j\ge0} P_{i}(c)\rho_{ij}(u)\gamma_{j}(u)- {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(u)\gamma_{j}(u)\right]\mathrm{d} c}_{=O(h^{2s+1}), \text{\small ~from the proof of Theorem~2}}\\ &&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(u)\gamma_{j}(u)- {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u)\right]\mathrm{d} c\\ &=& O(h^{2s+1}) -h{\sum}_{i,j=0}^{s-1}\underbrace{{{\int}_{0}^{1}} P_{i}(c) {\Phi}(h,ch,{ u(ch)})\mathrm{d} c}_{={\Psi}_{i}(u)}\left[ \rho_{ij}(u)\gamma_{j}(u)-\hat\rho_{ij}(u)\hat\gamma_{j}(u)\right]\\ &=& O(h^{2s+1}) -h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\left[ \rho_{ij}(u)\gamma_{j}(u)-\left( \rho_{ij}(u)+\chi_{ij}(h)\right)\left( \gamma_{j}(u)+{\Delta}_{j}(h)\right)\right]\\ &=& O(h^{2s+1}) +h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\left[ \rho_{ij}(u){\Delta}_{j}(h) +\underbrace{\chi_{ij}(h)\gamma_{j}(u)}_{=O(h^{2k-i})}+\underbrace{\chi_{ij}(h){\Delta}_{j}(h)}_{=O(h^{4k-2j-i})}\right]\\ &=&O(h^{2s+1}) + O(h^{2k+1}) + h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\rho_{ij}(u){\Delta}_{j}(h) ~=~O(h^{2s+1}) + h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\rho_{ij}(u){\Delta}_{j}(h). \end{array} $$

Concerning the latter sum, one has:

$$ \begin{array}{@{}rcl@{}} h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\rho_{ij}(u){\Delta}_{j}(h) &=&h{\sum}_{i=0}^{s-1}{\sum}_{j=i}^{s-1} \underbrace{{\Psi}_{i}(u)\rho_{ij}(u)}_{=O(h^{j})}{\Delta}_{j}(h) ~+~h{\sum}_{i=0}^{s-1}{\sum}_{j=0}^{i-1} {\Psi}_{i}(u)\underbrace{\rho_{ij}(u){\Delta}_{j}(h)}_{=O(h^{2k-2j+i})}\\&=&O(h^{2k+1}) + O(h^{2k+2}) ~=~ O(h^{2k+1}), \end{array} $$

and, consequently, the statement follows.□

Remark 5

By taking into account the results of Theorems 7 and 9, it follows that, by choosing k large enough, one obtains either an exact conservation of the Hamiltonian function, in the polynomial case, or a practical conservation, in the non polynomial case. In fact, as it has been also observed for HBVMs [11], in the latter case it is enough to choose k large enough so that the Hamiltonian error falls within the round-off error level of the finite precision arithmetic used in the simulation.

It is worth mentioning that a result similar to that of Theorem 3 holds true for the fully discrete method.

Theorem 11

The method (26)–(27) is symmetric, provided that the abscissae of the quadrature satisfy ^{Footnote 5}

$$ c_{k-i+1} = 1-c_{i}, \qquad i=1,\dots,k. $$

(36)

Proof

Symmetry of a given one-step method y₁ = Φ_h(y₀) applied to an initial value problem $y^{\prime }=f(y)$ with y(0) = y₀, means that ${\Phi }_{h}^{-1}={\Phi }_{-h}$ that is, applying the method to the state vector y₁, but with the direction of time reversed, yields the initial state vector y₀, independently of the choice of the initial value y₀. In our context, with reference to (26)–(27), Φ_h is defined by

$$ y_{1} = y_{0} + h{\sum}_{j=0}^{s-1}\hat\rho_{0j} \hat\gamma_{j}, $$

(37)

where $\hat \rho _{ij}$ and $\hat \gamma _{j}$ are the solutions of the nonlinear system

$$ \begin{array}{@{}rcl@{}} \hat\gamma_{j} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H\left( y_{0} + h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\hat\rho_{\mu\nu}\hat\gamma_{\nu} \right), \\ \hat\rho_{ij} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B\left( y_{0} + h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\hat\rho_{\mu\nu}\hat\gamma_{\nu} \right),\\ &&i,j=0,\dots,s-1. \end{array} $$

(38)

We can obtain the explicit formulation of $\bar y_{0}={\Phi }_{-h}(y_{1})$ by introducing in (37)–(38) the following substitutions: y₁ in place of y₀ and − h in place of h. In so doing, we arrive at the method defined as

$$ \bar y_{0} = y_{1} - h{\sum}_{j=0}^{s-1}\bar\rho_{0j} \bar\gamma_{j}, $$

(39)

where the unknown quantities $\bar \rho _{ij}$ and $\bar \gamma _{j}$ satisfy the following nonlinear system

$$ \begin{array}{@{}rcl@{}} \bar\gamma_{j} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\bar\rho_{\mu\nu}\bar\gamma_{\nu}\right), \\ \displaystyle \bar\rho_{ij} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\bar\rho_{\mu\nu}\bar\gamma_{\nu}\right), \\ &&i,j=0,\dots,s-1, \end{array} $$

(40)

and we want to show that $\bar y_{0}=y_{0}$. To this end, we introduce the following variables:

$$ \gamma_{j}^{\ast} := (-1)^{j} \hat \gamma_{j}, \quad \rho_{ij}^{\ast} := (-1)^{i+j} \hat \rho_{ij}, \quad i,j=0,\dots,s-1. $$

Exploiting the symmetry property of Legendre polynomials, (− 1)^jP_j(c) = P_j(1 − c), from the first equation in (38) we get

$$ \begin{array}{@{}rcl@{}} \gamma_{j}^{\ast} & = & \displaystyle {\sum}_{\ell=1}^{k} b_{\ell} (-1)^{j}P_{j}(c_{\ell})\nabla H\left( y_{0} + h{\sum}_{\mu,\nu=0}^{s-1}\left( {{\int}_{0}^{1}} P_{\mu}(x){ \mathrm{d} x} - {\int}_{c_{\ell}}^{1} P_{\mu}(x)\mathrm{d} x \right)\hat\rho_{\mu\nu}\hat\gamma_{\nu} \right) \\ &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(1-c_{\ell})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1} {\int}_{c_{\ell}}^{1} P_{\mu}(x)\mathrm{d} x \hat\rho_{\mu\nu}\hat\gamma_{\nu} \right). \end{array} $$

Introducing the change of variables τ = 1 − x, transforms the latter integral as

$$ {\int}_{c_{\ell}}^{1} P_{\mu}(x)\mathrm{d} x = -{\int}_{1-c_{\ell}}^{0} P_{\mu}(1-\tau)\mathrm{d} \tau = (-1)^{\mu} {\int}_{0}^{1-c_{\ell}} P_{\mu}(\tau)\mathrm{d} \tau. $$

Exploiting the symmetry assumption (36), which in turn implies symmetric weights, b_k−i+ 1 = b_i, $i=1,\dots ,k$, we finally get

$$ \begin{array}{@{}rcl@{}} \gamma_{j}^{\ast} &=& \displaystyle {\sum}_{\ell=1}^{k} b_{k-\ell+1} P_{j}(c_{k-\ell+1})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1} {\int}_{0}^{c_{k-\ell+1}} P_{\mu}(x)\mathrm{d} x (-1)^{\mu} (-1)^{\nu} \hat\rho_{\mu\nu} (-1)^{\nu} \hat \gamma_{\nu} \right)\\ &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1} {\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x \rho^{\ast}_{\mu\nu} \gamma^{\ast}_{\nu} \right). \end{array} $$

The same flow of computation may be employed on the second equation in (38) to see that

$$ \displaystyle \rho^{\ast}_{ij} = {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\rho^{\ast}_{\mu\nu}\gamma^{\ast}_{\nu}\right). $$

We then realize that $\gamma _{j}^{\ast }$ and $\rho ^{\ast }_{ij}$ satisfy the very same nonlinear system (40) governing the quantities $\bar \gamma _{j}$ and $\bar \rho _{ij}$. Thus we may conclude that

$$ \bar \gamma_{j}= (-1)^{j} \hat \gamma_{j}, \quad \bar \rho_{ij}= (-1)^{i+j} \hat \rho_{ij}, \quad i,j=0,\dots,s-1, $$

and hence from (39),

$$ \bar y_{0} = y_{1} - h{\sum}_{j=0}^{s-1}(-1)^{j} \hat \rho_{0j} (-1)^{j} \hat \gamma_{j} = y_{1} - h{\sum}_{j=0}^{s-1} \hat \rho_{0j} \hat \gamma_{j} =y_{0}. $$

□

In the limit case when $k\rightarrow \infty $, since the Gauss-Legendre quadrature formule are convergent, we are led back to the symmetry property of the original non-discretized procedure (12)–(14) that we anticipated in Theorem 3.

We conclude this section by emphasizing that, when problem (1) is in the form (3), then (see (26)) $\hat \rho _{ij}(\sigma ) = { \delta _{ij}}J$ and, consequently, the polynomial approximation u becomes

$$u(ch) = y_{0}+h{\sum}_{j=0}^{s-1} {{\int}_{0}^{c}} P_{j}(x)\mathrm{d} x {\sum}_{{ i=1}}^{k} b_{i} P_{j}(c_{i})J\nabla H(u(c_{i} h)), \qquad c\in[0,1],$$

which is equivalent to (29). As anticipated in Remark 4, this equation (see, e.g., the monograph [5] or the review paper [6, Section 2.2]) defines a Hamiltonian Boundary Value Method with parameters k and s, in short HBVM(k,s). Moreover, also Theorems 7–10 exactly describe, in the case of problem (3), their conservation and accuracy properties. For this reason, the methods here presented can be regarded as a generalization of HBVMs for solving Poisson problems and, therefore, this fact, motivates the following definition.

Definition 1

We shall refer to the method (26)–(27) as to a PHBVM(k,s) method.

3.2 Conservation of Casimirs

This section is devoted to the conservation of Casimirs for the discrete PHBVM(k,s) method (26)–(27), following similar steps as those in Section 2.3. For this purpose, it is convenient to define the approximate Fourier coefficients, for k ≥ s, which we assume hereafter:

$$ \hat\pi_{i}(\sigma) = {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})\nabla C(\sigma(c_{\ell} h)) = O(h^{j}), \qquad i=0,\dots,s-1. $$

(41)

The following straightforward result is reported without proof.

Lemma 3

Assume that σ ∈π_s. Then, with reference to (18), for all $i=0,\dots ,s-1$, one has:

$$ \begin{array}{@{}rcl@{}} \hat\pi_{i}(\sigma) &=& \pi_{i}(\sigma), \qquad \text{if} \qquad C\in{\Pi}_{\nu},\quad \nu\le 2k/s,\\ \hat\pi_{i}(\sigma) &=& \pi_{i}(\sigma) - \theta_{i}(h), \qquad \theta_{i}(h)=O(h^{2k-i}), \qquad \text{otherwise.} \end{array} $$

Next, let us consider the following perturbed polynomial, in place of the polynomial u in (26),

$$ \dot u_{\alpha}(ch) ={\sum}_{i,j=0}^{s-1}P_{i}(c)\hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) -\alpha \tilde{B}\hat{\gamma}_{0}(u_{\alpha}), \qquad c\in[0,1], \qquad u_{\alpha}(0)=y_{0}, $$

(42)

with $\tilde {B}^{\top }=-\tilde {B}\ne O$ an arbitrary matrix, and (compare with (21))

$$ \alpha = \frac{{\sum}_{i,j=0}^{s-1} \hat\pi_{i}(u_{\alpha})^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha})}{\hat\pi_{0}(u_{\alpha})^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})} = O(h^{2s}). $$

(43)

Theorem 12

If

$$ B\in{\Pi}_{\mu},\quad C,H\in{\Pi}_{\nu}, \qquad \text{with}\qquad \mu\le \frac{2k+1}s-2, \quad \nu\le \frac{2k}s. $$

(44)

Then (see (26), (6)), (18), and (41)

$$ \hat\rho_{ij}(u_{\alpha}) = \rho_{ij}(u_{\alpha}), \qquad \hat\gamma_{j}(u_{\alpha})=\gamma_{j}(u_{\alpha}), \qquad \hat\pi_{i}(u_{\alpha})=\pi_{i}(u_{\alpha}),\qquad \forall i,j=0,\dots,s-1, $$

(45)

and, consequently, with reference to (42) and (20), one has u_α ≡ σ_α.

Differently, by setting y₁ := u_α(h) the new approximation, Theorems 8, 9, 10, and 11 continue formally to hold. Moreover, the following result holds true.

Theorem 13

With reference to (42)–(43), one has:

$$ C(y_{1})-C(y_{0})=\left\{\begin{array}{cc} 0, &\text{if}\quad C\in{\Pi}_{\nu}, \quad\text{with}\quad \nu\le 2k/s,\\ O(h^{2k+1}), &\text{otherwise}. \end{array}\right.$$

Proof

By taking into account Lemma 3, one obtains:

$$ \begin{array}{@{}rcl@{}} \lefteqn{C(y_{1})-C(y_{0}) ~=~ h{{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))^{\top} \dot u_{\alpha}(ch)\mathrm{d} c}\\ &=& h{{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))^{\top} \left[ {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]\mathrm{d} c\\ &=& h\left[{\sum}_{i,j=0}^{s-1} \left( {{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))P_{i}(c)\mathrm{d} c\right)^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \left( {{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))P_{i}(c)\mathrm{d} c\right)^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]\mathrm{d} c\\ &=& h\left[{\sum}_{i,j=0}^{s-1} \pi_{i}(u_{\alpha})^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \pi_{0}(u_{\alpha})^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]~=:~(*). \end{array} $$

In case C ∈π_ν with ν ≤ 2k/s, then, by virtue of Lemma 3, $\pi _{i}(u_{\alpha })=\hat \pi _{i}(u_{\alpha })$ and, consequently, (∗) = 0 because of (43). Conversely, again from Lemma 3, one has:

$$ \begin{array}{@{}rcl@{}} (*)&=& h\left[{\sum}_{i,j=0}^{s-1} \left[\hat\pi_{i}(u_{\alpha})+\theta_{i}(h)\right]^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \left[\hat\pi_{0}(u_{\alpha})+\theta_{0}(h)\right]^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]\\ &=& h\left[\underbrace{{\sum}_{i,j=0}^{s-1} \theta_{i}(h)^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha})}_{=O(h^{2k})} - \underbrace{\alpha \theta_{0}(h)^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})}_{=O(h^{2k+2s})}\right] ~=~O(h^{2k+1}). \end{array} $$

□

Remark 6

From the arguments here exposed, one deduces that when using a finite precision arithmetic, a (at least) practical conservation of both the Hamiltonian and the Casimirs can be obtained, by choosing k large enough.

Definition 2

Following [18], we name enhanced PHBVM(k,s), in short EPHBVM(k,s), the method defined by (42)–(43).

Following Remark 2, the polynomial u_α defined by an EPHBVM(k,s) method is the solution of the ODE-IVP (compare with (22)):

$$ \dot u_{\alpha}(ch) = \left[\left( B(u_{\alpha}(ch))+\alpha \tilde{B}\right)\left[\nabla H(u_{\alpha}(ch))\right]_{s}^{(2k)}\right]_{s}^{(2k)}, \quad c\in[0,1], \quad u_{\alpha}(0)=y_{0}. $$

Clearly, when α = 0 one retrieves the problem (28) defining the polynomial approximation of the PHBVM(k,s) method.

Remark 7

From the results of Theorems 8, 9, and 13, one deduces the clear advantage of choosing values of k suitably larger than s, in order to obtain a suitable conservation of non-quadratic Hamiltonians and/or Casimirs. This fact will be duly confirmed in the numerical tests. Moreover, this is not a serious drawback, from the point of view of the computational cost, since, as we shall see in the next section, the discrete problem to be solved will always have (block) dimension s, independently of k.^{Footnote 6}

4 The discrete problem

In this section we deal with the efficient solution of the discrete problem generated by the PHBVM(k,s) method (26). For this purpose, we observe that only the values Y_ℓ := u(c_ℓh), $\ell =1,\dots ,k$, are actually needed. Consequently, (26) can be re-written as:

$$ \begin{array}{@{}rcl@{}} Y_{\ell} &=& y_{0} + h{\sum}_{i,j=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{i}(x)\mathrm{d} x\hat\rho_{ij}\hat\gamma_{j}, \qquad \ell = 1,\dots,k,\\ \hat\gamma_{j} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H(Y_{\ell}),\qquad \hat\rho_{ij} ~=~ {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B(Y_{\ell}),\qquad i,j=0,\dots,s-1, \end{array} $$

(46)

where, for sake of brevity, we have omitted the argument u for $\hat \gamma _{j}$ and $\hat \rho _{ij}$, as was already done in (37)–(38). The (46) can be cast in matrix form by defining the block vectors and matrices

$$ Y = \left( \begin{array}{c} Y_{1}\\ {\vdots} \\ Y_{k} \end{array}\right)\in\mathbb{R}^{k\cdot m}, \quad {\boldsymbol{\gamma}} = \left( \begin{array}{c}\hat\gamma_{0}\\ \vdots\\ \hat\gamma_{s-1} \end{array}\right)\in\mathbb{R}^{s\cdot m}, \quad {\Gamma} = \left( \begin{array}{ccc} \hat\rho_{00} & {\dots} &\hat\rho_{0,s-1}\\ {\vdots} & &\vdots\\ \hat\rho_{s-1,0} & {\dots} &\hat\rho_{s-1,s-1} \end{array}\right)\in\mathbb{R}^{s\cdot m \times s\cdot m}, $$

(47)

and

$$ {\boldsymbol{e}} = \left( \begin{array}{c} 1\\ \vdots\\1 \end{array}\right) \in \mathbb{R}^{k}, \quad {\Omega} = \left( \begin{array}{ccc} b_{1}\\ &\ddots\\ &&b_{k} \end{array}\right), \quad \mathcal{I}_{s}=\left( {\int}_{0}^{c_{i}} P_{j-1}(x)\mathrm{d} x\right), ~ \mathcal{P}_{s}=\left( P_{j-1}(c_{i})\right) \in\mathbb{R}^{k\times s}. $$

(48)

In fact, by also setting hereafter I_r the identity of dimension r, we can rewrite (46) as:

$$ Y = {\boldsymbol{e}}\otimes y_{0} + h(\mathcal{I}_{s}\otimes I_{m}) {\Gamma}{\boldsymbol{\gamma}}, \quad {\boldsymbol{\gamma}} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H(Y), \quad {\Gamma} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m}) \mathcal{B}(Y) (\mathcal{P}_{s}\otimes I_{m}), $$

(49)

where

$$\nabla H(Y) = \left( \begin{array}{c} \nabla H(Y_{1})\\ \vdots\\ \nabla H(Y_{k}) \end{array}\right)\qquad\text{and}\qquad \mathcal{B}(Y) = \left( \begin{array}{ccc} B(Y_{1})\\ &\ddots\\ && B(Y_{k}) \end{array}\right).$$

As is clear, the discrete problem (49) can be further reformulated in terms of the product of the Fourier coefficients $\hat \rho _{ij}$ and $\hat \gamma _{j}$. In fact, by setting

$$ {\boldsymbol{\phi}} \equiv \left( \begin{array}{c} \phi_{0}\\ {\vdots} \\ \phi_{s-1} \end{array}\right) :={\Gamma}{\boldsymbol{\gamma}} \qquad \Rightarrow\qquad \phi_{i}={\sum}_{j=0}^{s-1}\hat\rho_{ij}\hat\gamma_{j}, \quad i=0,\dots,s-1, $$

(50)

the first equation in (49) becomes

$$ Y = {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m}{\boldsymbol{\phi}}, $$

(51)

whereas, multiplying side by side the third by the second gives

$$ {\boldsymbol{\phi}} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m}) \mathcal{B}(Y) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H(Y). $$

(52)

Consequently, by substituting the right-hand side of (51) in the right-hand side of (52), provides the new discrete problem:

$$ \begin{array}{@{}rcl@{}} \mathcal{F}({\boldsymbol{\phi}})&:=&{\boldsymbol{\phi}} - \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \mathcal{B}\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}\right) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H\left( \boldsymbol{e}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} \boldsymbol{\phi}\right)\\ &=& \boldsymbol{0}. \end{array} $$

(53)

Moreover, computing the vector ϕ in (50) allows us to obtain the new approximation (27) as:

$$y_{1} = y_{0} + h\phi_{0}.$$

Remark 8

In case where the problem (1) is in the form (3), one has that ${\mathscr{B}}(Y)=I_{s}\otimes J$. Consequently, considering that

$$ \mathcal{P}_{s}^{\top}{\Omega}\mathcal{P}_{s}=I_{s}, $$

(54)

the discrete problem (53) reduces to:

$$ {\boldsymbol{\phi}} - \mathcal{P}_{s}^{\top}{\Omega}\otimes J \nabla H\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes J {\boldsymbol{\phi}}\right) = {\boldsymbol{0}}. $$

This latter problem is exactly that generated by a HBVM(k,s) method applied for solving (3) [9]. We observe, however, that while the original HBVM(k,s) method is actually a k-stage Runge-Kutta method with Butcher tableau (see (48))

this is no more the case for the generalization defined by (49).

Remark 9

In the case k = s, one has that $\mathcal {P}_{s}\mathcal {P}_{s}^{\top }{\Omega }=I_{s}$. Consequently, (53) becomes, by using the notation (16),

$${\boldsymbol{\phi}} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m}) F({\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}).$$

This, in turn, is equivalent to the application of the s-stage Gauss method to the problem (1) (see, e.g., [9]).

Instead, in the case k > s, the discrete problem (53) is equivalent to the application of the HBVM(k,s) method to the problem (see (28)),

$$\dot y = B(y)[\nabla H(y)]_{s}^{(2k)}, \qquad t>0, \qquad y(0)=y_{0},$$

in place of (1). This application , in turn, provides the polynomial approximation (28).

We observe that the formulation (53) naturally induces a straightforward iterative procedure for solving the discrete problem,

$$ \begin{array}{@{}rcl@{}} \lefteqn{{\boldsymbol{\phi}}^{r+1} =}\\ && \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \mathcal{B}\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}^{r}\right) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}^{r}\right),\\ && r=0,1,\dots, \end{array} $$

(55)

for which the initial approximation ϕ⁰ = 0 can be conveniently used. It is also possible to use the simplified Newton iteration for solving (53) which, taking into account (48), (54), and that (see, e.g., [5])

$$ \mathcal{P}_{s}^{\top}{\Omega}\mathcal{I}_{s} = X_{s} := \left( \begin{array}{cccc} \xi_{0} &-\xi_{1}\\ \xi_{1} &0 &\ddots\\ &{\ddots} &{\ddots} &-\xi_{s-1}\\ & &\xi_{s-1} &0 \end{array}\right), \qquad \xi_{i} = \left( 2\sqrt{|4i^{2}-1|}\right)^{-1},\quad i=0,\dots,s-1,~~ $$

(56)

takes the form:

$$ \text{solve:}~\left[ I_{s}\otimes I_{m} -hX_{s}\otimes F^{\prime}(y_{0})\right] {\boldsymbol{\delta}}^{r} = -\mathcal{F}({\boldsymbol{\phi}}^{r}), \qquad {\boldsymbol{\phi}}^{r+1}:={\boldsymbol{\phi}}^{r}+{\boldsymbol{\delta}}^{r}, \qquad r=0,1,\dots, $$

(57)

with $F^{\prime }$ the Jacobian of F (see (16)). Nevertheless, the iteration (57) requires the factorization of a matrix having size s times larger than that of problem (1), which can be costly, when s and/or m are large. Consequently, it is much more effective to resort to a blended iteration for solving (53) (see, e.g., [9], we also refer to [13] for a more detailed analysis of blended methods). In the present case, this latter iteration, considering the matrix X_s defined in (56), denoting σ(X_s) its spectrum, and setting

$$ {\Lambda} := I_{m}-h\lambda_{s} F^{\prime}(y_{0})\in\mathbb{R}^{m\times m}, \qquad\text{with}\qquad \lambda_{s} = \min_{\lambda\in\sigma(X_{s})} |\lambda|, $$

(58)

assumes the form:

$$ \begin{array}{@{}rcl@{}} {\boldsymbol{\eta}}^{r}&:=&-\mathcal{F}({\boldsymbol{\phi}}^{r}),\qquad\quad {{\boldsymbol{\eta}}_{1}^{r}}~:=~ \lambda_{s}X_{s}^{-1}\otimes I_{m} {\boldsymbol{\eta}}^{r},\\ {\boldsymbol{\phi}}^{r+1}&:=&{\boldsymbol{\phi}}^{r}+I_{s}\otimes {\Lambda}^{-1}\left( {{\boldsymbol{\eta}}_{1}^{r}} +I_{s}\otimes {\Lambda}^{-1}\left( {\boldsymbol{\eta}}^{r}-{{\boldsymbol{\eta}}_{1}^{r}}\right)\right), \qquad r=0,1,\dots. \end{array} $$

(59)

Consequently, only the matrix Λ in (58), having the same size as that of the continuous problem (1), needs to be factored. This fact is a common feature, in the many instances where the blended iteration can be used. For this reason, in such cases, it turns out to be extremely efficient (see, e.g., [2, 14,15,16, 23]).

Remark 10

In the practical use of the methods, it is customary to choose the parameter k, related to the order of the quadrature, so that the discretization error falls within the round-off error level. Nevertheless, round-off errors are unavoidable, as are iteration errors in (55) or (59). This may cause a small numerical drift in the invariants,^{Footnote 7} even in the case where the quadrature is exact. This phenomenon has been duly studied in [5, Chapter 4.3], where a simple correction procedure is given to avoid this problem. The same procedure can be conveniently used in this setting, too. The reader is referred to the above reference for full details.

4.1 Conservation of Casimirs

In this section we sketch the implementation of EPHBVM(k,s) methods described in Section 3.2. For this purpose, besides the vector ϕ defined in (50), we need to define the block vector

$$ \hat{\boldsymbol{\pi}} = \left( \begin{array}{c} \hat\pi_{0} \\ {\vdots} \\ \hat\pi_{s-1} \end{array}\right) $$

(60)

with the approximate Fourier coefficients (41) of the gradient of the Casimir.^{Footnote 8} In so doing, the discrete problem generated by an EHBVM(k,s) method becomes:

$$ \begin{array}{@{}rcl@{}} \mathcal{F}({\boldsymbol{\phi}},\alpha)&:=&\left( \begin{array}{c} {\boldsymbol{\phi}} - \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \mathcal{B}\left( Y\right) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H\left( Y\right)\\ \alpha - \frac{\hat{\boldsymbol{\pi}}^{\top}{\boldsymbol{\phi}}}{\hat\pi_{0}^{\top} \tilde{B}\hat\gamma_{0} } \end{array}\right)~=~ {\boldsymbol{0}},\\ \text{with}&&\\ Y&=& {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}} - \alpha h{\boldsymbol{c}}\otimes (\tilde{B}\hat\gamma_{0}),\\ \hat\gamma_{0} &=&{\boldsymbol{b}}^{\top}\otimes I_{m} \nabla H(Y),\\ \hat{\boldsymbol{\pi}} &=& \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \nabla C(Y), \end{array} $$

(61)

and the new approximation given by

$$y_{1} = y_{0} + h\left( \phi_{0}-\alpha \tilde{B}\hat\gamma_{0}\right).$$

We conclude this section by mentioning that, in case of multiple Casimirs, the discrete problem (61) can be readily generalized, by considering discrete counterparts of (24)–(25).

5 Numerical tests

In this section we present a couple of numerical tests concerning the solution of Lotka-Volterra problems, with the last one possessing a Casimir. The numerical tests have been carried out on a 3GHz Intel Xeon W10 core computer with 64GB of memory running Matlab 2020a.

Example 1

We consider the following Lotka-Volterra problem:

$$ \begin{array}{@{}rcl@{}} \dot y &=& \left( \begin{array}{cc} 0 & y_{1}y_{2} \\ -y_{1}y_{2} &0 \end{array}\right) \nabla H(y),\\ H(y) &=& a\left( \ln y_{1} -\frac{y_{1}}{y_{1}^{*}}\right) + b\left( \ln y_{2} -\frac{y_{2}}{y_{2}^{*}}\right), \end{array} $$

(62)

with

$$a=1, \qquad b=3, \qquad y_{1}^{*}=y_{2}^{*}=1, \qquad y(0) = \left( 5, 1\right)^{\top},$$

whose solution, which is periodic of period T ≈ 4.633434168477889, is depicted in Fig. 1. At first, we solve the problem on one period with time-step h = T/n, by using the following methods:

The s-stage Gauss method, s = 1,2,3;
The PHBVM(4,s), s = 1,2, and PHBVM(6,3) methods, which become soon energy-conserving, as the value of n is increased.

The obtained results are summarized in Table 1, where we have denoted by e_y and e_H the error in the solution and in the Hamiltonian after one period, respectively. Their numerical rate of convergence is also reported, along with the mean blended iterations (58)–(59) per time-step (it), in order to obtain convergence within full machine accuracy, and the execution time in sec. From the listed results, one infers that:

As is striking clear, the higher order methods are much more efficient than the lower order ones, especially when a high accuracy is required;
The theoretical rate of convergence for both the solution and the Hamiltonian errors is that we expected (for PHBVMs, until the Hamiltonian error falls within the round-off error level);
For a fixed time-step h, the numerical solutions provided by the Gauss methods and by the corresponding PHBVM method have a comparable accuracy, despite the negligible Hamiltonian error of these latter methods;
The execution times of the PHBVM methods are about double than those of the corresponding Gauss methods, even though the mean number of blended iterations per time-step is practically the same (this latter, decreasing with the time-step h, and slightly increasing with s).

As a result, one would conclude that the conservation of the Hamiltonian apparently gains no practical advantage. However, this conclusion is readily confuted if we look at the error growth in the Hamiltonian and in the solution. In fact, in Fig. 2 there is the plot of the Hamiltonian error (left plot) and the solution error (right plot) by using the 3-stage Gauss method and the PHBVM(6,3) method with time-step h = T/100 over 100 periods. As one may see, now it is clear that the 3-stage Gauss method exhibits a numerical drift in the energy, unlike PHBVM(6,3). As a result, this latter method exhibits a linear error growth, whereas the former one has a quadratic error growth.

Table 1 Results for problem (62)

Full size table

Example 2

The second example, taken from [20], is given by:

$$ \begin{array}{@{}rcl@{}} \dot y &=& \left( \begin{array}{ccc} 0 & y_{1}y_{2} & y_{1}y_{3}\\ -y_{1}y_{2} &0 & -y_{2}y_{3}\\ -y_{1}y_{3} & y_{2}y_{3} & 0 \end{array}\right) \nabla H(y),\\ H(y) &=& a\left( \ln y_{1} -\frac{y_{1}}{y_{1}^{*}}\right) +b\left( \ln y_{2} -\frac{y_{2}}{y_{2}^{*}}\right) + c\left( \ln y_{3} -\frac{y_{3}}{y_{3}^{*}}\right),\\ C(y) &=& -\ln y_{1} -\ln y_{2} +\ln y_{3}, \end{array} $$

(63)

with

$$ a=1, \qquad b=2, \qquad c=3, \qquad y_{1}^{*}=1, \qquad y_{2}^{*}=10,\qquad y_{3}^{*} = 50,\qquad y(0) = \left( 1, 1, 1\right)^{\top}, $$

whose solution, which is periodic of period T ≈ 2.143610709155912, is depicted in Fig. 3.

At first, we compare the same methods used for the previous example, again with time-step h = T/n. The obtained results for the s-stage Gauss and PHBVM methods are listed in Table 2: the conclusions that one can derive from them are similar to those driven from Table 1 for the previous example, with the additional remark that now the Casimir C(y) is not conserved.^{Footnote 9} This fact, in turn, produces the results depicted in the two plots in Fig. 4, concerning the application of the Gauss-3 and PHBVM(6,3) methods for solving the problem with time-step h = T/100 over 100 periods. From the two plots, one infers that both methods exhibit a drift in the Casimir, whereas only Gauss-3 exhibits a drift in the Hamiltonian, too (left plot); however, both methods exhibit a quadratic error growth in the solution, despite the fact that PHBVM(6,3) conserves the Hamiltonian. For this reason, in Table 3 we list the obtained results by using the EHBVM(4,1), EHBVM(4,2), and EHBVM(6,3) methods for solving problem (63), by using the same time-steps considered for obtaining the results of Table 2. As one may see, now the conservation of the Casimir is soon obtained, as the time-step is decreased, besides that of the Hamiltonian, with a computational cost perfectly comparable to that of the corresponding PHBVM method. The conservation of both invariants, in turn, allows to recover a linear error growth in the numerical solution, as is shown in the plot of Fig. 5.

Table 2 Results for problem (63)

Full size table

Table 3 Further results for problem (63)

Full size table

6 Conclusions

In this paper we have presented a class of energy-conserving line integral methods for Poisson problems. In the case where the problem is Hamiltonian, these methods reduce to the class of Hamiltonian Boundary Value Methods (HBVMs), which are energy-conserving methods for such problems. Consequently, the new methods can be regarded as an extension of HBVMs for Poisson (not Hamiltonian) problems, which we called PHBVMs. Moreover, a further enhancement of such methods (EPHBVMs) allows to obtain the conservation of Casimirs, too. A thorough analysis of the methods has been carried out, confirmed by a couple of numerical tests. As a further direction of investigation, we mention the study of the application of the methods for solving highly oscillatory Poisson problems, similarly as done with HBVMs in the Hamiltonian case [27,28,29].

Change history

18 July 2022
Missing Open Access funding information has been added in the Funding Note.

Notes

I.e., the approximation procedure has order of convergence 2s.
Here, we take into account that P₀(c) ≡ 1.
For sake of brevity, we skip here some of the intermediate passages, which are identical to those used in the proof of Theorem 2.
I.e., $\lim _{p\rightarrow \infty } [\cdot ]_{s}^{(p)} = [\cdot ]_{s}$.
This is, in fact, the case, for the Gauss-Legendre quadrature abscissae.
This latter feature is inherited, in turn, from the original methods, HBVM(k,s) and EHBVM(k,s) for Hamiltonian problems.
Usually, this is a very small drift, of the order of the machine epsilon per step.
As before, for sake of brevity we now omit the argument u_α of the approximate Fourier coefficients.
In Table 2e_C denotes the error in the Casimir.

References

Brugnano, L., Calvo, M., Montijano, J.I., Rández, L.: Energy preserving methods for Poisson systems. J. Comput. Appl. Math. 236, 3890–3904 (2012)
Article MathSciNet Google Scholar
Brugnano, L., Frasca Caccia, G., Iavernaro, F.: Efficient implementation of Gauss collocation and Hamiltonian Boundary Value Methods. Numer. Algorithms 65, 633–650 (2014)
Article MathSciNet Google Scholar
Brugnano, L., Gurioli, G., Iavernaro, F.: Analysis of energy and QUadratic invariant preserving (EQUIP) methods. J. Comput. Appl. Math. 335, 51–73 (2018)
Article MathSciNet Google Scholar
Brugnano, L., Iavernaro, F.: Line integral methods which preserve all invariants of conservative problems. J. Comput. Appl. Math. 236, 3905–3919 (2012)
Article MathSciNet Google Scholar
Brugnano, L., Iavernaro, F.: Line Integral Methods for Conservative Problems. Chapman and Hall/CRC, Boca Raton (2016)
Book Google Scholar
Brugnano, L., Iavernaro, F.: Line integral solution of differential problems. Axioms 7(2), article n. 36 (2018). https://doi.org/10.3390/axioms7020036
Article Google Scholar
Brugnano, L., Iavernaro, F., Trigiante, D.: Hamiltonian BVMs (HBVMs): A family of “drift-free” methods for integrating polynomial Hamiltonian systems. AIP Conf. Proc. 1168, 715–718 (2009)
Article Google Scholar
Brugnano, L., Iavernaro, F., Trigiante, D.: Hamiltonian boundary value methods (energy preserving discrete line integral methods). JNAIAM J. Numer. Anal. Ind. Appl. Math. 5(1–2), 17–37 (2010)
MathSciNet MATH Google Scholar
Brugnano, L., Iavernaro, F., Trigiante, D.: A note on the efficient implementation of Hamiltonian BVMs. J. Comput. Appl. Math. 236, 375–383 (2011)
Article MathSciNet Google Scholar
Brugnano, L., Iavernaro, F., Trigiante, D.: The lack of continuity and the role of infinite and infinitesimal in numerical methods for ODEs: the case of symplecticity. Appl. Math. Comput. 218, 8056–8063 (2012)
Article MathSciNet Google Scholar
Brugnano, L., Iavernaro, F., Trigiante, D.: A simple framework for the derivation and analysis of effective one-step methods for ODEs. Appl. Math. Comput. 218, 8475–8485 (2012)
Article MathSciNet Google Scholar
Brugnano, L., Iavernaro, F., Trigiante, D.: Analisys of Hamiltonian Boundary Value Methods (HBVMs): a class of energy-preserving Runge-Kutta methods for the numerical solution of polynomial Hamiltonian systems. Commun. Nonlinear Sci. Numer. Simul. 20, 650–667 (2015)
Article MathSciNet Google Scholar
Brugnano, L., Magherini, C.: Blended implementation of block implicit methods for ODEs. Appl. Numer. Math. 42, 29–45 (2002)
Article MathSciNet Google Scholar
Brugnano, L., Magherini, C.: The BiM code for the numerical solution of ODEs. J. Comput. Appl. Math. 164-165, 145–158 (2002)
Article MathSciNet Google Scholar
Brugnano, L., Magherini, C.: Blended implicit methods for the numerical solution of DAE problems. J. Comput. Appl. Math. 189, 34–50 (2006)
Article MathSciNet Google Scholar
Brugnano, L., Magherini, C.: Recent advances in linear analysis of convergence for splittings for solving ODE problems. Appl. Numer. Math. 59, 542–557 (2009)
Article MathSciNet Google Scholar
Brugnano, L., Montijano, J.I., Rández, L.: High-order energy-conserving line integral methods for charged particle dynamics. J. Comput. Phys. 396, 209–227 (2019)
Article MathSciNet Google Scholar
Brugnano, L., Sun, Y.: Multiple invariants conserving Runge-Kutta type methods for Hamiltonian problems. Numer. Algorithms 65, 611–632 (2014)
Article MathSciNet Google Scholar
Cohen, D., Hairer, E.: Linear energy-preserving integrators for Poisson systems. BIT Numer. Math. 51, 91–101 (2011)
Article MathSciNet Google Scholar
Hairer, E., Lubich, C., Wanner, G.: Geometric Numerical Integration, 2nd edn. Springer, Berlin (2006)
MATH Google Scholar
Miyatake, Y.: A derivation of energy-preserving exponentially-fitted integrators for Poisson systems. Comput. Phys. Commun. 187, 156–161 (2015)
Article MathSciNet Google Scholar
Quispel, G.R.W., McLaren, D.I.: A new class of energy-preserving numerical integration methods. J. Phys. A 41(4), 045206 (2008)
Article MathSciNet Google Scholar
Wang, B., Meng, F., Fang, Y.: Efficient implementation of RKN-type Fourier collocation methods for second-order differential equations. Appl. Numer. Math. 119, 164–178 (2017)
Article MathSciNet Google Scholar
Wang, B., Wu, X.: Functionally-fitted energy-preserving integrators for Poisson systems. J. Comput. Phys. 364, 137–152 (2018)
Article MathSciNet Google Scholar
Wang, B., Wu, X: Geometric Integrators for Differential Equations with Highly Oscillatory Solutions. Springer Nature Singapore Pte Ltd (2021)
Mei, L., Huang, L., Wu, X.: A unified framework for the study of high-order energy-preserving integrators for solving Poisson systems. J. Comput. Phys. https://doi.org/10.1016/j.jcp.2021.110822
Brugnano, L., Montijano, J.I., Rández, L.: On the effectiveness of spectral methods for the numerical solution of multi-frequency highly-oscillatory Hamiltonian problems. Numer. Algorithms 81, 345–376 (2019)
Article MathSciNet Google Scholar
Brugnano, L., Iavernaro, F., Montijano, J.I., Rández, L.: Spectrally accurate space-time solution of Hamiltonian PDEs. Numer. Algorithms 81, 1183–1202 (2019)
Article MathSciNet Google Scholar
Amodio, P., Brugnano, L., Iavernaro, F.: Analysis of spectral Hamiltonian boundary value methods (SHBVMs) for the numerical solution of ODE problems. Numer. Algorithms 83, 1489–1508 (2020)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors wish to thank the reviewers for their suggestions and for carefully reading the manuscript.

Funding

Open access funding provided by Università degli Studi di Firenze within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Dipartimento di Matematica, Università di Bari, Bari, Italy
Pierluigi Amodio & Felice Iavernaro
Dipartimento di Matematica e Informatica “U. Dini”, Università di Firenze, Firenze, Italy
Luigi Brugnano

Authors

Pierluigi Amodio
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Brugnano
View author publications
You can also search for this author in PubMed Google Scholar
Felice Iavernaro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luigi Brugnano.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Data availability

All data generated or analyzed during this study are included in this published article.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Amodio, P., Brugnano, L. & Iavernaro, F. Arbitrarily high-order energy-conserving methods for Poisson problems. Numer Algor 91, 861–894 (2022). https://doi.org/10.1007/s11075-022-01285-z

Download citation

Received: 28 October 2021
Accepted: 15 February 2022
Published: 09 March 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s11075-022-01285-z

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Arbitrarily high-order energy-conserving methods for Poisson problems

Abstract

Similar content being viewed by others

The Energy Conservation of Vlasov-Poisson Systems

Design and analysis of the Extended Hybrid High-Order method for the Poisson problem

Least energy sign-changing solutions for Kirchhoff–Poisson systems

1 Introduction

2 The new framework

Lemma 1

Proof

Corollary 1

Proof

Lemma 2

2.1 Interpretation of σ

2.2 Analysis

Theorem 1

Proof

Theorem 2

Proof

Theorem 3

Proof

Remark 1

2.3 Conservation of Casimirs

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

Remark 2

3 Discretization

Remark 3

Remark 4

3.1 Analysis

Theorem 7

Proof

Theorem 8

Proof

Theorem 9

Proof

Theorem 10

Proof

Remark 5

Theorem 11

Proof

Definition 1

3.2 Conservation of Casimirs

Lemma 3

Theorem 12

Theorem 13

Proof

Remark 6

Definition 2

Remark 7

4 The discrete problem

Remark 8

Remark 9

Remark 10

4.1 Conservation of Casimirs

5 Numerical tests

Example 1

Example 2

6 Conclusions

Change history

18 July 2022

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Data availability

Publisher’s note

Rights and permissions

About this article

Cite this article