1 Introduction

A Poisson problem is in the form

$$ \dot y = B(y)\nabla H(y) =: f(y), \quad t>0,\qquad y(0)=y_{0}\in\mathbb{R}^{m}, \qquad B(y)^{\top} = -B(y), $$
(1)

where H(y) is a scalar function, usually called the Hamiltonian. For sake of simplicity, hereafter both H(y) and B(y) will be assumed suitably regular. From the skew-symmetry of B(y) one easily deduces that H(y) is a constant of motion, since, along the solution of (1),

$$\frac{\mathrm{d}}{\mathrm{d} t} H(y) = \nabla H(y)^{\top} \dot y = \nabla H(y)^{\top} B(y)\nabla H(y) = 0,$$

due to the skew-symmetry of B(y). Possible additional invariants of (1) are its Casimirs, namely scalar functions C(y) for which

$$ \nabla C(y)^{\top} B(y)={ (0,\dots,0)\in\mathbb{R}^{1\times m}} $$
(2)

holds true for all y. When matrix B(y) is constant, as in the case of Hamiltonian problems,

$$ \dot y = J\nabla H(y), \qquad t>0, \qquad y(0)=y_{0}, \qquad J^{\top}=-J, $$
(3)

energy conservation can be obtained by solving problem (3) via HBVMs, a class of energy-conserving Runge-Kutta methods for Hamiltonian problems (see, e.g., [7, 8, 11, 12] and the monograph [5], see also the recent review paper [6]). Nevertheless, in the case where the problem is not Hamiltonian, HBVMs are no more energy-conserving. This motivates the present paper, where an energy-conserving variant of HBVMs for Poisson problems is derived and analyzed.

The numerical solution of Poisson problems has been tackled by following many different approaches (see, e.g., [20, Chapter VII] and references therein). More recently, it has been considered in [19], where an extension of the AVF method [22] is proposed, and in [1, 3], where a line integral approach has been used instead. Functionally fitted methods have been proposed in [21, 24,25,26]. In this paper we further pursue the line integral approach to the problem, which will provide an energy-conserving variant of HBVMs for solving (1).

With this premise, the structure of the paper is the following: in Section 2 we describe the new framework, in which the methods will be derived; in Section 3 we provide the final shape of the method, while in Section 4 its actual implementation is studied; in Section 5 we present a few numerical tests confirming the theoretical findings; at last, in Section 6 we give some concluding remarks.

2 The new framework

As anticipated above, the framework that we shall use to derive and analyze the methods is that of the so-called line integral methods, namely methods where the conservation properties are derived by the vanishing of a corresponding line integral [5, 6]. Such methods have been largely investigated in the case of Hamiltonian problems, providing their major instance given by Hamiltonian Boundary Value Methods (HBVMs). The analysis will strictly follow that in [11] and [17]. To begin with, let us consider problem (1) on the interval [0,h],

$$ \dot y(ch) = B(y(ch)) \nabla H(y(ch)),\qquad c\in[0,1], \qquad y(0) = y_{0}. $$
(4)

In fact, since we shall speak about a one-step method, it suffices to analyze its first step of application, with h the time-step. Next, let us consider the orthonormal Legendre polynomial basis {Pj}j≥ 0 on the interval [0,1],

$$ \deg P_{j} = j, \qquad {{\int}_{0}^{1}} P_{i}(x)P_{j}(x)\mathrm{d} x = \delta_{ij}, \qquad \forall i,j=0,1,\dots, $$
(5)

with δij the Kronecker symbol, and the following expansions for the functions at the right-hand side in (4):

$$ \begin{array}{@{}rcl@{}} \nabla H(y(ch)) = {\sum}_{j\ge0} P_{j}(c) \gamma_{j}(y), &&P_{j}(c)B(y(ch)) = {\sum}_{i\ge0} P_{i}(c) \rho_{ij}(y), \qquad~ c\in[0,1],\\ \\ \gamma_{j}(y) = {{\int}_{0}^{1}} P_{j}(\tau)\nabla H(y(\tau h))\mathrm{d}\tau, && \rho_{ij}(y) = {{\int}_{0}^{1}} P_{i}(\tau)P_{j}(\tau)B(y(\tau h))\mathrm{d}\tau, \quad i,j=0,1,\dots. \end{array} $$
(6)

The following properties hold true.

Lemma 1

Assume \(\psi :[0,h]\rightarrow V\), with V a vector space, admit a Taylor expansion at 0. Then, for all \(j=0,1,\dots \):

$${{\int}_{0}^{1}} P_{j}(c)c^{i}\psi(ch)\mathrm{d} c=O(h^{j-i}), \qquad i=0,\dots,j.$$

Proof

By the hypotheses on ψ, one has:

$$c^{i}\psi(ch)={\sum}_{r\ge0} \frac{\psi^{(r)}(0)}{r!} h^{r} c^{r+i}.$$

Consequently, for all \(i=0,\dots ,j\), by virtue of (5) it follows that:

$$ \begin{array}{@{}rcl@{}} {{\int}_{0}^{1}} P_{j}(c)c^{i}\psi(ch)\mathrm{d} c&=& {\sum}_{r\ge0} \frac{\psi^{(r)}(0)}{r!} h^{r} {{\int}_{0}^{1}}P_{j}(c)c^{r+i}\mathrm{d} c\\ &=&{\sum}_{r\ge j-i} \frac{\psi^{(r)}(0)}{r!} h^{r} {{\int}_{0}^{1}}P_{j}(c)c^{r+i}\mathrm{d} c = O(h^{j-i}). \end{array} $$

Corollary 1

With reference to (6), for any suitably regular path \(\sigma :[0,h]\rightarrow \mathbb {R}^{m}\) one has:

$$ \gamma_{j}(\sigma)=O(h^{j}), \qquad \rho_{ij}(\sigma)=O(h^{|i-j|}).\qquad \forall i,j=0,1,\dots. $$
(7)

Proof

Immediate from Lemma 1, by taking into account (6).□

We also state, without proof, the following straightforward property, deriving from the skew-symmetry of B.

Lemma 2

With reference to (6), for any path \(\sigma :[0,h]\rightarrow \mathbb {R}^{m}\) one has:

$$ \rho_{ij}(\sigma) = \rho_{ji}(\sigma) = -\rho_{ij}(\sigma)^{\top}, \qquad \forall i,j=0,1,\dots. $$
(8)

Taking into account (6), the right-hand side in (4) can be rewritten as:

$$ \dot y(ch) = B(y(ch))\nabla H(y(ch)) = {\sum}_{j\ge0} P_{j}(c) B(y(ch)) \gamma_{j}(y) = {\sum}_{i,j\ge 0} P_{i}(c)\rho_{ij}(y) \gamma_{j}(y), \qquad c\in[0,1], $$
(9)

from which one obtains that the solution of (4) can be formally written as:

$$ y(ch) = y_{0} + h{\sum}_{i,j\ge0} {{\int}_{0}^{c}}P_{i}(x)\mathrm{d} x\rho_{ij}(y) \gamma_{j}(y), \qquad c\in[0,1]. $$
(10)

In particular, by considering (5) and that P0(c) ≡ 1, from which \({{\int \limits }_{0}^{1}}P_{i}(x)\mathrm {d} x=\delta _{i0}\), one has:

$$ y(h) = y_{0} + h{\sum}_{j\ge 0}\rho_{0j}(y) \gamma_{j}(y) \equiv y_{0} + h{\sum}_{j\ge0} {{\int}_{0}^{1}} P_{j}(c)B(y(ch))\mathrm{d} c{{\int}_{0}^{1}} P_{j}(c) \nabla H(y(ch))\mathrm{d} c. ~~ $$
(11)

In order to obtain a polynomial approximation of degree s to y, it suffices to truncate the two infinite series in (9) after s terms:

$$ \dot \sigma(ch) = {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma) \gamma_{j}(\sigma), \qquad c\in[0,1], $$
(12)

with ρij(σ) and γj(σ) defined according to (6) by formally replacing y by σ. Consequently, (10) becomes

$$ \sigma(ch) = y_{0} + h{\sum}_{i,j=0}^{s-1} {{\int}_{0}^{c}}P_{i}(x)\mathrm{d} x\rho_{ij}(\sigma) \gamma_{j}(\sigma), \qquad c\in[0,1], $$
(13)

providing the approximation

$$ y_{1}:=\sigma(h) = y_{0} + h{\sum}_{j=0}^{s-1}\rho_{0j}(\sigma) \gamma_{j}(\sigma) \equiv y_{0} + h{\sum}_{j=0}^{s-1} {{\int}_{0}^{1}} P_{j}(c)B(\sigma(ch))\mathrm{d} c{{\int}_{0}^{1}} P_{j}(c) \nabla H(\sigma(ch))\mathrm{d} c, $$
(14)

in place of (11).

2.1 Interpretation of σ

We now provide an interesting interpretation of the polynomial approximation σ. For this purpose, let us rewrite (9), by taking into account (6), as follows:

$$ \begin{array}{@{}rcl@{}} \dot y(ch) &=& {\sum}_{i,j\ge0} P_{i}(c) \rho_{ij}(y)\gamma_{j}(y)\\ &=&{\sum}_{i\ge0} P_{i}(c) {\sum}_{j\ge0} {{\int}_{0}^{1}} P_{i}(\tau) B(y(\tau h))P_{j}(\tau){\mathrm{d}\tau{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(y(\tau_{1}h))\mathrm{d}\tau_{1}\\ &=&{\sum}_{i\ge0} P_{i}(c) {{\int}_{0}^{1}} P_{i}(\tau) B(y(\tau h))\left( \underbrace{{\sum}_{j\ge0}P_{j}(\tau){{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(y(\tau_{1}h))\mathrm{d}\tau_{1}}_{=\nabla H(y(\tau))}\right)\mathrm{d}\tau\\ &\equiv& B(y(ch))\nabla H(y(ch)), \end{array} $$

as is expected. In a similar way, we can rewrite (12) as:

$$ \begin{array}{@{}rcl@{}} \dot \sigma(ch) &=& {\sum}_{i,j=0}^{s-1} P_{i}(c) \rho_{ij}(\sigma)\gamma_{j}(\sigma)\\ &=&{\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{j=0}^{s-1} {{\int}_{0}^{1}} P_{i}(\tau) B(\sigma(\tau h))P_{j}(\tau){\mathrm{d}\tau{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(\sigma(\tau_{1}h))\mathrm{d}\tau_{1}\\ &=&{\sum}_{i=0}^{s-1} P_{i}(c) {{\int}_{0}^{1}} P_{i}(\tau) B(\sigma(\tau h))\left( {\sum}_{j=0}^{s-1}P_{j}(\tau){{\int}_{0}^{1}} P_{j}(\tau_{1})\nabla H(\sigma(\tau_{1}h))\mathrm{d}\tau_{1}\right)\mathrm{d}\tau\\ &=:& {\sum}_{i=0}^{s-1} P_{i}(c) {{\int}_{0}^{1}} P_{i}(\tau) B(\sigma(\tau h))\left[\nabla H(\sigma(\tau h))\right]_{s}\mathrm{d}\tau\\ &\equiv& \left[ B(\sigma(\tau h))\left[\nabla H(\sigma(\tau h))\right]_{s}\right]_{s}, \end{array} $$

having denoted by \(\left [\cdot \right ]_{s}\) the best approximation in πs− 1 (i.e., [⋅]s is the best polynomial approximation of degree s − 1) of the function in argument. This fact provides a noticeable interpretation of the polynomial approximation σ, which is the solution of the initial value problem

$$ \dot\sigma(ch) = \left[B(\sigma(ch))\left[\nabla H(\sigma(ch))\right]_{s}\right]_{s}, \qquad c\in[0,1], \qquad \sigma(0)=y_{0}, $$
(15)

equivalent to (12). Thus, the vector field of (15) is defined by a double projection procedure onto the finite dimensional vector space πs− 1 which involves, in turn, the vector fields ∇H(σ(ch)) and \(B(\sigma (ch))\left [\nabla H(\sigma (ch))\right ]_{s}\), respectively.

2.2 Analysis

We now analyze the method (12)–(14). The following result then holds true, stating that the method is energy-conserving.

Theorem 1

H(y1) = H(y0).

Proof

In fact, by virtue of (1), (6), and (12)–(14) one has, by using the standard line integral argument:

$$ \begin{array}{@{}rcl@{}} \lefteqn{H(y_{1})-H(y_{0})~=~ H(\sigma(h))-H(\sigma(0)) ~=~ {{\int}_{0}^{h}} \nabla H(\sigma(t))^{\top}\dot\sigma(t)\mathrm{d} t}\\ &=& h{{\int}_{0}^{1}} \nabla H(\sigma(ch))^{\top}\dot\sigma(ch)\mathrm{d} c~=~ h{{\int}_{0}^{1}} \nabla H(\sigma(ch))^{\top} {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma) \gamma_{j}(\sigma)\mathrm{d} c\\ &=&h {\sum}_{i,j=0}^{s-1} \left[{\int}_{0}^{1} P_{i}(c)\nabla H(\sigma(ch))\mathrm{d} c\right]^{\top}\rho_{ij}(\sigma) \gamma_{j}(\sigma) ~=~h {\sum}_{i,j=0}^{s-1} \gamma_{i}(\sigma)^{\top}\rho_{ij}(\sigma) \gamma_{j}(\sigma) ~=~0, \end{array} $$

where the last equality follows from (8).□

Concerning the accuracy of the new approximation, the following result holds true.

Theorem 2

Let y1 be defined according to (12)–(14). Then, y1y(h) = O(h2s+ 1).Footnote 1

Proof

Let y(t) ≡ y(t,ξ,η) denote the solution of the initial value problem (see (1))

$$ \dot y = B(y) \nabla H(y) =:F(y), \qquad t\ge\xi, \qquad y(\xi)=\eta, $$
(16)

Moreover, let us denote

$${\Phi}(t,\xi,\eta) = \frac{\partial}{\partial \eta}y(t,\xi,\eta),\qquad t\ge \xi,$$

also recalling that

$$\qquad\frac{\partial}{\partial \xi}y(t,\xi,\eta) = - {\Phi}(t,\xi,\eta)F(\eta).$$

Then, by taking into account Lemma 1 and Corollary 1, and setting

$${\Psi}_{i}(\sigma) = {{\int}_{0}^{1}} P_{i}(c){\Phi}(h,ch,\sigma(ch))\mathrm{d} c =O(h^{i}), \qquad i=0,1,\dots,$$

one has:

$$ \begin{array}{@{}rcl@{}} \lefteqn{ y_{1}-y(h) ~=~\sigma(h)-y(h) = y(h,h,\sigma(h))-y(h,0,\sigma(0)) = {{\int}_{0}^{h}} \frac{\mathrm{d}}{\mathrm{d} t} y(h,t,\sigma(t))\mathrm{d} t}\\ &=&{{\int}_{0}^{h}} \left.\left[\frac{\partial}{\partial \xi} y(h,\xi,\sigma(t))\right|_{\xi=t} + \left.\frac{\partial}{\partial \eta}y(h,t,\eta)\right|_{\eta=\sigma(t)} \dot\sigma(t)\right] \mathrm{d} t\\ &=& {{\int}_{0}^{h}} \left[-{\Phi}(h,t,{ \sigma(t)})F(\sigma(t))+{\Phi}(h,t,{ \sigma(t)})\dot\sigma(t)\right]\mathrm{d} t \\ &=& -h{{\int}_{0}^{1}} {\Phi}(h,ch,{ \sigma(ch)})\left[ F(\sigma(ch))-\dot\sigma(ch)\right]\mathrm{d} c\\ &=&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ \sigma(ch)})\left[ B(\sigma(ch))\nabla H(\sigma(ch)) - {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma)\gamma_{j}(\sigma)\right]\mathrm{d} c\\ &=&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ \sigma(ch)})\left[ {\sum}_{i,j\ge 0} P_{i}(c)\rho_{ij}(\sigma)\gamma_{j}(\sigma) - {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma)\gamma_{j}(\sigma)\right]\mathrm{d} c\\ &=& -h\left[{\sum}_{i,j\ge0} {\Psi}_{i}(\sigma)\rho_{ij}(\sigma)\gamma_{j}(\sigma) -{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(\sigma)\rho_{ij}(\sigma)\gamma_{j}(\sigma) \right]\\ &=&-h\left[ {\sum}_{i=0}^{s-1}{\sum}_{j\ge s} \underbrace{{\Psi}_{i}(\sigma)\rho_{ij}(\sigma)}_{=O(h^{j})}\gamma_{j}(\sigma) + {\sum}_{i\ge s}{\sum}_{j=0}^{s-1} {\Psi}_{i}(\sigma)\underbrace{\rho_{ij}(\sigma)\gamma_{j}(\sigma)}_{=O(h^{i})} + {\sum}_{i,j\ge s} {\Psi}_{i}(\sigma)\rho_{ij}(\sigma)\gamma_{j}(\sigma) \right]\\ &=&O(h^{2s+1}). \end{array} $$

At last, we observe that, since the procedure (12)–(14) is equivalent to defining the path σ that joins σ(0) = y0 to σ(h) = y1, then the same procedure, when started at y0 and going forward provides y1 and, when started from y1 and going backward, brings back to y0. In other words, the following result holds true.

Theorem 3

The procedure (12)–(14) is symmetric.

Proof

This result comes as an easy consequence of Theorem 11, where the analogous property for the full discretized method is shown. □

Remark 1

We conclude this section emphasizing that, when problem (1) is Hamiltonian, i.e., in the form (3), then matrix B(y) ≡ J is constant, and therefore (see (6)) ρij(σ) = δijJ. Consequently, (13) becomes

$$\sigma(ch) = y_{0}+h{\sum}_{j=0}^{s-1} {{\int}_{0}^{c}} P_{j}(x)\mathrm{d} x {{\int}_{0}^{1}} P_{j}(\tau)J\nabla H(\sigma(\tau h))\mathrm{d}\tau, \qquad c\in[0,1],$$

which (see [6, 8, 10]) is the so-called master functional equation defining the class of energy-conserving methods named Hamiltonian Boundary Value Methods (HBVMs). Consequently, when the problem is Hamiltonian, then the procedure (12)–(14) reduces to the HBVM\((\infty ,s)\) method in [8].

2.3 Conservation of Casimirs

In this section, we study the required modifications, in order to conserve Casimirs, i.e., functions satisfying (2). For sake of simplicity, we shall consider the simpler case of one Casimir, but multiple ones can be handled by slightly adapting the arguments, as is sketched at the end of the section. To begin with, for the original problem (4), and its equivalent formulation (9), one has:

$$ \begin{array}{@{}rcl@{}} 0 &=& C(y(h))-C(y_{0}) = {{\int}_{0}^{h}} \nabla C(y(t))^{\top} \dot y(t)\mathrm{d} t = h{{\int}_{0}^{1}} \nabla C(y(ch))^{\top}\dot y(ch)\mathrm{d} c \\ &=& h{{\int}_{0}^{1}} \nabla C(y(ch))^{\top} B(y(ch))\nabla H(y(ch))\mathrm{d} c =h{{\int}_{0}^{1}} \nabla C(y(ch))^{\top}{\sum}_{i,j\ge0} P_{i}(c) \rho_{ij}(y)\gamma_{j}(y) \\ &=& h{\sum}_{i,j\ge0}\left[ {{\int}_{0}^{1}} P_{i}(c)\nabla C(y(ch))\mathrm{d} c\right]^{\top} \rho_{ij}(y)\gamma_{j}(y) ~=:~ h{\sum}_{i,j\ge0} \pi_{i}(y)^{\top}\rho_{ij}(y)\gamma_{j}(y), \end{array} $$
(17)

having set

$$ \pi_{i}(y) = {{\int}_{0}^{1}} P_{i}(c)\nabla C(y(ch))\mathrm{d} c = O(h^{i}), $$
(18)

the i-th Fourier coefficient of the gradient of the Casimir, with the last equality following from Lemma 1. Clearly, again from (2), one derives, by taking into account (12):

$$ \begin{array}{@{}rcl@{}} \lefteqn{C(y_{1})-C(y_{0}) ~=~C(\sigma(h))-C(\sigma(0)) ~=~{{\int}_{0}^{h}} \nabla C(\sigma(t))^{\top} \dot \sigma(t)\mathrm{d} t ~=~ h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top}\dot \sigma(ch)\mathrm{d} c}\\ &=& h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top}\left[\dot \sigma(ch)-B(\sigma(ch))\nabla H(\sigma(ch))\right]\mathrm{d} c\\ &&+~\overbrace{h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top} B(\sigma(ch))\nabla H(\sigma(ch))\mathrm{d} c}^{=0}\\ &=&-h{{\int}_{0}^{1}} \nabla C(\sigma(ch))^{\top}\left[ {\sum}_{i,j\ge0} P_{i}(c) \rho_{ij}(\sigma)\gamma_{j}(\sigma) - {\sum}_{i,j=0}^{s-1} P_{i}(c) \rho_{ij}(\sigma)\gamma_{j}(\sigma)\right]\mathrm{d} c\\ &=& -h\left[ {\sum}_{i,j\ge s}\pi_{i}(\sigma)^{\top}\rho_{ij}(\sigma)\gamma_{j}(\sigma) + {\sum}_{i=0}^{s-1}{\sum}_{j\ge s}\underbrace{\pi_{i}(\sigma)^{\top}\rho_{ij}(\sigma)}_{=O(h^{j})}\gamma_{j}(\sigma)+ {\sum}_{j=0}^{s-1}{\sum}_{i\ge s}\pi_{i}(\sigma)^{\top}\underbrace{\rho_{ij}(\sigma)\gamma_{j}(\sigma)}_{=O(h^{i})}\right] \\&=&O(h^{2s+1}). \end{array} $$
(19)

In order to recover the conservation of Casimirs, we shall use a strategy akin to that used in [18] for HBVMs (see also [4]), i.e., suitably perturbing some of its coefficients. In more details, let us consider the following modified polynomial in place of (12):Footnote 2

$$ \dot \sigma_{\alpha}(ch) = {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha}) - \alpha \tilde{B}\gamma_{0}(\sigma_{\alpha}), \qquad c\in[0,1], \qquad \sigma_{\alpha}(0)=y_{0}, $$
(20)

with \(\tilde {B}^{\top }=-\tilde {B}\ne O\) an arbitrary skew-symmetric matrix. As is usual, the new approximation will be y1 := σα(h). In other words, we have considered the following perturbed coefficient:

$$\rho_{00}(\sigma_{\alpha}) - \alpha \tilde{B},$$

in place of ρ00(σ) in (12). The following result holds true.

Theorem 4

Assume that \(\pi _{0}(\sigma _{\alpha })^{\top } \tilde {B}\gamma _{0}(\sigma _{\alpha })\ne 0\). Then the Casimir C(y) is conserved, provided that

$$ \alpha = \frac{{\sum}_{i,j=0}^{s-1} \pi_{i}(\sigma_{\alpha})^{\top} \rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})}{\pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}\gamma_{0}(\sigma_{\alpha})}. $$
(21)

Moreover, α = O(h2s).

Proof

In fact, by repeating similar steps as in (19), and replacing σ by σα, as defined in (20), one obtains:

$$ \begin{array}{@{}rcl@{}} \lefteqn{C(y_{1})-C(y_{0}) ~=~C(\sigma_{\alpha}(h))-C(\sigma_{\alpha}(0))~=~h{{\int}_{0}^{1}} \nabla C(\sigma_{\alpha}(ch))^{\top}\dot \sigma_{\alpha}(ch)\mathrm{d} c}\\ &=&h{{\int}_{0}^{1}} \nabla C(\sigma_{\alpha}(ch))^{\top}\left[{\sum}_{i,j=0}^{s-1} P_{i}(c) \rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})-\alpha \tilde{B}\gamma_{0}(\sigma_{\alpha})\right]\mathrm{d} c\\ &=& h\left[ {\sum}_{i,j=0}^{s-1} \pi_{i}(\sigma_{\alpha})^{\top}\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})- \alpha \pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}\gamma_{0}(\sigma_{\alpha})\right] ~=~0, \end{array} $$

provided that (21) holds true. The statement is completed by observing that the numerator is O(h2s), whereas, the denominator is O(1). \(\Box {~}\)

We now prove that the results of Theorems 1 and 2 continue to hold for the polynomial (20).

Theorem 5

For any α: H(σα(h)) = H(σα(0)).

Proof

Following similar steps as in the proof of Theorem 1, one has:

$$ \begin{array}{@{}rcl@{}} \lefteqn{H(\sigma_{\alpha}(h))-H(\sigma_{\alpha}(0)) = {{\int}_{0}^{h}} \nabla H(\sigma_{\alpha}(t))^{\top}\dot\sigma_{\alpha}(t)\mathrm{d} t}\\ &=& h{{\int}_{0}^{1}} \nabla H(\sigma_{\alpha}(ch))^{\top}\dot\sigma_{\alpha}(ch)\mathrm{d} c~ = ~ h{{\int}_{0}^{1}} \nabla H(\sigma_{\alpha}(ch))^{\top} {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha})\mathrm{d} c\\ &&- h\alpha\left[{\int}_{0}^{1}\nabla H(\sigma_{\alpha}(ch)) \mathrm{d} c\right]^{\top} \tilde{B}\gamma_{0}(\gamma_{\alpha})\\ &=&h \underbrace{{\sum}_{i,j=0}^{s-1} \gamma_{i}(\sigma_{\alpha})^{\top}\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha})}_{=0} - h\/\alpha \gamma_{0}(\sigma_{\alpha})^{\top} \tilde{B}\gamma_{0}(\sigma_{\alpha}) ~=~0, \end{array} $$

due to the fact that \(\tilde {B}\) is skew-symmetric, independently of the considered value of the parameter α.□

Theorem 6

Assume that the parameter α in (20) is chosen according to (21). Then,

$$\sigma_{\alpha}(h)-y(h)=O(h^{2s+1}).$$

Proof

Repeating similar steps as those in the proof of Theorem 2 (and using the same notation), and taking into account (20), one arrives at:Footnote 3

$$ \begin{array}{@{}rcl@{}} \lefteqn{\sigma_{\alpha}(h)-y(h)}\\ &=& -h\left[{\sum}_{i,j\ge0} {\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha}) -{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha}) +\alpha {\Psi}_{0}(\sigma_{\alpha})\tilde{B}\gamma_{0}(\sigma_{\alpha})\right]\\ &=&-h\left[ {\sum}_{i=0}^{s-1}{\sum}_{j\ge s} \underbrace{{\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})}_{=O(h^{j})}\gamma_{j}(\sigma_{\alpha}) + {\sum}_{i\ge s}{\sum}_{j=0}^{s-1} {\Psi}_{i}(\sigma_{\alpha})\underbrace{\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})}_{=O(h^{i})}\right.\\ &&\left. + {\sum}_{i,j\ge s} {\Psi}_{i}(\sigma_{\alpha})\rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha})+\underbrace{\alpha {\Psi}_{0}(\sigma_{\alpha})\tilde{B}\gamma_{0}(\sigma_{\alpha})}_{=O(h^{2s})} \right] ~=~O(h^{2s+1}). \end{array} $$

Remark 2

We observe that the modified polynomial σα in (20) is the solution of the approximate perturbed ODE-IVPs:

$$ \dot\sigma_{\alpha}(ch) = \left[\left( B(\sigma_{\alpha}(ch))+\alpha \tilde{B}\right)\left[\nabla H(\sigma_{\alpha}(ch))\right]_{s}\right]_{s}, \qquad c\in[0,1], \qquad \sigma_{\alpha}(0)=y_{0}, $$
(22)

where the parameter α is such that C(σα(h)) = C(σα(0)). Clearly, when α = 0, one recovers the problem (15) defining σ.

We end this section by sketching the case when we have r independent Casimirs, so that \(C:\mathbb {R}^{m}\rightarrow \mathbb {R}^{r}\). In such a case, the notation introduced above formally still holds true, with the following differences:

  • The Fourier coefficients (see (18)) \(\pi _{i}(u_{\alpha })\in \mathbb {R}^{m\times r}\), \(i=0,\dots ,s-1\);

  • The polynomial (20) now becomes

    $$ \dot \sigma_{\alpha}(ch) = {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(\sigma_{\alpha}) \gamma_{j}(\sigma_{\alpha}) - {\sum}_{\ell=1}^{r}\alpha_{\ell} \tilde{B}_{\ell}\gamma_{0}(\sigma_{\alpha}), \qquad c\in[0,1], \qquad \sigma_{\alpha}(0)=y_{0}, $$
    (23)

    having set \(\alpha =\left (\alpha _{1}, \dots , \alpha _{r}\right )^{\top }\) and with \(\tilde {B}_{i}^{\top }=-\tilde {B}_{i}\), \(i=1,\dots ,r\), arbitrary skew-symmetric matrices such that

    $$ M:= \left[ \pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}_{1}\gamma_{0}(\sigma_{\alpha}), \dots, \pi_{0}(\sigma_{\alpha})^{\top} \tilde{B}_{r}\gamma_{0}(\sigma_{\alpha})\right]\in\mathbb{R}^{r\times r} $$
    (24)

    is nonsingular;

  • The vector α, providing the conservation of all Casimirs, is given by (compare with (21))

    $$ \alpha = M^{-1} {\sum}_{i,j=0}^{s-1} \pi_{i}(\sigma_{\alpha})^{\top} \rho_{ij}(\sigma_{\alpha})\gamma_{j}(\sigma_{\alpha}). $$
    (25)

3 Discretization

The procedure (12)–(14) described in the previous section is not yet a ready to use numerical method. In fact, in order for this to happen, the integrals γj(σ),ρij(σ), \(i,j=0,\dots ,s-1\), defined in (6) need to be conveniently computed or approximated. For this purpose, as it has been done in the case of HBVMs [8], we shall use a Gauss-Legendre quadrature formula of order 2k, i.e., the interpolatory quadrature rule based at the zeros of Pk(c), with abscissae and weights (ci,bi), for a convenient value ks. In so doing, we shall in general obtain a new polynomial approximation u ∈πs, in place of σ as defined in (12)–(13):

$$ \begin{array}{@{}rcl@{}} \dot u(ch) ={\sum}_{i,j=0}^{s-1}P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u),&& u(ch) = y_{0} + h{\sum}_{i,j=0}^{s-1}{{\int}_{0}^{c}} P_{i}(x)\mathrm{d} x\hat\rho_{ij}(u)\hat\gamma_{j}(u), \qquad c\in[0,1],\\ \\ \hat\gamma_{j}(u) = {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H(u(c_{\ell} h)),&& \hat\rho_{ij}(u) = {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B(u(c_{\ell} h)),\quad i,j=0,\dots,s-1. \end{array} $$
(26)

Consequently, the new approximation to y(h) will be given by

$$ y_{1}:=u(h) = y_{0} + h{\sum}_{j=0}^{s-1}\hat\rho_{0j}(u) \hat\gamma_{j}(u) \equiv y_{0} + h{\sum}_{j=0}^{s-1} {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})B(u(c_{\ell} h)) {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell}) \nabla H(u(c_{\ell} h)), $$
(27)

which is the discrete counterpart of (14).

It is worth mentioning that, in a similar way as done in Section 2.1 for the polynomial σ, for u one obtains, by virtue of (26):

$$ \begin{array}{@{}rcl@{}} \dot u(ch) &=& {\sum}_{i,j=0}^{s-1}P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u)\\ &=& {\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{j=0}^{s-1} {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell}) B(u(c_{\ell} h))P_{j}(c_{\ell}) {\sum}_{\ell_{1}=1}^{k} b_{\ell_{1}}P_{j}(c_{\ell_{1}})\nabla H(u(c_{\ell_{1}}h))\\ &=& {\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell}) B(u(c_{\ell} h)){\sum}_{j=0}^{s-1}P_{j}(c_{\ell}) {\sum}_{\ell_{1}=1}^{k} b_{\ell_{1}} P_{j}(c_{\ell_{1}})\nabla H(u(c_{\ell_{1}}h))\\ &=:& {\sum}_{i=0}^{s-1} P_{i}(c) {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell}) B(u(c_{\ell} h)) [\nabla H(u(c_{\ell} h))]_{s}^{(2k)}\\ &\equiv&\left[ B(u(ch)) [\nabla H(u(ch)]_{s}^{(2k)}\right]_{s}^{(2k)}, \end{array} $$

having set \([\cdot ]_{s}^{(2k)}\) the approximate best approximation in πs− 1 obtained by using a quadrature of order 2k for approximating the involved integrals.Footnote 4 Consequently (compare with (15)), the polynomial u is the solution of the initial value problem:

$$ \dot u(ch) = \left[B(u(ch))\left[\nabla H(u(ch))\right]_{s}^{(2k)}\right]_{s}^{(2k)}, \qquad c\in[0,1], \qquad u(0)=y_{0}. $$
(28)

Remark 3

We observe that the polynomial approximation defined by the problem

$$\dot u(ch) = \left[B(u(ch))\left[\nabla H(u(ch))\right]_{s}\right]_{s}^{(2s)}, \qquad c\in[0,1], \qquad u(0)=y_{0},$$

corresponds to that provided by the methods in [19], when the Gauss-Legendre abscissae are used, and to the methods in [26, Definition 3.2], upon selecting the derivative space as πs− 1. Similarly, some of the methods in [1] provide the approximation

$$\dot u(ch) = \left[B(u(ch))\left[\nabla H(u(ch))\right]_{s}^{(2k)}\right]_{s}^{(2s)}, \qquad c\in[0,1], \qquad u(0)=y_{0}.$$

Remark 4

When in (28) B(σ(ch)) ≡ J, a constant skew-symmetric matrix, we derive the polynomial approximation provided by a HBVM(k,s) method:

$$ u(ch) = y_{0} + h{{\int}_{0}^{c}} \left[J\nabla H(u(\tau h))\right]_{s}^{(2k)}\mathrm{d}\tau, \qquad c\in[0,1]. $$
(29)

3.1 Analysis

As was done in Section 2.2 for the continuous procedure, let us now analyze the fully discrete method (26)–(27). To begin with, the following straightforward result holds true.

Theorem 7

If

$$ B\in{\Pi}_{\mu},\quad H\in{\Pi}_{\nu}, \qquad \text{with}\qquad \mu\le \frac{2k+1}s-2, \quad \nu\le \frac{2k}s. $$
(30)

Then (see (26) and (6)),

$$ \hat\rho_{ij}(u) = \rho_{ij}(u), \qquad \hat\gamma_{j}(u)=\gamma_{j}(u), \qquad \forall i,j=0,\dots,s-1, $$
(31)

and, consequently, with reference to (26) and (13), one has uσ.

Proof

In fact, if B is a polynomial of degree μ and H a polynomial of degree ν, the integrand defining ρij(u) has at most degree μs + 2s − 2, whereas that defining γj(u) has at most degree νs − 1. Consequently, these degrees do not exceed 2k − 1, when (30) holds true. As a result, the quadrature is exact, so that (31) is valid and, therefore, uσ.□

Consequently, when (30) hold true, the method is energy-conserving and has order 2s, as stated by Theorems 1 and 2, respectively.

Concerning energy conservation, the following additional result holds true, in the case where only H is a polynomial.

Theorem 8

If

$$ H\in{\Pi}_{\nu}, \qquad \text{with}\qquad \nu\le \frac{2k}s, $$
(32)

then H(y1) = H(y0).

Proof

In fact, in such a case \(\gamma _{j}(u)=\hat \gamma _{j}(u)\), \(j=0,\dots ,s-1\), and the proof of Theorem 1 continues formally to hold, upon replacing σ with u, and ρij with \(\hat \rho _{ij}\), due to the fact that (compare with (8))

$$ \hat\rho_{ij}(u) = \hat\rho_{ji}(u) = -\hat\rho_{ij}(u)^{\top}, \qquad \forall i,j=0,1,\dots,s-1. $$
(33)

When (31) does not hold true, there is a quadrature error that, upon regularity assumptions, can be easily seen to be given by (see (6)):

$$ \begin{array}{@{}rcl@{}} \hat\rho_{ij}(u) - \rho_{ij}(u) &=& \chi_{ij}(h) ~\equiv~ O(h^{2k-i-j}), \\ \hat\gamma_{j}(u)-\gamma_{j}(u) &=& {\Delta}_{j}(h)~\equiv~ O(h^{2k-j}), \qquad \forall i,j=0,\dots,s-1. \end{array} $$
(34)

Nonetheless, also in this case it is straightforward to verify that (compare with (7)),

$$ \forall k\ge s: \qquad \hat\gamma_{j}(u)=O(h^{j}), \qquad \hat\rho_{ij}(u)=O(h^{|i-j|}).\qquad \forall i,j=0,1,\dots,s-1. $$
(35)

Consequently, with reference to the approximation y1 defined in (27), the following result is easily obtained, when (32) is not valid.

Theorem 9

ks : H(y1) = H(y0) + O(h2k+ 1).

Proof

In fact, using arguments similar to those used in the proof of Theorem 1, one has, by taking into account (33)–(35):

$$ \begin{array}{@{}rcl@{}} \lefteqn{H(y_{1})-H(y_{0})~=~ H(u(h))-H(u(0)) = {{\int}_{0}^{h}} \nabla H(u(t))^{\top}\dot u(t)\mathrm{d} t}\\ &=& h{{\int}_{0}^{1}} \nabla H(u(ch))^{\top}\dot u(ch)\mathrm{d} c~=~ h{{\int}_{0}^{1}} \nabla H(u(ch))^{\top} {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u) \hat\gamma_{j}(u)\mathrm{d} c\\ &=&h {\sum}_{i,j=0}^{s-1} \underbrace{\left[{\int}_{0}^{1} P_{i}(c)\nabla H(u(ch))\mathrm{d} c\right]^{\top}}_{=\gamma_{i}(u)^{\top}}\hat\rho_{ij}(u) \left[\gamma_{j}(u)+{\Delta}_{j}(h)\right] \\ &=& h {\sum}_{i,j=0}^{s-1} \underbrace{\gamma_{i}(u)^{\top}\hat\rho_{ij}(u) \gamma_{j}(u)}_{=0}~+~h {\sum}_{i,j=0}^{s-1} \gamma_{i}(u)^{\top}\hat\rho_{ij}(u) {\Delta}_{j}(h)\\ &=&h \left[{\sum}_{j=0}^{s-1} {\sum}_{i=j}^{s-1} \underbrace{\gamma_{i}(u)^{\top}\hat\rho_{ij}(u)}_{=O(h^{2i-j})}{\Delta}_{j}(h) + {\sum}_{j=0}^{s-1} {\sum}_{i=0}^{j-1} \gamma_{i}(u)^{\top}\underbrace{\hat\rho_{ij}(u){\Delta}_{j}(h)}_{=O(h^{2k-i})}\right] ~=~O(h^{2k+1}). \end{array} $$

Concerning the accuracy of the approximation (27), the following result, stating that the convergence order of Theorem 2 is retained, holds true.

Theorem 10

ks : y1y(h) = O(h2s+ 1).

Proof

By using arguments and notations similar to those used in the proof of Theorem 2, one has, by taking into account (34) and that ks:

$$ \begin{array}{@{}rcl@{}} \lefteqn{ y_{1}-y(h) ~=~u(h)-y(h) = y(h,h,u(h))-y(h,0,u(0)) = {{\int}_{0}^{h}} \frac{\mathrm{d}}{\mathrm{d} t} y(h,t,u(t))\mathrm{d} t}\\ &=&{{\int}_{0}^{h}} \left[\left.\frac{\partial}{\partial \xi} y(h,\xi,u(t))\right|_{\xi=t} + \left.\frac{\partial}{\partial \eta} y(h,t,\eta)\right|_{\eta=u(t)} \dot u(t)\right] \mathrm{d} t\\ &=& {{\int}_{0}^{h}}\left[ -{\Phi}(h,t,{ u(t)})F(u(t))+{\Phi}(h,t,{ u(t)})\dot u(t)\right]\mathrm{d} t \\ &=& -h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ F(u(ch))-\dot u(ch)\right]\mathrm{d} c\\ &=&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ B(u(ch))\nabla H(u(ch)) - {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u)\right]\mathrm{d} c\\ \end{array} $$
$$ \begin{array}{@{}rcl@{}} &=&\underbrace{-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ {\sum}_{i,j\ge0} P_{i}(c)\rho_{ij}(u)\gamma_{j}(u)- {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(u)\gamma_{j}(u)\right]\mathrm{d} c}_{=O(h^{2s+1}), \text{\small ~from the proof of Theorem~2}}\\ &&-h{{\int}_{0}^{1}} {\Phi}(h,ch,{ u(ch)})\left[ {\sum}_{i,j=0}^{s-1} P_{i}(c)\rho_{ij}(u)\gamma_{j}(u)- {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u)\hat\gamma_{j}(u)\right]\mathrm{d} c\\ &=& O(h^{2s+1}) -h{\sum}_{i,j=0}^{s-1}\underbrace{{{\int}_{0}^{1}} P_{i}(c) {\Phi}(h,ch,{ u(ch)})\mathrm{d} c}_{={\Psi}_{i}(u)}\left[ \rho_{ij}(u)\gamma_{j}(u)-\hat\rho_{ij}(u)\hat\gamma_{j}(u)\right]\\ &=& O(h^{2s+1}) -h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\left[ \rho_{ij}(u)\gamma_{j}(u)-\left( \rho_{ij}(u)+\chi_{ij}(h)\right)\left( \gamma_{j}(u)+{\Delta}_{j}(h)\right)\right]\\ &=& O(h^{2s+1}) +h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\left[ \rho_{ij}(u){\Delta}_{j}(h) +\underbrace{\chi_{ij}(h)\gamma_{j}(u)}_{=O(h^{2k-i})}+\underbrace{\chi_{ij}(h){\Delta}_{j}(h)}_{=O(h^{4k-2j-i})}\right]\\ &=&O(h^{2s+1}) + O(h^{2k+1}) + h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\rho_{ij}(u){\Delta}_{j}(h) ~=~O(h^{2s+1}) + h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\rho_{ij}(u){\Delta}_{j}(h). \end{array} $$

Concerning the latter sum, one has:

$$ \begin{array}{@{}rcl@{}} h{\sum}_{i,j=0}^{s-1} {\Psi}_{i}(u)\rho_{ij}(u){\Delta}_{j}(h) &=&h{\sum}_{i=0}^{s-1}{\sum}_{j=i}^{s-1} \underbrace{{\Psi}_{i}(u)\rho_{ij}(u)}_{=O(h^{j})}{\Delta}_{j}(h) ~+~h{\sum}_{i=0}^{s-1}{\sum}_{j=0}^{i-1} {\Psi}_{i}(u)\underbrace{\rho_{ij}(u){\Delta}_{j}(h)}_{=O(h^{2k-2j+i})}\\&=&O(h^{2k+1}) + O(h^{2k+2}) ~=~ O(h^{2k+1}), \end{array} $$

and, consequently, the statement follows.□

Remark 5

By taking into account the results of Theorems 7 and 9, it follows that, by choosing k large enough, one obtains either an exact conservation of the Hamiltonian function, in the polynomial case, or a practical conservation, in the non polynomial case. In fact, as it has been also observed for HBVMs [11], in the latter case it is enough to choose k large enough so that the Hamiltonian error falls within the round-off error level of the finite precision arithmetic used in the simulation.

It is worth mentioning that a result similar to that of Theorem 3 holds true for the fully discrete method.

Theorem 11

The method (26)–(27) is symmetric, provided that the abscissae of the quadrature satisfy Footnote 5

$$ c_{k-i+1} = 1-c_{i}, \qquad i=1,\dots,k. $$
(36)

Proof

Symmetry of a given one-step method y1 = Φh(y0) applied to an initial value problem \(y^{\prime }=f(y)\) with y(0) = y0, means that \({\Phi }_{h}^{-1}={\Phi }_{-h}\) that is, applying the method to the state vector y1, but with the direction of time reversed, yields the initial state vector y0, independently of the choice of the initial value y0. In our context, with reference to (26)–(27), Φh is defined by

$$ y_{1} = y_{0} + h{\sum}_{j=0}^{s-1}\hat\rho_{0j} \hat\gamma_{j}, $$
(37)

where \(\hat \rho _{ij}\) and \(\hat \gamma _{j}\) are the solutions of the nonlinear system

$$ \begin{array}{@{}rcl@{}} \hat\gamma_{j} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H\left( y_{0} + h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\hat\rho_{\mu\nu}\hat\gamma_{\nu} \right), \\ \hat\rho_{ij} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B\left( y_{0} + h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\hat\rho_{\mu\nu}\hat\gamma_{\nu} \right),\\ &&i,j=0,\dots,s-1. \end{array} $$
(38)

We can obtain the explicit formulation of \(\bar y_{0}={\Phi }_{-h}(y_{1})\) by introducing in (37)–(38) the following substitutions: y1 in place of y0 and − h in place of h. In so doing, we arrive at the method defined as

$$ \bar y_{0} = y_{1} - h{\sum}_{j=0}^{s-1}\bar\rho_{0j} \bar\gamma_{j}, $$
(39)

where the unknown quantities \(\bar \rho _{ij}\) and \(\bar \gamma _{j}\) satisfy the following nonlinear system

$$ \begin{array}{@{}rcl@{}} \bar\gamma_{j} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\bar\rho_{\mu\nu}\bar\gamma_{\nu}\right), \\ \displaystyle \bar\rho_{ij} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\bar\rho_{\mu\nu}\bar\gamma_{\nu}\right), \\ &&i,j=0,\dots,s-1, \end{array} $$
(40)

and we want to show that \(\bar y_{0}=y_{0}\). To this end, we introduce the following variables:

$$ \gamma_{j}^{\ast} := (-1)^{j} \hat \gamma_{j}, \quad \rho_{ij}^{\ast} := (-1)^{i+j} \hat \rho_{ij}, \quad i,j=0,\dots,s-1. $$

Exploiting the symmetry property of Legendre polynomials, (− 1)jPj(c) = Pj(1 − c), from the first equation in (38) we get

$$ \begin{array}{@{}rcl@{}} \gamma_{j}^{\ast} & = & \displaystyle {\sum}_{\ell=1}^{k} b_{\ell} (-1)^{j}P_{j}(c_{\ell})\nabla H\left( y_{0} + h{\sum}_{\mu,\nu=0}^{s-1}\left( {{\int}_{0}^{1}} P_{\mu}(x){ \mathrm{d} x} - {\int}_{c_{\ell}}^{1} P_{\mu}(x)\mathrm{d} x \right)\hat\rho_{\mu\nu}\hat\gamma_{\nu} \right) \\ &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(1-c_{\ell})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1} {\int}_{c_{\ell}}^{1} P_{\mu}(x)\mathrm{d} x \hat\rho_{\mu\nu}\hat\gamma_{\nu} \right). \end{array} $$

Introducing the change of variables τ = 1 − x, transforms the latter integral as

$$ {\int}_{c_{\ell}}^{1} P_{\mu}(x)\mathrm{d} x = -{\int}_{1-c_{\ell}}^{0} P_{\mu}(1-\tau)\mathrm{d} \tau = (-1)^{\mu} {\int}_{0}^{1-c_{\ell}} P_{\mu}(\tau)\mathrm{d} \tau. $$

Exploiting the symmetry assumption (36), which in turn implies symmetric weights, bki+ 1 = bi, \(i=1,\dots ,k\), we finally get

$$ \begin{array}{@{}rcl@{}} \gamma_{j}^{\ast} &=& \displaystyle {\sum}_{\ell=1}^{k} b_{k-\ell+1} P_{j}(c_{k-\ell+1})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1} {\int}_{0}^{c_{k-\ell+1}} P_{\mu}(x)\mathrm{d} x (-1)^{\mu} (-1)^{\nu} \hat\rho_{\mu\nu} (-1)^{\nu} \hat \gamma_{\nu} \right)\\ &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1} {\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x \rho^{\ast}_{\mu\nu} \gamma^{\ast}_{\nu} \right). \end{array} $$

The same flow of computation may be employed on the second equation in (38) to see that

$$ \displaystyle \rho^{\ast}_{ij} = {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B\left( y_{1} - h{\sum}_{\mu,\nu=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{\mu}(x)\mathrm{d} x\rho^{\ast}_{\mu\nu}\gamma^{\ast}_{\nu}\right). $$

We then realize that \(\gamma _{j}^{\ast }\) and \(\rho ^{\ast }_{ij}\) satisfy the very same nonlinear system (40) governing the quantities \(\bar \gamma _{j}\) and \(\bar \rho _{ij}\). Thus we may conclude that

$$ \bar \gamma_{j}= (-1)^{j} \hat \gamma_{j}, \quad \bar \rho_{ij}= (-1)^{i+j} \hat \rho_{ij}, \quad i,j=0,\dots,s-1, $$

and hence from (39),

$$ \bar y_{0} = y_{1} - h{\sum}_{j=0}^{s-1}(-1)^{j} \hat \rho_{0j} (-1)^{j} \hat \gamma_{j} = y_{1} - h{\sum}_{j=0}^{s-1} \hat \rho_{0j} \hat \gamma_{j} =y_{0}. $$

In the limit case when \(k\rightarrow \infty \), since the Gauss-Legendre quadrature formule are convergent, we are led back to the symmetry property of the original non-discretized procedure (12)–(14) that we anticipated in Theorem 3.

We conclude this section by emphasizing that, when problem (1) is in the form (3), then (see (26)) \(\hat \rho _{ij}(\sigma ) = { \delta _{ij}}J\) and, consequently, the polynomial approximation u becomes

$$u(ch) = y_{0}+h{\sum}_{j=0}^{s-1} {{\int}_{0}^{c}} P_{j}(x)\mathrm{d} x {\sum}_{{ i=1}}^{k} b_{i} P_{j}(c_{i})J\nabla H(u(c_{i} h)), \qquad c\in[0,1],$$

which is equivalent to (29). As anticipated in Remark 4, this equation (see, e.g., the monograph [5] or the review paper [6, Section 2.2]) defines a Hamiltonian Boundary Value Method with parameters k and s, in short HBVM(k,s). Moreover, also Theorems 7–10 exactly describe, in the case of problem (3), their conservation and accuracy properties. For this reason, the methods here presented can be regarded as a generalization of HBVMs for solving Poisson problems and, therefore, this fact, motivates the following definition.

Definition 1

We shall refer to the method (26)–(27) as to a PHBVM(k,s) method.

3.2 Conservation of Casimirs

This section is devoted to the conservation of Casimirs for the discrete PHBVM(k,s) method (26)–(27), following similar steps as those in Section 2.3. For this purpose, it is convenient to define the approximate Fourier coefficients, for ks, which we assume hereafter:

$$ \hat\pi_{i}(\sigma) = {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})\nabla C(\sigma(c_{\ell} h)) = O(h^{j}), \qquad i=0,\dots,s-1. $$
(41)

The following straightforward result is reported without proof.

Lemma 3

Assume that σ ∈πs. Then, with reference to (18), for all \(i=0,\dots ,s-1\), one has:

$$ \begin{array}{@{}rcl@{}} \hat\pi_{i}(\sigma) &=& \pi_{i}(\sigma), \qquad \text{if} \qquad C\in{\Pi}_{\nu},\quad \nu\le 2k/s,\\ \hat\pi_{i}(\sigma) &=& \pi_{i}(\sigma) - \theta_{i}(h), \qquad \theta_{i}(h)=O(h^{2k-i}), \qquad \text{otherwise.} \end{array} $$

Next, let us consider the following perturbed polynomial, in place of the polynomial u in (26),

$$ \dot u_{\alpha}(ch) ={\sum}_{i,j=0}^{s-1}P_{i}(c)\hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) -\alpha \tilde{B}\hat{\gamma}_{0}(u_{\alpha}), \qquad c\in[0,1], \qquad u_{\alpha}(0)=y_{0}, $$
(42)

with \(\tilde {B}^{\top }=-\tilde {B}\ne O\) an arbitrary matrix, and (compare with (21))

$$ \alpha = \frac{{\sum}_{i,j=0}^{s-1} \hat\pi_{i}(u_{\alpha})^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha})}{\hat\pi_{0}(u_{\alpha})^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})} = O(h^{2s}). $$
(43)

Theorem 12

If

$$ B\in{\Pi}_{\mu},\quad C,H\in{\Pi}_{\nu}, \qquad \text{with}\qquad \mu\le \frac{2k+1}s-2, \quad \nu\le \frac{2k}s. $$
(44)

Then (see (26), (6)), (18), and (41)

$$ \hat\rho_{ij}(u_{\alpha}) = \rho_{ij}(u_{\alpha}), \qquad \hat\gamma_{j}(u_{\alpha})=\gamma_{j}(u_{\alpha}), \qquad \hat\pi_{i}(u_{\alpha})=\pi_{i}(u_{\alpha}),\qquad \forall i,j=0,\dots,s-1, $$
(45)

and, consequently, with reference to (42) and (20), one has uασα.

Differently, by setting y1 := uα(h) the new approximation, Theorems 8, 9, 10, and 11 continue formally to hold. Moreover, the following result holds true.

Theorem 13

With reference to (42)–(43), one has:

$$ C(y_{1})-C(y_{0})=\left\{\begin{array}{cc} 0, &\text{if}\quad C\in{\Pi}_{\nu}, \quad\text{with}\quad \nu\le 2k/s,\\ O(h^{2k+1}), &\text{otherwise}. \end{array}\right.$$

Proof

By taking into account Lemma 3, one obtains:

$$ \begin{array}{@{}rcl@{}} \lefteqn{C(y_{1})-C(y_{0}) ~=~ h{{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))^{\top} \dot u_{\alpha}(ch)\mathrm{d} c}\\ &=& h{{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))^{\top} \left[ {\sum}_{i,j=0}^{s-1} P_{i}(c)\hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]\mathrm{d} c\\ &=& h\left[{\sum}_{i,j=0}^{s-1} \left( {{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))P_{i}(c)\mathrm{d} c\right)^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \left( {{\int}_{0}^{1}} \nabla C(u_{\alpha}(ch))P_{i}(c)\mathrm{d} c\right)^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]\mathrm{d} c\\ &=& h\left[{\sum}_{i,j=0}^{s-1} \pi_{i}(u_{\alpha})^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \pi_{0}(u_{\alpha})^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]~=:~(*). \end{array} $$

In case C ∈πν with ν ≤ 2k/s, then, by virtue of Lemma 3, \(\pi _{i}(u_{\alpha })=\hat \pi _{i}(u_{\alpha })\) and, consequently, (∗) = 0 because of (43). Conversely, again from Lemma 3, one has:

$$ \begin{array}{@{}rcl@{}} (*)&=& h\left[{\sum}_{i,j=0}^{s-1} \left[\hat\pi_{i}(u_{\alpha})+\theta_{i}(h)\right]^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha}) - \alpha \left[\hat\pi_{0}(u_{\alpha})+\theta_{0}(h)\right]^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})\right]\\ &=& h\left[\underbrace{{\sum}_{i,j=0}^{s-1} \theta_{i}(h)^{\top} \hat\rho_{ij}(u_{\alpha})\hat\gamma_{j}(u_{\alpha})}_{=O(h^{2k})} - \underbrace{\alpha \theta_{0}(h)^{\top} \tilde{B}\hat\gamma_{0}(u_{\alpha})}_{=O(h^{2k+2s})}\right] ~=~O(h^{2k+1}). \end{array} $$

Remark 6

From the arguments here exposed, one deduces that when using a finite precision arithmetic, a (at least) practical conservation of both the Hamiltonian and the Casimirs can be obtained, by choosing k large enough.

Definition 2

Following [18], we name enhanced PHBVM(k,s), in short EPHBVM(k,s), the method defined by (42)–(43).

Following Remark 2, the polynomial uα defined by an EPHBVM(k,s) method is the solution of the ODE-IVP (compare with (22)):

$$ \dot u_{\alpha}(ch) = \left[\left( B(u_{\alpha}(ch))+\alpha \tilde{B}\right)\left[\nabla H(u_{\alpha}(ch))\right]_{s}^{(2k)}\right]_{s}^{(2k)}, \quad c\in[0,1], \quad u_{\alpha}(0)=y_{0}. $$

Clearly, when α = 0 one retrieves the problem (28) defining the polynomial approximation of the PHBVM(k,s) method.

Remark 7

From the results of Theorems 8, 9, and 13, one deduces the clear advantage of choosing values of k suitably larger than s, in order to obtain a suitable conservation of non-quadratic Hamiltonians and/or Casimirs. This fact will be duly confirmed in the numerical tests. Moreover, this is not a serious drawback, from the point of view of the computational cost, since, as we shall see in the next section, the discrete problem to be solved will always have (block) dimension s, independently of k.Footnote 6

4 The discrete problem

In this section we deal with the efficient solution of the discrete problem generated by the PHBVM(k,s) method (26). For this purpose, we observe that only the values Y := u(ch), \(\ell =1,\dots ,k\), are actually needed. Consequently, (26) can be re-written as:

$$ \begin{array}{@{}rcl@{}} Y_{\ell} &=& y_{0} + h{\sum}_{i,j=0}^{s-1}{\int}_{0}^{c_{\ell}} P_{i}(x)\mathrm{d} x\hat\rho_{ij}\hat\gamma_{j}, \qquad \ell = 1,\dots,k,\\ \hat\gamma_{j} &=& {\sum}_{\ell=1}^{k} b_{\ell} P_{j}(c_{\ell})\nabla H(Y_{\ell}),\qquad \hat\rho_{ij} ~=~ {\sum}_{\ell=1}^{k} b_{\ell} P_{i}(c_{\ell})P_{j}(c_{\ell})B(Y_{\ell}),\qquad i,j=0,\dots,s-1, \end{array} $$
(46)

where, for sake of brevity, we have omitted the argument u for \(\hat \gamma _{j}\) and \(\hat \rho _{ij}\), as was already done in (37)–(38). The (46) can be cast in matrix form by defining the block vectors and matrices

$$ Y = \left( \begin{array}{c} Y_{1}\\ {\vdots} \\ Y_{k} \end{array}\right)\in\mathbb{R}^{k\cdot m}, \quad {\boldsymbol{\gamma}} = \left( \begin{array}{c}\hat\gamma_{0}\\ \vdots\\ \hat\gamma_{s-1} \end{array}\right)\in\mathbb{R}^{s\cdot m}, \quad {\Gamma} = \left( \begin{array}{ccc} \hat\rho_{00} & {\dots} &\hat\rho_{0,s-1}\\ {\vdots} & &\vdots\\ \hat\rho_{s-1,0} & {\dots} &\hat\rho_{s-1,s-1} \end{array}\right)\in\mathbb{R}^{s\cdot m \times s\cdot m}, $$
(47)

and

$$ {\boldsymbol{e}} = \left( \begin{array}{c} 1\\ \vdots\\1 \end{array}\right) \in \mathbb{R}^{k}, \quad {\Omega} = \left( \begin{array}{ccc} b_{1}\\ &\ddots\\ &&b_{k} \end{array}\right), \quad \mathcal{I}_{s}=\left( {\int}_{0}^{c_{i}} P_{j-1}(x)\mathrm{d} x\right), ~ \mathcal{P}_{s}=\left( P_{j-1}(c_{i})\right) \in\mathbb{R}^{k\times s}. $$
(48)

In fact, by also setting hereafter Ir the identity of dimension r, we can rewrite (46) as:

$$ Y = {\boldsymbol{e}}\otimes y_{0} + h(\mathcal{I}_{s}\otimes I_{m}) {\Gamma}{\boldsymbol{\gamma}}, \quad {\boldsymbol{\gamma}} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H(Y), \quad {\Gamma} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m}) \mathcal{B}(Y) (\mathcal{P}_{s}\otimes I_{m}), $$
(49)

where

$$\nabla H(Y) = \left( \begin{array}{c} \nabla H(Y_{1})\\ \vdots\\ \nabla H(Y_{k}) \end{array}\right)\qquad\text{and}\qquad \mathcal{B}(Y) = \left( \begin{array}{ccc} B(Y_{1})\\ &\ddots\\ && B(Y_{k}) \end{array}\right).$$

As is clear, the discrete problem (49) can be further reformulated in terms of the product of the Fourier coefficients \(\hat \rho _{ij}\) and \(\hat \gamma _{j}\). In fact, by setting

$$ {\boldsymbol{\phi}} \equiv \left( \begin{array}{c} \phi_{0}\\ {\vdots} \\ \phi_{s-1} \end{array}\right) :={\Gamma}{\boldsymbol{\gamma}} \qquad \Rightarrow\qquad \phi_{i}={\sum}_{j=0}^{s-1}\hat\rho_{ij}\hat\gamma_{j}, \quad i=0,\dots,s-1, $$
(50)

the first equation in (49) becomes

$$ Y = {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m}{\boldsymbol{\phi}}, $$
(51)

whereas, multiplying side by side the third by the second gives

$$ {\boldsymbol{\phi}} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m}) \mathcal{B}(Y) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H(Y). $$
(52)

Consequently, by substituting the right-hand side of (51) in the right-hand side of (52), provides the new discrete problem:

$$ \begin{array}{@{}rcl@{}} \mathcal{F}({\boldsymbol{\phi}})&:=&{\boldsymbol{\phi}} - \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \mathcal{B}\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}\right) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H\left( \boldsymbol{e}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} \boldsymbol{\phi}\right)\\ &=& \boldsymbol{0}. \end{array} $$
(53)

Moreover, computing the vector ϕ in (50) allows us to obtain the new approximation (27) as:

$$y_{1} = y_{0} + h\phi_{0}.$$

Remark 8

In case where the problem (1) is in the form (3), one has that \({\mathscr{B}}(Y)=I_{s}\otimes J\). Consequently, considering that

$$ \mathcal{P}_{s}^{\top}{\Omega}\mathcal{P}_{s}=I_{s}, $$
(54)

the discrete problem (53) reduces to:

$$ {\boldsymbol{\phi}} - \mathcal{P}_{s}^{\top}{\Omega}\otimes J \nabla H\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes J {\boldsymbol{\phi}}\right) = {\boldsymbol{0}}. $$

This latter problem is exactly that generated by a HBVM(k,s) method applied for solving (3) [9]. We observe, however, that while the original HBVM(k,s) method is actually a k-stage Runge-Kutta method with Butcher tableau (see (48))

this is no more the case for the generalization defined by (49).

Remark 9

In the case k = s, one has that \(\mathcal {P}_{s}\mathcal {P}_{s}^{\top }{\Omega }=I_{s}\). Consequently, (53) becomes, by using the notation (16),

$${\boldsymbol{\phi}} = (\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m}) F({\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}).$$

This, in turn, is equivalent to the application of the s-stage Gauss method to the problem (1) (see, e.g., [9]).

Instead, in the case k > s, the discrete problem (53) is equivalent to the application of the HBVM(k,s) method to the problem (see (28)),

$$\dot y = B(y)[\nabla H(y)]_{s}^{(2k)}, \qquad t>0, \qquad y(0)=y_{0},$$

in place of (1). This application , in turn, provides the polynomial approximation (28).

We observe that the formulation (53) naturally induces a straightforward iterative procedure for solving the discrete problem,

$$ \begin{array}{@{}rcl@{}} \lefteqn{{\boldsymbol{\phi}}^{r+1} =}\\ && \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \mathcal{B}\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}^{r}\right) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H\left( {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}}^{r}\right),\\ && r=0,1,\dots, \end{array} $$
(55)

for which the initial approximation ϕ0 = 0 can be conveniently used. It is also possible to use the simplified Newton iteration for solving (53) which, taking into account (48), (54), and that (see, e.g., [5])

$$ \mathcal{P}_{s}^{\top}{\Omega}\mathcal{I}_{s} = X_{s} := \left( \begin{array}{cccc} \xi_{0} &-\xi_{1}\\ \xi_{1} &0 &\ddots\\ &{\ddots} &{\ddots} &-\xi_{s-1}\\ & &\xi_{s-1} &0 \end{array}\right), \qquad \xi_{i} = \left( 2\sqrt{|4i^{2}-1|}\right)^{-1},\quad i=0,\dots,s-1,~~ $$
(56)

takes the form:

$$ \text{solve:}~\left[ I_{s}\otimes I_{m} -hX_{s}\otimes F^{\prime}(y_{0})\right] {\boldsymbol{\delta}}^{r} = -\mathcal{F}({\boldsymbol{\phi}}^{r}), \qquad {\boldsymbol{\phi}}^{r+1}:={\boldsymbol{\phi}}^{r}+{\boldsymbol{\delta}}^{r}, \qquad r=0,1,\dots, $$
(57)

with \(F^{\prime }\) the Jacobian of F (see (16)). Nevertheless, the iteration (57) requires the factorization of a matrix having size s times larger than that of problem (1), which can be costly, when s and/or m are large. Consequently, it is much more effective to resort to a blended iteration for solving (53) (see, e.g., [9], we also refer to [13] for a more detailed analysis of blended methods). In the present case, this latter iteration, considering the matrix Xs defined in (56), denoting σ(Xs) its spectrum, and setting

$$ {\Lambda} := I_{m}-h\lambda_{s} F^{\prime}(y_{0})\in\mathbb{R}^{m\times m}, \qquad\text{with}\qquad \lambda_{s} = \min_{\lambda\in\sigma(X_{s})} |\lambda|, $$
(58)

assumes the form:

$$ \begin{array}{@{}rcl@{}} {\boldsymbol{\eta}}^{r}&:=&-\mathcal{F}({\boldsymbol{\phi}}^{r}),\qquad\quad {{\boldsymbol{\eta}}_{1}^{r}}~:=~ \lambda_{s}X_{s}^{-1}\otimes I_{m} {\boldsymbol{\eta}}^{r},\\ {\boldsymbol{\phi}}^{r+1}&:=&{\boldsymbol{\phi}}^{r}+I_{s}\otimes {\Lambda}^{-1}\left( {{\boldsymbol{\eta}}_{1}^{r}} +I_{s}\otimes {\Lambda}^{-1}\left( {\boldsymbol{\eta}}^{r}-{{\boldsymbol{\eta}}_{1}^{r}}\right)\right), \qquad r=0,1,\dots. \end{array} $$
(59)

Consequently, only the matrix Λ in (58), having the same size as that of the continuous problem (1), needs to be factored. This fact is a common feature, in the many instances where the blended iteration can be used. For this reason, in such cases, it turns out to be extremely efficient (see, e.g., [2, 14,15,16, 23]).

Remark 10

In the practical use of the methods, it is customary to choose the parameter k, related to the order of the quadrature, so that the discretization error falls within the round-off error level. Nevertheless, round-off errors are unavoidable, as are iteration errors in (55) or (59). This may cause a small numerical drift in the invariants,Footnote 7 even in the case where the quadrature is exact. This phenomenon has been duly studied in [5, Chapter 4.3], where a simple correction procedure is given to avoid this problem. The same procedure can be conveniently used in this setting, too. The reader is referred to the above reference for full details.

4.1 Conservation of Casimirs

In this section we sketch the implementation of EPHBVM(k,s) methods described in Section 3.2. For this purpose, besides the vector ϕ defined in (50), we need to define the block vector

$$ \hat{\boldsymbol{\pi}} = \left( \begin{array}{c} \hat\pi_{0} \\ {\vdots} \\ \hat\pi_{s-1} \end{array}\right) $$
(60)

with the approximate Fourier coefficients (41) of the gradient of the Casimir.Footnote 8 In so doing, the discrete problem generated by an EHBVM(k,s) method becomes:

$$ \begin{array}{@{}rcl@{}} \mathcal{F}({\boldsymbol{\phi}},\alpha)&:=&\left( \begin{array}{c} {\boldsymbol{\phi}} - \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \mathcal{B}\left( Y\right) (\mathcal{P}_{s}\mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m})\nabla H\left( Y\right)\\ \alpha - \frac{\hat{\boldsymbol{\pi}}^{\top}{\boldsymbol{\phi}}}{\hat\pi_{0}^{\top} \tilde{B}\hat\gamma_{0} } \end{array}\right)~=~ {\boldsymbol{0}},\\ \text{with}&&\\ Y&=& {\boldsymbol{e}}\otimes y_{0} + h\mathcal{I}_{s}\otimes I_{m} {\boldsymbol{\phi}} - \alpha h{\boldsymbol{c}}\otimes (\tilde{B}\hat\gamma_{0}),\\ \hat\gamma_{0} &=&{\boldsymbol{b}}^{\top}\otimes I_{m} \nabla H(Y),\\ \hat{\boldsymbol{\pi}} &=& \mathcal{P}_{s}^{\top}{\Omega}\otimes I_{m} \nabla C(Y), \end{array} $$
(61)

and the new approximation given by

$$y_{1} = y_{0} + h\left( \phi_{0}-\alpha \tilde{B}\hat\gamma_{0}\right).$$

We conclude this section by mentioning that, in case of multiple Casimirs, the discrete problem (61) can be readily generalized, by considering discrete counterparts of (24)–(25).

5 Numerical tests

In this section we present a couple of numerical tests concerning the solution of Lotka-Volterra problems, with the last one possessing a Casimir. The numerical tests have been carried out on a 3GHz Intel Xeon W10 core computer with 64GB of memory running Matlab 2020a.

Example 1

We consider the following Lotka-Volterra problem:

$$ \begin{array}{@{}rcl@{}} \dot y &=& \left( \begin{array}{cc} 0 & y_{1}y_{2} \\ -y_{1}y_{2} &0 \end{array}\right) \nabla H(y),\\ H(y) &=& a\left( \ln y_{1} -\frac{y_{1}}{y_{1}^{*}}\right) + b\left( \ln y_{2} -\frac{y_{2}}{y_{2}^{*}}\right), \end{array} $$
(62)

with

$$a=1, \qquad b=3, \qquad y_{1}^{*}=y_{2}^{*}=1, \qquad y(0) = \left( 5, 1\right)^{\top},$$

whose solution, which is periodic of period T ≈ 4.633434168477889, is depicted in Fig. 1. At first, we solve the problem on one period with time-step h = T/n, by using the following methods:

  • The s-stage Gauss method, s = 1,2,3;

  • The PHBVM(4,s), s = 1,2, and PHBVM(6,3) methods, which become soon energy-conserving, as the value of n is increased.

The obtained results are summarized in Table 1, where we have denoted by ey and eH the error in the solution and in the Hamiltonian after one period, respectively. Their numerical rate of convergence is also reported, along with the mean blended iterations (58)–(59) per time-step (it), in order to obtain convergence within full machine accuracy, and the execution time in sec. From the listed results, one infers that:

  • As is striking clear, the higher order methods are much more efficient than the lower order ones, especially when a high accuracy is required;

  • The theoretical rate of convergence for both the solution and the Hamiltonian errors is that we expected (for PHBVMs, until the Hamiltonian error falls within the round-off error level);

  • For a fixed time-step h, the numerical solutions provided by the Gauss methods and by the corresponding PHBVM method have a comparable accuracy, despite the negligible Hamiltonian error of these latter methods;

  • The execution times of the PHBVM methods are about double than those of the corresponding Gauss methods, even though the mean number of blended iterations per time-step is practically the same (this latter, decreasing with the time-step h, and slightly increasing with s).

As a result, one would conclude that the conservation of the Hamiltonian apparently gains no practical advantage. However, this conclusion is readily confuted if we look at the error growth in the Hamiltonian and in the solution. In fact, in Fig. 2 there is the plot of the Hamiltonian error (left plot) and the solution error (right plot) by using the 3-stage Gauss method and the PHBVM(6,3) method with time-step h = T/100 over 100 periods. As one may see, now it is clear that the 3-stage Gauss method exhibits a numerical drift in the energy, unlike PHBVM(6,3). As a result, this latter method exhibits a linear error growth, whereas the former one has a quadratic error growth.

Fig. 1
figure 1

Solution of problem (62)

Table 1 Results for problem (62)
Fig. 2
figure 2

Hamiltonian error (left plot) and solution error (right plot) when solving problem (62) with time-step h = T/100 over 100 periods

Example 2

The second example, taken from [20], is given by:

$$ \begin{array}{@{}rcl@{}} \dot y &=& \left( \begin{array}{ccc} 0 & y_{1}y_{2} & y_{1}y_{3}\\ -y_{1}y_{2} &0 & -y_{2}y_{3}\\ -y_{1}y_{3} & y_{2}y_{3} & 0 \end{array}\right) \nabla H(y),\\ H(y) &=& a\left( \ln y_{1} -\frac{y_{1}}{y_{1}^{*}}\right) +b\left( \ln y_{2} -\frac{y_{2}}{y_{2}^{*}}\right) + c\left( \ln y_{3} -\frac{y_{3}}{y_{3}^{*}}\right),\\ C(y) &=& -\ln y_{1} -\ln y_{2} +\ln y_{3}, \end{array} $$
(63)

with

$$ a=1, \qquad b=2, \qquad c=3, \qquad y_{1}^{*}=1, \qquad y_{2}^{*}=10,\qquad y_{3}^{*} = 50,\qquad y(0) = \left( 1, 1, 1\right)^{\top}, $$

whose solution, which is periodic of period T ≈ 2.143610709155912, is depicted in Fig. 3.

Fig. 3
figure 3

Solution of problem (63)

At first, we compare the same methods used for the previous example, again with time-step h = T/n. The obtained results for the s-stage Gauss and PHBVM methods are listed in Table 2: the conclusions that one can derive from them are similar to those driven from Table 1 for the previous example, with the additional remark that now the Casimir C(y) is not conserved.Footnote 9 This fact, in turn, produces the results depicted in the two plots in Fig. 4, concerning the application of the Gauss-3 and PHBVM(6,3) methods for solving the problem with time-step h = T/100 over 100 periods. From the two plots, one infers that both methods exhibit a drift in the Casimir, whereas only Gauss-3 exhibits a drift in the Hamiltonian, too (left plot); however, both methods exhibit a quadratic error growth in the solution, despite the fact that PHBVM(6,3) conserves the Hamiltonian. For this reason, in Table 3 we list the obtained results by using the EHBVM(4,1), EHBVM(4,2), and EHBVM(6,3) methods for solving problem (63), by using the same time-steps considered for obtaining the results of Table 2. As one may see, now the conservation of the Casimir is soon obtained, as the time-step is decreased, besides that of the Hamiltonian, with a computational cost perfectly comparable to that of the corresponding PHBVM method. The conservation of both invariants, in turn, allows to recover a linear error growth in the numerical solution, as is shown in the plot of Fig. 5.

Table 2 Results for problem (63)
Fig. 4
figure 4

Hamiltonian and Casimir errors (left plot) and solution error (right plot) when solving problem (63) with time-step h = T/100 over 100 periods with the 3-stage Gauss and PHBVM(6,3) methods

Table 3 Further results for problem (63)
Fig. 5
figure 5

Hamiltonian, Casimir, and solution errors when solving problem (63) with time-step h = T/100 over 100 periods with the EPHBVM(6,3) method

6 Conclusions

In this paper we have presented a class of energy-conserving line integral methods for Poisson problems. In the case where the problem is Hamiltonian, these methods reduce to the class of Hamiltonian Boundary Value Methods (HBVMs), which are energy-conserving methods for such problems. Consequently, the new methods can be regarded as an extension of HBVMs for Poisson (not Hamiltonian) problems, which we called PHBVMs. Moreover, a further enhancement of such methods (EPHBVMs) allows to obtain the conservation of Casimirs, too. A thorough analysis of the methods has been carried out, confirmed by a couple of numerical tests. As a further direction of investigation, we mention the study of the application of the methods for solving highly oscillatory Poisson problems, similarly as done with HBVMs in the Hamiltonian case [27,28,29].