1 Introduction

The time integration of the equations of motion of charged particles is a crucial step in particle methods of plasma physics [1]. In the strong magnetic field regime, the charged particles exhibit very fast rotations of small radius around a guiding center. This often imposes stringent restriction on the time step size for numerical integrators. There are many works aiming at designing large-stepsize integrators with good accuracy for the charged-particle dynamics, such as [3, 4, 7, 11, 12, 14]. Among them, a Boris-type integrator with appropriate modifications shows striking numerical results [14], and rigorous analysis is provided in [10]. It is proved that the position and the parallel velocity are approximated with \(O(h^2)\) accuracy for the modified Boris algorithm with large step sizes \(h^2\sim \varepsilon \) for fixed \(T=O(1)\), where \(\varepsilon \ll 1\) is a small parameter whose inverse corresponds to the strength of the magnetic field.

In this paper, we are interested in analyzing the long time behavior (over \(O(\varepsilon ^{-1})\)) of the modified Boris algorithm in a toroidal axi-symmetric geometry, with a magnetic field everywhere toroidal and an electric field everywhere orthogonal to the magnetic field. This geometry has already been proposed in [5], and a first order description of the slow dynamics for the continuous case is derived. Here, we will use a different technique of modulated Fourier expansions [9], which has recently been used for charged-particle dynamics in a strong magnetic field [6,7,8, 10, 13], to derive the guiding center drifts of the exact solution in such toroidal geometry. Since this technique can be extended to numerical discretization equally, the long-term analysis of the modified Boris algorithm is also performed.

In Sect. 2, we formulate the equations of motion in a strongly non-uniform strong magnetic field, describe the toroidal axi-symmetric geometry, and introduce the modified Boris scheme. In Sect. 3, we state the main results of this paper: Theorem 3.1 states the slow drift motion over \(O(\varepsilon ^{-1})\) in toroidal geometry for the continuous system, and Theorem 3.2 states the long-time accuracy of the modified Boris algorithm. The numerical experiments are presented to illustrate the theoretical results. In Sect. 4, we give the proofs for our main results.

2 Setting

2.1 Charged-particle dynamics in toroidal geometry

We consider the differential equation that describes the motion of a charged particle (with unit mass and charge) under a magnetic and electric field,

$$\begin{aligned} \ddot{x} = \dot{x} \times B(x) + E(x), \end{aligned}$$
(2.1)

where \(x(t)\in {\mathbb {R}}^3\) is the position at time t, \(v(t)=\dot{x}(t)\) is the velocity, B(x) is the magnetic field and E(x) is the electric field. B and E can be expressed via the vector potential \(A(x)\in {{\mathbb {R}}}^3\) and the scalar potential \(\phi (x)\in {{\mathbb {R}}}\) as \(B(x) = \nabla \times A(x)\) and \(E(x) = - \nabla \phi (x)\). Here we are interested in the situation of a strong magnetic field

$$\begin{aligned} B(x) = B_\varepsilon (x)= \frac{1}{\varepsilon }\, B_1(x), \quad \ 0<\varepsilon \ll 1, \end{aligned}$$
(2.2)

where \(B_1\) is smooth and independent of the small parameter \(\varepsilon \), with \(|B_1(x)|\ge 1\) for all x. The initial values \((x(0),\dot{x}(0))\) are bounded independently of \(\varepsilon \): for some constants \(M_0,M_1\),

$$\begin{aligned} |x(0)| \le M_0, \quad \ |\dot{x}(0)| \le M_1. \end{aligned}$$
(2.3)

In this paper, we consider the toroidal axi-symmetric geometry (Fig. 1), referred to as the geometry introduced in [5]. To be specific, we fix a unitary vector \({\mathrm e}_z\), and any vector \(x\in {\mathbb {R}}^3\), it can be expressed as

$$\begin{aligned} x=r(x)\,{\mathrm e}_r(x)+z(x)\,{\mathrm e}_z \end{aligned}$$

with \(z(x)= {\mathrm e}_z^\top x\), \(r(x)=|{\mathrm e}_z\times x|\), and \({\mathrm e}_r(x)=(x-z(x)\,{\mathrm e}_z)/r(x)\). It is assumed that far from the axis \({\mathrm e}_z\) the magnetic field is stationary, toroidal, axi-symmetric and non vanishing, that is, for some \(r_0>0\), when \(r(x)\ge r_0\)

$$\begin{aligned} {\mathrm e}_\parallel (x)=\frac{{\mathrm e}_z\times x}{r(x)} \quad \text {and} \quad |B_1(x)|=b(r(x),z(x)) \end{aligned}$$
(2.4)

for some function b. The electric field satisfies \(E_\parallel (x)=0\) and E is axi-symmetric when \(r(x)\ge r_0\), that is,

$$\begin{aligned} E(x)=E_\perp (x)=E_r(r(x),z(x))\,{\mathrm e}_r(x)+E_z(r(x),z(x))\,{\mathrm e}_z. \end{aligned}$$
(2.5)

In our proofs, we assume that the functions b, \(E_r\), \(E_z\) and all their derivatives are bounded independently of \(\varepsilon \).

Fig. 1
figure 1

Toroidal geometry with \({\mathrm e}_r(x),{\mathrm e}_\parallel (x),{\mathrm e}_z\) the local frame and the magnetic field along \({\mathrm e}_\parallel (x)\)

It is noted that \(({\mathrm e}_r(x), {\mathrm e}_\parallel (x), {\mathrm e}_z)\) forms the orthonormal basis and

$$\begin{aligned} \begin{aligned} {\mathrm e}_r(x)= & {} \left( \frac{x_1}{r}, \frac{x_2}{r}, 0 \right) ^\top \quad {\mathrm e}_\parallel (x)=\left( -\frac{x_2}{r}, \frac{x_1}{r}, 0\right) ^\top \quad {\mathrm e}_z=(0,0,1)^\top \end{aligned} \end{aligned}$$

with \(r=\sqrt{x_1^2+x_2^2}\). The following relations are useful in our proof

$$\begin{aligned} \begin{aligned}&{\mathrm e}'_r(x)=\frac{1}{r(x)}{\mathrm e}_\parallel (x) {\mathrm e}_\parallel (x)^\top , \quad {\mathrm e}'_\parallel (x)=-\frac{1}{r(x)} {\mathrm e}_r(x) {\mathrm e}_\parallel (x)^\top \\&\nabla _x r(x)= {\mathrm e}_r(x), \quad B_1'(x)={\mathrm e}_\parallel (\nabla _x b)^\top -\frac{b(r(x),z(x))}{r(x)}{\mathrm e}_r(x) {\mathrm e}_\parallel (x)^\top , \end{aligned} \end{aligned}$$
(2.6)

where \('\) denotes the Jacobian of the functions considered and \(\nabla _x\) is the gradient.

2.2 Modified Boris method

The modified Boris method proposed in [14] is used to solve the charged-particle dynamics under a strong magnetic field with large stepsizes. Recently, the analysis of accuracy order of such a method for a general non-uniform strong magnetic field with a stepsize of \(h^2\sim \varepsilon \) until \(T=O(1)\) was provided in [10].

This algorithm has the following two-step formulation

$$\begin{aligned} \frac{x^{n+1}-2x^n+x^{n-1}}{h^2}=v^n \times B(x^n) + E(x^n)- \mu ^0\, \nabla |B|(x^n) \end{aligned}$$
(2.7)

with the initial magnetic moment

$$\begin{aligned} \mu ^0=\mu (x(0),\dot{x}(0))=\frac{1}{2}\frac{|\dot{x}(0)\times B(x(0))|^2}{|B(x(0))|^3}. \end{aligned}$$

The velocity is computed as

$$\begin{aligned} v^n = \frac{x^{n+1}-x^{n-1}}{2h}. \end{aligned}$$
(2.8)

The modified Boris method starts from modified initial values

$$\begin{aligned} x^0 = x(0), \quad v^0 = P_\parallel (x^0) \,\dot{x}(0), \end{aligned}$$
(2.9)

where \(P_\parallel (x^0)\) is the orthogonal projection onto the span of \(B(x^0)\). This means the component of the initial velocity orthogonal to the magnetic field is filtered out, i.e., \(v^0_\perp =P_\perp (x^0)v^0=0\) with \(P_\perp (x^0)=I-P_\parallel (x^0)\).

We note that the modified Boris method is identical to the standard Boris integrator for the modified electric field \(E_\textrm{mod}(x) = E(x)- \mu ^0 \,\nabla |B|(x) =-\nabla (\phi + \mu ^0 |B|)(x)\). It can be implemented as the common one-step formulation of the Boris algorithm [2].

3 Main results and numerical experiments

3.1 Main results

Introducing \(\tilde{r}(t), \tilde{z}(t), \tilde{v}(t)\) such that they are the solutions of the following initial-value problem for the slow differential equations

$$\begin{aligned} \begin{aligned} \frac{{\mathrm d}{\tilde{r}}}{{\mathrm d}t}&=-\varepsilon \frac{E_z(\tilde{r},\tilde{z})}{b(\tilde{r},\tilde{z})}+\varepsilon \frac{\mu ^0}{b(\tilde{r},\tilde{z})}\partial _z b(\tilde{r},\tilde{z}), \quad \tilde{r}(0)=r(x(0))\\ \frac{{\mathrm d}{\tilde{z}}}{{\mathrm d}t}&=\varepsilon \frac{\tilde{v}^2}{\tilde{r} b(\tilde{r},\tilde{z})}+\varepsilon \frac{E_r(\tilde{r},\tilde{z})}{b(\tilde{r},\tilde{z})}-\varepsilon \frac{\mu ^0}{b(\tilde{r},\tilde{z})}\partial _r b(\tilde{r}, \tilde{z}), \quad \tilde{z}(0)=z(x(0))\\ \frac{{\mathrm d}{\tilde{v}}}{{\mathrm d}t}&=\varepsilon \frac{\tilde{v}}{\tilde{r}}\left( \frac{E_z(\tilde{r}, \tilde{z})}{b(\tilde{r}, \tilde{z})}-\frac{\mu ^0}{b(\tilde{r}, \tilde{z})}\partial _z b(\tilde{r}, \tilde{z})\right) , \quad \tilde{v}(0)={\mathrm e}_\parallel (x(0))^\top \dot{x}(0), \end{aligned} \end{aligned}$$
(3.1)

we then have the following results.

Theorem 3.1

(Drift motion of the exact solution) Let x(t) be a solution of (2.1)–(2.3) with (2.4) and (2.5), which stays in a compact set K for \(0\le t\le c\varepsilon ^{-1}\) (with K and c independent of \(\varepsilon \)). We express the exact solution in toroidal coordinates as \(x(t)=r(x(t))\,{\mathrm e}_r(x(t))+z(x(t))\,{\mathrm e}_z\) with \(v_\parallel (t)= {\mathrm e}_\parallel (x(t))^\top \dot{x}(t)\) being the parallel velocity. Denote \(r(t)=r(x(t))\) and \(z(t)=z(x(t))\), then we have

$$\begin{aligned} |r(t)-\tilde{r}(t)|\le C\varepsilon , \ |z(t)-\tilde{z}(t)|\le C\varepsilon , \ |v_\parallel (t)-\tilde{v}(t)|\le C\varepsilon , \quad 0\le t \le c/\varepsilon . \end{aligned}$$

Here, the constant C is independent of \(\varepsilon \) and t with \(0\le t\le c/\varepsilon \), but depends on c and on bounds of derivatives of \(B_1\) and E on the compact set K.

Remark 3.1

A similar result is presented in Proposition 5.2 of [5]. Here, we offer an alternative proof of the modulated Fourier expansions. This proof can be extended to the analysis of numerical methods and enables us to obtain the following result.

For the numerical approximation, the nondegeneracy condition is needed as in [10]:

$$\begin{aligned}&\text {For }(x,v)\text { along the numerical trajectory, the linear maps} \nonumber \\&L_{x,v}:P_\perp (x){\mathbb {R}}^3 \rightarrow P_\perp (x){\mathbb {R}}^3, \quad z \mapsto z + \tfrac{1}{4} h^2\, P_\perp (x)\bigl (v \times B'(x)z\bigr ) \nonumber \\&\text {have an inverse that is bounded independently of }(x,v)\text { and of } \nonumber \\&h\text { and }\varepsilon \text { with }h^2/\varepsilon \le C_*. \end{aligned}$$
(3.2)

This determines an upper bound \(C_*\) on the ratio \(h^2/\varepsilon \).

Theorem 3.2

(Drift approximation by the numerical solution) Consider applying the modified Boris method to (2.1)–(2.3) with (2.4) and (2.5) and with modified initial values (2.9) using a step size h with \(h^2\sim \varepsilon \), i.e.,

$$\begin{aligned} c_* \varepsilon \le h^2 \le C_* \varepsilon \end{aligned}$$

for some positive constants \(c_*\) and \(C_*\). Under the nondegeneracy condition (3.2) and provided that the numerical solution \(x^n=r(x^n)\,{\mathrm e}_r(x^n)+z(x^n)\,{\mathrm e}_z\) stays in a compact set K for \(0\le nh\le c\varepsilon ^{-1}\) (with K and c independent of \(\varepsilon \) and h), we have the following estimates:

$$\begin{aligned} \begin{aligned} |r(x^n)-\tilde{r}(t_n)|&\le Ch^2, \\ |z(x^n)-\tilde{z}(t_n)|&\le Ch^2, \quad 0\le t_n=nh\le c/\varepsilon ,\\ |v_\parallel ^n-\tilde{v}(t_n)|&\le Ch^2, \end{aligned} \end{aligned}$$

where \(v_\parallel ^n={\mathrm e}_\parallel (x^n)^\top v^n\) denotes the parallel component of the numerical velocity \(v^n\). Here, the constant C is independent of \(\varepsilon \) and h and n with \(0\le nh\le c/\varepsilon \), but depends on c, on bounds of derivatives of \(B_1\) and E on the compact set K and on \(c_*\) and \(C_*\).

Remark 3.2

This theorem shows that the modified Boris method reproduces the drift with an \(O(h^2)\) error over the time scale \(\varepsilon ^{-1}\), which is not an obvious result for large step sizes.

3.2 Numerical experiments

To illustrate the statement of the preceding subsection we consider the following electromagnetic fields

$$\begin{aligned} \begin{aligned} E(x)&=0.1 z(x) \, {\mathrm e}_r(x)+0.1 r(x)\, {\mathrm e}_z=0.1\left( \frac{x_1 x_3}{r}, \, \frac{x_2 x_3}{r},\, r\right) ^\top , \\ B(x)&=\frac{r(x)+z^2(x)}{\varepsilon }{\mathrm e}_\parallel (x)=\frac{r+x^2_3}{\varepsilon }\left( -\frac{x_2}{r},\, \frac{x_1}{r}, \, 0\right) ^\top , \end{aligned} \end{aligned}$$

with \(r=\sqrt{x_1^2+x_2^2}\).

Fig. 2
figure 2

Particle trajectories for \(t\le 1/\varepsilon \) with \(\varepsilon =10^{-3}\) as computed by the modified Boris with \(h=0.04\) (left) and by the Boris method with \(h=0.01\) (right)

Fig. 3
figure 3

Particle trajectories for \(t\le 1/\varepsilon \) with \(\varepsilon =10^{-3}\) projected onto the \(r-z\) plane as computed by the modified Boris with \(h=0.04\) (left) and by the Boris method with \(h=0.01\) (right)

Fig. 4
figure 4

The errors \(r(x^n)-r^{\text {ref}}\), \(z(x^n)-z^{\text {ref}}\) and \(v_\parallel ^n-v_\parallel ^{\text {ref}}\) as functions of time, along the numerical solution of the modified Boris algorithm with \(\varepsilon =10^{-3}\), \(T=5/\varepsilon \) and three different h

Figure 2 displays the trajectories computed by the standard Boris and modified Boris with final time \(T=1/\varepsilon \), initial position \(x(0)=(1/3,1/4,1/2)^\top \), and initial velocity \(\dot{x}(0)=(2/5, 2/3,1)^\top \). The projections of the computed particle trajectories onto the (rz) plane are shown in Fig. 3. Notably, the modified Boris method accurately predicts trajectories even with a large time step size of \(h=40\varepsilon \), while the standard Boris method yields incorrect drift motions with large time steps.

To verify the long-term error behavior of the modified Boris algorithm, we choose \(T=5/\varepsilon \) and initial values \(x(0)=(1,0,0)^\top \), \(\dot{x}(0)=(2/5, 2/3,1)^\top \). Figure 4 demonstrates the errors of r, z, and \(v_\parallel \) along the numerical solution with \(\varepsilon =10^{-3}\) and three different time steps \(h=0.16, 0.32, 0.64\), showing errors of size \(O(h^2)\) in accordance with our theoretical results. Figure 5 presents analogous results for \(\varepsilon =10^{-4}\) with \(h=0.08, 0.16, 0.32\). It is noted that all reference solutions were obtained using the standard Boris with a small time step size \(h=0.05\varepsilon \).

4 Proof of main results

The theorems will primarily be proved based on the modulated Fourier expansions for the exact and numerical solutions provided in [10]. In this section, we will derive the guiding center equations in toroidal geometry and express all the \(O(\varepsilon )\) terms explicitly.

Following [6], we diagonalize the linear map \(v\mapsto v\times B(x)\) and denote the eigenvalues as \(\lambda _1=\textrm{i}|B(x)|\), \(\lambda _0=0\), and \(\lambda _{-1}=-\textrm{i}|B(x)|\). The corresponding normalized eigenvectors are denoted by \(\nu _1(x), \, \nu _0(x), \, \nu _{-1}(x)\) and the orthogonal projections onto the eigenspaces are denoted by \(P_j(x)=\nu _j(x)\nu _j(x)^*\). It is noted that \(P_\parallel (x)=P_0(x)\) and \(P_\perp (x)=I-P_\parallel (x)=P_1(x)+P_{-1}(x)\).

4.1 Proof of Theorem 3.1

The proof is structured into three parts (a)-(c).

Fig. 5
figure 5

The errors \(r(x^n)-r^{\text {ref}}\), \(z(x^n)-z^{\text {ref}}\) and \(v_\parallel ^n-v_\parallel ^{\text {ref}}\) as functions of time, along the numerical solution of the modified Boris algorithm with \(\varepsilon =10^{-4}\), \(T=5/\varepsilon \) and three different h

(a) The equation of guiding center motion in Cartesian coordinates.

According to Theorem 4.1 of [6], it is known that the solution of (2.1)–(2.2) can be written as

$$\begin{aligned} x(t)=\sum _{|k|\le N-1}y^k(t)\,{\mathrm e}^{{\textrm{i}}k\varphi (t)/\varepsilon } + R_N(t), \qquad 0\le t \le c\varepsilon , \end{aligned}$$

where the phase function satisfies \(\dot{\varphi }(t)=|B_1(y^0(t))|\). The coefficient functions \(y^k(t)\) together with their derivatives (up to order N) are bounded as

$$\begin{aligned} y^k=O(\varepsilon ^{|k|}) \quad \forall |k|\le N-1 \end{aligned}$$

and further satisfy

$$\begin{aligned} \dot{y}^0 \times B_1(z^0)=O(\varepsilon ), \quad y^k_j =O(\varepsilon ^2) \quad \text {for} \quad |k|=1, j\ne k, \end{aligned}$$

where \(y^k(t)=y^k_1(t)+y^k_0(t)+y^k_{-1}(t)\) with \(y^k_j=P_j(y^0)y^k\). The remainder term and its derivative are bounded by

$$\begin{aligned} R_N(t)=O(\varepsilon ^N),\quad \dot{R}_N(t)=O(\varepsilon ^{N-1}). \end{aligned}$$

Similar to Theorem 4.1 of [10], we can divide the interval \([0, c/\varepsilon ]\) into small intervals of length \(O(\varepsilon )\). On each subinterval, we consider the above modulated Fourier expansion. This implies that x(t) can be expressed as the modulated Fourier expansion for longer time intervals

$$\begin{aligned} x(t)=\sum _{|k|\le N-1}y^k(t)\,{\mathrm e}^{{\textrm{i}}k\varphi (t)/\varepsilon } + R_N(t), \qquad 0\le t \le \frac{c}{\varepsilon }, \end{aligned}$$
(4.1)

where \(y^k(t)\) are piecewise continuous functions with jumps of size \(O(\varepsilon ^{N})\) at integral multiples of \(\varepsilon \), and they are smooth elsewhere. The sizes of the coefficients and remainder term remain consistent as before.

After inserting (4.1) into the continuous system and comparing the coefficients of \({\mathrm e}^{\textrm{i}k\varphi (t)/\varepsilon }\), we obtain the differential equations for \(y^k(t)\). Specifically, for \(k=0\) and \(k=\pm 1\), we have

$$\begin{aligned} \begin{aligned} \ddot{y}^0=\ {}&\dot{y}^0\times B(y^0)+E(y^0)+\underbrace{2\textrm{Re}\left( \textrm{i}|B|y^1\times B'(y^0)y^{-1}\right) }_{=:I}\\&+\underbrace{2\textrm{Re}\left( \dot{y}_1^1\times B'(y^0)y^{-1}_{-1}\right) }_{=:II}+O(\varepsilon ^2),\\ \end{aligned} \end{aligned}$$
(4.2)

and

$$\begin{aligned} \begin{aligned} \pm 2\textrm{i}\frac{\dot{\varphi }}{\varepsilon }\dot{y}^{\pm 1}+\left( \pm \textrm{i}\frac{\ddot{\varphi }}{\varepsilon }-\frac{\dot{\varphi }^2}{\varepsilon ^2}\right) y^{\pm 1}=&\left( \dot{y}^{\pm 1}\pm \textrm{i}\frac{\dot{\varphi }}{\varepsilon }y^{\pm 1}\right) \times B(y^0)\\&+\dot{y}^0\times B'(y^0)y^{\pm 1}_{\pm 1}+O(\varepsilon ). \end{aligned} \end{aligned}$$
(4.3)

From (4.2), we can straightforwardly derive several slow drifts for \(P_\perp \dot{y}^0\) (see Remark 4.3 of [10]). Then, the guiding center \(y^0(t)\) satisfies

$$\begin{aligned} \dot{y}^0=P_{\parallel }\dot{y}^0+\frac{1}{|B|}P_{\parallel }\dot{y}^0\times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}+\frac{1}{|B|^2}\left( E-\mu ^0\,\nabla |B|\right) \times B+O(\varepsilon ^2), \end{aligned}$$
(4.4)

with \(B, \nabla |B|, P_\parallel ={\mathrm e}_\parallel {\mathrm e}_\parallel ^\top \) and E evaluated at the guiding center \(y^0\). The initial value of \(y^0\) is

$$\begin{aligned} y^0(0)= x(0)+ \frac{\dot{x}(0)\times B(x(0))}{|B(x(0))|^2}+O(\varepsilon ^2).\\ \end{aligned}$$

(b) The equations in toroidal geometry.

In the toroidal geometry, \(y^0(t)\) can be expressed as

$$\begin{aligned} y^0=r(y^0){\mathrm e}_r(y^0)+z(y^0){\mathrm e}_z=:r^0{\mathrm e}_r(y^0)+z^0 {\mathrm e}_z. \end{aligned}$$

— Multiplying (4.4) with \({\mathrm e}_r^\top ={\mathrm e}_r(y^0)^\top \) gives

$$\begin{aligned} {\mathrm e}_r^\top \dot{y}^0 =\frac{\varepsilon v^0_\parallel }{b} {\mathrm e}_r^\top \left( {\mathrm e}_\parallel \times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}\right) +\frac{\varepsilon }{b} {\mathrm e}_r^\top \left( (E-\mu ^0\,\nabla b)\times {\mathrm e}_\parallel \right) +O(\varepsilon ^2), \end{aligned}$$
(4.5)

where \({\mathrm e}_r, {\mathrm e}_\parallel \) are evaluated at \(y^0\) and

$$\begin{aligned} v^0_\parallel := {\mathrm e}_\parallel ^\top \dot{y}^0. \end{aligned}$$
(4.6)

From (2.6), it is known that

$$\begin{aligned} \begin{aligned} \dot{{\mathrm e}}_r(y^0)&=\frac{v^0_\parallel }{r^0}{\mathrm e}_\parallel (y^0), \quad \dot{{\mathrm e}}_\parallel (y^0)=-\frac{v^0_\parallel }{r(y^0)} {\mathrm e}_r(y^0),\\ \ddot{{\mathrm e}}_\parallel (y^0)&=-\frac{{\mathrm d}}{{\mathrm d}t}\left( \frac{v^0_\parallel }{r^0}\right) {\mathrm e}_r(y^0)-\left( \frac{v^0_\parallel }{r^0}\right) ^2{\mathrm e}_\parallel (y^0). \end{aligned} \end{aligned}$$
(4.7)

Then the left-hand side of (4.5) can be expressed as

$$\begin{aligned} {\mathrm e}_r^\top \dot{y}^0 = \frac{{\mathrm d}}{{\mathrm d}t}( {\mathrm e}_r^\top y^0 ) - \dot{{\mathrm e}}_r^\top y^0=\frac{{\mathrm d}r^0}{{\mathrm d}t}, \end{aligned}$$

and the first term on the right-hand side of (4.5) vanishes since

$$\begin{aligned} {\mathrm e}_r^\top \left( {\mathrm e}_\parallel \times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t} \right) = -\frac{v^0_\parallel }{r} {\mathrm e}_r^\top ( {\mathrm e}_\parallel \times {\mathrm e}_r)=0. \end{aligned}$$

Using the fact that \({\mathrm e}_r\times {\mathrm e}_\parallel ={\mathrm e}_z, \, {\mathrm e}_z\times {\mathrm e}_\parallel =-{\mathrm e}_r\), \(E=E_r{\mathrm e}_r+E_z{\mathrm e}_z\) and \(\nabla b=\partial _r b \, {\mathrm e}_r+\partial _z b \, {\mathrm e}_z\), we obtain

$$\begin{aligned} {\mathrm e}_r^\top \left( (E-\mu ^0\,\nabla b(r^0, z^0))\times {\mathrm e}_\parallel \right) = -E_z+\mu ^0\partial _z b. \end{aligned}$$

Thus (4.5) is equivalent to

$$\begin{aligned} \frac{{\mathrm d}r^0}{{\mathrm d}t}=\frac{\varepsilon }{b}( -E_z+\mu ^0\partial _z b)+O(\varepsilon ^2), \end{aligned}$$

where the functions \(E_z, b, \partial _z b\) are evaluated at \((r^0,z^0)\). The initial value of \(r^0\) can be expressed as

$$\begin{aligned} r^0(0)={\mathrm e}_r(y(0))^\top y^0(0)={\mathrm e}_r(x(0))^\top x(0) + O(\varepsilon )=r(x(0))+O(\varepsilon ). \end{aligned}$$

— Multiplying (4.4) with \({\mathrm e}_z^\top \) gives

$$\begin{aligned} \begin{aligned} {\mathrm e}_z^\top \dot{y}^0 =\ {}&\frac{\varepsilon v^0_\parallel }{b(r^0, z^0)} {\mathrm e}_z^\top \left( {\mathrm e}_\parallel \times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}\right) \\&+\frac{\varepsilon }{b(r^0, z^0)} {\mathrm e}_z^\top \left( (E-\mu ^0\,\nabla b(r^0, z^0))\times {\mathrm e}_\parallel \right) +O(\varepsilon ^2). \end{aligned} \end{aligned}$$
(4.8)

Similarly, we have

$$\begin{aligned}{} & {} {\mathrm e}_z^\top \dot{y}^0 = \frac{{\mathrm d}}{{\mathrm d}t}({\mathrm e}_z^\top y^0)=\frac{{\mathrm d}z^0}{{\mathrm d}t}, \\{} & {} {\mathrm e}_z^\top \left( {\mathrm e}_\parallel \times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}\right) = -\frac{v^0_\parallel }{r^0}{\mathrm e}_z^\top ( {\mathrm e}_\parallel \times {\mathrm e}_r)=\frac{v^0_\parallel }{r^0}, \end{aligned}$$

and

$$\begin{aligned} {\mathrm e}_z^\top \left( (E-\mu ^0\,\nabla b)\times {\mathrm e}_\parallel \right) = E_r-\mu ^0\partial _r b, \end{aligned}$$

then (4.8) can be expressed as

$$\begin{aligned} \frac{{\mathrm d}z^0}{{\mathrm d}t}=\varepsilon \frac{(v^0_\parallel )^2}{r^0 b}+\varepsilon \frac{E_r}{b}-\varepsilon \frac{\mu ^0}{b}\partial _r b+O(\varepsilon ^2), \end{aligned}$$

where the functions \(E_r, b, \partial _r b\) are evaluated at \((r^0,z^0)\). The initial value of \(z^0\) is

$$\begin{aligned} z^0(0)={\mathrm e}_z^\top y^0(0)={\mathrm e}_z^\top x(0) + O(\varepsilon )=z(x(0))+O(\varepsilon ). \end{aligned}$$

— By the definition of (4.6) we can derive the equation for \(v^0_\parallel \)

$$\begin{aligned} \frac{{\mathrm d}}{{\mathrm d}t}v^0_\parallel =\frac{{\mathrm d}}{{\mathrm d}t} ({\mathrm e}_\parallel ^\top \dot{y}^0)= \dot{{\mathrm e}}_\parallel ^\top \dot{y}^0+ {\mathrm e}_\parallel ^\top \ddot{y}^0. \end{aligned}$$
(4.9)

The first term on the right-hand side is

$$\begin{aligned} \dot{{\mathrm e}}_\parallel ^\top \dot{y}^0=-\frac{v^0_\parallel }{r^0}\frac{{\mathrm d}r^0}{{\mathrm d}t} \end{aligned}$$
(4.10)

using (4.7). In the following we will demonstrate that the second term \({\mathrm e}_\parallel ^\top \ddot{y}^0\) is of size \(O(\varepsilon ^2)\).

Considering the expression of \(B'\) given in (2.6), it is evident that \( B'(y^0)y^{k}_{\pm 1}\) is parallel to \({\mathrm e}_\parallel \). Consequently, \( {\mathrm e}_\parallel ^\top II=0\) and

$$\begin{aligned} {\mathrm e}_\parallel ^\top I= {\mathrm e}_\parallel ^\top \left( 2 \, \textrm{Re}(\textrm{i}|B|y_1^1\times B'(y^0)y_0^{-1}\right) +O(\varepsilon ^2). \end{aligned}$$

The algebraic equation of \(y_0^{\pm 1}\) can be derived by applying \(P_\parallel (y^0)\) to Eq. (4.3)

$$\begin{aligned} \pm 2\textrm{i}\frac{\dot{\varphi }}{\varepsilon }P_\parallel \dot{y}^{\pm 1}+\left( \pm \textrm{i}\frac{\ddot{\varphi }}{\varepsilon }-\frac{\dot{\varphi }^2}{\varepsilon ^2}\right) y_0^{\pm 1}=P_\parallel \left( \dot{y}^0\times B'(y^0)y^{\pm 1}_{\pm 1}\right) +O(\varepsilon ). \end{aligned}$$

The dominant term is \(-\dot{\varphi }^2/\varepsilon ^2 y_0^{\pm 1}\). Using that \(B'(y^0)y^{k}_{\pm 1}\) is parallel to \({\mathrm e}_\parallel \), we have \(P_\parallel \left( \dot{y}^0\times B'(y^0)y^{\pm 1}_{\pm 1}\right) =0\). Hence we obtain the following relation for \(y_0^{\pm 1}\)

$$\begin{aligned} \begin{aligned} y_0^{\pm 1}&=\pm 2\textrm{i}\frac{\varepsilon }{\dot{\varphi }}P_\parallel \dot{y}^{\pm 1}+O(\varepsilon ^3)\\&=\pm 2\textrm{i}\frac{\varepsilon }{\dot{\varphi }}\dot{y}^{\pm 1}_0 \mp 2\textrm{i}\frac{\varepsilon }{\dot{\varphi }}\dot{P}_\parallel y^{\pm 1}_{\pm 1}+O(\varepsilon ^3). \end{aligned} \end{aligned}$$

By differential and substitution, the first term on the right-hand side of the above equation can be eliminated. Using (4.7), we obtain

$$\begin{aligned} \begin{aligned} y_0^{\pm 1}&=\mp 2\textrm{i}\frac{\varepsilon }{\dot{\varphi }}(\dot{{\mathrm e}}_\parallel {\mathrm e}^\top _\parallel +{\mathrm e}_\parallel \dot{{\mathrm e}}^\top _\parallel ) y^{\pm 1}_{\pm 1}+O(\varepsilon ^3)\\&=\pm 2\textrm{i}\frac{\varepsilon }{\dot{\varphi }}\frac{v^0_\parallel }{r^0}({{\mathrm e}}^\top _r y^{\pm 1}_{\pm 1}){\mathrm e}_\parallel +O(\varepsilon ^3).\\ \end{aligned} \end{aligned}$$

Denoting \(y^{1}_{1}=\zeta \,\nu _1\) and \(y^{-1}_{-1}=\bar{\zeta }\,\nu _{-1}\), with \(\nu _{\pm 1}=({\mathrm e}_z \pm \textrm{i}\,{\mathrm e}_r(y^0))/{\sqrt{2}}\), and substituting it into the above equation yields \(y^{1}_0=\eta \, {\mathrm e}_\parallel +O(\varepsilon ^3)\) and \(y^{-1}_0=\bar{\eta }\,{\mathrm e}_\parallel +O(\varepsilon ^3)\), where \(\eta =-\varepsilon (\sqrt{2}{ v^0_\parallel }/{\dot{\varphi }r^0}) \zeta \). Then we have

$$\begin{aligned} \begin{aligned} 2\,\textrm{Re}\left( \textrm{i}|B|y_1^1\times B'(y^0)y_0^{-1}\right) =&\sqrt{2}\,\textrm{Re}\left( \textrm{i}|B| \zeta \bar{\eta }({\mathrm e}_z\times B'(y^0){\mathrm e}_\parallel +\textrm{i}\,{\mathrm e}_r\times B'(y^0){\mathrm e}_\parallel )\right) \\&+O(\varepsilon ^2)\\ =&-\sqrt{2}\,|B| \zeta \bar{\eta } \ {\mathrm e}_r\times B'(y^0){\mathrm e}_\parallel +O(\varepsilon ^2). \end{aligned} \end{aligned}$$

From the expression of \(B'\) in (2.6), we know that \(B'(y^0){\mathrm e}_\parallel \) is the combination of \({\mathrm e}_\parallel \) and \({\mathrm e}_r\), and thus \({\mathrm e}_\parallel ^\top ( {\mathrm e}_r\times B'(y^0){\mathrm e}_\parallel )=0\). This means

$$\begin{aligned} {\mathrm e}_\parallel ^\top \ddot{y}^0= {\mathrm e}_\parallel ^\top I +O(\varepsilon ^2)=O(\varepsilon ^2), \end{aligned}$$

and (4.9) is equivalent to

$$\begin{aligned} \frac{{\mathrm d}}{{\mathrm d}t}v^0_\parallel =-\frac{v^0_\parallel }{r^0}\frac{{\mathrm d}r^0}{{\mathrm d}t}+ O(\varepsilon ^2). \end{aligned}$$

The initial value of \(v^0_\parallel \) is

$$\begin{aligned} v^0_\parallel (0)={\mathrm e}_\parallel (y^0(0))^\top \dot{y}^0(0)= {\mathrm e}_\parallel (x(0))^\top \dot{x}(0) + O(\varepsilon ). \end{aligned}$$

(c) From short to long time intervals

Denoting by \(y^{0,[n]}, r^{0,[n]}, z^{0,[n]}, v^{0,[n]}_\parallel \) the functions \(y^{0}, r^0, z^0, v^0_\parallel \) on time interval \(n\varepsilon \le t\le (n+1)\varepsilon \), from (b), it is known that these coefficients satisfy the following equations

$$\begin{aligned} \begin{aligned} \frac{{\mathrm d}r^{0,[n]}}{{\mathrm d}t}&=-\varepsilon \frac{E_z}{b}+\varepsilon \frac{\mu ^0}{b}\partial _z b+O(\varepsilon ^2), \\ \frac{{\mathrm d}z^{0,[n]}}{{\mathrm d}t}&=\varepsilon \frac{(v^{0,[n]}_\parallel )^2}{b r^{0,[n]}}+\varepsilon \frac{E_r}{b}-\varepsilon \frac{\mu ^0}{b}\partial _r b+O(\varepsilon ^2),\\ \frac{{\mathrm d}v^{0,[n]}_\parallel }{{\mathrm d}t}&=\varepsilon \frac{v^{0,[n]}_\parallel }{r^{0,[n]}}\left( \frac{E_z}{b}-\frac{\mu ^0}{b}\partial _z b\right) +O(\varepsilon ^2), \end{aligned} \end{aligned}$$
(4.11)

with \(E_r, E_z, b, \partial _r b, \partial _z b\) evaluated at \((r^{0,[n]},z^{0,[n]})\) and following initial values

$$\begin{aligned} \begin{aligned} r^{0,[n]}(n\varepsilon )&=r(x(n\varepsilon ))+O(\varepsilon )\\ z^{0,[n]}(n\varepsilon )&=z(x(n\varepsilon ))+O(\varepsilon )\\ v^{0,[n]}_\parallel (n\varepsilon )&=\langle \dot{x}(n\varepsilon ), {\mathrm e}_\parallel (x(n\varepsilon )) \rangle + O(\varepsilon ). \end{aligned} \end{aligned}$$

From Eq. (4.1), on every time interval, we have

$$\begin{aligned} x(t)=y^{0,[n]}(t)+ O(\varepsilon ), \ v_\parallel (t)=v_\parallel ^{0,[n]}(t)+ O(\varepsilon ), \ n\varepsilon \le t \le (n+1)\varepsilon , \end{aligned}$$

and thus

$$\begin{aligned} r(t)=r^{0,[n]} + O(\varepsilon ), \ z(t)=z^{0,[n]} + O(\varepsilon ), \ v_\parallel (t)=v_\parallel ^{0,[n]} + O(\varepsilon ), \ n\varepsilon \le t \le (n+1)\varepsilon . \end{aligned}$$

In view of the factor \(\varepsilon \) in front of the right hand side of the differential Eqs. (3.1) and (4.11), we have

$$\begin{aligned} r^{0,[0]}(t)-\tilde{r}(t)=O(\varepsilon ), \ z^{0,[0]}(t)-\tilde{z}(t)=O(\varepsilon ), \ v^{0,[0]}_\parallel (t)-\tilde{v}(t)=O(\varepsilon ), \ 0\le t \le \frac{c}{\varepsilon }. \end{aligned}$$

Since \(y^{0,[n-1]}(n\varepsilon )=y^{0,[n]}(n\varepsilon )+O(\varepsilon ^N), \dot{y}^{0,[n-1]}(n\varepsilon )=\dot{y}^{0,[n]}(n\varepsilon )+O(\varepsilon ^{N-1})\), we have

$$\begin{aligned} \begin{aligned} r^{0,[n-1]}(n\varepsilon )&=r^{0,[n]}(n\varepsilon )+O(\varepsilon ^N)\\ z^{0,[n-1]}(n\varepsilon )&=z^{0,[n]}(n\varepsilon )+O(\varepsilon ^N)\\ v_\parallel ^{0,[n-1]}(n\varepsilon )&=v_\parallel ^{0,[n]}(n\varepsilon )+O(\varepsilon ^{N-1}). \end{aligned} \end{aligned}$$

In view of the factor \(\varepsilon \) in front of the right hand side of the (4.11), we have

$$\begin{aligned} \begin{aligned} r^{0,[n]}(t)-r^{0,[n-1]}(t)&=O(\varepsilon ^N), \\ z^{0,[n]}(t)-z^{0,[n-1]}(t)&=O(\varepsilon ^N), \quad n\varepsilon \le t\le c/\varepsilon .\\ v_\parallel ^{0,[n]}(t)-v_\parallel ^{0,[n-1]}(t)&=O(\varepsilon ^{N-1}), \end{aligned} \end{aligned}$$

With the above estimates, we obtain, for \(n\varepsilon \le t\le (n+1)\varepsilon \le c/\varepsilon \)

$$\begin{aligned} \begin{aligned} r(t)-\tilde{r}(t)&=r(t)-r^{0,[n]}+\sum _{j=1}^{n} \left( r^{0,[j]}(t)-r^{0,[j-1]}(t)\right) + r^{0,[0]}(t)-\tilde{r}(t)\\&=O(\varepsilon )+O(n\varepsilon ^N)+O(\varepsilon )=O(\varepsilon ), \\ z(t)-\tilde{z}(t)&=z(t)-z^{0,[n]}+\sum _{j=1}^{n} \left( z^{0,[j]}(t)-z^{0,[j-1]}(t)\right) + z^{0,[0]}(t)-\tilde{z}(t)\\&=O(\varepsilon )+O(n\varepsilon ^N)+O(\varepsilon )=O(\varepsilon ), \\ v_\parallel (t)-\tilde{v}(t)&=v_\parallel (t)-v_\parallel ^{0,[n]}+\sum _{j=1}^{n} \left( v_\parallel ^{0,[j]}(t)-v_\parallel ^{0,[j-1]}(t)\right) + v_\parallel ^{0,[0]}(t)-\tilde{v}(t)\\&=O(\varepsilon )+O(n\varepsilon ^{N-1})+O(\varepsilon )=O(\varepsilon ), \\ \end{aligned} \end{aligned}$$

which is the stated result of Theorem 3.1.

4.2 Proof of Theorem 3.2

Similar to the proof of Theorem 3.1, we structure the proof into three parts.

(a) For a general strong magnetic field, the time interval of modulated Fourier expansion for numerical solution is validated over O(h). Using the uniqueness of the modulated Fourier expansion, we can patch together many short-time expansions in the same manner as done for the exact solution, thereby obtaining the expansion for longer time \(O(1/\varepsilon )\).

From Theorem 4.2 of [10], it is known that the numerical solution \(x^n\) given by the modified Boris algorithm (2.7)–(2.9) with a step size h satisfying

$$\begin{aligned} c_*\varepsilon \le h^2 \le C_*\varepsilon \end{aligned}$$

can be written as

$$\begin{aligned} x^n=y^0(t_n) + (-1)^n y^1(t_n) +R_N(t_n), \qquad t_n=nh \le c/\varepsilon , \end{aligned}$$
(4.12)

where \(y^0=O(1)\), \(y^1=O(h^2)\) are peicewise continuous with jumps of size \(O(h^N)\) at integral multiples of h and smooth elsewhere. They are unique up to \(O(h^N)\) and \(P_\perp (y^0)\dot{y}^0=O(h^2)\), \(P_0(y^0)y^1=O(h^4)\).

After inserting (4.12) into the numerical scheme (2.7), and separating the terms without \((-1)^n\), we obtain the equation for the guiding center \(y^0(t)\)

$$\begin{aligned} \begin{aligned} \ddot{y}^0+h^2\ddddot{y}^0+O(h^4)=\ {}&\bigl (\dot{y}^0+ h^2\dddot{y}^0+ O(h^4)\bigr )\times B(y^0)+E(y^0)\\&-\mu ^0\,\nabla |B|(y^0)-\underbrace{\dot{y}^1_\perp \times B'(y^0)y^1_\perp }_{=:III}+O(h^4), \end{aligned} \end{aligned}$$
(4.13)

where \(III=O(h^2)\) in our stepsize regime \(h^2\sim \varepsilon \). Taking the projection \(P_{\pm 1}=P_{\pm 1}(y^0)\) on both sides gives

$$\begin{aligned} \begin{aligned} P_{\pm 1}\ddot{y}^0+O(h^2)=&\pm \textrm{i}|B(y^0)|P_{\pm 1}(\dot{y}^0+ h^2\dddot{y}^0+ O(h^4))\\&+P_{\pm 1}\left( E(y^0)-\mu ^0\,\nabla |B|(y^0)\right) +O(h^2), \end{aligned} \end{aligned}$$

which means (recall that \(h^2\sim \varepsilon \))

$$\begin{aligned} P_{\pm 1}\dot{y}^0=-h^2P_{\pm 1}\dddot{y}^0\mp \frac{\textrm{i}}{|B|}P_{\pm 1}\ddot{y}^0\pm \frac{\textrm{i}}{|B|}P_{\pm 1}\left( E(y^0)-\mu ^0\,\nabla |B|(y^0)\right) +O(\varepsilon h^2). \end{aligned}$$

Denoting \(g_{\pm 1}=P_{\pm 1}\dot{y}^0\), we can express \(P_{\pm 1}\ddot{y}^0\) as \(P_{\pm 1}\ddot{y}^0=\dot{g}_{\pm 1}-\dot{P}_{\pm 1}\dot{y}^0\) and \(P_{\pm 1}\dddot{y}^0\) as \(P_{\pm 1}\dddot{y}^0=\ddot{g}_{\pm 1}-2\dot{P}_{\pm 1}\ddot{y}^0-{\ddot{P}}_{\pm 1}\dot{y}^0\). By differentiation and substitution, we can remove the derivatives of \(g_{\pm 1}\) resulting in

$$\begin{aligned} \begin{aligned} P_{\pm 1}\dot{y}^0=\&h^2(2\dot{P}_{\pm 1}\ddot{y}^0+{\ddot{P}}_{\pm 1}\dot{y}^0)\pm \frac{\textrm{i}}{|B|}\dot{P}_{\pm 1}\dot{y}^0\\&\pm \frac{\textrm{i}}{|B|}P_{\pm 1}\left( E(y^0)-\mu _0\nabla |B|(y^0)\right) +O(\varepsilon h^2). \end{aligned} \end{aligned}$$

Using the fact that \(\dot{P}_{1}+\dot{P}_{-1}+\dot{P}_{0}=0\), we obtain

$$\begin{aligned} \begin{aligned} \dot{y}^0 =&P_{\parallel }\dot{y}^0+P_{1}\dot{y}^0+P_{-1}\dot{y}^0\\ =&P_{\parallel }\dot{y}^0-h^2(2\dot{P}_\parallel \ddot{y}^0+{\ddot{P}}_\parallel \dot{y}^0)+\frac{1}{|B|}P_{\parallel }\dot{y}^0\times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}\\&+\frac{1}{|B|^2}\left( E-\mu ^0\,\nabla |B|\right) \times B+O(\varepsilon h^2) \end{aligned} \end{aligned}$$
(4.14)

with \(B, \nabla |B|, P_\parallel ={\mathrm e}_\parallel {\mathrm e}_\parallel ^\top \) and E evaluated at the guiding center \(y^0\). Comparing (4.14) with the guiding center equation (4.4) of the exact solution, we observe the presence of additional \(O(h^2)\) terms.

(b) Next, we derive the guiding center equation in toroidal geometry, where \(y^0(t)\) can be expressed as

$$\begin{aligned} y^0=r(y^0){\mathrm e}_r(y^0)+z(y^0){\mathrm e}_z=:r^0{\mathrm e}_r(y^0)+z^0{\mathrm e}_z. \end{aligned}$$

— Multiplying (4.14) with \({\mathrm e}_r^\top ={\mathrm e}_r(y^0)^\top \) gives

$$\begin{aligned} \begin{aligned} {\mathrm e}_r^\top \dot{y}^0 =&-2h^2 {\mathrm e}_r^\top \dot{P}_\parallel \ddot{y}^0-h^2{\mathrm e}_r^\top {\ddot{P}}_\parallel \dot{y}^0+\frac{\varepsilon v^0_\parallel }{b(r,z)} {\mathrm e}_r^\top \left( {\mathrm e}_\parallel \times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}\right) \\&+\frac{\varepsilon }{b(r,z)} {\mathrm e}_r^\top \left( (E-\mu ^0\,\nabla b(r,z))\times {\mathrm e}_\parallel \right) +O(\varepsilon h^2) \end{aligned} \end{aligned}$$
(4.15)

with \(v^0_\parallel :={\mathrm e}_\parallel ^\top \dot{y}^0\). Compared to (4.5), the only difference comes from the first two terms on the right hand side which we calculate in the following.

Multiplying (4.13) with \({\mathrm e}_\parallel ^\top ={\mathrm e}_\parallel (y^0)^\top \) gives

$$\begin{aligned} {\mathrm e}_\parallel ^\top \ddot{y}^0=O(h^2), \end{aligned}$$

then the first term on the right-hand side of (4.15) is

$$\begin{aligned} -2h^2{\mathrm e}_r^\top \dot{P}_\parallel \ddot{y}^0=-2h^2{\mathrm e}_r^\top (\dot{{\mathrm e}}_\parallel {\mathrm e}_\parallel ^\top \ddot{y}^0)=O(h^4). \end{aligned}$$

Using (4.7), the second term on the right-hand side of (4.15) can be expressed as

$$\begin{aligned} \begin{aligned} {\mathrm e}_r^\top {\ddot{P}}_\parallel \dot{y}^0&={\mathrm e}_r^\top ( \ddot{{\mathrm e}}_\parallel {\mathrm e}_\parallel ^\top \dot{y}^0)+2{\mathrm e}_r^\top ( \dot{{\mathrm e}}_\parallel \dot{{\mathrm e}}_\parallel ^\top \dot{y}^0)\\&=-v^0_\parallel \frac{{\mathrm d}}{{\mathrm d}t}\left( \frac{v^0_\parallel }{r^0}\right) +2\left( \frac{v^0_\parallel }{r^0}\right) ^2\frac{{\mathrm d}r^0}{{\mathrm d}t}\\&=-\frac{v^0_\parallel }{r^0}\frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t} + 3 \left( \frac{v^0_\parallel }{r^0}\right) ^2\frac{{\mathrm d}r^0}{{\mathrm d}t}. \end{aligned} \end{aligned}$$
(4.16)

Inserting

$$\begin{aligned} \frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t}=\frac{{\mathrm d}}{{\mathrm d}t} ({\mathrm e}_\parallel ^\top \dot{y}^0)= \dot{{\mathrm e}}_\parallel ^\top \dot{y}^0+ {\mathrm e}_\parallel ^\top \ddot{y}^0=-\frac{v^0_\parallel }{r^0}\frac{{\mathrm d}r^0}{{\mathrm d}t}+ O(h^2) \end{aligned}$$

into (4.16) gives

$$\begin{aligned} {\mathrm e}_r^\top {\ddot{P}}_\parallel \dot{y}^0=4\left( \frac{v^0_\parallel }{r^0}\right) ^2\frac{{\mathrm d}r^0}{{\mathrm d}t}+O(h^2). \end{aligned}$$

Then (4.15) can be expressed as

$$\begin{aligned} \left( 1+4h^2\left( \frac{v^0_\parallel }{r^0}\right) ^2\right) \frac{{\mathrm d}r^0}{{\mathrm d}t}=\frac{\varepsilon }{b} {\mathrm e}_r^\top \left( (E-\mu ^0\,\nabla b)\times {\mathrm e}_\parallel \right) +O(\varepsilon h^2), \end{aligned}$$

which yields

$$\begin{aligned} \frac{{\mathrm d}r^0}{{\mathrm d}t}=-\varepsilon \frac{E_z}{b}+\varepsilon \frac{\mu ^0}{b}\partial _z b+O(\varepsilon h^2), \end{aligned}$$

where the functions \(E_z, b, \partial _z b\) are evaluated at \((r^0,z^0)\). The initial value of \(r^0\) can be expressed as

$$\begin{aligned} r^0(0)=\langle y^0(0),{\mathrm e}_r(y(0))\rangle =\langle x^0,{\mathrm e}_r(x^0))\rangle + O(h^2)=r(x^0)+O(h^2). \end{aligned}$$

— Multiplying (4.14) with \({\mathrm e}_z^\top \) gives

$$\begin{aligned} \begin{aligned} {\mathrm e}_z^\top \dot{y}^0 =&-2h^2 {\mathrm e}_z^\top \dot{P}_\parallel \ddot{y}^0 -h^2 {\mathrm e}_z^\top {\ddot{P}}_\parallel \dot{y}^0+\frac{\varepsilon v_\parallel }{b(r,z)} {\mathrm e}_z^\top \left( {\mathrm e}_\parallel \times \frac{{\mathrm d}{\mathrm e}_\parallel }{{\mathrm d}t}\right) \\&+\frac{\varepsilon }{b(r,z)} {\mathrm e}_z^\top \left( (E-\mu ^0\,\nabla b(r,z))\times {\mathrm e}_\parallel \right) +O(\varepsilon h^2), \end{aligned} \end{aligned}$$
(4.17)

where the first two terms on the right-hand side vanish using (4.7) and the orthorgonality of \({\mathrm e}_z,{\mathrm e}_r,{\mathrm e}_\parallel \). Similar to the continuous case, we obtain

$$\begin{aligned} \frac{{\mathrm d}z^0}{{\mathrm d}t}=\varepsilon \frac{\left( v^0_\parallel \right) ^2}{r^0b}+\varepsilon \frac{E_r}{b}-\varepsilon \frac{\mu ^0}{b}\partial _r b+O(\varepsilon h^2), \end{aligned}$$

where the functions \(E_r, b, \partial _r b\) are evaluated at \((r^0,z^0)\). The initial value of \(z^0\) is

$$\begin{aligned} z^0(0)=\langle y^0(0),{\mathrm e}_z)\rangle =\langle x^0,{\mathrm e}_z)\rangle + O(h^2)=z(x^0)+O(h^2). \end{aligned}$$

— Finally, we need to derive the differential equation for \(v^0_\parallel \), which can be directly computed as in the continuous case

$$\begin{aligned} \frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t}=\frac{{\mathrm d}}{{\mathrm d}t}\left( {\mathrm e}_\parallel ^\top \dot{y}^0\right) = \dot{{\mathrm e}}_\parallel ^\top \dot{y}^0+ {\mathrm e}_\parallel ^\top \ddot{y}^0. \end{aligned}$$
(4.18)

The first term on the right-hand side is the same as (4.10). Multiplying (4.13) with \({\mathrm e}_\parallel ^\top ={\mathrm e}_\parallel (y^0)^\top \) yields

$$\begin{aligned} \begin{aligned} {\mathrm e}_\parallel ^\top \ddot{y}^0&=-h^2 {\mathrm e}_\parallel ^\top \ddddot{y}^0+ O(h^4)\\&=-h^2 \frac{{\mathrm d}^2}{{\mathrm d}t^2}({\mathrm e}_\parallel ^\top \ddot{y}^0)+2h^2 (\dot{{\mathrm e}}_\parallel ^\top \dddot{y}^0)+h^2 \ddot{{\mathrm e}}_\parallel ^\top \ddot{y}^0+ O(h^4), \end{aligned} \end{aligned}$$
(4.19)

where we use \( {\mathrm e}_\parallel ^\top (E-\mu ^0\,\nabla |B|) =0\) and \({\mathrm e}_\parallel ^\top III =0\). Since the derivatives of \(r^0\) are \(O(\varepsilon )\) and using (4.7), we have

$$\begin{aligned} \begin{aligned} \dot{{\mathrm e}}_\parallel ^\top \dddot{y}^0&=-\frac{v^0_\parallel }{r^0} {\mathrm e}_r^\top \dddot{y}^0=-\frac{v^0_\parallel }{r^0}\left( \dddot{r^0} - 3 \dot{{\mathrm e}}_r^\top \ddot{y}^0 - 3 \ddot{{\mathrm e}}_r^\top \dot{y}^0- \dddot{{\mathrm e}}_r^\top y^0 \right) \\&=\frac{v^0_\parallel }{r^0}\left( \frac{3v^0_\parallel }{r^0} {\mathrm e}_\parallel ^\top \ddot{y}^0 +2v^0_\parallel \frac{{\mathrm d}}{{\mathrm d}t}\left( \frac{v^0_\parallel }{r^0}\right) - r^0\frac{{\mathrm d}}{{\mathrm d}t}\left( \left( \frac{v^0_\parallel }{r^0}\right) ^2\right) \right) +O(\varepsilon )\\&=3\left( \frac{v^0_\parallel }{r^0}\right) ^2 {\mathrm e}_\parallel ^\top \ddot{y}^0+O(\varepsilon ) \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \ddot{{\mathrm e}}_\parallel ^\top \ddot{y}^0&=-{\mathrm e}_r^\top \ddot{y}^0\frac{{\mathrm d}}{{\mathrm d}t}\left( \frac{v^0_\parallel }{r^0}\right) -\left( \frac{v^0_\parallel }{r^0}\right) ^2 {\mathrm e}_\parallel ^\top \ddot{y}^0\\&=-\left( \frac{{\mathrm d}}{{\mathrm d}t}({\mathrm e}_r^\top \dot{y}^0)-\dot{{\mathrm e}}_r^\top \dot{y}^0\right) \frac{{\mathrm d}}{{\mathrm d}t}\left( \frac{v^0_\parallel }{r^0}\right) -\left( \frac{v^0_\parallel }{r^0}\right) ^2 {\mathrm e}_\parallel ^\top \ddot{y}^0\\&=\frac{\left( v^0_\parallel \right) ^2}{r^0}\frac{{\mathrm d}}{{\mathrm d}t}\left( \frac{v^0_\parallel }{r^0}\right) -\left( \frac{v^0_\parallel }{r^0}\right) ^2 {\mathrm e}_\parallel ^\top \ddot{y}^0+O(\varepsilon ), \end{aligned} \end{aligned}$$

then (4.19) can be written as

$$\begin{aligned} (1-5h^2(v^0_\parallel /r^0)^2) {\mathrm e}_\parallel ^\top \ddot{y}^0=h^2\left( \frac{v^0_\parallel }{r^0}\right) ^2\frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t}+ O(h^4). \end{aligned}$$

This gives

$$\begin{aligned} {\mathrm e}_\parallel ^\top \ddot{y}^0=h^2\left( \frac{v^0_\parallel }{r^0}\right) ^2\frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t}+ O(h^4). \end{aligned}$$

(4.18) now can be written as

$$\begin{aligned} (1-h^2(v^0_\parallel /r^0)^2)\frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t}=-\frac{v^0_\parallel }{r^0}\frac{{\mathrm d}r^0}{{\mathrm d}t}+ O(h^4), \end{aligned}$$

which gives

$$\begin{aligned} \frac{{\mathrm d}v^0_\parallel }{{\mathrm d}t}=-\frac{v^0_\parallel }{r^0}\frac{{\mathrm d}r^0}{{\mathrm d}t}+ O(h^4). \end{aligned}$$

The initial value of \(v^0_\parallel \) is

$$\begin{aligned} v^0_\parallel (0)=\langle \dot{y}^0(0), {\mathrm e}_\parallel (y^0(0))\rangle =\langle v^0,{\mathrm e}_\parallel (x^0) \rangle + O(h^2). \end{aligned}$$

(c) Denoting by \(y^{0,[n]}, r^{0,[n]}, z^{0,[n]}, v^{0,[n]}_\parallel \) the functions \(y^{0}, r^0, z^0, v^0_\parallel \) on the time interval \(nh\le t\le (n+1)h\), from (b), it is known that these coefficients satisfy the following equations

$$\begin{aligned} \begin{aligned} \frac{{\mathrm d}r^{0,[n]}}{{\mathrm d}t}&=-\varepsilon \frac{E_z}{b}+\varepsilon \frac{\mu ^0}{b}\partial _z b+O(\varepsilon h^2),\\ \frac{{\mathrm d}z^{0,[n]}}{{\mathrm d}t}&=\varepsilon \frac{(v^{0,[n]}_\parallel )^2}{b r^{0,[n]}}+\varepsilon \frac{E_r}{b}-\varepsilon \frac{\mu ^0}{b}\partial _r b+O(\varepsilon h^2),\\ \frac{{\mathrm d}v^{0,[n]}_\parallel }{{\mathrm d}t}&=\varepsilon \frac{v^{0,[n]}_\parallel }{r^{0,[n]}}\left( \frac{E_z}{b}-\frac{\mu ^0}{b}\partial _z b\right) +O(\varepsilon h^2), \end{aligned} \end{aligned}$$

with \(E_r, E_z, b, \partial _r b, \partial _z b\) evaluated at \((r^{0,[n]},z^{0,[n]})\) and following initial values

$$\begin{aligned} \begin{aligned} r^{0,[n]}(nh)&=r(x(nh))+O(h^2)\\ z^{0,[n]}(nh)&=z(x(nh))+O(h^2)\\ v^{0,[n]}_\parallel (nh)&=\langle \dot{x}(nh), {\mathrm e}_\parallel (x(nh)) \rangle + O(h^2). \end{aligned} \end{aligned}$$

By patching together the errors as was done for the continuous case, we prove that

$$\begin{aligned} r(x^n)-\tilde{r}(t_n){=}O(h^2), \ z(x^n)-\tilde{z}(t_n){=}O(h^2), \ v^n_\parallel -\tilde{v}(t_n){=}O(h^2), \quad 0\le t_n \le \frac{c}{\varepsilon }. \end{aligned}$$