1 Introduction

Integrating the equations of motion of charged particles is a fundamental computational task in particle methods of plasma physics, e.g. [2]. The standard numerical integrator for these computations is the Boris algorithm [3], which has the charm of simplicity and remarkable conservation properties [8, 17]. In the practically important situation of a strong magnetic field and moderate velocities, which will be considered in this paper, the particle trajectories show fast gyrorotations of small radius around a guiding centre [15].

To approximate the guiding centre motion, one approach—not considered here—is to integrate numerically the known but structurally complicated differential equations for the approximate guiding centre by a suitable numerical method [6].

In a different and arguably more efficient approach, a Boris-type integrator with appropriate modifications is applied to the original equations of motion of the charged particle with large step sizes that do not resolve the high-frequency oscillations. As the standard Boris algorithm used with large step sizes is known to produce numerical solutions with unphysically large gyroradius [16, 18], modifications to it are necessary. In the case of a near-uniform strong magnetic field, it suffices to filter out the normal component of the initial velocity [10], but the mere modification of initial values is not sufficient in a strongly non-uniform magnetic field. Xiao and Qin [22] recently proposed to additionally modify the electric field in a non-obvious way when using the Boris algorithm with large step sizes and showed striking numerical results, but no error analysis was given. It was then found that a very similar numerical method, with the same extra force term, was already proposed by Vu and Brackbill [19] (Method III) in 1995, motivated by Parker and Birdsall [16] on the large-stepsize behaviour of the Boris method; see in particular formula (10) in [16], based on the equations of guiding-centre motion as given by Northrop [15], which are at the origin of the extra force term added to the schemes in [19] and [22]. A very interesting approach to understanding such methods in terms of slow manifolds was recently given by Burby and Klotz [5] and Burby and Hirvijoki [4], but these papers deal with the exact flow on and near the slow manifold and do not clarify the behaviour of the numerical method for large step sizes.

The objective of the present paper is to give a rigorous analysis of the modified Boris algorithm of Xiao and Qin [22] for approximating the guiding centre motion of a charged particle in a strong non-uniform magnetic field taking large time steps that cover many periods of gyrorotation.

In Sect. 2 we formulate the general setting of charged-particle motion in a strongly non-uniform strong magnetic field, describe the Boris algorithm and its modification, and state our main result, Theorem 2.1, which yields second-order error bounds for the position and velocity of the guiding centre when approximated with the modified Boris method with large step sizes whose square exceeds the inverse of the strength of the magnetic field. In Sect. 3 we present results of numerical experiments that illustrate and complement the theory. In Sect. 4 we give modulated Fourier expansions of both the exact and the numerical solution. Their comparison yields the proof of Theorem 2.1.

2 Large-stepsize modified Boris method and its error bound

2.1 Setting

The motion of a charged particle (of unit mass and charge) in a magnetic and electric field is governed by the differential equation

$$\begin{aligned} \ddot{x} = \dot{x} \times B(x) + E(x), \end{aligned}$$
(2.1)

where \(x(t)\in {\mathbb {R}}^3\) is the position at time t, \(v(t)=\dot{x}(t)\) is the velocity, B is the magnetic field and E is the electric field. Here, \(B(x) = \nabla \times A(x)\) with a vector potential \(A(x)\in {\mathbb {R}}^3\) and \(E(x) = - \nabla \phi (x)\) with a scalar potential \(\phi (x)\in {\mathbb {R}}\), which we assume to be bounded from below. We are interested in the case of a strong magnetic field

$$\begin{aligned} B(x) = B_\varepsilon (x)= \frac{1}{\varepsilon }\, B_1(x), \quad \ 0<\varepsilon \ll 1, \end{aligned}$$
(2.2)

where \(B_1\) and its first two derivatives are bounded independently of the small parameter \(\varepsilon \), and \(|B_1(x)|\ge 1\) for all x. The derivatives of E(x) are also assumed to be bounded independently of \(\varepsilon \). (Note that in contrast to, e.g., [10, 11], it is not assumed here that derivatives of the strong magnetic field B are bounded independently of \(\varepsilon \). Here, the derivatives of B are of size \(O(\varepsilon ^{-1})\) as is B itself.)

The motion (2.1) and its approximation are to be studied over time intervals \(t\in [0,T]\) with fixed T independent of \(\varepsilon \), for initial values \((x(0),\dot{x}(0))\) that are bounded independently of \(\varepsilon \): for some constants \(M_0,M_1\),

$$\begin{aligned} |x(0)| \le M_0, \quad \ |\dot{x}(0)| \le M_1. \end{aligned}$$
(2.3)

Under these conditions, it is known that the magnetic moment

$$\begin{aligned} \mu (x,v) = \frac{1}{2}\frac{|v\times B(x)|^2}{|B(x)|^3}, \end{aligned}$$

which is of size \(O(\varepsilon )\) under our assumptions, is an adiabatic invariant [14, 15]: \(\mu (x(t),\dot{x}(t))\) is conserved up to \(O(\varepsilon ^2)\) over very long times \(t\le \varepsilon ^{-N}\) with arbitrary \(N>1\) [1, 9]. Here we consider (2.1) only over fixed times T that are independent of \(\varepsilon \).

2.2 Modified Boris method of Xiao and Qin [22]

The integrator for charge-particle dynamics (2.1) proposed in [22] is a remarkably simple but nontrivial modification of the Boris algorithm, with the objective to approximate the guiding centre of the particle motion with large step sizes \(h\gg \varepsilon \) without resolving the gyrorotations. In its two-step formulation the method computes the new position \(x^{n+1}\) as an approximation at time \(t_{n+1}=(n+1)h\) via

$$\begin{aligned} \frac{x^{n+1}-2x^n+x^{n-1}}{h^2}=v^n \times B(x^n) + E(x^n)- \mu ^0\, \nabla |B|(x^n) \end{aligned}$$
(2.4)

with the initial magnetic moment \(\mu ^0=\mu (x(0),\dot{x}(0))\) and the symmetric finite difference velocity approximation

$$\begin{aligned} v^n = \frac{x^{n+1}-x^{n-1}}{2h}. \end{aligned}$$
(2.5)

This differs from the original Boris method only in the addition of the extra term \(- \mu ^0\, \nabla |B|(x^n)\), which also appears in the differential equations for the guiding centre; see [15] and, e.g., Theorem 4.1 below. Note that \(\mu ^0\, \nabla |B|(x)=O(1)\) under our assumptions, whereas it is \(O(\varepsilon )\) under the stronger assumption of maximal ordering in [10, 11] and could therefore be ignored in the numerical methods of those papers.

The modified Boris method starts from modified initial values

$$\begin{aligned} x^0 = x(0), \quad v^0 = P_\parallel (x^0) \,\dot{x}(0), \end{aligned}$$
(2.6)

where \(P_\parallel (x^0)\) is the orthogonal projection onto the span of \(B(x^0)\). With \(P_\perp (x^0)=I-P_\parallel (x^0)\), we note that

$$\begin{aligned} v^0_\perp = P_\perp (x^0) v^0 = 0, \end{aligned}$$

i.e., the perpendicular component of the initial velocity has been filtered out.

Note that the modified Boris method (2.4) is not consistent with (2.1) as \(h\rightarrow 0\) and \(\varepsilon \) is fixed. It is identical to the standard Boris integrator for the modified force field

$$\begin{aligned} E_\textrm{mod}(x) = E(x)- \mu ^0 \,\nabla |B|(x) =-\nabla (\phi + \mu ^0 |B|)(x). \end{aligned}$$

The actual implementation uses the common one-step formulation of the Boris algorithm [3].

2.3 Large-stepsize error bound

For the following theorem, which is the main result of this paper, we need a nondegeneracy condition:

$$\begin{aligned}&\text {For }(x,v)\text { along the numerical trajectory, the linear maps} \nonumber \\&L_{x,v}:P_\perp (x){\mathbb {R}}^3 \rightarrow P_\perp (x){\mathbb {R}}^3, \quad z \mapsto z + \tfrac{1}{4} h^2\, P_\perp (x)\bigl (v \times B'(x)z\bigr )\\&\text {have an inverse that is bounded independently of }(x,v)\text { and of } \nonumber \\&h\text { and }\varepsilon \text { with }h^2/\varepsilon \le C_*\text {.} \nonumber \end{aligned}$$
(2.7)

This determines an upper bound \(C_*\) on the ratio \(h^2/\varepsilon \). We have the following large-stepsize error bound for the modified Boris method.

Theorem 2.1

Consider applying the modified Boris method to (2.1)–(2.3) with modified initial values (2.6) over a time interval \(0\le t\le T\) (with T independent of \(\varepsilon \)) using a step size h with \(h^2\sim \varepsilon \), i.e.,

$$\begin{aligned} c_* \varepsilon \le h^2 \le C_* \varepsilon \end{aligned}$$

for some positive constants \(c_*\) and \(C_*\). Under the nondegeneracy condition (2.7), the errors in position x and parallel velocity \(v_\parallel =P_\parallel (x) v\) (where \(P_\parallel (x)\) denotes the orthogonal projection onto the span of B(x)) at time \(t_n=nh\le T\) are bounded by

$$\begin{aligned} |x^n-x(t_n)|\le Ch^2, \quad |v_\parallel ^n-v_\parallel (t_n)|\le Ch^2,\quad \ |v^n_\perp | \le Ch^2, \end{aligned}$$

where C is independent of \(\varepsilon \), h and n with \(nh\le T\) (but depends on T, on bounds of derivatives of \(B_1\) and E, and on \(c_*\) and \(C_*\)).

Since x(t) and \(v_\parallel (t)\) are \(O(\varepsilon )\) close to the guiding centre at time t and its velocity, respectively, and since \(\varepsilon \sim h^2\) by assumption, Theorem 2.1 yields that the modified Boris method approximates the guiding centre motion with \(O(h^2)\) accuracy for step sizes h that are much larger than the gyroperiod \(2\pi /|B(x)| \sim \varepsilon \).

The proof of this theorem will be given in Sect. 4.

Remark 2.1

It is of interest to understand how the error bound changes when \(\varepsilon \ll h^2\) or \(h \gg \varepsilon \gg h^2\). An \(O(h^2)\) error bound still holds true for the less restrictive stepsize condition

$$\begin{aligned} c_* \varepsilon \le h^2 \le C_* \varepsilon ^\alpha \qquad \text {for } 0\le \alpha <1 \end{aligned}$$

for less strongly varying magnetic fields

$$\begin{aligned} B(x)=\frac{1}{\varepsilon }\, B_1(\varepsilon ^{1-\alpha } x). \end{aligned}$$

This can still be obtained with essentially the same proof, but we will not work out the lengthy yet conceptually straightforward details.

In the situation of \(h^2 \le \varepsilon \ll h\) given by

$$\begin{aligned} c_* \varepsilon ^\beta \le h^2 \le C_* \varepsilon \qquad \text {for } 1<\beta <2, \end{aligned}$$

an \(O(\varepsilon )\) error bound can be shown without an extra assumption on derivatives of B, again with essentially the same proof.

3 Numerical experiments

We show numerical results of the modified Boris method for two examples.

3.1 Tokamak example from [22]

We consider the motion of a charged particle in a tokamak geometry without electric field [22]. In Cartesian coordinates, the magnetic field is given as

$$\begin{aligned} B(x)=\Bigl (-\frac{2x_2+x_1 x_3}{2R^2},\ \frac{2x_1-x_2x_3}{2R^2}, \ \frac{R-1}{2R} \Bigr )^\top \quad \text { with } \ R=\sqrt{x_1^2+x_2^2}. \end{aligned}$$

Starting with the initial position \(x(0)=(1.05,0,0)^\top \) and the initial velocity \({\dot{x}}(0)=(2.1\times 10^{-3}, 4.3\times 10^{-4},0)^\top \), the orbit projected onto the \((R,x_3)\) plane is a banana orbit. The final time considered is \(T=3.75\times 10^4\). We note that upon rescaling time as \(t \rightarrow \varepsilon t\) with \(\varepsilon =10^{-3}\), the problem has the scaling of Sect. 2.1.

Figure 1 shows the trajectories computed by the standard Boris, standard Boris with projected initial velocity and the modified Boris algorithm. Two step sizes \(h=0.2\) and \(h=20\) are chosen. It is observed that when \(h=0.2\), the standard Boris shows the correct result while the gyroradius gets larger with larger step size. For \(h=20\) the numerical result is completely wrong. If we use standard Boris with \(v^0_\perp =0\), the gyroradius is always small, but the trajectory is not correct. A similar behaviour (not shown here) is observed also for the filtered methods of [10, 11], which were designed for a scaling (maximal ordering) where \(\mu \nabla |B|\) is \(O(\varepsilon )\) small. After adding the \(\mu ^0 \nabla |B|\) term, which is O(1) for the problem considered here, the modified Boris method shows correct results even for the large step size \(h=20\).

Fig. 1
figure 1

Banana orbits on the \(R-x_3\) plane computed by different methods with \(T=3.75\times 10^4\), \(h=0.2\) and \(h=20\). In the left picture, the scattered dots correspond to \(h=20\), whereas the thick-peeled banana corresponds to \(h=0.2\). In the centre and right pictures, the results for \(h=0.2\) and \(h=20\) are indistinguishable

3.2 Order of accuracy

We test the numerical accuracy of modified Boris algorithm with large time step by applying the scheme to the example in [9]. We have the electric field

$$\begin{aligned} E(x)=-\nabla \phi (x) \quad \text {with the potential }\ \phi (x)=x_1^3-x_2^3+\frac{1}{5}x_1^4+x_2^4+x_3^4, \end{aligned}$$

the magnetic field

$$\begin{aligned} B(x)=\frac{1}{2\varepsilon } \begin{pmatrix} x_2-x_3\\ x_1+x_3\\ x_2-x_1 \end{pmatrix}, \end{aligned}$$

and we take initial values

$$\begin{aligned} x(0)=(0.0, 1.0, 0.1)^\top ,\quad {\dot{x}}(0)=(0.09,0.55,0.3)^\top . \end{aligned}$$

Figure 2 shows the absolute errors in x, \(v_\parallel \) and \(v_\perp \) at time \(t=1\) versus \(\varepsilon \) for various h. It is observed that the errors in x and \(v_\parallel \) tend to a constant error level proportional to \(h^2\), which is in accordance with our theoretical result in Theorem 2.1.

Fig. 2
figure 2

Global error versus \(\varepsilon \) (\(\varepsilon =1/2^j,j=13,\cdots 22\)) with different h for the modified Boris algorithm

4 Modulated Fourier expansions and proof of Theorem 2.1

Theorem 2.1 will be proved by comparing the modulated Fourier expansions of the exact and numerical solutions. Modulated Fourier expansions have previously been used in the analysis of numerical methods for oscillatory differential equations, see [7, 12] and numerous further papers, and lately also for charged-particle dynamics in a strong magnetic field [9,10,11, 20, 21]. Incidentally, modulated Fourier expansions (though not under this name) were used for studying the gyration of charged particles by Kruskal [14] as early as 1958.

The analysis given here builds on that of [9] for the exact solution in the situation of a strong non-uniform magnetic field and on that of [10] for the Boris method with large step sizes in the situation of a mildly non-uniform strong magnetic field.

In this section we give the modulated Fourier expansions of the exact solution (Theorem 4.1) and of the numerical solution (Theorem 4.2), including explicit expressions for the differential equations of the dominant modulation functions. The analysis identifies the additional term in the modified Boris algorithm as the additional term appearing for the motion of the guiding centre in the modulated Fourier expansion that is missed by the standard Boris algorithm. This explains the excellent behaviour of the algorithm. The proof of Theorem 2.1 is then obtained by a comparison of Theorems 4.1 and 4.2.

4.1 Modulated Fourier expansion of the exact solution

We write the solution of (2.1) as

$$\begin{aligned} x(t)\approx \sum _{k\in {\mathbb {Z}}}z^k(t)\,{\textrm{e}}^{{{\textrm{i}}}k\varphi (t)/\varepsilon }, \qquad 0\le t \le T, \end{aligned}$$

with piecewise smooth modulation functions \(z^k\) and phase function \(\varphi \) for which all time derivatives are bounded independently of \(\varepsilon \), except at the discontinuities of \(z^k\) and \(\dot{\varphi }\) at integral multiples \(t=n\varepsilon \) with jumps of size \(O(\varepsilon ^N)\), for an arbitrarily chosen integer \(N>1\). The phase satisfies \({\dot{\varphi }}(t)/\varepsilon =|B(z^0(t))|\) and \(z^0(t)\) is the guiding centre at time t (defined up to \(O(\varepsilon ^N)\)).

Following [9], we diagonalize the linear map \(v \mapsto v\times B(x)\), which has eigenvalues \(\lambda _1={\textrm{i}}|B(x)|\), \(\lambda _0=0\) and \(\lambda _{-1}=-{\textrm{i}}|B(x)|\). The corresponding normalized eigenvectors are denoted by \(\nu _1(x)\), \(\nu _0(x)\), and \(\nu _{-1}(x)\). We let \(P_j(x)=\nu _j(x)\nu _j(x)^*\) be the orthogonal projections onto the eigenspaces, where we note that \(P_\parallel (x)=P_0(x)\) and \(P_\perp (x)=I-P_\parallel (x)= P_1(x)+P_{-1}(x)\). We write the coefficient functions in the basis \(\nu _j(z^0(t))\),

$$\begin{aligned} z^k=\sum _{j=-1}^1 z^k_j, \qquad z^k_j(t)=P_j(z^0(t))z^k(t), \qquad k\ne 0, \end{aligned}$$

whereas for \(k=0\) we decompose

$$\begin{aligned} z^0 = c^0 + \sum _{j=-1}^1 z^0_j, \qquad z^0_j(t) = P_j(z^0(t))(z^0(t) -c^0(t)), \end{aligned}$$

where \(c^0(t)\) is a piecewise constant function with a finite number of jumps independent of \(\varepsilon \), chosen arbitrarily such that \(|x(t) - c^0(t)|\) remains distinctly smaller than the inverse of a bound of the derivative of \(P_0\) in a neighbourhood of the solution.

The following result is based on Theorem 4.1 of [9], where the existence of the modulated Fourier expansion of solutions of (2.1)–(2.2) was established together with bounds of the modulation functions and of the remainder term. However, the differential equations for the dominant modulation functions \(z^0_0\), \(z^0_{\pm 1}\) and \(z^{\pm 1}_{\pm 1}\) and their initial values were not stated explicitly. As these will be needed in the following and are also of independent interest, they are given here.

Theorem 4.1

Let x(t) be a solution of (2.1)–(2.2) with an initial velocity bounded independently of \(\varepsilon \) \((|\dot{x}(0)|\le M)\). For an arbitrary truncation index \(N\ge 1\) we then have an expansion

$$\begin{aligned} x(t)=\sum _{|k|\le N-1}z^k(t)\,\mathrm e^{\mathrm i k \varphi (t)/\varepsilon }+R_N(t), \qquad 0\le t \le T, \end{aligned}$$

where the phase function satisfies \({\dot{\varphi }}(t)=|B_1(z^0(t))|\) (recall that \(B(x)=B_1(x)/\varepsilon \) with \(B_1\) independent of \(\varepsilon \)), and we fix \(\varphi (0)=0\).

  1. (a)

    The coefficient functions \(z^k(t)\) are piecewise continuous with jumps of size \(O(\varepsilon ^{N})\) at integral multiples of \(\varepsilon \) and are smooth elsewhere. Together with their derivatives (up to order N) they are bounded as

    $$\begin{aligned} z^k=O(\varepsilon ^{|k|}) \quad \qquad \hbox { for all }\ |k|\le N-1 \end{aligned}$$

    and further satisfy \(z^k_j =O(\varepsilon ^2)\) for \( |k|=1, j\ne k\). Moreover, \(\dot{z}^0 \times B_1(z^0)=O(\varepsilon )\). The functions \(z^k\) are unique up to \(O(\varepsilon ^{N})\).

  2. (b)

    The remainder term and its derivative are bounded by

    $$\begin{aligned} R_N(t)=O(\varepsilon ^{N}),\quad {\dot{R}}_N(t)=O(\varepsilon ^{N-1}), \qquad 0\le t \le T. \end{aligned}$$
  3. (c)

    On each time interval \(n\varepsilon \le t < (n+1)\varepsilon \le T\) (for integers \(n\ge 0\)), the functions \(z_0^{0}\), \(z_{\pm 1}^{0}\), \(z_1^{1}\), \(z_{-1}^{-1}\) satisfy the following differential equations. Here, all functions B, E, \(P_j\) are evaluated at the guiding centre \(z^0(t)\), and we write \(\dot{P}_j = ({\textrm{d}}/{\textrm{d}}t) P_j(z^0(t))=P_j'(z^0(t)) \dot{z}^0(t)\) and analogously \({\ddot{P}}_j\). Moreover, \(\mu ^0=\mu (x(0),\dot{x}(0)))\) is the magnetic moment. Omitting the ubiquitous argument t, we have

    $$\begin{aligned} \ddot{z}^{0}_0&=2{\dot{P}}_0 {\dot{z}}^0+{\ddot{P}}_0 (z^0-c^0)+ P_0\left( E -\mu ^0\,\nabla |B |\right) +O(\varepsilon ),\\ {\dot{z}}^0_{\pm 1}&={\dot{P}}_{\pm 1} (z^0-c^0)\pm \frac{{\textrm{i}}}{|B |}\,{\dot{P}}_{\pm 1} {\dot{z}}^0\ \pm \frac{{\textrm{i}}}{|B |}\,P_{\pm 1} \left( E -\mu ^0\,\nabla |B |\right) +O(\varepsilon ^2),\\ {\dot{z}}^{\pm 1}_{\pm 1}&={\dot{P}}_{\pm 1} z^{\pm 1}_{\pm 1}-\frac{({\textrm{d}}/{\textrm{d}}t)|B |}{|B |}z^{\pm 1}_{\pm 1}\ \mp \frac{{\textrm{i}}}{|B |}\,P_{\pm 1}\left( {\dot{z}}^0\times B'(z^0) z^{\pm 1}_{\pm 1}\right) +O(\varepsilon ^2). \end{aligned}$$

    All other modulation functions \(z_j^{k}\) are given by algebraic expressions depending on \(z^{0}\), \({\dot{z}}_0^{0}\), \(z_1^{1}\), \(z_{-1}^{-1}\).

  4. (d)

    Initial values for the differential equations of item (c) are given by

    $$\begin{aligned} \begin{aligned} z^{0}(0)=&\ x(0)+ \frac{{\dot{x}}(0)\times B(x(0))}{|B(x(0))|^2}+O(\varepsilon ^2),\\ {\dot{z}}_0^{0}(0)=&\ P_0{\dot{x}}(0) +{\dot{P}}_0(z^{0}(0)-c^0(0)) +O(\varepsilon ),\\ z_{\pm 1}^{\pm 1}(0)=&\ \frac{\mp {\mathrm i}}{|B|}\, P_{\pm 1}{\dot{x}}(0)+O(\varepsilon ^2), \end{aligned} \end{aligned}$$

    where \(B,B'\) and \(P_j\) are evaluated at the initial guiding centre \(z^0(0)\) (up to \(O(\varepsilon ^2)\)).

The constants symbolized by the O-notation are independent of \(\varepsilon \) and t with \(0\le t\le T\), but depend on N, on the velocity bound M, on bounds of derivatives of B and E in a neighbourhood of the trajectory \(\{ x(t):\, 0\le t \le T\}\), and on the final time T.

Remark 4.1

Since the energy \(H(x,v)=\tfrac{1}{2} |v|^2 + \phi (x)\) is conserved, it is bounded by \(\tfrac{1}{2} {\widetilde{M}}^2:= \tfrac{1}{2} M^2 + \phi (x(0))\) and we have \(\tfrac{1}{2} |\dot{x}(t)|^2 \le {\widetilde{M}}^2 - \phi (x(t))\). As we assumed that the scalar potential \(\phi \) is bounded from below, this gives an a priori bound on the velocity. Hence, the solution stays in a ball with centre x(0) and radius depending only on x(0) and \(\dot{x}(0)\) in a fixed time interval \(0\le t \le T\).

Remark 4.2

The differential equations for \(z^0_0\) and \(z^0_{\pm 1}\) are implicit, because the term \({\ddot{P}}(z^0)(z^0-c^0)\) contains \(\ddot{z}^0_0\). By our choice of \(c^0\), which ensures that \(|z^0-c^0|\) is sufficiently small, the equation can be solved for \(\ddot{z}^0_0\) to yield an explicit second-order differential equation. Similarly, the first-order differential equations for \(z^0_{\pm 1}\), which contain the time derivative in the term \({\dot{P}}_{\pm 1}(z^0)(z^0-c^0)\), can be solved for \(\dot{z}^0_{\pm 1}\) to yield explicit first-order differential equations. As was noted in [9], the modulation functions \(z^k\) are independent of the choice of \(c^0\).

Remark 4.3

From the second equation of (c), it is straightforward to get (with \(P_\perp =P_1+P_{-1}\) and \(P_\parallel =P_0\))

$$\begin{aligned} P_{\perp }{\dot{z}}^0=\frac{1}{|B|}P_{\parallel }{\dot{z}}^0\times \frac{{\text {d}}(B/|B|)}{{\text {d}}t}+\frac{1}{|B|^2}\left( E-\mu ^0\,\nabla |B|\right) \times B+O(\varepsilon ^2), \end{aligned}$$

with \(B, \nabla |B|, P_\parallel ,P_\perp \) and E evaluated at the guiding centre \(z^0\), which shows several slow drifts for the guiding centre motion usually derived by averaging techniques in the physical literature.

Proof

It is sufficient to prove the theorem on time intervals of length \(\varepsilon \). At the end of an interval \([(n-1)\varepsilon ,n\varepsilon ]\), the construction of the modulated Fourier expansion is restarted from the exact solution values \(x(n\varepsilon ),\dot{x}(n\varepsilon )\), which in view of the uniqueness of the modulation functions up to \(O(\varepsilon ^N)\) stated in (a) and the bound of the remainder term stated in (b) leads to jump discontinuities of size \(O(\varepsilon ^N)\) in the modulation functions and the derivative of the phase function.

Statements (a) and (b) are given by Theorem 4.1 in [9]. Here, we just give the proof of (c) and (d).

(c): Inserting the modulated Fourier expansion into the differential equation (2.1) and comparing the coefficients of \({\textrm{e}}^{{\textrm{i}}k\varphi (t)/\varepsilon }\) yields

$$\begin{aligned} \ddot{z}^k+2{\textrm{i}}k\frac{{\dot{\varphi }}}{\varepsilon }{\dot{z}}^k+\left( {\textrm{i}}k\frac{\ddot{\varphi }}{\varepsilon }-k^2\frac{{\dot{\varphi }}^2}{\varepsilon ^2}\right) z^k=F^k, \end{aligned}$$

where the right-hand side \(F^k\) is obtained from a Taylor expansion of B and E at \(z^0\); see [9] for the general formula. For \(k=0\), we obtain the motion of the guiding centre \(z^0(t)\):

$$\begin{aligned} \ddot{z}^0={\dot{z}}^0\times B(z^0)+E(z^0)+\underbrace{2\,\textrm{Re}\Bigl (\frac{\textrm{i}{\dot{\varphi }}}{\varepsilon }\,z^{1}\times B'(z^{0})z^{-1}\Bigr )}_{=: I} +O(\varepsilon ). \end{aligned}$$
(4.1)

For \(k=\pm 1\), we have

$$\begin{aligned} \pm 2{\textrm{i}}\frac{{\dot{\varphi }}}{\varepsilon }{\dot{z}}^{\pm 1}+\left( \pm {\textrm{i}}\frac{\ddot{\varphi }}{\varepsilon }-\frac{{\dot{\varphi }}^2}{\varepsilon ^2}\right) z^{\pm 1}=\left( {\dot{z}}^{\pm 1}\pm {\textrm{i}}\frac{{\dot{\varphi }}}{\varepsilon }z^{\pm 1}\right) \times B(z^0)+{\dot{z}}^0\times B'(z^0)z^{\pm 1}+O(\varepsilon ).\nonumber \\ \end{aligned}$$
(4.2)

We first study the case \(k=0\), i.e. (4.1). Here we begin by giving an alternative expression for the term I, which is an O(1) term. We show that

$$\begin{aligned} I=-\mu ^0\,\nabla |B|(z^0)+O(\varepsilon ). \end{aligned}$$
(4.3)

With the normalized eigenvectors \(\nu _j\), we have \(z_1^1=\zeta \nu _1\) and \(z_{-1}^{-1}={\overline{\zeta }} \nu _{-1}\) with \(\nu _{-1}=\overline{\nu }_1\). We define the local orthonormal basis \(e_1, e_2, e_3\) of \({\mathbb {R}}^3\) by the eigenvectors \(\nu _j\) as \(\nu _0= B/|B|=e_1\) and \(\nu _{\pm 1}=\frac{1}{\sqrt{2}}(e_2\pm {\textrm{i}}e_3)\). Using that \(z^k_j=O(\varepsilon ^2)\) for \(|k|=1\) and \(j\ne k\) by part (a), the term I can then be written as

$$\begin{aligned} \nonumber I&=\frac{\textrm{i}{\dot{\varphi }}}{\varepsilon }\,z^{1}_1\times B'(z^{0})z^{-1}_{-1}-\frac{\textrm{i}{\dot{\varphi }}}{\varepsilon }\,z^{-1}_{-1}\times B'(z^{0})z^{1}_{1} + O(\varepsilon )\\&=|B(z_0)||z^1_1|^2 \left( e_2\times B'(z^0)e_3-e_3\times B'(z_0)e_2\right) + O(\varepsilon ). \end{aligned}$$
(4.4)

Following equation (11) in [15], we find

$$\begin{aligned} e_2\times B'(z^0)e_3-e_3\times B'(z^0)e_2=-\nabla |B|(z^0). \end{aligned}$$
(4.5)

On the other hand,

$$\begin{aligned} x=z^0+O(\varepsilon ), \quad {\dot{x}}={\dot{z}}^0+{\textrm{i}}\frac{{\dot{\varphi }}}{\varepsilon }z^1_1{\textrm{e}}^{{\textrm{i}}\varphi /\varepsilon }-{\textrm{i}}\frac{{\dot{\varphi }}}{\varepsilon }z^{-1}_{-1}{\textrm{e}}^{-{\textrm{i}}\varphi /\varepsilon }+O(\varepsilon ) \end{aligned}$$

and thus

$$\begin{aligned} {\dot{x}}\times B(x)={\dot{z}}^0\times B(z^0)-|B(z^0)|^2\left( z^1_1{\textrm{e}}^{{\textrm{i}}\varphi /\varepsilon }+z_{-1}^{-1}{\textrm{e}}^{-{\textrm{i}}\varphi /\varepsilon }\right) +O(\varepsilon ^0). \end{aligned}$$

From the orthogonality of \(z_1^1\) and \(z_{-1}^{-1}\) it follows that

$$\begin{aligned} \mu (x,{\dot{x}})=\frac{1}{2}\frac{|{\dot{x}}\times B(x)|^2}{|B(x)|^3}=|B(z^0)|\,{|z_1^1|^2}+O(\varepsilon ^2). \end{aligned}$$
(4.6)

Inserting (4.5) and (4.6) into (4.4) gives

$$\begin{aligned} I=-\mu (x,{\dot{x}})\nabla |B|(z^0)+O(\varepsilon ). \end{aligned}$$

Using the adiabatic invariance [9, 15] \( \, \mu (x(t),\dot{x}(t)) = \mu ^0 + O(\varepsilon ^2), \) we obtain (4.3), and hence equation (4.1) can be equivalently written as

$$\begin{aligned} \ddot{z}^0={\dot{z}}^0\times B(z^0)+E(z^0)-\mu ^0\,\nabla |B|(z^0)+O(\varepsilon ). \end{aligned}$$
(4.7)

— Multiplying (4.7) with \(P_0(z^0)\) gives

$$\begin{aligned} P_0(z^{0})\ddot{z}^{0}=P_0(z^{0})\bigl (E(z^{0})-\mu ^0\,\nabla |B|(z^{0})\bigr )+O(\varepsilon ). \end{aligned}$$

Using the product rule

$$\begin{aligned} \ddot{z}^0_0 = \frac{{\textrm{d}}^2}{{\textrm{d}}t^2} \bigl ( P_0(z^0)(z^0-c^0) \bigr ) = P_0(z^{0})\ddot{z}^{0} + 2 \dot{P}_0(z^{0}){\dot{z}}^{0} + {\ddot{P}}_0(z^{0})({z}^{0}-c^0), \end{aligned}$$

this gives the first equation in (c).

— Multiplying (4.7) with \(P_{\pm 1}(z^0)\) gives

$$\begin{aligned} P_{\pm 1}(z^0)\ddot{z}^0=\pm {\textrm{i}}\frac{{\dot{\varphi }}}{\varepsilon }P_{\pm 1}(z^0){\dot{z}}^0+P_{\pm 1}(z^0)(E(z^0)-\mu ^0\,\nabla |B|(z^{0}))+O(\varepsilon ). \end{aligned}$$

Substituting \(P_{\pm 1}(z^0){\dot{z}}^0={\dot{z}}^0_{\pm 1}-{\dot{P}}_{\pm 1}(z^0)(z^0-c^0)\) yields

$$\begin{aligned} {\dot{z}}^0_{\pm 1}-{\dot{P}}_{\pm 1}(z^0)(z^0-c^0)&=\mp {\textrm{i}}\frac{\varepsilon }{{\dot{\varphi }}}P_{\pm 1}(z^0)\ddot{z}^0\pm {\textrm{i}}\frac{\varepsilon }{{\dot{\varphi }}}P_{\pm 1}(z^0)\left( E(z^0)-\mu ^0\,\nabla |B|(z^{0})\right) +O(\varepsilon ^2)\\&=\mp {\textrm{i}}\frac{\varepsilon }{{\dot{\varphi }}}\left( \ddot{z}^0_{\pm 1}-{\ddot{P}}_{\pm 1}(z^0)(z^0-c^0)-2{\dot{P}}_{\pm 1}(z^0){\dot{z}}^0\right) \\&\ \ \pm {\textrm{i}}\frac{\varepsilon }{{\dot{\varphi }}}P_{\pm 1}(z^0)\left( E(z^0)-\mu ^0\,\nabla |B|(z^{0})\right) +O(\varepsilon ^2). \end{aligned}$$

Denoting \(g_{\pm 1}={\dot{z}}^0_{\pm 1}-{\dot{P}}_{\pm 1}(z^0-c^0)\), we have \({\dot{g}}_{\pm 1}=\ddot{z}^0_{\pm 1}-{\ddot{P}}_{\pm 1}(z^0)(z^0-c^0)-{\dot{P}}_{\pm 1}(z^0){\dot{z}}^0\). The above equation can be expressed as

$$\begin{aligned} g_{\pm 1}=\mp {\textrm{i}}\frac{\varepsilon }{ {\dot{\varphi }}}{\dot{g}}_{\pm 1}\pm {\textrm{i}}\frac{\varepsilon }{{\dot{\varphi }}}{\dot{P}}_{\pm 1}(z^0){\dot{z}}^0\pm {\textrm{i}}\frac{\varepsilon }{{\dot{\varphi }}}P_{\pm 1}(z^0)\left( E(z^0)-\mu ^0\,\nabla |B|(z^{0})\right) +O(\varepsilon ^2). \end{aligned}$$

By differentiation and substitution, the first term on the right-hand side can be absorbed into the \(O(\varepsilon ^2)\) term, and so we get the second equation in (c).

Since the \(\varepsilon ^{-2}\)-terms cancel in (4.2) after projection with \(P_{\pm 1}(z^0)\), the \(\varepsilon ^{-1}\)-terms are dominant and we obtain the last equation in (c).

(d): The initial values can be obtained by the same arguments as in the proof of Theorem 4.1 in [11]. \(\square \)

4.2 Modulated Fourier expansion of the numerical solution

The modulated Fourier expansion can be extended to the numerical solution of the modified Boris algorithm similarly to Theorem 4.2 in [10]. There are, however, additional terms and difficulties to be considered, since here we do not have a magnetic field in a near-constant direction as in [10].

Theorem 4.2

Let \(x^n\) be the numerical solution obtained by applying the modified Boris algorithm to (2.1)–(2.3) with a stepsize h satisfying

$$\begin{aligned} c_*\varepsilon \le h^2 \le C_*\varepsilon \end{aligned}$$
(4.8)

for some positive constants \(c_*\) and \(C_*\). We assume that the component orthogonal to \(B(x^0)\) of the starting velocity, \(v^0_\perp =P_\perp (x^0) v^0= v^0 - P_0(x^0)v^0\), is chosen to be small:

$$\begin{aligned} |v^0_\perp | \le c_1 \varepsilon . \end{aligned}$$
(4.9)

We further make the nondegeneracy assumption (2.7). For an arbitrary truncation index \(N\ge 2\), we then have a decomposition

$$\begin{aligned} x^n=y(t_n) + (-1)^n z(t_n) +R_N(t_n), \qquad t_n=nh \le T, \end{aligned}$$
(4.10)

with the following properties:

  1. (a)

    The functions y(t) and z(t), \(0\le t \le T\), are piecewise continuous with jumps of size \(O(h^N)\) at integral multiples of h and are smooth elsewhere. Together with their derivatives (up to order N) they are bounded as \(y=O(1)\), \(z=O(h^2)\). They are unique up to \(O(h^{N})\). Moreover, \(P_\perp (y)\dot{y} = O(\varepsilon )\) and \(P_0(y)z=O(h^4)\).

  2. (b)

    The remainder term is bounded by

    $$\begin{aligned} R_N(t)=O(h^N) \quad \text {for} \quad 0\le t\le T. \end{aligned}$$
  3. (c)

    We let \(c^0(t)\) be a piecewise constant function that is sufficiently close to y(t). The functions \(y_j=P_j(y) (y- c^0)\) \((j=0,\pm 1)\) and \(z_{\pm 1}=P_{\pm 1}z\) satisfy the following differential equations for \(0\le t \le T\) except at the jumps. Here, all functions B, E, \(P_j\) are evaluated at the numerical guiding centre y(t), and we write \(\dot{P}_j = ({\textrm{d}}/{\textrm{d}}t) P_j(y(t))=P_j'(y(t)) \dot{y}(t)\) and analogously \({\ddot{P}}_j\). Moreover, \(\mu ^0=\mu (x(0),\dot{x}(0)))\) is the magnetic moment. Omitting the ubiquitous argument t, we have

    $$\begin{aligned} \ddot{y}_0&=2{\dot{P}}_0{\dot{y}}+{\ddot{P}}_0(y-c^0)+ P_0\left( E-\mu ^0\,\nabla |B|\right) +O(h^2)\\ {\dot{y}}_{\pm 1}&={\dot{P}}_{\pm 1}(y-c^0) \pm \frac{{\textrm{i}}}{|B|}{\dot{P}}_{\pm 1}{\dot{y}} \pm \frac{{\textrm{i}}}{|B|}P_{\pm 1}\left( E-\mu ^0\,\nabla |B|\right) +O(h^2)\\ {\dot{z}}_{\pm 1}&={\dot{P}}_{\pm 1}z\mp \frac{4{\textrm{i}}}{h^2|B|}z_{\pm 1}{\mp \,\frac{{\textrm{i}}}{|B|}P_{\pm 1}\left( {\dot{y}}\times B'(y)z\right) }+O(\varepsilon h^2). \end{aligned}$$

    The function \(z_0=P_0(y)(z-c_0)\) is given by an algebraic expression depending on y, \({\dot{y}}_0\) and \(z_{\pm 1}\).

  4. (d)

    Initial values for the differential equations of item (c) are given by

    $$\begin{aligned} \begin{aligned} y(0)&= x^0+O(h^2),\\ {\dot{y}}_0(0)&=P_0(x^0) v^0 +O(h^2), \\ z_{\pm 1}(0)&= O(h^2). \end{aligned} \end{aligned}$$

The constants symbolized by the O-notation are independent of \(\varepsilon \), h and n with \(0\le nh \le T\), but depend on the velocity bound, on bounds of derivatives of B and E in a neighbourhood of the numerical trajectory, and on the final time T.

Remark 4.4

The essential observation is that for the modified Boris method, the differential equations for the numerical guiding centre y(t) are the same, up to a defect of size \(O(h^2)\), as the differential equations for the guiding centre \(z^0(t)\) of the exact solution, and also the initial values agree up to \(O(h^2)\). In contrast, for the standard Boris method with parallel-projected initial velocity, the terms \(\mu ^0\,\nabla |B|\) are missing. This is the reason for the failure of the standard Boris method with modified starting values for large step sizes \(h^2\ge \varepsilon \) in the situation of strongly non-uniform strong magnetic fields.

Proof

This theorem is proved similarly to Theorem 4.2 in [10] (which gives an analogous decomposition for the standard Boris method in the case of a near-constant strong magnetic field) combined with the treatment of the strongly nonuniform magnetic field in Theorem 4.1 in [9]. Here, we do not repeat the arguments in the proofs of those papers for (a) and (b) (such as the recursive elimination of higher time derivatives, an idea going back in time as far as the Euler–Maclaurin summation formula [13]) but concentrate on the parts (c) and (d) that are specific for the present situation.

Since a general strong magnetic field is considered, the time interval of validity of the modulated Fourier expansion is here O(h) instead of O(1), and so we need to patch together many such short-time expansions, starting anew from each \(x^n\), in the same way we did in Theorem 4.1 over intervals of length proportional to \(\varepsilon \).

Inserting the decomposition (4.10) into the numerical method (2.4) and separating the terms without and with the factor \((-1)^n\) gives

$$\begin{aligned}&\ddot{y}+O(h^2)=\bigl ({\dot{y}}+ O(h^2)\bigr )\times B(y)-{\dot{z}}\times B'(y)z+E(y)-\mu ^0\,\nabla |B|(y)+O(h^2) \end{aligned}$$
(4.11)
$$\begin{aligned}&\quad -\frac{4}{h^2}z-\ddot{z}+O(h^2)=-{\dot{z}}\times B(y)+{\dot{y}}\times B'(y)z+E'(y)z+O(h^2). \end{aligned}$$
(4.12)

Since \(z=O(h^2)\) and \(\dot{z}=O(h^2)\), the second term on the right hand side of the first equation is

$$\begin{aligned} {\dot{z}}\times B'(y)z= O(h^4/\varepsilon ) = O(h^2) \end{aligned}$$

in our stepsize regime \(h^2\sim \varepsilon \).

Taking the projection \(P_0=P_0(y)\) on both sides of (4.11) yields the first equation in (c). Taking the projection \(P_{\pm 1}\) on both sides gives

$$\begin{aligned} P_{\pm 1}\ddot{y}+O(h^2)=\pm {\textrm{i}}|B(y)| P_{\pm 1}{\dot{y}}+P_{\pm 1}\left( E(y)-\mu ^0\,\nabla |B|(y)\right) +O(h^2 |B(y)|). \end{aligned}$$

As in Theorem 4.1 we thus have, with \(B=B(y)\),

$$\begin{aligned} P_{\pm 1}{\dot{y}}=\pm \frac{{\textrm{i}}}{|B|}{\dot{P}}_{\pm 1}{\dot{y}}\pm \frac{{\textrm{i}}}{|B|}P_{\pm 1}\left( E(y)-\mu ^0\,\nabla |B(y)|\right) +O( h^2). \end{aligned}$$

Taking the projection \(P_{\pm 1}=P_{\pm 1}(y)\) on both sides of (4.12) yields

$$\begin{aligned} -\frac{4}{h^2}z_{\pm 1}-P_{\pm 1}\ddot{z}+O(h^2)=\mp {\textrm{i}}|B|P_{\pm 1}{\dot{z}}+P_{\pm 1}\left( {\dot{y}}\times B'(y)z\right) +O(h^2), \end{aligned}$$

and so we find

$$\begin{aligned} P_{\pm 1}{\dot{z}}=\mp \frac{4{\textrm{i}}}{h^2|B|}z_{\pm 1}\mp \frac{{\textrm{i}}}{|B|}P_{\pm 1}\left( {\dot{y}}\times B'(y)z\right) +O(\varepsilon h^2). \end{aligned}$$

We thus have the differential equations of part (c). Taking \(P_0\) on both sides of (4.12) and multiplying with \(-h^2/4\) yields

$$\begin{aligned} z_0=-\tfrac{1}{4} h^2\,P_0({\dot{y}}\times B'(y)z)+O(h^4). \end{aligned}$$

Since \(P_\perp \dot{y}=O(\varepsilon )\), we have \(P_0({\dot{y}}\times B'(y)z)= P_0(P_\perp {\dot{y}}\times B'(y)z)=O(z)\). This gives us \(z_0=O(h^4)\) provided that \(z_{\pm 1}=O(h^2)\).

(d) The numerical approximation to the velocity is given by

$$\begin{aligned} v^n=\frac{x^{n+1}-x^{n-1}}{2h}={\dot{y}}(t_n)+\dddot{y}(t_n)h^2+\cdots -(-1)^n({\dot{z}}(t_n)+\dddot{z}(t_n)h^2+\cdots ), \end{aligned}$$

and so we have

$$\begin{aligned} v^n_{\perp }=P_{\perp }{\dot{y}}(t_n)-(-1)^n P_{\perp }{\dot{z}}(t_n)+ O(h^2), \end{aligned}$$

which under the bounds of (a) yields \(v^n_{\perp }=O(h^2)\). We now consider this equation for \(n=0\). Since the above equation for \(P_{\pm 1}{\dot{z}}\) and the bound for \(z_0\) yield

$$\begin{aligned} P_\perp \dot{z}(0) = \frac{4}{h^2|B^0|}\, L_{x^0,v^0}(z_\perp (0)) \times \frac{B^0}{|B^0|} + O(h^4), \end{aligned}$$

the above equation for \(v^0_\perp \) yields

$$\begin{aligned} \frac{4}{h^2|B^0|}\, L_{x^0,v^0}(z_\perp (0)) \times \frac{B^0}{|B^0|}= P_{\perp }{\dot{y}}(0) - v^0_\perp + h^2 P_{\perp }\dddot{y}(0) + O(h^2 z) +O(h^4), \end{aligned}$$

and with the nondegeneracy condition (2.7) we are now able to construct \(z_\perp (0)\) and hence \(z_{\pm 1}(0)\), which thanks to \(h^2\sim \varepsilon \) and \(v^0_\perp =O(\varepsilon )\) are indeed of size \(O(h^2)\). \(\square \)

4.3 Proof of Theorem 2.1

Theorem 4.1 represents the exact solution as

$$\begin{aligned} x(t)= z^0(t) + O(\varepsilon ), \end{aligned}$$

and Theorem 4.2 represents the numerical solution of the modified Boris method with \(h^2\sim \varepsilon \) as

$$\begin{aligned} x^n = y(t_n) + O(h^2), \end{aligned}$$

where the guiding centre \(z^0(t)\) and the numerical guiding centre \(y(t_n)\) satisfy the same differential equations up to \(O(h^2)\) with the same initial values up to \(O(h^2)\), and the jumps of size \(O(\varepsilon ^N)\) or \(O(h^N)\) for arbitrary N contribute less than \(O(h^2)\) to the difference. (The piecewise constant function \(c^0(t)\) can be chosen the same in both cases.) Therefore, \(z^0(t)\) and y(t) differ by \(O(h^2)\) on a fixed time interval \(0\le t \le T\). This proves the \(O(h^2)\) error bound for the positions in Theorem 2.1.

We now turn to the error bound for the velocity. We compare the velocity of the exact solution

$$\begin{aligned} v(t)={\dot{x}}(t)={\dot{z}}^0(t)+\frac{{\textrm{i}}{\dot{\varphi }}(t)}{\varepsilon }z^1_1(t){\textrm{e}}^{{\textrm{i}}\varphi (t)/\varepsilon }-\frac{{\textrm{i}}{\dot{\varphi }}(t)}{\varepsilon }z^{-1}_{-1}(t){\textrm{e}}^{-{\textrm{i}}\varphi (t)/\varepsilon }+O(\varepsilon ) \end{aligned}$$

and the numerical velocity

$$\begin{aligned} v^n=\frac{x^{n+1}-x^{n-1}}{2h}={\dot{y}}(t_n)-(-1)^n{\dot{z}}(t_n)+O(h^N). \end{aligned}$$

Since \(P_\parallel (z^0)z^{\pm 1}_{\pm 1}=0\) and \(P_\parallel (y)z=z_0=O(h^4)\), and since we already know that \(z^0(t)-y(t)=O(h^2)\) and \(z^0(t)-x(t)=O(\varepsilon )\) and \(y(t_n)-x^n=O(h^2)\), it follows that

$$\begin{aligned} v_{\parallel }^n-v_{\parallel }(t^n)= P_\parallel (x^n)v^n - P_\parallel (x(t_n))v(t_n) = O(h^2). \end{aligned}$$

Finally, the bound \(v^n_\perp =O(h^2)\) was already shown in part (d) of the proof of Theorem 4.2. This completes the proof of Theorem 2.1.