1 Introduction

Gyrokinetic theories and simulations are powerful means to investigate microinstabilities and turbulence in fusion and astrophysical plasmas (Krommes 2012; Garbet et al. 2010; Idomura et al. 2006; Dimits et al. 2000; Schekochihin et al. 2009). Linear and nonlinear gyrokinetic equations for particle distribution functions were originally derived by recursive techniques combined with the WKB representation (Hazeltine and Meiss 1992; Rutherford and Frieman 1968; Taylor and Hastie 1968; Antonsen and Lane 1980; Catto et al. 1981; Frieman and Chen 1982). Another modern derivation of the gyrokinetic equations based on the Lagrangian and/or Hamiltonian formulations (Cary and Littlejohn 1983; Brizard and Hahm 2007; Dubin et al. 1983; Lee 1983; Hahm et al. 1988; Hahm 1988; Brizard 1989) was presented to ensure conservation laws for the phase space volume and the magnetic moment from Liouville’s theorem and Noether’s theorem (Goldstein et al. 2002), respectively. Later, conservation of the total energy and momentum was obtained in the gyrokinetic field theory (Sugama 2000) where all governing equations for the distribution functions and the electromagnetic fields are derived from the Lagrangian which describes the whole system consisting of particles and fields. However, Lagrangian/Hamiltonian gyrokinetic formulations basically treat collisionless systems so that Noether’s theorem and conservation laws are not directly applied to collisional systems although generally, in toroidal magnetic configurations, Coulomb collisions and turbulent fluctuations cause classical (Braginskii 1965), neoclassical (Helander and Sigmar 2002; Hirshman and Sigmar 1981; Hinton and Hazeltine 1976), and turbulent transport (Horton 2012) of plasma particles, heat, and momentum as multiscale phenomena. Therefore, several theoretical studies (Brizard 2004; Madsen 2013; Sugama et al. 2015; Burby et al. 2015) have been done to combine the modern gyrokinetic formulations with the collision models which can keep the conservation laws and describe the collisional transport processes properly as shown later in the present paper.

Recently, profiles of background \(\mathbf{E}\times \mathbf{B}\) and toroidal flows are regarded as key factors which influence magnetic plasma confinement although severe accuracy requirements for theoretically predicting those flow profiles are sometimes controversial among recent studies based on the low-flow ordering (Krommes 2012; Parra and Catto 2010; Scott and Smirnov 2010; Sugama et al. 2011; Calvo and Parra 2012) in which the background flow velocity \(V_0\) is assumed to be of \({\mathcal {O}}(\delta v_{Ti})\). Here, \(v_{Ti}\) is the ion thermal velocity and \(\delta \sim \rho _{Ti}/L \ll 1\) represents the ordering parameter defined by the ratio of the ion thermal gyroradius \(\rho _{Ti}\) to the background gradient scale length L. On the other hand, under the high-flow ordering \(V_0 = {\mathcal {O}}(v_{Ti})\) (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997a, b, 1998; Artun and Tang 1994; Brizard 1995; Hahm 1996; Miyato 2009; Abel et al. 2013), the toroidal momentum transport equation which determines the background radial electric field profile can be derived with the same-order accuracy as the particle and energy transport equations. In this paper, we present a novel formulation of collisional and turbulent transport in toroidal plasmas under the high-flow ordering (Sugama et al. 2017) by generalizing the previous study to derive governing equations for background and turbulent electromagnetic fields and gyrocenter distribution functions which satisfy conservation laws for particles, energy, and toroidal momentum. In the future magnetic confinement fusion devices such as ITER (Horton and Benkadda 2015), the toroidal flow velocity \(V_0\) is expected to be lower than \(v_{Ti}\). We expected that, in ITER, the Mach number \(M\equiv V_0 / v_{Ti} \sim 0.05\) while that number is still much larger than the normalized gyroradius \(\delta \sim \rho _{Ti} / a \sim 2 \times 10^{-3}\) [see, for example, Table 1 in Abel et al. (2013)]. Thus, for ITER, the so-called low-flow ordering \(M \sim \delta\) is not considered to be sufficiently valid and the gyrokinetic model presented in this paper based on the high-flow ordering is still expected to be useful for investigating the effects of the toroidal flow and its shear on transport processes.

The rest of this paper is organized as follows. In Sect. 2, the guiding-center and gyrocenter equations for the charged particles under the strong magnetic field are presented based on the Lagrangian/Hamiltonian formulation. Here, following Brizard’s terminology (Brizard 1989), we refer to the single-particle phase-space variables defined from the equilibrium and perturbed fields as the guiding-center and gyrocenter coordinates, respectively. The guiding-center equations derived by Littlejohn in the high-flow ordering (Littlejohn 1981) are reviewed in Sect. 2.1 and they are modified to treat toroidally rotating plasmas in Sect. 2.2. Then, effects of turbulent electromagnetic perturbations are introduced to derive the gyrocenter equations in Sect. 2.3. In Sect. 3, the gyrokinetic field theory is presented for toroidally rotating plasmas. The Lagrangian for the whole system consisting of the charged particles and the electromagnetic fields are presented in Sect. 3.1. Then, applying the variational principle to the Lagrangian, the gyrokinetic Vlasov equation for the collisionless rotating plasma is derived in Sect. 3.1 while the gyrokinetic Poisson equations and Ampère’s laws to determine the background and perturbation parts of the electromagnetic fields are shown in Sect. 3.2. Noether’s theorem relating symmetry of the system to conservation laws are described in Sect. 3.3. Collisional effects are considered by including the collision term into the gyrokinetic equation for the distribution function in Sect. 4 where an external source term is also included. We need to represent the collision term in terms of the gyrocenter coordinates to use it in the gyrokinetic Boltzmann equation. It is shown in Sect. 4.1 how the collision operator represented in the particle coordinates is transformed to that in the gyrocenter coordinates. In Sect. 4.2, we follow Burby et al. (2015) to represent the collision operator in terms of Poisson brackets and show that the resultant operator in the gyrocenter coordinates retains preferable properties regarding conservation laws for particles, energy, and toroidal angular momentum. Equations for gyrocenter densities and polarization in collisional systems are derived from the gyrokinetic Boltzmann equation in Sect. 4.3. In Sect. 4.4, we find how the collisionless conservation laws derived from Noether’s theorem are modified by the collision and source terms added to the gyrokinetic equation. Then, in Sects. 4.5 and 4.6, the energy and toroidal angular momentum balance equations for the toroidally rotating plasma are derived from symmetries under the infinitesimal time translation and toroidal rotation, respectively. In Sect. 5, the gyrokinetic system of equations are separated into the ensemble-averaged and turbulent parts, which are shown to agree with the conventional drift kinetic and gyrokinetic equations for describing neoclassical and turbulent transport processes, respectively. The distribution function is divided into the ensemble-averaged and turbulent parts in Sect. 5.1 and the first-order components of these parts satisfy the drift kinetic and gyrokinetic equations which are derived in Sects. 5.2 and 5.3, respectively. Then, the ensemble-averaged particle, energy, and toroidal momentum balance equations are shown in Sects. 5.4, 5.5, and 5.6, respectively, and they are shown to contain the second-order classical, neoclassical, and turbulent transport fluxes which agree with those derived in the previous works. Finally, summary is given in Sect. 6.

2 Guiding-center and gyrocenter equations

In this section, the guiding-center and gyrocenter equations are presented to describe motions of the charged particles under the strong background magnetic field with and without turbulent fluctuations, respectively. We begin with explaining the guiding-center equations derived by Littlejohn for the case without perturbations although large \(\mathbf{E}\times \mathbf{B}\) flows on the order of the ion thermal velocity are assumed to exist.

2.1 Littlejohn’s guiding-center equations in the high-flow ordering

We here consider the particle with the mass m and the charge e moving in the electromagnetic fields. The magnetic field \(\mathbf{B}\) is assumed to be so large that \(\rho /L \sim 1/(\Omega \tau ) \ll 1\) is well satisfied. Here, \(\rho\) and \(\Omega\) are the particle’s gyroradius and gyrofrequency, respectively, while L and \(\tau\) are the characteristic length of variations in the ambient electromagnetic fields and the transit time scale \(\sim L/ v_T\), respectively. In the present paper, the background electromagnetic fields are assumed to slowly vary in the time scale \(\sim (\rho /L )^{-2} (L/ v_T)\). [Note that, in this subsection, we follow Littlejohn’s work (Littlejohn 1981) where the guiding-center motion equations are derived so as to be valid even for the background fields that rapidly vary with the transit time scale]. Then, \(\delta \sim \rho /L\) is used as an ordering parameter for the perturbation expansion to make the guiding center approximation. Here, the guiding center coordinates are denoted by \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) where \(\mathbf{X}\) represents the guiding center position vector, U the parallel velocity, \(\mu\) the magnetic moment, and \(\xi\) the gyrophase angle.

Governing equations for Lagrangian and Hamiltonian systems are generally derived from the variational principle (Goldstein et al. 2002),

$$\begin{aligned} \delta \int \limits _{t_1}^{t_2} L {\text{d}}t = 0. \end{aligned}$$
(1)

To apply the variational principle to the motion of the charged particle under the guiding center approximation, the Lagrangian L in Eq. (1) is represented as a function of \((\mathbf{Z}, \dot{\mathbf{Z}}, t)\), where \(\dot{} = d/dt\) denotes the derivative with respect to the time t. Here, following Littlejohn (Littlejohn 1981), the Lagrangian \(L (\mathbf{Z}, \dot{\mathbf{Z}}, t)\) is written as

$$\begin{aligned} L = \mathbf{P}^c \cdot \dot{\mathbf{X}} + \frac{m c}{e} \mu \dot{\xi } - H , \end{aligned}$$
(2)

where c is the speed of light in vacuum. The canonical momentum \(\mathbf{P}^c\) is given by

$$\begin{aligned} \mathbf{P}^c \equiv \frac{e}{c} \mathbf{A}^* \equiv \frac{e}{c} \mathbf{A} + m ( U \mathbf{b} + {\mathbf{V}}_E ) - \frac{mc}{e} \mu \mathbf{W} , \end{aligned}$$
(3)

where \(\mathbf{A}= \mathbf{A}(\mathbf{X}, t)\) is the vector potential for the magnetic field \(\mathbf{B} \equiv \nabla \times \mathbf{A}\), \(\nabla \equiv \partial / \partial \mathbf{X}\) represents the derivative with respect to the guiding center position vector \(\mathbf{X}\), \(\mathbf{b}\equiv \mathbf{B}/B\) is the unit vector parallel to \(\mathbf{B}\), \({\mathbf{V}}_E \equiv (c/B)\mathbf{E} \times \mathbf{b}\) is the \(\mathbf{E} \times \mathbf{B}\) drift velocity, and \(\mathbf{W}\) is defined by

$$\begin{aligned} \mathbf{W} \equiv ( \nabla \mathbf{e}_1 ) \cdot \mathbf{e}_2 + \frac{1}{2}{} \mathbf{b} \left[ \mathbf{b} \cdot ( \nabla \times \mathbf{b} ) \right] . \end{aligned}$$
(4)

Here, \((\mathbf{e}_1 , \mathbf{e}_2, \mathbf{b})\) are unit vectors which form a right-handed orthogonal triad and are regarded as functions of \((\mathbf{X}, t)\). The electric field is written as \(\mathbf{E} = - \nabla \Phi - c^{-1} \partial \mathbf{A} / \partial t\), where \(\Phi (\mathbf{X}, t)\) is the electrostatic potential. We see from Eq. (4) that \(\mathbf{W}\) contains the term \(( \nabla \mathbf{e}_1 ) \cdot \mathbf{e}_2\) which is dependent on the gyrogauge (Littlejohn 1981), namely the choice of the triad \((\mathbf{e}_1 , \mathbf{e}_2, \mathbf{b})\) necessary to define the gyrophase \(\xi\) [see Eq. (57)]. The last term on the right-hand side of Eq. (2) represents the Hamiltonian which is written as

$$\begin{aligned} H = e \Phi + \frac{1}{2} m ( U^2 + V_E^2) + \mu B + \frac{m c}{e} \mu \left[ \frac{1}{2} \mathbf{b} \cdot ( \nabla \times {\mathbf{V}}_E ) + \frac{\partial \mathbf{e}_1}{\partial t} \cdot \mathbf{e}_2 \right] . \end{aligned}$$
(5)

The fact that the Hamiltonian H given above, the canonical momentum \(\mathbf{P}^c\) defined in Eq. (3), and accordingly the Lagrangian in Eq. (2) are all independent of the gyrophase \(\xi\) is essential for the derivation of the invariance of the magnetic moment \(\mu\) as described later.

Note that, in Littlejohn (1981), units are chosen so that \(m = c = e = 1\) and the guiding center approximation is performed by replacing the electric charge e with the inverse of the dimensionless parameter \(\epsilon\) which corresponds to \(\delta\) described above. In this review article, Gaussian units are used to express physical variables although we still treat e as a quantity of \({\mathcal {O}}(\delta ^{-1})\) as in Littlejohn (1981). This treatment, \(e = {\mathcal {O}}(\delta ^{-1})\), is convenient for quickly judge the order of each physical variable. For example, the gyrofrequency \(\Omega = e B / (m c)\) is regarded as of \({\mathcal {O}}(\delta ^{-1})\) because of e in the numerator although the magnetic field B included in \(\Omega\) are considered to be of \({\mathcal {O}}(\delta ^0)\) as well as m and c. Then, on the right-hand side of Eq. (3), the first and last terms are of \({\mathcal {O}}(\delta ^{-1})\) and \({\mathcal {O}}(\delta )\), respectively, while the remaining terms are of \({\mathcal {O}}(\delta ^0)\). We also see from Eq. (5) that the dominant contribution to the Hamiltonian H is given by the potential energy \(e \Phi = {\mathcal {O}}(\delta ^{-1})\) and that the last group of terms proportional to \((mc/e)\mu\) on the right hand of Eq. (5) are of \({\mathcal {O}}(\delta )\). It is emphasized that we here assume \(\Phi = {\mathcal {O}}(\delta ^0)\) and accordingly \(\mathbf{E} = {\mathcal {O}}(\delta ^0)\). Therefore, we also have \({\mathbf{V}}_E \equiv (c/B)\mathbf{E} \times \mathbf{b} = {\mathcal {O}}(\delta ^0)\) corresponding to the so-called high-flow ordering which means that the magnitude of the \(\mathbf{E} \times \mathbf{B}\) drift velocity is assumed to be on the order of the thermal or characteristic particle velocity. On the other hand, the \(\mathbf{E} \times \mathbf{B}\) drift velocity is assumed to be on the same order as the diamagnetic drift velocity in the low-flow ordering, which can be characterized by using \(\mathbf{E} = {\mathcal {O}}(\delta )\) and \({\mathbf{V}}_E = {\mathcal {O}}(\delta )\).

The variational principle in Eq. (1) leads to the Euler–Lagrange equation,

$$\begin{aligned} \frac{\text{ d }}{{\text{ d }}t} \left( \frac{\partial L}{\partial \dot{\mathbf{Z}}} \right) - \frac{\partial L}{\partial \mathbf{Z}} = 0 , \end{aligned}$$
(6)

which is rewritten using Eq. (2) as

$$\begin{aligned} \frac{{\text{ d }}{} \mathbf{Z}}{{\text{ d }}t} = \{ \mathbf{Z}, H \} + \{ \mathbf{Z}, \mathbf{X} \} \cdot \frac{e}{c} \frac{\partial \mathbf{A}^*}{\partial t} . \end{aligned}$$
(7)

Here, \(\{ \cdot , \cdot \}\) represents the Poisson bracket and nonvanishing components of Poisson brackets between the guiding center coordinates \(\mathbf{Z} = (\mathbf{X}, U, \mu , \xi )\) are given by

$$\begin{aligned}&\{ \mathbf{X}, \mathbf{X} \} = \frac{c}{e B_\parallel ^*} \mathbf{b} \times \mathbf{I},\quad \{ \mathbf{X}, U \} = \frac{\mathbf{B}^*}{m B_\parallel ^*},\quad \{ \mathbf{X}, \xi \} = \frac{c}{e B_\parallel ^*} \mathbf{b} \times \mathbf{W} , \nonumber \\&\{ U, \xi \} = - \frac{\mathbf{B}^* \cdot \mathbf{W}}{m B_\parallel ^*},\quad \{ \xi , \mu \} = \frac{e}{m c} , \end{aligned}$$
(8)

where

$$\begin{aligned} \mathbf{B}^* = \nabla \times \mathbf{A}^*,\quad B_\parallel ^* = \mathbf{B}^* \cdot \mathbf{b} . \end{aligned}$$
(9)

The guiding center motion equations are derived from Eqs. (7) and (8) as

$$\begin{aligned} \frac{{\text {d}} \mathbf{X}}{{\text {d}}t}&= U \frac{\mathbf{B}^*}{B_\parallel ^*} + \frac{c}{e B_\parallel ^*} \mathbf{b} \times \left( - e \mathbf{E}^* + \frac{1}{2} m \nabla ( V_E^2 ) \right. \nonumber \\& \qquad \left. + \mu \nabla \left[ B + \frac{mc}{e} \left\{ \frac{1}{2} \mathbf{b} \cdot ( \nabla \times {\mathbf{V}}_E ) + \frac{\partial \mathbf{e}_1}{\partial t} \cdot \mathbf{e}_2 \right\} \right] \right) , \nonumber \\ \frac{{\text {d}} U}{{\text {d}}t}&= \frac{\mathbf{B}^*}{m B_\parallel ^*} \cdot \left( e \mathbf{E}^* - \frac{1}{2} m \nabla ( V_E^2 ) \right. \nonumber \\& \qquad \left. - \, \mu \nabla \left[ B + \frac{mc}{e} \left\{ \frac{1}{2} \mathbf{b} \cdot ( \nabla \times {\mathbf{V}}_E ) + \frac{\partial \mathbf{e}_1}{\partial t} \cdot \mathbf{e}_2 \right\} \right] \right) , \nonumber \\ \frac{{\text {d}} \mu }{{\text {d}}t}&= 0 , \nonumber \\ \frac{{\text {d}} \xi }{{\text {d}}t}&= \Omega + \mathbf{W} \cdot \frac{{\text {d}} \mathbf{X}}{{\text {d}}t} + \frac{1}{2} \mathbf{b} \cdot ( \nabla \times {\mathbf{V}}_E ) + \frac{\partial \mathbf{e}_1}{\partial t} \cdot \mathbf{e}_2 , \end{aligned}$$
(10)

where \(\Omega \equiv e B / (m c)\) and

$$\begin{aligned} \mathbf{E}^* \equiv - \nabla \Phi - \frac{1}{c} \frac{\partial \mathbf{A}^*}{\partial t} \end{aligned}$$
(11)

are used. Since the Lagrangian L is independent of the gyrophase \(\xi\) and the magnetic moment \(\mu\) is the conjugate variable to \(\xi\) except for a constant coefficient, the invariance of the magnetic momentum \({\text {d}} \mu / {\text {d}}t = 0\) shown in Eq. (10) is derived from Noether’s theorem. As seen from Eq. (10), the equations for time evolutions of the guiding center position vector \(\mathbf{X}\) and its parallel velocity U contain the magnetic moment \(\mu\) as a constant parameter and have no dependence on the gyrophase \(\xi\) so that they can describe the guiding center motion which is completely decoupled from the gyromotion. We also note that the guiding center motion equations for \((\mathbf{X}, U)\) are independent of the gyrogauge.

As explained by Littlejohn (1981), the components of the electric field in the directions parallel and perpendicular to the magnetic field are assumed to be of different orders in \(\delta\) such that \(E_\parallel \equiv \mathbf{E}\cdot \mathbf{b} = {\mathcal {O}}(\delta )\) and \(\mathbf{E}_\perp \equiv \mathbf{E} - E_\parallel \mathbf{b} = {\mathcal {O}}(\delta ^0)\). Combining this assumption with Eq. (10) leads to \({\text {d}}U/{\text {d}}t = {\mathcal {O}}(\delta ^0)\). If \(E_\parallel = {\mathcal {O}}(\delta ^0)\), we have \({\text {d}}U/{\text {d}}t = {\mathcal {O}}(\delta ^{-1})\) which represents extreme particle acceleration along the field line although such an exceptional case is not considered in this article.

The relations of the guiding center variables \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) to the particle position and velocity vectors \((\mathbf{x}, {\mathbf{v}})\) are given by Littlejohn (1981) as

$$\begin{aligned} \mathbf{X}&= \mathbf{x} - {\varvec{\rho }} + \frac{v'_\perp }{\Omega ^2} \left[ \left\{ \mathbf{b} \cdot [ \nabla \times ( v_\parallel \mathbf{b} + {\mathbf{V}}_E ) ] - \frac{v'_\perp }{2 B} \mathbf{a} \cdot \nabla B \right\} \mathbf{a} \right. \nonumber \\& \qquad + \left\{ \mathbf{c} \cdot \left( \mathbf{b} \cdot \nabla {\mathbf{V}}_E + {\mathbf{V}}_E \cdot \nabla \mathbf{b} + \frac{\partial \mathbf{b}}{\partial t} + 2 v_\parallel \mathbf{b} \cdot \nabla \mathbf{b} \right) \right. \nonumber \\& \qquad \left. \left. +\frac{v'_\perp }{8} \left( \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - 5 \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \right\} \mathbf{b} \right] + {\mathcal {O}}(\delta ^3) , \nonumber \\ U& = v_\parallel - \frac{v'_\perp }{\Omega } \left[ v_\parallel \mathbf{b} \cdot \nabla \mathbf{b} \cdot \mathbf{a} - \mathbf{c} \cdot ( \nabla \times {\mathbf{V}}_E ) +\frac{v'_\perp }{4} \left( 3 \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \right] \nonumber \\& \qquad + {\mathcal {O}}(\delta ^2) , \nonumber \\ \mu&= \frac{m (v'_\perp )^2}{2 B} + \frac{m (v'_\perp )^2}{B \Omega } \left[ \frac{\mathbf{a}}{v'_\perp } \cdot \left( \frac{\partial }{\partial t} + ( v_\parallel \mathbf{b} + {\mathbf{V}}_E ) \cdot \nabla \right) ( v_\parallel \mathbf{b} + {\mathbf{V}}_E ) \right. \nonumber \\& \qquad +\frac{1}{4} \left( 3 \mathbf{a} \cdot \nabla {\mathbf{V}}_E \cdot \mathbf{c} - \mathbf{c} \cdot \nabla {\mathbf{V}}_E \cdot \mathbf{a} \right) +\frac{v_\parallel }{4} \left( 3 \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \nonumber \\&\left. \qquad + \frac{v'_\perp }{2 B} \mathbf{a} \cdot \nabla B \right] + {\mathcal {O}}(\delta ^2) , \nonumber \\ \xi&= \xi _0 + \frac{1}{\Omega } \left[ \frac{\mathbf{c}}{v'_\perp } \cdot \left( \frac{\partial }{\partial t} + ( v_\parallel \mathbf{b} + {\mathbf{V}}_E ) \cdot \nabla \right) ( v_\parallel \mathbf{b} + {\mathbf{V}}_E ) \right. \nonumber \\& \qquad +\frac{1}{4} \left( \mathbf{c} \cdot \nabla {\mathbf{V}}_E \cdot \mathbf{c} - \mathbf{a} \cdot \nabla {\mathbf{V}}_E \cdot \mathbf{a} \right) +\frac{v_\parallel }{4} \left( \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \nonumber \\& \qquad \left. + \, v'_\perp \left( \mathbf{a} \cdot \frac{\nabla B}{B} - \mathbf{a} \cdot \nabla \mathbf{c} \cdot \mathbf{a} \right) \right] + {\mathcal {O}}(\delta ^2) , \end{aligned}$$
(12)

where \(v_\parallel \equiv {\mathbf{v}} \cdot \mathbf{b}\) represents the parallel component of the particle velocity vector and \({\mathbf{v}}'_\perp \equiv (\mathbf{b} \times {\mathbf{v}}' ) \times \mathbf{b}\) is the perpendicular component of the particle velocity \({\mathbf{v}}' \equiv {\mathbf{v}} - {\mathbf{V}}_E\) observed from the moving frame with the \(\mathbf{E} \times \mathbf{B}\) drift velocity \({\mathbf{V}}_E\). The unit vectors \(\mathbf{a}\) and \(\mathbf{b}\) are defined by \(\mathbf{a} \equiv \mathbf{b} \times \mathbf{c}\) and \(\mathbf{c} \equiv {\mathbf{v}}'_\perp / v'_\perp\), respectively. The leading-order gyroradius vector is denoted by \({\varvec{\rho }} \equiv \mathbf{b} \times {\mathbf{v}}' / \Omega\). It should be noted that the magnetic field \(\mathbf{B}\) must be regarded as a function of the particle position \(\mathbf{x}\) and the time t when using \(\mathbf{B}\) to evaluate \(v_\parallel\), \({\mathbf{v}}'_\perp\), \({\mathbf{V}}_E\), \({\varvec{\rho }}\), and other variables appearing on the right-hand side of Eq. (12). On the right-hand side of the equation for \(\xi\) in Eq. (12), \(\xi _0\) is defined from \({\mathbf{v}}'_\perp\) as

$$\begin{aligned} {\mathbf{v}}'_\perp = - v'_\perp [ \sin \xi _0 \, \mathbf{e}_1 + \cos \xi _0 \, \mathbf{e}_2 ] , \end{aligned}$$
(13)

where the unit vectors \(\mathbf{e}_1\) and \(\mathbf{e}_2\) perpendicular to the magnetic field \(\mathbf{B}\) are regarded as functions of \((\mathbf{x}, t)\).

2.2 Guiding center equations for axisymmetric fields

We here treat the guiding center motion in the axisymmetric magnetic field,

$$\begin{aligned} \mathbf{B} = I \nabla \zeta + \nabla \zeta \times \nabla \chi . \end{aligned}$$
(14)

We consider toroidal nested surfaces formed by the magnetic field lines and use the flux-surface coordinates \((s, \theta , \zeta )\) where \(\theta\) and \(\zeta\) represent the poloidal and toroidal angles, respectively, and s is an arbitrary label for a flux surface. Then, I and \(\chi\) on the right-hand side of Eq. (14) are flux-surface functions, which are independent of \(\theta\) and \(\zeta\), and the poloidal flux within a flux surface labeled by s is given by \(2\pi \chi (s)\).

The electromagnetic fields are assumed to slowly evolve and we use \(\partial /\partial t = {\mathcal {O}}(\delta ^2)\) which corresponds to the transport time scale ordering (Helander and Sigmar 2002). Then, we obtain \(- c^{-1} \partial \mathbf{A}/ \partial t = {\mathcal {O}}(\delta ^2)\) and write the electric field as \(\mathbf{E} = - \nabla \Phi + {\mathcal {O}}(\delta ^2)\). Consequently, the \(\mathbf{E} \times \mathbf{B}\) drift velocity \({\mathbf{V}}_E \equiv (c/B) \mathbf{E} \times \mathbf{b}\) included in Eq. (3) can be replaced by \({\mathbf{V}}_E \equiv - (c/B) \nabla \Phi \times \mathbf{b}\) with no influence on the definition of the canonical momentum \(\mathbf{P}^c \equiv (e/c) \mathbf{A}^*\) up to \({\mathcal {O}}(\delta )\) and no need to change resultant equations shown in Sect. 2.1.

Recall that \(E_\parallel = {\mathcal {O}}(\delta )\) is assumed in Sect. 2.1 to avoid the extreme parallel acceleration of the particle. The electric field is also assumed to be axisymmetric. So the electrostatic potential is written as \(\Phi = \Phi _0 (s, t) + \Phi _1(s, \theta , t)\) where the zeroth-order potential \(\Phi _0 (s, t)\) satisfies \(\mathbf{b}\cdot \nabla \Phi _0 =0\) and the first-order potential \(\Phi _1(s, \theta , t)\) makes a dominant contribution to the parallel component of the electric field as \(E_\parallel = \mathbf{b} \cdot \nabla \Phi _1 +{\mathcal {O}}(\delta ^2)\). However, \(\Phi _1\) is neglected in this subsection for simplicity and the \(\mathbf{E} \times \mathbf{B}\) drift velocity is written as

$$\begin{aligned} {\mathbf{V}}_E = - \frac{c}{B} \nabla \Phi _0 \times \mathbf{b} . \end{aligned}$$
(15)

Effects of \(\Phi _1\) are included together with those of turbulent electromagnetic fields from the next subsection.

Under the axisymmetric magnetic field \(\mathbf{B}\), the unit vectors \(\mathbf{e}_1\) and \(\mathbf{e}_2\) perpendicular to \(\mathbf{B}\) can also be chosen so as to be axisymmetric. Then, the covariant toroidal component of \(\mathbf{R} \equiv (\nabla \mathbf{e}_1) \cdot \mathbf{e}_2\) is written as

$$\begin{aligned} R_\zeta = \mathbf{e}_\zeta \cdot \mathbf{R} = (\mathbf{e}_1 \times \hat{\mathbf{z}}) \cdot \mathbf{e}_2 = - \mathbf{b} \cdot \hat{\mathbf{z}} = - \frac{\nabla R \cdot \nabla \chi }{RB} , \end{aligned}$$
(16)

where

$$\begin{aligned} \mathbf{e}_\zeta = \frac{\partial \mathbf{X}(R, z, \zeta )}{\partial \zeta } = R^2 \nabla \zeta \end{aligned}$$
(17)

is the contravariant basis vector in the toroidal direction, \((R, z, \zeta )\) represent the right-handed cylindrical spatial coordinates, and

$$\begin{aligned} \hat{\mathbf{z}} = R \nabla \zeta \times \nabla R \end{aligned}$$
(18)

is the unit vector parallel to the z-axis. Using Eqs. (4), (14), and (16)–(18), the covariant toroidal component of \(\mathbf{W}\) is written as

$$\begin{aligned} W_\zeta = \mathbf{e}_\zeta \cdot \mathbf{W} = - \frac{\nabla R \cdot \nabla \chi }{RB} + \frac{I}{2B} \mathbf{b} \cdot ( \nabla \times \mathbf{b} ) . \end{aligned}$$
(19)

The gyrogauge-dependent term \(\mathbf{W}\) included in the canonical momentum \(\mathbf{P}^c \equiv (e/c) \mathbf{A}^*\) [see Eq. (3)] is often neglected in the literature on the guiding center drift motion because it is regarded as a small correction as well as a troublesome term to calculate practically under the nonuniform magnetic field. However, it is numerically verified by Belova et al. (2003) that, in axisymmetric geometry, the toroidal canonical moment \(P^c_\zeta \equiv \mathbf{P}^c \cdot \mathbf{e}_\zeta\) including the \(-(mc/e)\mu W_\zeta\) is conserved with high accuracy. We now carefully examine effects of the \(\mathbf{W}\) term on the guiding center motion equations in Eq. (10). The equation for the guiding center velocity \(d \mathbf{X}/{\text {d}}t\) is not influenced up to \({\mathcal {O}}(\delta )\) by \(\mathbf{W}\) . On the other hand, the equation for the parallel acceleration \({\text {d}} U/{\text {d}}t\) contains a \(\mathbf{W}\)-dependent term of \({\mathcal {O}}(\delta )\) which is derived from Eq. (10) with Eqs. (3) and (9) as

$$\begin{aligned} \mu \frac{c^2}{e B} (\nabla \times \mathbf{W} ) \cdot \nabla \Phi _0 = \mu \frac{c^2}{e B} \frac{\partial \Phi _0}{\partial \chi } \mathbf{B} \cdot \nabla W_\zeta . \end{aligned}$$
(20)

The magnitude of the term shown above is on the same order as another term proportional to \(\mathbf{b} \cdot (\nabla \times {\mathbf{V}}_E )\) in Eq. (10) for \({\text {d}}U/{\text {d}}t\) and the turbulent parallel acceleration term \(- e (\mathbf{B}^* / (m B_\parallel ^* )) \cdot \nabla \Psi\) which is added later in Eq. (55). Thus, in order to use the equation for \({\text {d}}U/{\text {d}}t\) which is correct up to \({\mathcal {O}} (\delta )\) consistently even in the case with turbulent fluctuations, the parallel acceleration due to the \(W_\zeta\) given in Eq. (20) should be retained. We now modify the definitions of the canonical momentum in Eq. (3) and the Hamiltonian in Eq. (5) by

$$\begin{aligned} \mathbf{P}^c \equiv \frac{e}{c} \mathbf{A}^* \equiv \frac{e}{c} \mathbf{A} + m ( U \mathbf{b} + {\mathbf{V}}_E ) , \end{aligned}$$
(21)

and

$$\begin{aligned} H = e \Phi _0 + \frac{1}{2} m ( U^2 + V_E^2) + \mu B + \frac{m c}{e} \mu \left[ \frac{1}{2} \mathbf{b} \cdot ( \nabla \times {\mathbf{V}}_E ) - c W_\zeta \frac{\partial \Phi _0}{\partial \chi } \right] . \end{aligned}$$
(22)

Note that \(\mathbf{W}\) disappears from \(\mathbf{P}^c \equiv (e/c) \mathbf{A}^*\) in Eq. (21) although the new additional term proportional to \(W_\zeta\) appears in H in Eq. (22) where \((\partial \mathbf{e}_1 / \partial t ) \cdot \mathbf{e}_2\) of Eq. (5) is neglected as a higher order small term because of the transport time scale ordering \(\partial /\partial t = {\mathcal {O}}(\delta ^2)\). Then, substituting Eqs. (21) and (22) into Eq. (2) defines a modified Lagrangian which is used to derive the guiding center motion equations from the variational principle. The resultant equations for \({\text {d}} \mathbf{X}/{\text {d}}t\) and \({\text {d}}U/{\text {d}}t\) are found to agree with those in Eq. (10) up to \({\mathcal {O}}(\delta )\) while \(\mathbf{W}\) included in the latter equations through \(\mathbf{B}^* \equiv \nabla \times \mathbf{A}^*\) is eliminated in the former. Because of simplicity owing to this elimination, we hereafter use the modified Lagrangian described above as a basis for treating the guiding center motion in the axisymmetric magnetic field \(\mathbf{B}\). Thus, the correction term due to \(W_\zeta\) for the definition of the toroidal canonical momentum \(P^c_\zeta\) conserved in axisymmetric geometry is neglected here although the \({\mathcal {O}}(\delta )\) effect of \(W_\zeta\) on the guiding center motion equations can still be kept in the present formulation.

We now define the parallel velocity \(V_\parallel\) by

$$\begin{aligned} V_\parallel \equiv - \frac{cI}{B} \frac{\partial \Phi _0}{\partial \chi } , \end{aligned}$$
(23)

and combine it with the \(\mathbf{E}\times \mathbf{B}\) drift velocity \({\mathbf{V}}_E\) in Eq. (15) to define the toroidal velocity,

$$\begin{aligned} {\mathbf{V}}_0 \equiv V_\parallel \mathbf{b} + {\mathbf{V}}_E = V^\zeta \mathbf{e}_\zeta , \end{aligned}$$
(24)

where the contravariant toroidal velocity component is represented by

$$\begin{aligned} V^\zeta \equiv {\mathbf{V}}_0 \cdot \nabla \zeta = - c \frac{\partial \Phi _0}{\partial \chi } . \end{aligned}$$
(25)

Now, the last term added in Eq. (22) to obtain the correct parallel acceleration up to \({\mathcal {O}}(\delta )\) is written as \((m c/e) \mu W_\zeta V^\zeta\) which can be intuitively given by performing the partial replacement in the fundamental 1-form \(\gamma = L \; d t = \mathbf{P}^c \cdot d \mathbf{X} + (m c/e) \mu \; d\xi - H \; dt\) as

$$\begin{aligned} - \frac{mc}{e} \mu \mathbf{W} \cdot d \mathbf{X} \quad \longrightarrow \quad - \frac{mc}{e} \mu W_\zeta V^\zeta \; dt , \end{aligned}$$
(26)

where \(d \mathbf{X}\) is replaced by \(V^\zeta \mathbf{e}_\zeta \; dt\) to shift the effect of the first-order part \(- (m c/e) \mu W_\zeta\) from the canonical toroidal angular momentum \(P^c_\zeta\) to the first-order part of the Hamiltonian H. We also define a new guiding center variable \(U{^\prime }\) by

$$\begin{aligned} U{^\prime } \equiv U - V_\parallel , \end{aligned}$$
(27)

which represents the parallel velocity of the guiding center motion observed from the rotating frame with the velocity \({\mathbf{V}}_0\).

We hereafter use the guiding center coordinates \(\mathbf{Z}{^\prime } \equiv (\mathbf{X}, U{^\prime }, \mu , \xi )\) instead of \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) although \('\) is omitted for convenience. Now, the Lagrangian takes the same form as in Eq. (2),

$$\begin{aligned} L = \mathbf{P}^c \cdot \dot{\mathbf{X}} + \frac{m c}{e} \mu \dot{\xi } - H \end{aligned}$$
(28)

although the canonical momentum and the Hamiltonian in Eqs. (21) and (22) are now rewritten as

$$\begin{aligned} \mathbf{P}^c \equiv \frac{e}{c} \mathbf{A}^* \equiv \frac{e}{c} \mathbf{A} + m ( U \mathbf{b} + {\mathbf{V}}_0 ) , \end{aligned}$$
(29)

and

$$\begin{aligned} H \equiv e \Phi _0 + \frac{1}{2} m |U \mathbf{b} + {\mathbf{V}}_0|^2 + \mu B + H_1^V , \end{aligned}$$
(30)

respectively, where the \({\mathcal {O}}(\delta )\) term \(H_1^V\) is given by

$$\begin{aligned} H_1^V \equiv \frac{m c}{e} \mu \left[ \frac{1}{2} \mathbf{b} \cdot ( \nabla \times {\mathbf{V}}_E ) + W_\zeta V^\zeta \right] = - \frac{\mu }{\Omega } \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) . \end{aligned}$$
(31)

When Eq. (29) is used for the Lagrangian in Eq. (28), the nonvanishing components of Poisson brackets between the guiding center coordinates are given by

$$\begin{aligned} \{ \mathbf{X}, \mathbf{X} \} = \frac{c}{e B_\parallel ^*} \mathbf{b} \times \mathbf{I} , \; \; \{ \mathbf{X}, U \} = \frac{\mathbf{B}^*}{m B_\parallel ^*} , \; \; \{ \xi , \mu \} = \frac{e}{m c} , \end{aligned}$$
(32)

where \(\mathbf{B}^* \equiv \nabla \times \mathbf{A}^*\) is evaluated by using \(\mathbf{A}^*\) defined in Eq. (29). We see that the number of the nonvanishing components of Poisson brackets in Eq. (32) is reduced from that in Eq. (8) because the \(\mathbf{W}\) term is removed from the canonical momentum. Then, the guiding center motion equations are derived from Eqs. (7) , (30), and (32). Their detailed expressions are not shown here because the motion equations for more general case including perturbed electromagnetic fields are given in the next subsection.

2.3 Gyrocenter equations for toroidally rotating plasmas

We here suppose that the electromagnetic perturbations are added into the system considered in the previous subsection. For the present case, the Lagrangian is written in terms of the particle phase-space coordinates \(\mathbf{z} \equiv (\mathbf{x}, {\mathbf{v}})\) as

$$\begin{aligned} L = \left[ m {\mathbf{v}} + \frac{e}{c} ( \mathbf{A}_0 + \mathbf{A}_1) \right] \cdot \dot{\mathbf{x}} - \left[ \frac{1}{2} m |{\mathbf{v}}|^2 + e ( \Phi _0 + \phi _1 ) \right] , \end{aligned}$$
(33)

where \(\phi _1\) and \(\mathbf{A}_1\) represent the perturbation fields added into the electrostatic potential and the vector potential, respectively. Defining the modified velocity vector \({\mathbf{v}}_c\) by

$$\begin{aligned} {\mathbf{v}}_c \equiv {\mathbf{v}} + \frac{e}{m c} \mathbf{A}_1 , \end{aligned}$$
(34)

the Lagrangian is rewritten in terms of the modified particle coordinates \(\mathbf{z}_c \equiv (\mathbf{x}, {\mathbf{v}}_c)\) as

$$\begin{aligned} L = \left( m {\mathbf{v}}_c + \frac{e}{c} \mathbf{A}_0 \right) \cdot \dot{\mathbf{x}} - \left[ \frac{1}{2} m |{\mathbf{v}}_c|^2 + e \Phi _0 + e \left( \phi _1 - \frac{{\mathbf{v}}_c}{c}\cdot \mathbf{A}_1 \right) + \frac{e^2 }{2m c^2} |\mathbf{A}_1|^2 \right] . \end{aligned}$$
(35)

Comparing Eq. (35) with Eq. (33) and identifying \({\mathbf{v}}_c\) with \({\mathbf{v}}\), the perturbations can formally be regarded as included only through the perturbation Hamiltonian,

$$\begin{aligned} h_1 \equiv e \left( \phi _1 - \frac{{\mathbf{v}}_c}{c}\cdot \mathbf{A}_1 \right) + \frac{e^2 }{2m c^2} |\mathbf{A}_1|^2 . \end{aligned}$$
(36)

The unperturbed magnetic field is axisymmetric, and as in Eq. (14), it is written as

$$\begin{aligned} \mathbf{B}_0 = \nabla \times \mathbf{A}_0 = I \nabla \zeta + \nabla \zeta \times \nabla \chi . \end{aligned}$$
(37)

We now use Eq. (12), in which \({\mathbf{v}}\) is replaced by \({\mathbf{v}}_c\), and combine it with Eq. (27) to define the guiding center coordinates \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\), the nonvanishing components of which are given by Eq. (32). Then, the Lagrangian in Eq. (35) is written in the same form as in Eq. (28) where the canonical momentum \(\mathbf{P}^c\) also takes the same form as in Eq. (29) but the Hamiltonian H is given by Eq. (30) plus the perturbation Hamiltonian \(h_1\) in Eq. (36). The Hamiltonian H is no longer independent of the gyrophase \(\xi\) because the perturbation fields \(\phi _1\) and \(\mathbf{A}_1\) included in \(h_1\) are now regarded as functions of the particle position vector \(\mathbf{x} = \mathbf{X} + {\varvec{\rho }}\) where \({\varvec{\rho }}\) depends on \(\xi\). In the presence of the large \(\mathbf{E}\times \mathbf{B}\) drift \({\mathbf{V}}_E\) due to the zeroth-order fields \(\Phi _0\) and \(\mathbf{B}_0\),

$$\begin{aligned} {\mathbf{V}}_E \equiv - \frac{c}{B_0^2} (\nabla \Phi _0 \times \mathbf{B}_0) , \end{aligned}$$
(38)

we use the velocity

$$\begin{aligned} {\mathbf{v}}'_c \equiv {\mathbf{v}}_c - {\mathbf{V}}_E , \end{aligned}$$
(39)

to represent the leading order of the gyroradius vector by

$$\begin{aligned} {\varvec{\rho }} \equiv \frac{\mathbf{b} \times {\mathbf{v}}'_c }{\Omega _0} , \end{aligned}$$
(40)

where \(\mathbf{b} \equiv \mathbf{B}_0/B_0\) and \(\Omega _0 = e B_0 / (m c)\).

In the same way as in Eq. (13), the gyrophase \(\xi\) is defined from \(({\mathbf{v}}'_c)_\perp \equiv ( \mathbf{b} \times {\mathbf{v}}'_c ) \times \mathbf{b}\) as

$$\begin{aligned} ({\mathbf{v}}'_c)_\perp = - [2 \mu B_0 (\mathbf{X}, t) / m]^{1/2} [ \sin \xi \, \mathbf{e}_1 (\mathbf{X}, t) + \cos \xi \, \mathbf{e}_2 (\mathbf{X}, t) ] . \end{aligned}$$
(41)

Here, the component of an arbitrary vector to the equilibrium field \(\mathbf{B}_0\) is denoted by the subscript \(\perp\). Hereafter, the gyrophase-average and gyrophase-dependent parts of an arbitrary periodic gyrophase function \(Q(\xi )\) are denoted by

$$\begin{aligned} \langle Q \rangle _\xi \equiv \frac{1}{2\pi } \oint Q(\xi ) \, d\xi \quad \text{ and } \quad \widetilde{Q} \equiv Q - \langle Q \rangle _\xi , \end{aligned}$$
(42)

respectively, and we define the potential field variable \(\psi\) by

$$\begin{aligned} \psi \equiv \phi _1 ( \mathbf{X} + {\varvec{\rho }}, t) - \frac{1}{c} [ {\mathbf{V}}_0 + U \mathbf{b} +({\mathbf{v}}'_c)_\perp ] \cdot \mathbf{A}_1 (\mathbf{X} + {\varvec{\rho }}, t) . \end{aligned}$$
(43)

As shown in Eqs. (23)–(25), the velocity vector \({\mathbf{V}}_0\) is given by

$$\begin{aligned} {\mathbf{V}}_0 \equiv V_\parallel \mathbf{b} + {\mathbf{V}}_E = V^\zeta \mathbf{e}_\zeta , \end{aligned}$$
(44)

where \(V_\parallel \equiv - (c I/B_0) ( \partial \Phi _0 / \partial \chi )\), \(V^\zeta \equiv {\mathbf{V}}_0 \cdot \nabla \zeta = - c \partial \Phi _0 / \partial \chi\), and \(\mathbf{e}_\zeta \equiv R^2 \nabla \zeta\). Recall that the zeroth-order potential \(\Phi _0\) is a flux-surface function.

To remove the gyrophase dependence from the Hamiltonian, we make the canonical (or symplectic) transformation from the guiding center coordinates \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) to the new coordinates \(\overline{\mathbf{Z}} \equiv (\overline{\mathbf{X}}, \overline{U}, \overline{\mu }, \overline{\xi } )\) which are called the gyrocenter coordinates (Brizard 1989). The relations of \(\overline{\mathbf{Z}} \equiv (\overline{\mathbf{X}}, \overline{U}, \overline{\mu }, \overline{\xi } )\) to \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) are written using the Poisson bracket as

$$\begin{aligned} \overline{\mathbf{Z}} = \mathbf{Z} + \{ \widetilde{S}_1, \mathbf{Z} \} + {\mathcal {O}} (\delta ^2) , \end{aligned}$$
(45)

where the first-order generating function \(\widetilde{S}_1\) is determined by

$$\begin{aligned} \frac{\partial \widetilde{S}_1}{\partial t} + \{ \widetilde{S}_1, H_0 \} = e {\widetilde{\psi }} , \end{aligned}$$
(46)

where \(H_0\) represents the unperturbed Hamiltonian given by Eq. (30). Assuming \((\partial \widetilde{S}_1 / \partial t)/(\Omega _0 \widetilde{S}_1) = {\mathcal {O}}(\delta )\), the solution of Eq. (46) is written, up to the leading order in \(\delta\), as

$$\begin{aligned} \widetilde{S}_1 (\mathbf{Z}, t) = \frac{e}{\Omega _0} \int {\widetilde{\psi }} \, d \xi , \end{aligned}$$
(47)

where the integral constant is determined from the condition \(\langle \widetilde{S}_1 \rangle _\xi = 0\). In order for Eq. (45) to be the near-identity transformation, \(\{ \widetilde{S}_1, \mathbf{Z} \}\) needs to be of the first order in \(\delta\). Then, we assume that \(e {\widetilde{\psi }} = {\mathcal {O}}(\delta )\), namely, \({\widetilde{\psi }} = {\mathcal {O}} (\delta ^2)\), which is consistent with \(\{ \widetilde{S}_1, \mathbf{Z} \} = {\mathcal {O}}(\delta )\) as confirmed later from Eq. (52). We also assume \(e \mathbf{A}_1 = {\mathcal {O}}(\delta )\) so that the last term on the right-hand side of Eq. (36) is of \({\mathcal {O}}(\delta ^2)\).

We hereafter denote the gyrocenter coordinates by \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) instead of \(\overline{\mathbf{Z}} \equiv (\overline{\mathbf{X}}, \overline{U}, \overline{\mu }, \overline{\xi } )\) for simplicity. In terms of the gyrocenter coordinates \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\), the Lagrangian is given by

$$\begin{aligned} L = \mathbf{P}^c \cdot \dot{\mathbf{X}} + \frac{m c}{e} \mu \dot{\xi } - H , \end{aligned}$$
(48)

where the canonical momentum \(\mathbf{P}^c\) is written in the same form as in Eq. (29),

$$\begin{aligned} \mathbf{P}^c \equiv \frac{e}{c} \mathbf{A}^* \equiv \frac{e}{c} \mathbf{A}_0 + m ( U \mathbf{b} + {\mathbf{V}}_0 ) , \end{aligned}$$
(49)

and the Hamiltonian H is given by

$$\begin{aligned} H \equiv e \Phi _0 + \frac{1}{2} m |U \mathbf{b} + {\mathbf{V}}_0|^2 + \mu B_0 - \frac{\mu }{\Omega _0} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) + e \Psi . \end{aligned}$$
(50)

The perturbation fields are included in the Hamiltonian H through the field variable \(\Psi\) defined by

$$\begin{aligned} \Psi \equiv \langle \psi \rangle _\xi + \frac{e}{2m c^2} \langle |\mathbf{A}_1|^2 \rangle _\xi - \frac{e}{2 B_0} \frac{\partial }{\partial \mu } \langle ( {\widetilde{\psi }} )^2 \rangle _{\xi } . \end{aligned}$$
(51)

As shown in Eq. (43), \(\psi\) and \(\mathbf{A}_1\) depend on the particle position \(\mathbf{x} = \mathbf{X} + {\varvec{\rho }}\) and accordingly on the gyrophase \(\xi\). The last two terms on the right-hand side of Eq. (51) give the perturbation to the Hamiltonian on the second order in \(\delta\). We retain these second-order perturbation terms here because they influence the gyrokinetic Poisson equation and/or Ampère’s law derived in Sect. 3.2 to the lowest order in \(\delta\). However, other second-order terms are neglected in the perturbation Hamiltonian. The second-order correction terms (Calvo and Parra 2012; Littlejohn 1981) to define the difference between the particle and gyrocenter positions are not considered here. To avoid a secular deviation of the particle position from the gyrocenter in a long time gyrokinetic simulation, Wang and Hahm (2010) considered the correction due to the fluctuating \(\mathbf{E}\times \mathbf{B}\) velocity in the definition of the gyrocenter position and included the polarization drift in the gyrocenter equations of motion, which are not retained in this work either. The gyrocenter Hamiltonian for describing turbulent transport in toroidally rotating plasmas first appears in Brizard (1995) where the same Poisson brackets as in Eq. (32) are used with neglecting the \(\mathbf{W}\) term in \(\mathbf{A}^*\). Compared with Eq. (50), the gyrocenter Hamiltonian given in Eq. (54) of  Brizard (1995) contains some second-order fluctuation terms which are not included in \(e \Psi\) although it neglects the first-order terms corresponding to those in \(H_1^V\) [see Eq. (31)].

The nonvanishing components of Poisson brackets between the gyrocenter coordinates are also given by the same expressions as in Eq. (32) which are shown again as

$$\begin{aligned} \{ \mathbf{X}, \mathbf{X} \} = \frac{c}{e B_\parallel ^*} \mathbf{b} \times \mathbf{I}, \quad \{ \mathbf{X}, U \} = \frac{\mathbf{B}^*}{m B_\parallel ^*},\quad \{ \xi , \mu \} = \frac{e}{m c} . \end{aligned}$$
(52)

Here, \(\mathbf{A}^*\) given in Eq. (49) is used to define \(\mathbf{B}^* \equiv \nabla \times \mathbf{A}^*\) and \(B_\parallel ^* \equiv \mathbf{B}^* \cdot \mathbf{b}\). Using Eqs. (49) and (52), we can show

$$\begin{aligned} \{ \mathbf{X}, P^c_\zeta \} = \mathbf{e}_\zeta , \{ U, P^c_\zeta \} = \{ \mu , P^c_\zeta \} = 0 , \end{aligned}$$
(53)

which imply that the covariant toroidal component \(P^c_\zeta \equiv \mathbf{P}^c \cdot \mathbf{e}_\zeta\) of the canonical momentum plays a role of the generator of toroidal rotation. The Euler–Lagrange equation is rewritten in the same form as in Eq. (7) which is written by

$$\begin{aligned} \frac{{\text {d}}{} \mathbf{Z}}{{\text {d}}t} = \{ \mathbf{Z}, H \} + \{ \mathbf{Z}, \mathbf{X} \} \cdot \frac{e}{c} \frac{\partial \mathbf{A}^*}{\partial t} . \end{aligned}$$
(54)

The gyrocenter motion equations are derived from combining Eqs. (50), (52), and (54) as

$$\begin{aligned} \frac{{\text {d}} \mathbf{X}}{{\text {d}}t}&= U \mathbf{b} + {\mathbf{V}}_0 + \frac{e}{m} \frac{\partial \Psi }{\partial U} \frac{\mathbf{B}^*}{B_\parallel ^*} + \frac{c}{e B_\parallel ^*} \mathbf{b} \times \left( e \nabla \Psi + \frac{e}{c} \frac{\partial \mathbf{A}^*}{\partial t} \right. \nonumber \\& \qquad + m ( U \mathbf{b} + {\mathbf{V}}_0 ) \cdot \nabla ( U \mathbf{b} + {\mathbf{V}}_0 ) \nonumber \\& \qquad \left. + \, \mu \nabla \left[ B_0 - \frac{1}{\Omega _0} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) \right] \right) , \nonumber \\ \frac{{\text {d}} U}{{\text {d}}t}&= - \frac{\mathbf{B}^*}{m B_\parallel ^*} \cdot \left( e \nabla \Psi + \frac{e}{c} \frac{\partial \mathbf{A}^*}{\partial t} + m ( U \mathbf{b} + {\mathbf{V}}_0 ) \cdot \nabla ( U \mathbf{b} + {\mathbf{V}}_0 ) \right. \nonumber \\& \qquad \left. + \, \mu \nabla \left[ B_0 - \frac{1}{\Omega _0} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) \right] \right) , \nonumber \\ \frac{{\text {d}} \mu }{{\text {d}}t}&= 0 , \nonumber \\ \frac{{\text {d}} \xi }{{\text {d}}t}&= \Omega _0 + \frac{e^2}{m c} \frac{\partial \Psi }{\partial \mu } - \frac{1}{B_0} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) . \end{aligned}$$
(55)

At the end of this subsection, we write the relations of the gyrocenter coordinates \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) to the particle coordinates \(\mathbf{z} \equiv (\mathbf{x}, {\mathbf{v}})\) which are derived by combining Eq. (45) with the procedures described after Eq. (37) as

$$\begin{aligned}&\mathbf{X}=\mathbf{x} - {\varvec{\rho }} + \frac{v'_\perp }{\Omega ^2} \left[ \left\{ \mathbf{b} \cdot [ \nabla \times ( v'_\parallel \mathbf{b} + {\mathbf{V}}_0 ) ] - \frac{v'_\perp }{2 B} \mathbf{a} \cdot \nabla B \right\} \mathbf{a} \right. \nonumber \\& \qquad + \left\{ \mathbf{c} \cdot \left( \mathbf{b} \cdot \nabla {\mathbf{V}}_0 + {\mathbf{V}}_0 \cdot \nabla \mathbf{b} + \frac{\partial \mathbf{b}}{\partial t} + 2 v'_\parallel \mathbf{b} \cdot \nabla \mathbf{b} \right) \right. \nonumber \\& \qquad \left. \left. +\frac{v'_\perp }{8} \left( \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - 5 \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \right\} \mathbf{b} \right] \nonumber \\& \qquad + \frac{1}{B_0} \left[ \left\{ \mathbf{A}_1 + c \nabla \left( \int \frac{d\xi _0}{\Omega _0} \, {\widetilde{\psi }} \right) \right\} \times \mathbf{b} + \mathbf{b} \int d\xi _0 \, \widetilde{A}_{1\parallel } \right] + {\mathcal {O}}(\delta ^3) , \nonumber \\&U = v'_\parallel - \frac{v'_\perp }{\Omega _0} \left[ v'_\parallel \mathbf{b} \cdot \nabla \mathbf{b} \cdot \mathbf{a} - \mathbf{c} \cdot ( \nabla \times {\mathbf{V}}_0 ) +\frac{v'_\perp }{4} \left( 3 \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \right] \nonumber \\&+ \frac{e}{mc}A_{1\parallel } + {\mathcal {O}}(\delta ^2) , \nonumber \\ \mu&= \frac{m (v'_\perp )^2}{2 B_0} + \frac{m (v'_\perp )^2}{B_0 \Omega _0} \left[ \frac{\mathbf{a}}{v'_\perp } \cdot \left( \frac{\partial }{\partial t} + ( v'_\parallel \mathbf{b} + {\mathbf{V}}_0 ) \cdot \nabla \right) ( v'_\parallel \mathbf{b} + {\mathbf{V}}_0 ) \right. \nonumber \\&+\frac{1}{4} \left( 3 \mathbf{a} \cdot \nabla {\mathbf{V}}_0 \cdot \mathbf{c} - \mathbf{c} \cdot \nabla {\mathbf{V}}_0 \cdot \mathbf{a} \right) +\frac{v'_\parallel }{4} \left( 3 \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \nonumber \\&\left. + \, \frac{v'_\perp }{2 B_0} \mathbf{a} \cdot \nabla B_0 \right] + \frac{e}{cB_0}{\mathbf{v}}'_\perp \cdot \mathbf{A}_{1\perp } + \frac{e}{B_0} {\widetilde{\psi }} + {\mathcal {O}}(\delta ^2) , \nonumber \\&\xi = \xi _0 + \frac{1}{\Omega _0} \left[ \frac{\mathbf{c}}{v'_\perp } \cdot \left( \frac{\partial }{\partial t} + ( v'_\parallel \mathbf{b} + {\mathbf{V}}_0 ) \cdot \nabla \right) ( v'_\parallel \mathbf{b} + {\mathbf{V}}_0 ) \right. \nonumber \\&+\frac{1}{4} \left( \mathbf{c} \cdot \nabla {\mathbf{V}}_0 \cdot \mathbf{c} - \mathbf{a} \cdot \nabla {\mathbf{V}}_0 \cdot \mathbf{a} \right) +\frac{v'_\parallel }{4} \left( \mathbf{c} \cdot \nabla \mathbf{b} \cdot \mathbf{c} - \mathbf{a} \cdot \nabla \mathbf{b} \cdot \mathbf{a} \right) \nonumber \\&\left. + \, v'_\perp \left( \mathbf{a} \cdot \frac{\nabla B_0}{B_0} - \mathbf{a} \cdot \nabla \mathbf{c} \cdot \mathbf{a} \right) \right] - \frac{e}{B_0} \int d \xi _0 \frac{\partial {\widetilde{\psi }}}{\partial \mu } + {\mathcal {O}}(\delta ^2) , \end{aligned}$$
(56)

where \(v'_\parallel \equiv {\mathbf{v}}' \cdot \mathbf{b}\) and \({\mathbf{v}}'_\perp \equiv (\mathbf{b} \times {\mathbf{v}}' ) \times \mathbf{b}\) represent the parallel and perpendicular components of the particle velocity \({\mathbf{v}}' \equiv {\mathbf{v}} - {\mathbf{V}}_0\) observed from the toroidally rotating frame with the velocity \({\mathbf{V}}_0\). The unit vectors \(\mathbf{a}\) and \(\mathbf{b}\) are defined by \(\mathbf{a} \equiv \mathbf{b} \times \mathbf{c}\) and \(\mathbf{c} \equiv {\mathbf{v}}'_\perp / v'_\perp\), respectively. The leading-order gyroradius vector is denoted by \({\varvec{\rho }} \equiv \mathbf{b} \times {\mathbf{v}}' / \Omega _0\). On the right-hand side of Eq. (56), all variables are regarded as functions of the particle coordinates \(\mathbf{z} \equiv (\mathbf{x}, {\mathbf{v}})\), and \(\xi _0\) is defined from \(({\mathbf{v}}'_c)_\perp \equiv {\mathbf{v}}'_\perp + (e/mc)\mathbf{A}_{1\perp }\) as

$$\begin{aligned} ({\mathbf{v}}'_c)_\perp = - (v'_c)_\perp [ \sin \xi _0 \, \mathbf{e}_1 + \cos \xi _0 \, \mathbf{e}_2 ] , \end{aligned}$$
(57)

where the orthogonal unit vectors \((\mathbf{e}_1 , \mathbf{e}_2 , \mathbf{b} )\) are evaluated at \((\mathbf{x}, t)\).

3 Gyrokinetic field theory for toroidally rotating plasmas

In this section, we present the gyrokinetic field theory for deriving all governing equations which describe not only the gyrocenter distribution functions but also the background and turbulent electromagnetic fields in toroidally rotating plasmas. For this purpose, the variational principle is employed with using the action integral for the whole system consisting of charged particles and electromagnetic fields. In analytical treatments of continuum mechanical systems, it is convenient to use the functional derivative \(\delta \mathcal{F}/\delta f\) which is defined for the functional \(\mathcal{F}[f]\) of the function f(x) such that for an arbitrary function \(\varphi (x)\),

$$\begin{aligned} \int \frac{\delta \mathcal{F}[f ]}{\delta f} (x) \varphi (x) \, {\text {d}}x = \lim _{\varepsilon \rightarrow 0} \frac{ \mathcal{F}[f + \varepsilon \varphi ] - \mathcal{F}[f] }{\varepsilon } . \end{aligned}$$
(58)

In the present gyrokinetic system, the action integral is regarded as the function of the gyrocenter orbits and the electromagnetic fields.

3.1 Lagrangian for the whole system

Now, the Lagrangian for the whole system is given by

$$\begin{aligned} L&= \sum _a \int {\text {d}}^6 Z_0 \, D_a(\mathbf{Z}_0, t_0) F_a(\mathbf{Z}_0, t_0) \nonumber \\& \qquad \times L_a [ \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) , \dot{\mathbf{Z}}_a (\mathbf{Z}_0, t_0 ; t) ; \{ \Phi _0 , \phi _1, \mathbf{A}_0 , \mathbf{A}_1 \} ] + \int {\text {d}}^3 x \, \mathcal{L}_f , \end{aligned}$$
(59)

where \(\int {\text {d}}^6 Z_0 \equiv \int d^3 X_0 \int _{-\infty }^{\infty } dU_0 \int _0^{\infty } {\text {d}}\mu _0 \int _0^{2\pi } d\xi _0\) represents the integral with respect to the initial gyrocenter coordinates, \(F_a(\mathbf{Z}_0, t_0)\) is the gyrocenter distribution function at the initial time \(t_0\), and the subscript a denotes the particle species. The Lagrangian \(L_a\) for the single particle is given using Eq. (48) for the species a and \(\mathbf{Z}_a = \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t)\) represents the gyrocenter coordinates of the particle which satisfy the initial condition,

$$\begin{aligned} \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t_0) = \mathbf{Z}_0 . \end{aligned}$$
(60)

The Jacobian is given by

$$\begin{aligned} D_a \equiv B_{a\parallel }^*/m_a , \end{aligned}$$
(61)

where \(B_{a\parallel }^* \equiv \mathbf{B}_a^* \cdot \mathbf{b}\) and \(\mathbf{B}_a^* \equiv \nabla \times \mathbf{A}_a^*\). Here, \(\mathbf{A}_a^*\) is given using Eq. (49) for the particle species a. The Lagrangian density \(\mathcal{L}_f\) associated with electromagnetic fields is defined by

$$\begin{aligned}&\mathcal{L}_f = \frac{1}{8\pi } \left[ |\nabla (\Phi _0 + \phi _1)|^2 - |\nabla \times ( \mathbf{A}_0 + \mathbf{A}_1 ) |^2 \right] + \frac{\alpha }{4 \pi c} \nabla \cdot \mathbf{A}_0 \nonumber \\&\qquad \qquad + \frac{\lambda }{4 \pi c} \nabla \cdot \mathbf{A}_1 + \frac{{\varvec{\Lambda }} }{4\pi } \cdot ( \mathbf{B}_0 - I \nabla \zeta - \nabla \zeta \times \nabla \chi ) , \end{aligned}$$
(62)

where \(\alpha\), \(\lambda\), and \({\varvec{\Lambda }}\) are introduced as the Lagrange undetermined multipliers to impose the Coulomb gauge conditions on the vector potentials \(\mathbf{A}_0\), \(\mathbf{A}_1\), and require the axisymmetric unperturbed magnetic field \(\mathbf{B}_0\) as shown in Eqs. (70) and (71). Here, not only perturbed fields \((\phi _1, \mathbf{A}_1)\) but also unperturbed fields \((\Phi _0, \mathbf{A}_0)\) are treated as variational variables to describe physical processes in the transport time scale during which unperturbed fields vary.

The electric field part of the Lagrangian density \(\mathcal{L}_f\) is given by \(|\nabla (\Phi _0 + \phi _1)|^2/ (8\pi )\) instead of \(|\mathbf{E}|^2/ (8\pi ) = |\nabla (\Phi _0 + \phi _1)+ c^{-1} \partial (\mathbf{A}_0 + \mathbf{A}_1)/\partial t|^2/ (8\pi )\) to remove the transverse part of the Maxwell displacement current \(- c^{-1} \partial \mathbf{A}/\partial t\) from the resultant equations for the electromagnetic fields derived from the variational principle as seen later. This corresponds to the Darwin approximation (Kaufman and Rostler 1971), by which we can avoid treating the electromagnetic waves propagating at the light speed c. It should be also noted that, in spite of dropping the inductive part of the electric field in \(\mathcal{L}_f\) to remove the Maxwell displacement current, the effect of the inductive electric field \(- c^{-1}\partial \mathbf{A}_0 /\partial t\) due to the temporal variation in the unperturbed magnetic field is still retained in the gyrocenter motion equations because it plays an important role in the classical and neoclassical transport processes as a source term [see Eq. (268) in Sect. 5.2] which produces the Ohmic current and Ware Pinch (Helander and Sigmar 2002).

We here note that because of the above-mentioned term \(|\nabla (\Phi _0 + \phi _1)|^2/ (8\pi )\) associated with the Darwin approximation, the Lagrangian density \(\mathcal{L}_f\) is not necessarily invariant under an arbitrary gauge transformation for electrostatic and vector potentials written in the most general form, \((\phi , \mathbf{A}) \rightarrow (\phi - \partial \mathcal{S}/\partial t, \mathbf{A} + \nabla \mathcal{S} )\), where \(\mathcal{S}\) is an arbitrary function of \((t, \mathbf{x})\). It is also seen from Eq. (55) and the definition of \(\Psi\) in Eq. (51) that the perturbed potentials \(\phi _1\) and \(\mathbf{A}_1\) cannot be generally gauge-transformed without causing differences in the gyrocenter motion equations. However, we find that the gauge transformation of the unperturbed vector potential,

$$\begin{aligned} \mathbf{A}_0 \rightarrow \mathbf{A}_0 + \nabla \mathcal{S}_0 , \end{aligned}$$
(63)

with \(\partial \mathcal{S}_0 /\partial t =0\) causes no change in \(\mathcal{L}_f\) as well as it just transforms the gyrocenter Lagrangian \(L_a\) [see Eq. (48)] to \(L_a + {\text {d}} \mathcal{S}_0 / {\text {d}}t\) where \({\text {d}} \mathcal{S}_0 / {\text {d}}t = \dot{\mathbf{X}} \cdot \nabla \mathcal{S}_0\). Furthermore, the additional conditions \(\nabla ^2 \mathcal{S}_0 = 0\) and \(\partial \mathcal{S}_0 / \partial \zeta = 0\) result from the constraints on \(\mathbf{A}_0\) mentioned after Eq. (62), namely the Coulomb gauge and the axisymmetric unperturbed field. Thus, the gauge transformation defined in Eq. (63) with \(\partial \mathcal{S}_0 /\partial t = \nabla ^2 \mathcal{S}_0 = \partial \mathcal{S}_0 / \partial \zeta = 0\) should never change physical results derived from the Lagrangian L in Eq. (59) for the whole system.

All governing equations for the gyrokinetic system considered here are derived from the variational principle,

$$\begin{aligned} \delta \mathcal{I} \equiv \delta \int _{t_1}^{t_2} L d t = 0 , \end{aligned}$$
(64)

where L is given by Eq. (59) and the variations of all variables are assumed to vanish at the boundaries of the integral region. For example, \(\delta \mathcal{I}/\delta \mathbf{Z}_a = 0\) yields the gyrocenter motion equations for the species a,

$$\begin{aligned} \frac{{\text {d}}{} \mathbf{Z}_a}{{\text {d}}t} = \{ \mathbf{Z}_a, H_a \} + \{ \mathbf{Z}_a, \mathbf{X}_a \} \cdot \frac{e_a}{c} \frac{\partial \mathbf{A}_a^*}{\partial t} , \end{aligned}$$
(65)

from which the same equations as in Eq. (55) are derived with the help of the Poisson brackets in Eq. (52) and the gyrocenter Hamiltonian in Eq. (50). Then, the solution \(\mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t)\) of the gyrocenter motion equations is used to write the distribution function \(F_a(\mathbf{Z}, t)\) for the time t as

$$\begin{aligned}&D_a(\mathbf{Z}, t) F_a(\mathbf{Z}, t) \nonumber \\&= \int {\text {d}}^6 Z_0 \; D_a(\mathbf{Z}_0, t_0) F_a(\mathbf{Z}_0, t_0) \delta ^6[\mathbf{Z}- \mathbf{Z}_a(\mathbf{Z}_0, t_0 ; t)] , \end{aligned}$$
(66)

where \(\delta ^6(\mathbf{Z}- \mathbf{Z}_a) = \delta ^3 (\mathbf{X} - \mathbf{X}_a) \delta (U - U_a) \delta (\mu - \mu _a) \delta [\xi - \xi _a({\mathrm{mod}}2\pi )]\).

Since there is no gyrophase dependence in the right-hand side of the gyrocenter motion equations given in Eq. (65) [or Eq. (55)], \(\mathbf{X}_a(\mathbf{Z}_0,t_0; t)\), \(U_a(\mathbf{Z}_0,t_0; t)\), and \(\mu _a(\mathbf{Z}_0,t_0; t)\) are all independent of the initial gyrophase \(\xi _0\). The Jacobian \(D_a\) is also gyrophase-independent. Then, we find from Eq. (66) that if \(F_a\) is initially gyrophase-independent, it is gyrophase-independent at any time. We here assume without loss of generality that \(F_a\) is gyrophase-independent, \(\partial F_a(\mathbf{Z}, t)/\partial {\xi } =0\).

The gyrocenter phase-space conservation law is represented by

$$\begin{aligned} \frac{\partial D_a(\mathbf{Z}, t)}{\partial t} + \frac{\partial }{\partial \mathbf{Z}} \cdot \left( D_a (\mathbf{Z}, t) \frac{{\text {d}} \mathbf{Z}_a}{{\text {d}}t}(\mathbf{Z}, t) \right) =0 , \end{aligned}$$
(67)

where \(({\text {d}} \mathbf{Z}_a/{\text {d}}t)(\mathbf{Z}, t)\) represents the value of the right-hand side of Eq. (65) evaluated at the gyrocenter coordinates \(\mathbf{Z}\) and the time t. From Eqs. (65) and (66), we have the gyrokinetic Vlasov equation in the conservation form,

$$\begin{aligned} \frac{\partial }{\partial t} \left( D_a F_a \right) + \frac{\partial }{\partial \mathbf{Z}} \cdot \left( D_a F_a \frac{{\text {d}} \mathbf{Z}_a}{{\text {d}}t} \right) =0 , \end{aligned}$$
(68)

which is rewritten with the help of Eq. (67) in the convection form,

$$\begin{aligned} \left( \frac{\partial }{\partial t} + \frac{{\text {d}} \mathbf{Z}_a}{{\text {d}}t} \cdot \frac{\partial }{\partial \mathbf{Z}} \right) F_a (\mathbf{Z}, t) =0 . \end{aligned}$$
(69)

If we neglect the electric field energy density \((8\pi )^{-1}|\nabla (\Phi _0 + \phi _1)|^2\) in \(\mathcal{L}_f\) given by Eq. (62), we can straightforwardly reduce the results shown in the succeeding sections to those for the case of using the charge neutrality condition instead of Poisson’s equation. The charge neutrality condition is valid for length scales larger than the Debye length, and it is widely acceptable for practical use although we here keep \((8\pi )^{-1}|\nabla (\Phi _0 + \phi _1)|^2\) in \(\mathcal{L}_f\) from the theoretical point of view to check how this term influences the governing equations and conservation laws derived in this article.

3.2 Governing equations for background and perturbation electromagnetic fields

In the right-hand side of Eq. (62), \(\alpha\), \(\lambda\), and \({\varvec{\Lambda }}\) are included as the Lagrange undetermined multipliers. The Coulomb gauge conditions for the vector potentials,

$$\begin{aligned} \nabla \cdot \mathbf{A}_0 = \nabla \cdot \mathbf{A}_1 = 0 , \end{aligned}$$
(70)

are derived from \(\delta \mathcal{I}/\delta \alpha = \delta \mathcal{I}/\delta \lambda = 0\) while \(\delta \mathcal{I}/\delta {\varvec{\Lambda }} = 0\) imposes the constraint that the background magnetic field is written in the axisymmetric form as

$$\begin{aligned} \mathbf{B}_0 = I \nabla \zeta + \nabla \zeta \times \nabla \chi , \end{aligned}$$
(71)

where I and \(\chi\) represent the covariant toroidal component of \(\mathbf{B}_0\) and the poloidal magnetic flux divided by \(2\pi\), respectively. Here, I and \(\chi\) are both independent of the toroidal angle coordinate \(\zeta\) and they are written as \(I= I(\chi , t)\) and \(\chi = \chi (R, z, t)\), where \((R, z, \zeta )\) represent the right-handed cylindrical coordinates. The background field \(\mathbf{B}_0\) is generally regarded as time-dependent. The equilibrium vector potential \(\mathbf{A}_0\), which satisfies the Coulomb gauge condition and Eq. (71) with \(\mathbf{B}_0 = \nabla \times \mathbf{A}_0\), is given by

$$\begin{aligned} \mathbf{A}_0 = -\chi \nabla \zeta + \mathbf{A}_{P0} , \end{aligned}$$
(72)

with

$$\begin{aligned} \mathbf{A}_{P0} = \nabla \zeta \times \nabla \eta , \end{aligned}$$
(73)

where \(\eta = \eta (R, Z, t)\) is the solution of

$$\begin{aligned} \Delta _* \eta \equiv R^2 \nabla \cdot (R^{-2} \nabla \eta ) = I . \end{aligned}$$
(74)

The conditions for \({\varvec{\Lambda }}\) are derived from \(\delta \mathcal{I} / \delta \chi = 0\) and \(\delta \mathcal{I} / \delta I = 0\) as

$$\begin{aligned} \overline{(\nabla \times {\varvec{\Lambda }})^\zeta }&= \frac{\partial I}{\partial \chi } \overline{\Lambda ^\zeta } - 4 \pi \sum _a \left( \int d^3 v^\mathrm{(gc)} \left[ \overline{F}_a \frac{e_a}{c} V^\zeta \right. \right. \nonumber \\& \qquad + \, \frac{e_a}{c} \frac{\partial V^\zeta }{\partial \chi } \overline{F_a \nabla \chi \cdot \left\{ \frac{\mathbf{b}}{\Omega _{a0}} \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - \mathbf{V}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0 } \right) \right\} } \nonumber \\& \qquad \left. + \, \overline{F_a} \frac{\mu }{\Omega _{a0}} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) \right] \nonumber \\& \qquad \left. - \, \nabla \cdot \left\{ \int d^3 v^\mathrm{(gc)} \overline{F}_a \frac{\mu }{\Omega _{a0}} \left( \frac{2 V^\zeta }{R} \nabla R + \frac{\partial V^\zeta }{\partial \chi } \nabla \chi \right) \right\} \right) , \end{aligned}$$
(75)

and

$$\begin{aligned} \langle \Lambda ^\zeta \rangle = 0 , \end{aligned}$$
(76)

respectively, where the integral over the gyrocenter velocity space is represented by

$$\begin{aligned} \int {\text {d}}^3 v^\mathrm{(gc)} \equiv \int _{-\infty }^{+\infty } {\text {d}} U \int _{0}^{+\infty } {\text {d}}s\mu \oint d\xi \, D_a (\mathbf{Z}, t) , \end{aligned}$$
(77)

the gyrocenter drift velocity \({\mathbf{v}}_a^\mathrm{(gc)} = {\text {d}} \mathbf{X}_a/{\text {d}}t\) is given by evaluating the right-hand side of Eq. (55) at \((\mathbf{X}, U, \mu )\), and \(\Lambda ^\zeta \equiv {\varvec{\Lambda }} \cdot \nabla \zeta\). The toroidal-angle and flux-surface averages are defined by

$$\begin{aligned} \overline{\cdots } \equiv \oint \frac{d\zeta }{2\pi } \cdots , \end{aligned}$$
(78)

and

$$\begin{aligned} \langle \cdots \rangle \equiv \frac{\oint d\theta \oint d\zeta \sqrt{g} \cdots }{ \oint d\theta \oint d\zeta \sqrt{g} } , \end{aligned}$$
(79)

respectively, and the Jacobian for the flux coordinates \((s, \theta , \zeta )\) is given by

$$\begin{aligned} \sqrt{g} = [\nabla s \cdot (\nabla \theta \times \nabla \zeta )]^{-1} = \frac{R^2 q}{I} \frac{\partial \chi }{\partial s} , \end{aligned}$$
(80)

where \(\chi\) is regarded as a function of (st),

$$\begin{aligned} q \equiv \frac{\mathbf{B}_0 \cdot \nabla \zeta }{\mathbf{B}_0 \cdot \nabla \theta } = \frac{\partial \psi }{\partial \chi }, \end{aligned}$$
(81)

represents the safety factor, and \(\psi (s, t)\) is related to the toroidal magnetic field by

$$\begin{aligned} I \nabla \zeta = \nabla \psi \times \nabla \theta . \end{aligned}$$
(82)

We note that Eq. (75) for \(\overline{(\nabla \times {\varvec{\Lambda }})^\zeta }\) is used later in Eq. (116) to determine the poloidal flux \(\chi\) of the background field.

The gyrokinetic Poisson equation is obtained from \(\delta \mathcal{I} / \delta \phi _1 = 0\) as

$$\begin{aligned} \nabla ^2 \left[ \Phi _0 (\mathbf{x}, t) + \phi _1 (\mathbf{x}, t) \right] = -4\pi \sum _a e_a \int d^6 Z \; D_a (\mathbf{Z}, t) F_a^* (\mathbf{Z}, t) \delta ^3 ( \mathbf{X} + {\varvec{\rho }}_{a} -\mathbf{x} ) , \end{aligned}$$
(83)

where

$$\begin{aligned} F_a^* (\mathbf{Z}, t) \equiv F_a (\mathbf{Z}, t) + \frac{e_a {\widetilde{\psi }}_a}{B_0} \frac{\partial F_a }{\partial \mu } . \end{aligned}$$
(84)

We see from the Coulomb gauge conditions in Eq. (70) that the left-hand side of Eq. (83) naturally represents the opposite sign of the divergence of the electric field \(\mathbf{E} = - \nabla (\Phi _0 + \phi _1) - c^{-1} \partial (\mathbf{A}_0 + \mathbf{A}_1)/\partial t\). Thus, the Coulomb gauge conditions are regarded as a reasonable choice for deriving Poisson’s equation from the variational principle consistently with the Darwin approximation described after Eq. (62) (Sugama 2000; Kaufman and Rostler 1971; Sugama et al. 2013). The delta-function part appearing in Eq. (83) is rewritten as

$$\begin{aligned} \delta ^3 ( \mathbf{X} + {\varvec{\rho }}_a - \mathbf{x} )&= \sum _{n = 0}^\infty \frac{1}{n!} \sum _{i_1, \cdots , i_n} \rho _{ai_1} \cdots \rho _{ai_n} \frac{\partial ^n \delta ^3 (\mathbf{X} - \mathbf{x} )}{\partial X_{i_1} \cdots \partial X_{i_n} } \nonumber \\&= { \sum _{n = 0}^\infty \frac{(-1)^n}{n!} \sum _{i_1, \cdots , i_n} \rho _{ai_1} \cdots \rho _{ai_n} \frac{\partial ^n \delta ^3 (\mathbf{X} - \mathbf{x} )}{\partial x_{i_1} \cdots \partial x_{i_n} } , } \end{aligned}$$
(85)

where \(\rho _{a i}\) is the ith Cartesian component of the gyroradius vector,

$$\begin{aligned} {\varvec{\rho }}_a \equiv \frac{ \mathbf{b}(\mathbf{X}, t) \times {\mathbf{v}}'_c(\mathbf{Z}, t) }{ \Omega _{a0}(\mathbf{X}, t) } . \end{aligned}$$
(86)

Equation (85) is a useful formula to represent effects of finite gyroradii. Substituting Eq. (85) into Eq. (83) and rewriting \(\mathbf{x}\) as \(\mathbf{X}\), the gyrokinetic Poisson equation is rewritten as

$$\begin{aligned} \nabla \cdot ( \mathbf{E}_L + 4 \pi \mathbf{P}^\mathrm{(pol)} ) = 4 \pi \sum _a e_a n_a^\mathrm{(gc)} , \end{aligned}$$
(87)

where the longitudinal part of the electric field, \(\mathbf{E} \equiv - \nabla (\Phi _0 + \phi _1) - c^{-1} \partial ( \mathbf{A}_0 + \mathbf{A}_1 ) / \partial t\) is denoted by

$$\begin{aligned} \mathbf{E}_L \equiv - \nabla (\Phi _0 + \phi _1) \equiv \mathbf{E}_0 + \mathbf{E}_{L1} , \end{aligned}$$
(88)

and the subscript L represents the longitudinal part of the vector. The gyrocenter density is given by

$$\begin{aligned} n_a^\mathrm{(gc)} (\mathbf{X}, t) \equiv \int d^3 v^\mathrm{(gc)} \, F_a (\mathbf{Z}, t) , \end{aligned}$$
(89)

and the polarization density is written as (Sugama et al. 2014)

$$\begin{aligned} \mathbf{P}^\mathrm{(pol)} \equiv \sum _a e_a \sum _{n=0}^\infty \frac{(-1)^n}{(n+1)!} \sum _{i_1, \cdots , i_n} \int d^3 v^\mathrm{(gc)} \; \frac{\partial ^n ( D_a F_a^* {\varvec{\rho }}_a \rho _{a i_1} \cdots \rho _{a i_n} ) }{\partial X_{i_1} \cdots \partial X_{i_n}} . \end{aligned}$$
(90)

The polarization density \(\mathbf{P}^\mathrm{(pol)}\) defined in Eq. (90) can be divided into two parts as

$$\begin{aligned} \mathbf{P}^\mathrm{(pol)} = \mathbf{P}_g + \mathbf{P}_\psi , \end{aligned}$$
(91)

where

$$\begin{aligned} \mathbf{P}_g = - \sum _a e_a \sum _{l=1}^\infty \frac{1}{(2l)!} \sum _{i_1, \ldots , i_{2l-1}} \int dU \int d\mu \int d\xi \frac{\partial ^{2l-1} ( D_a F_a {\varvec{\rho }}_a \rho _{a i_1} \cdots \rho _{a i_{2l-1}} ) }{\partial X_{i_1} \cdots \partial X_{i_{2l-1}}} \end{aligned}$$
(92)

and

$$\begin{aligned} \mathbf{P}_\psi = \sum _a \frac{e_a^2}{B_0} \sum _{n=1}^\infty \frac{(-1)^{n-1}}{n!} \sum _{i_1, \ldots , i_n} \int dU \int d\mu \int {\text {d}}\xi \frac{\partial ^{n-1} [ D_a {\widetilde{\psi }}_a (\partial F_a/\partial \mu ) {\varvec{\rho }}_a \rho _{a i_1} \cdots \rho _{a i_{n-1}} ] }{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} . \end{aligned}$$
(93)

We see that the \(\mathbf{P}_\psi\) represents the polarization caused by the field \(\psi\) and that the charge density (at the position X) for the case of \(\psi = 0\) is given by

$$\begin{aligned} \sum _a e_a n_a^\mathrm{(gc)} - \nabla \cdot \mathbf{P}_g = \sum _a e_a \int {\text {d}}^6 Z' \; D_a (\mathbf{Z}', t) F_a (\mathbf{Z}', t) \delta ^3 (\mathbf{X}' + {\varvec{\rho }}_a - \mathbf{X}) , \end{aligned}$$
(94)

which shows that the particle charge density should be evaluated from the gyrocenter charge density with keeping the corrections due to finite gyroradii.

To the lowest order in the gyroradius expansion, \(\mathbf{P}_\psi\) defined in Eq. (93) is approximated as

$$\begin{aligned} \mathbf{P}_\psi \simeq - \frac{ c^2 \sum _a m_a n_a^\mathrm{(gc)} }{B_0^2} \nabla _\perp \phi _1 (\mathbf{X}) , \end{aligned}$$
(95)

where \(\nabla _\perp \equiv \nabla - \mathbf{b} \mathbf{b} \cdot \nabla\). In deriving Eq. (95), only the electrostatic fluctuation part of \({\widetilde{\psi }}_a\) is retained and the long-wavelength approximation \({\widetilde{\psi }}_a = \widetilde{\phi }_1(\mathbf{X} + {\varvec{\rho }}_a) \simeq {\varvec{\rho }}_a \cdot \nabla \phi _1(\mathbf{X})\) is used. In fact, the contribution of the magnetic fluctuation part to the polarization can be neglected in the lowest order when \(F_a\) is approximated by the Maxwellian equilibrium function as shown in Sect. 5.1. The expression in Eq. (95) coincides with the well-known polarization vector due to the electrostatic field and its time derivative yields the polarization current that is represented in terms of the charge density and the polarization drift (Chen 2015). It is shown by Brizard and Tronko (2011) using the push-forward transformation with the higher order correction term in the definition of the gyroradius vector that the contributions of the grad B and curvature gyrocenter drifts to the polarization, which are not included in the present work, appear. It is also shown by keeping only the lowest order part in the gyroradius expansion that \(\mathbf{P}_g\) defined in Eq. (92) is written as

$$\begin{aligned} \mathbf{P}_g \simeq - \sum _a \frac{e_a}{2} \nabla \cdot \left( \int d^3 v^\mathrm{(gc)} F_a {\varvec{\rho }}_a {\varvec{\rho }}_a \right) , \end{aligned}$$
(96)

which is also found in Brizard and Tronko (2011). This polarization term is purely due to the effect of finite gyroradii and it occurs even without fluctuation fields. As shown in Sect. 5.3, the turbulent part of Eq. (87) coincides with Poisson’s equation used in the conventional recursive formulation of the gyrokinetic turbulence theory for toroidally rotating plasmas (Sugama and Horton 1998; Abel et al. 2013).

From \(\delta \mathcal{I}/\delta \mathbf{A}_1 =0\), we obtain

$$\begin{aligned} \nabla ^2 ( \mathbf{A}_0 + \mathbf{A}_1 ) - \frac{1}{c} \nabla \lambda = - \frac{4\pi }{c} \mathbf{j}_G , \end{aligned}$$
(97)

where the gyrokinetic current density is defined by

$$\begin{aligned} \mathbf{j}_G&\equiv \sum _a e_a \int d^6 Z \,D_a (\mathbf{Z}) \delta ^3 ( \mathbf{X} + {\varvec{\rho }}_{a} -\mathbf{x} ) \nonumber \\&\left( F_a (\mathbf{Z}, t) \left[ {\mathbf{V}}_0 + U \mathbf{b} + ({\mathbf{v}}'_c)_\perp - \frac{e_a}{m_a c} \mathbf{A}_1 (\mathbf{X} + {\varvec{\rho }}_{a} , t) \right] \right. \nonumber \\&\left. +\, \frac{e_a {\widetilde{\psi }}_a}{B_0} \frac{\partial F_a}{\partial \mu } \left[ {\mathbf{V}}_0 + U \mathbf{b} + ({\mathbf{v}}'_c)_\perp \right] \right) . \end{aligned}$$
(98)

Note that any vector field \(\mathbf{a}\) can be expressed as \(\mathbf{a} = \mathbf{a}_L + \mathbf{a}_T\), where \(\mathbf{a}_L \equiv - (4\pi )^{-1} \nabla \int d^3 x' (\nabla '\cdot \mathbf{a})/|\mathbf{x}-\mathbf{x}'|\) and \(\mathbf{a}_T \equiv (4\pi )^{-1} \nabla \times ( \nabla \times \int {\text {d}}^3 x' \; \mathbf{a}/|\mathbf{x}-\mathbf{x}'|)\) represent the longitudinal (or irrotational) and transverse (or solenoidal) parts, respectively (Jackson 1998). Then, the longitudinal and transverse parts of Eq. (97) are written as

$$\begin{aligned} \nabla \lambda = 4\pi (\mathbf{j}_G)_L \end{aligned}$$
(99)

and

$$\begin{aligned} \nabla ^2 ( \mathbf{A}_0 + \mathbf{A}_1 ) = - \frac{4\pi }{c} (\mathbf{j}_G)_T , \end{aligned}$$
(100)

respectively. Equation (100) represents the gyrokinetic Ampère’s law. As shown in Sect. 5.3, the turbulent part of Eq. (100) can successfully reproduce the conventional Ampère’s law used in the recursive formulation of the electromagnetic gyrokinetic theory.

From \(\delta \mathcal{I} / \delta \Phi _0 = 0\), we obtain the surface-averaged gyrokinetic Poisson equation,

$$\begin{aligned}&\frac{ \langle \nabla \cdot \mathbf{E}_L \rangle }{4 \pi } = \sum _a e_a \left\langle n_a^\mathrm{(gc)} - \nabla \cdot \left[ \int d^3 v^\mathrm{(gc)} F_a \left\{ \frac{\mathbf{b}}{\Omega _a} \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - \mathbf{V}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0 } \right) \right. \right. \right. \nonumber \\&\left. \left. \left. + \frac{2 c \mu }{e_a \Omega _a R} \nabla R \right\} - \frac{\nabla \chi }{|\nabla \chi |^2} \nabla \cdot \left( \int d^3 v^\mathrm{(gc)} \frac{c \mu F_a}{2 e_a \Omega _a} \nabla \chi \right) \right] \right\rangle , \end{aligned}$$
(101)

where \(\langle \cdots \rangle\) represents the flux-surface average defined in Eq. (79). Equation (101) imposes the constraint to determine \(\Phi _0(s)\) independently of Poisson’s equation in Eq. (87) derived from \(\delta \mathcal{I}/ \delta \phi _1 = 0\). Thus, Eqs. (87) and (101) give two conditions to determine \(\phi _1\) and \(\Phi _0\). On the right-hand side of Eq. (101), we can clearly see the polarization effect in the form of \(- \nabla \cdot [ \cdots ]\) where the well-known polarization term [Eq. (95)] and other correction terms are included as confirmed using the gyrocenter motion equation for \({\mathbf{v}}_a^\mathrm{(gc)} \equiv d\mathbf{X}_a/dt\) given in Eq. (55). It is interesting to note that the gyroradius-expansion formula in Eq (85), which is used to define the polarization vector \(\mathbf{P}^\mathrm{(pol)} = \mathbf{P}_g +\mathbf{P}_\psi\) in Eq. (90), is not necessary to derive the polarization term in Eq. (101). It is also seen later in Sect. 3.3 that Eq. (101) and all other governing equations derived from the variational principle are necessary to apply Noether’s theorem, and accordingly obtain the toroidal momentum balance equation which describes radial transport of toroidal momentum in toroidally rotating plasmas. That toroidal momentum balance equation given in Eq. (228) can also be used instead of Eq. (101) as the constraint to determine \(\Phi _0(s)\).

We also find that \(\delta \mathcal{I} / \delta \mathbf{A}_0 = 0\) yields

$$\begin{aligned} \nabla ^2 (\mathbf{A}_0 + \mathbf{A}_1) + \nabla \times {\varvec{\Lambda }} - \frac{1}{c}\nabla \alpha + \frac{4\pi }{c}( \mathbf{j}^\mathrm{(gc)} + \nabla \times \mathbf{M} ) = 0 , \end{aligned}$$
(102)

where the gyrocenter current is written as

$$\begin{aligned} \mathbf{j}^\mathrm{(gc)} \equiv \sum _a e_a n_a^\mathrm{(gc)} \mathbf{u}_a^\mathrm{(gc)} . \end{aligned}$$
(103)

Here, the gyrocenter fluid velocity \(\mathbf{u}_a^\mathrm{(gc)}\) is defined by

$$\begin{aligned} n_a^\mathrm{(gc)} \mathbf{u}_a^\mathrm{(gc)} \equiv \int d^3 v^\mathrm{(gc)} F_a {\mathbf{v}}_a^\mathrm{(gc)} , \end{aligned}$$
(104)

where the gyrocenter drift velocity \({\mathbf{v}}_a^\mathrm{(gc)} \equiv d \mathbf{X}_a/dt\) is given by evaluating the right-hand side of Eq. (55) at \((\mathbf{X}, U, \mu )\). The last term on the left-hand side of Eq. (102) represents the magnetization current. The magnetization is defined by

$$\begin{aligned} \mathbf{M} \equiv \sum _a \mathbf{M}_a , \end{aligned}$$
(105)

with

$$\begin{aligned}&\mathbf{M}_a \equiv c \int d^3 v^\mathrm{(gc)} F_a \left[ -\mu \mathbf{b} \left\{ 1 + \frac{1}{B_0 \Omega _{a0}} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) \right\} \right. \nonumber \\&\quad \qquad \left. +\, \frac{m_a U}{B_0} ({\mathbf{v}}_a^\mathrm{(gc)} - U\mathbf{b} - {\mathbf{V}}_0 )_\perp - \mathbf{N}_a \right] , \end{aligned}$$
(106)

where

$$\begin{aligned}&\mathbf{N}_a \equiv e_a \langle \mathbf{D}_B \psi _a \rangle _\xi + \frac{e_a^2}{2 m_a c^2} \langle \mathbf{D}_B (|\mathbf{A}_1|^2 ) \rangle _\xi + \frac{e_a^2}{2 B_0^2} \mathbf{b} \frac{\partial }{\partial \mu } \langle (\widetilde{\psi _a})^2 \rangle _\xi - \frac{e_a^2}{B_0} \frac{\partial }{\partial \mu } \langle \widetilde{\psi _a} \mathbf{D}_B \psi _a \rangle _\xi , \nonumber \\&\mathbf{D}_B \psi _a \equiv - \frac{1}{B_0} \left[ \left( \frac{1}{2} \mathbf{b} {\varvec{\rho }}_{a} + {\varvec{\rho }}_{a} \mathbf{b} \right) \cdot \left( \nabla \phi _1 - \nabla \mathbf{A}_1 \cdot \frac{1}{c} ( {\mathbf{V}}_0 + U \mathbf{b} + ({\mathbf{v}}'_c)_\perp ) \right) \right. \nonumber \\&\qquad \qquad \left. +\, \frac{U}{c}{} \mathbf{A}_{1\perp } + \frac{1}{c} \left( \frac{1}{2} \mathbf{b} ({\mathbf{v}}'_c)_\perp - ({\mathbf{v}}'_c)_\perp \mathbf{b} \right) \cdot \mathbf{A}_1 \right] , \nonumber \\&\mathbf{D}_B (|\mathbf{A}_1|^2 ) \equiv - \frac{1}{B_0} \left( \frac{1}{2} \mathbf{b} {\rho }_{a} + {\rho }_{a} \mathbf{b} \right) \cdot \nabla (|\mathbf{A}_1|^2 ) . \end{aligned}$$
(107)

In the same way as in Eqs. (97)–(100), Eq. (102) is divided into the longitudinal part,

$$\begin{aligned} - \frac{1}{c}\nabla \alpha + \frac{4\pi }{c}( \mathbf{j}^\mathrm{(gc)})_L = 0 , \end{aligned}$$
(108)

and the transverse part,

$$\begin{aligned} \nabla ^2 (\mathbf{A}_0 + \mathbf{A}_1) + \nabla \times {\varvec{\Lambda }} + \frac{4\pi }{c}( (\mathbf{j}^\mathrm{(gc)})_T + \nabla \times \mathbf{M} ) = 0 . \end{aligned}$$
(109)

Equation (109) gives the gyrokinetic Ampère’s law in a different form from Eq. (100). Thus, Ampère’s law is expressed in two ways by Eq. (100) and (109), which are derived from \(\delta \mathcal{I}/ \delta \mathbf{A}_1 = 0\) and \(\delta \mathcal{I}/ \delta \mathbf{A}_0 = 0\), respectively. Correspondingly, we find the two expressions \((\mathbf{j}_G)_T\) and \((\mathbf{j}^\mathrm{(gc)})_T + \nabla \times (\mathbf{M} + c {\varvec{\Lambda }}/4\pi )\) for the solenoidal currents in Eqs. (100) and (109), respectively. We can see from Eqs. (105)–(107) that the magnetization is approximated to the lowest order in \(\delta\) by

$$\begin{aligned} \mathbf{M} \simeq - c \sum _a \int d^3 v^\mathrm{(gc)} F_a \mu \mathbf{b} , \end{aligned}$$
(110)

which coincides with the well-known expression for the magnetization vector [see, for example, Eq. (4.151) in Hazeltine and Meiss (1992)]. Other high-order effects on the magnetization are included in \((\mathbf{M} + c {\varvec{\Lambda }}/4\pi )\). The magnetization current \(\nabla \times (\mathbf{M} + c {\varvec{\Lambda }}/4\pi )\) explicitly appears in Eq. (109). On the other hand, the magnetization current included in Eq. (100) is identified by substituting the gyroradius-expansion formula, Eq. (85), into Eq. (98) for \(\mathbf{j}_G\) just like using it to define the polarization vector \(\mathbf{P}^\mathrm{(pol)}\). Then, we find that \(\mathbf{j}_G\) contains the magnetization current written by

$$\begin{aligned} - \sum _a e_a \nabla \cdot \left( \int d^3 v^\mathrm{(gc)} F_a {\varvec{\rho }}_a ({\mathbf{v}}'_c)_\perp ) \right) = \nabla \times \left( - c \sum _a \int d^3 v^\mathrm{(gc)} F_a \mu \mathbf{b} \right) , \end{aligned}$$
(111)

which is consistent with the result shown in Eq. (110). The magnetization current is obviously solenoidal, while as seen from Eq. (95), the polarization current \(\partial \mathbf{P}^\mathrm{(pol)}/\partial t\) has the longitudinal (or irrotational) part and it is a higher order small quantity which is not included in Eq. (98) to define \(\mathbf{j}_G\). The gyrocenter current \(\mathbf{j}^\mathrm{(gc)}\) defined in Eq. (103) does not contain the polarization current \(\partial \mathbf{P}^\mathrm{(pol)}/\partial t\) either. Instead, \(\partial \mathbf{P}^\mathrm{(pol)}/\partial t\) can be treated separately from \(\mathbf{j}^\mathrm{(gc)}\) by considering the charge conservation law as seen in Sect. 4.3.

The two independent expressions of the gyrokinetic Ampère’s law given in Eqs. (100) and (109) are necessary to determine \(\mathbf{A}_1\) and \(\mathbf{A}_0\), separately. This is similar to the earlier situation in which \(\phi _1\) and \(\Phi _0\) are determined by the two independent gyrokinetic Poisson equations in Eqs. (87) and (101) derived from \(\delta \mathcal{I}/ \delta \phi _1 = 0\) and \(\delta \mathcal{I}/ \delta \Phi _0 = 0\). As shown in Eq. (276) in Sect. 5.2, the lowest order ensemble-averaged part of \((\mathbf{j}^\mathrm{(gc)})_T + \nabla \times (\mathbf{M} + c {\varvec{\Lambda }}/4\pi )\) properly represents the equilibrium current and accordingly Eq. (109) leads to the Grad–Shafranov-type equation below in Eq. (116) where the axisymmetric MHD equilibrium condition for the toroidally rotating system is consistently contained. Thus, turbulent and equilibrium parts of Ampère’s law obtained by the conventional scale-separation formulation are correctly included in Eqs. (100) and (109), respectively, while these two equations are independent of each other to impose the two conditions for determining \(\mathbf{A}_1\) and \(\mathbf{A}_0\). It also can be considered that the additional variable \({\varvec{\Lambda }}\) plays a role in adjusting these two expressions of Ampère’s law without inconsistency.

We now define \(\mathbf{B}^\mathrm{(gc)}\) as the magnetic field produced by \((\mathbf{j}^\mathrm{(gc)})_T\),

$$\begin{aligned} \nabla \times \mathbf{B}^\mathrm{(gc)} = \frac{4\pi }{c} (\mathbf{j}^\mathrm{(gc)})_T , \end{aligned}$$
(112)

from which we obtain

$$\begin{aligned} \frac{1}{\sqrt{g}} \frac{\partial \overline{(B^\mathrm{(gc)})_\zeta }}{\partial \theta }&= \frac{4\pi }{c} \overline{(\mathbf{j}^\mathrm{(gc)})_T \cdot \nabla \chi } \nonumber \\ \frac{1}{\sqrt{g}} \frac{\partial \overline{(B^\mathrm{(gc)})_\zeta }}{\partial \chi }&= - \frac{4\pi }{c} \overline{(\mathbf{j}^\mathrm{(gc)})_T \cdot \nabla \theta } . \end{aligned}$$
(113)

Then, using Eqs. (71) and (109), we have

$$\begin{aligned} \overline{\Lambda _\zeta } = I - \frac{4\pi }{c} \overline{M_\zeta } - \overline{(B^\mathrm{(gc)})_\zeta } + \overline{B_{1\zeta }} , \end{aligned}$$
(114)

which is combined with Eqs. (76) and (80) to obtain

$$\begin{aligned} I = \oint \frac{{\text {d}}\theta }{2\pi } \left[ \frac{4\pi }{c} \overline{M_\zeta } + \overline{(B^\mathrm{(gc)})_\zeta } - \overline{B_{1\zeta }} \right] . \end{aligned}$$
(115)

Using Eq. (75) and taking the \(\zeta\)-average of the toroidal component of Eq. (109) give

$$\begin{aligned} \Delta _* \chi = \overline{ \left( \frac{4\pi }{c} \left[ (\mathbf{j}^\mathrm{(gc)})_T + \nabla \times \mathbf{M} \right] - \nabla \times \mathbf{B}_1 \right) \cdot R^2 \nabla \zeta } + R^2 \overline{(\nabla \times {\varvec{\Lambda }})^\zeta } , \end{aligned}$$
(116)

where the last term \(\overline{(\nabla \times {\varvec{\Lambda }})^\zeta }\) on the right-hand side is evaluated using Eq. (75). The time-dependent axisymmetric background field \(\mathbf{B}_0\) [see Eq. (71)] can be self-consistently determined using Eqs. (115) and (116), in which effects of the turbulent current and fields are included.

In the present model for the collisionless gyrokinetic system, Eqs. (69), (83) [or (87)], (100), (101), (115), and (116) constitute the closed system of governing equations which determine \(F_a\), \(\phi _1\), \(\mathbf{A}_1\), \(\Phi _0\), I, and \(\chi\) (\(\mathbf{A}_0\) and \(\mathbf{B}_0\) are determined from I and \(\chi\)). Here, we note that \({\varvec{\Lambda }}\) can be removed from Eq. (116) using Eqs. (75) and (114). Then, the governing equations do not contain the field variables \(\lambda\), \(\alpha\), and \({\varvec{\Lambda }}\), which are included in the Lagrangian, Eq. (59), as the Lagrange undetermined multipliers associated with the constraint conditions for \(\mathbf{A}_1\), \(\mathbf{A}_0\), and \(\mathbf{B}_0\). It can also be found that all governing equations are invariant under the gauge transformation defined by Eq. (63) with \(\partial \mathcal{S}_0 /\partial t = \nabla ^2 \mathcal{S}_0 = \partial \mathcal{S}_0 / \partial \zeta = 0\). If we fix the background magnetic field, we can eliminate Eqs. (115) and (116), so that Eqs. (69), (83) [or (87)], (100), and (101) form the closed system equations for \(F_a\), \(\phi _1\), \(\mathbf{A}_1\), and \(\Phi _0\). For the case of the electrostatic turbulence, \(\mathbf{A}_1\) is neglected, Eq. (100) is not used, and the reduced set of equations is given by Eqs. (69), (83) [or (87)], (101), (115), and (116) which determine \(F_a\), \(\phi _1\), \(\Phi _0\), I, and \(\chi\). These equations can be used to describe the collisionless gyrokinetic system, in which the time evolutions of equilibrium profiles are dominated by the electrostatic turbulent transport while there are slow variations of the background electric and magnetic fields to be consistent with the evolving profiles.

3.3 Conservation laws derived from Noether’s theorem

In this subsection, conservation laws for energy and toroidal angular momentum are derived from Noether’s theorem in the way similar to that in Sugama et al. (2013, 2014). First, we consider general infinitesimal transformations of the Eulerian field variables given as a function of \((\mathbf{x}, t)\),

$$\begin{aligned} t&'= t + \delta t_E (\mathbf{x}, t) , \nonumber \\ \mathbf{x}&'= \mathbf{x} + \delta \mathbf{x}_E (\mathbf{x}, t) , \nonumber \\ \phi _1' (\mathbf{x}', t')&= \phi _1(\mathbf{x}, t) + \delta \phi _1(\mathbf{x}, t) , \nonumber \\ \mathbf{A}'_1(\mathbf{x}', t')& = \mathbf{A} _1(\mathbf{x}, t) + \delta \mathbf{A}_1(\mathbf{x}, t) , \nonumber \\ \mathbf{A}'_0(\mathbf{x}', t')&= \mathbf{A} _0(\mathbf{x}, t) + \delta \mathbf{A}_0(\mathbf{x}, t) , \nonumber \\ \lambda '(\mathbf{x}', t')&= \lambda (\mathbf{x}, t) + \delta \lambda (\mathbf{x}, t) , \nonumber \\ \alpha '(\mathbf{x}', t')& = \alpha (\mathbf{x}, t) + \delta \alpha (\mathbf{x}, t) , \nonumber \\ {\varvec{\Lambda }}'(\mathbf{x}', t')& = {\varvec{\Lambda }}(\mathbf{x}, t) + \delta {\varvec{\Lambda }}(\mathbf{x}, t) . \end{aligned}$$
(117)

Here, \(\delta t_E\) and \(\delta \mathbf{x}_E\) are generally functions of \((\mathbf{x}, t)\) while \(\delta \phi _1\), \(\delta \mathbf{A}_1\), \(\delta \mathbf{A}_0\), \(\delta \lambda\), \(\delta \alpha\), and \(\delta {\varvec{\Lambda }}\) are produced by the variations in their functional forms and those in the variables \((\mathbf{x}, t)\)

$$\begin{aligned} \delta \phi _1 (\mathbf{x}, t)&= \bar{\delta } \phi _1 (\mathbf{x}, t) + \delta t_E \; \partial _t \phi _1 + \delta \mathbf{x}_E \cdot \nabla \phi _1 , \nonumber \\ \delta \mathbf{A}_1 (\mathbf{x}, t)& = \bar{\delta } \mathbf{A}_1 (\mathbf{x}, t) + \delta t_E \; \partial _t \mathbf{A}_1 + \delta \mathbf{x}_E \cdot \nabla \mathbf{A}_1 , \nonumber \\ \delta \mathbf{A}_0 (\mathbf{x}, t)& = \bar{\delta } \mathbf{A}_0 (\mathbf{x}, t) + \delta t_E \; \partial _t \mathbf{A}_0 + \delta \mathbf{x}_E \cdot \nabla \mathbf{A}_0 , \nonumber \\ \delta \lambda (\mathbf{x}, t)& = \bar{\delta } \lambda (\mathbf{x}, t) + \delta t_E \; \partial _t \lambda + \delta \mathbf{x}_E \cdot \nabla \lambda , \nonumber \\ \delta \alpha (\mathbf{x}, t)& = \bar{\delta } \alpha (\mathbf{x}, t) + \delta t_E \; \partial _t \alpha + \delta \mathbf{x}_E \cdot \nabla \alpha , \nonumber \\ \delta {\varvec{\Lambda }} (\mathbf{x}, t)& = \bar{\delta } {\varvec{\Lambda }} (\mathbf{x}, t) + \delta t_E \; \partial _t {\varvec{\Lambda }} + \delta \mathbf{x}_E \cdot \nabla {\varvec{\Lambda }}, \end{aligned}$$
(118)

where

$$\begin{aligned} \bar{\delta }\phi _1 (\mathbf{x}, t)&= \phi '_1 (\mathbf{x}, t) - \phi _1 (\mathbf{x}, t) , \nonumber \\ \bar{\delta }{} \mathbf{A}_1 (\mathbf{x}, t)&= \mathbf{A}'_1 (\mathbf{x}, t) - \mathbf{A}_1 (\mathbf{x}, t) , \nonumber \\ \bar{\delta }{} \mathbf{A}_0 (\mathbf{x}, t)&= \mathbf{A}'_0 (\mathbf{x}, t) - \mathbf{A}_0 (\mathbf{x}, t) , \nonumber \\ \bar{\delta }\lambda (\mathbf{x}, t)&= \lambda ' (\mathbf{x}, t) - \lambda (\mathbf{x}, t) , \nonumber \\ \bar{\delta }\alpha (\mathbf{x}, t)&= \alpha ' (\mathbf{x}, t) - \alpha (\mathbf{x}, t) , \nonumber \\ \bar{\delta }{\varvec{\Lambda }} (\mathbf{x}, t)& = {\varvec{\Lambda }}' (\mathbf{x}, t) - {\varvec{\Lambda }} (\mathbf{x}, t) , \end{aligned}$$
(119)

and the second-order variation terms are neglected. Infinitesimal transformations of the axisymmetric functions \(\chi\) and I associated with the equilibrium magnetic field are given by

$$\begin{aligned} \chi '(R', z', t')& = \chi (R, z, t) + \delta \chi (R, z, t) , \nonumber \\ I'(\chi '(R', z', t'), t')& = I(\chi (R, z, t), t) + \delta I (\chi (R, z, t), t) , \nonumber \\ \Phi '_0 (\chi '(R', z', t'), t')& = \Phi _0 (\chi (R, z, t), t) + \delta \Phi _0 (\chi (R, z, t), t) , \end{aligned}$$
(120)

where \(R' = R(\mathbf{x}')\) and \(z' = z(\mathbf{x}')\) represent the R and z coordinates of the position \(\mathbf{x}'\). Then, \(\delta \chi (R, z, t)\), \(\delta I (\chi , t)\), and \(\delta \Phi _0 (\chi , t)\) are written as

$$\begin{aligned} \delta \chi (R, z, t)& = \bar{\delta } \chi (R, z, t) + \delta t_E \; \partial _t \chi + \delta R \; \partial _R \chi + \delta z \; \partial _z \chi , \nonumber \\ \delta I (\chi , t)& = \bar{\delta } I (\chi , t) + \delta t_E \; \partial _t I (\chi , t) + \delta \chi \; \partial _\chi I (\chi , t) \nonumber \\ \delta \Phi_0 (\chi , t)& = \bar{\delta } \Phi _0 (\chi , t) + \delta t_E \; \partial _t \Phi _0 (\chi , t) + \delta \chi \; \partial _\chi \Phi _0 (\chi , t), \end{aligned}$$
(121)

where

$$\begin{aligned} \bar{\delta } \chi (R, z, t)& = \chi ' (R, z, t) - \chi (R, z, t) , \nonumber \\ \bar{\delta } I (\chi , t)&= I' (\chi , t) - I (\chi , t) , \nonumber \\ \bar{\delta } \Phi _0 (\chi , t)& = \Phi '_0(\chi , t) - \Phi _0 (\chi , t) , \nonumber \\ \delta R& = R' - R , \nonumber \\ \delta z& = z' - z . \end{aligned}$$
(122)

We also consider the following infinitesimal transformations based on the Lagrangian description using \((\mathbf{Z}_0, t_0 ; t)\) as independent variables:

$$\begin{aligned} t'&= t + \delta t_a (\mathbf{Z}_0, t_0 ; t) , \nonumber \\ \mathbf{Z}'_a (\mathbf{Z}_0, t_0 ; t')& = \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) + \delta \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) , \end{aligned}$$
(123)

where the Lagrangian variations \(\delta t_a\) and \(\delta \mathbf{X}_a\) are related to the Eulerian variations \(\delta t_E\) and \(\delta \mathbf{x}_E\) by

$$\begin{aligned} \delta t_a (\mathbf{Z}_0, t_0 ; t)&= \delta t_E (\mathbf{X}_a (\mathbf{Z}_0, t_0 ; t), t) , \nonumber \\ \delta \mathbf{X}_a (\mathbf{Z}_0, t_0 ; t)&= \delta \mathbf{x}_E (\mathbf{X}_a (\mathbf{Z}_0, t_0 ; t), t) . \end{aligned}$$
(124)

Similar to Eqs. (118) and (121), \(\delta \mathbf{Z}_a\) is caused by the variations in their functional forms and the variation \(\delta t_a\),

$$\begin{aligned} \delta \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) = \bar{\delta } \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) + \delta t_a \; \partial _t \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) , \end{aligned}$$
(125)

where

$$\begin{aligned} \bar{\delta } \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) = \mathbf{Z}'_a (\mathbf{Z}_0, t_0 ; t) - \mathbf{Z}_a (\mathbf{Z}_0, t_0 ; t) . \end{aligned}$$
(126)

Recall that, in Sects. 3.1 and 3.2, the governing equations for \(\mathbf{Z}_a\), \(\phi _1\), \(\mathbf{A}_1\), \(\Phi _0\), \(\mathbf{A}_0\), \(\lambda\), \(\alpha\), and \({\varvec{\Lambda }}\) are derived as Euler–Lagrange equations from \(\delta \mathcal{I} = 0\) by considering only the variations in the functional forms which are assumed to vanish on the integral boundaries. On the other hand, when the variations given by Eqs. (117)–(125) are taken about the solutions of the Euler–Lagrange equations, the variation \(\delta \mathcal{I}\) of the action integral does not generally vanish but it is written as

$$\begin{aligned} \delta \mathcal{I} = \delta \mathcal{I}_p + \delta \mathcal{I}_{pf} + \delta \mathcal{I}_f , \end{aligned}$$
(127)

where

$$\begin{aligned} \delta \mathcal{I}_p&= \sum _a \int {\text {d}}t \int {\text {d}}^6 Z_{a0} D_a(\mathbf{Z}_{a0}, t_0) F_a(\mathbf{Z}_{a0}, t_0) \frac{\partial }{\partial t} \left( L_a \delta t_a + \frac{\partial L_a }{\partial (\partial _t \mathbf{Z}_a)} \cdot \overline{\delta } \mathbf{Z}_a \right) \nonumber \\& = \sum _a \int {\text {d}}t \int {\text {d}}^6 Z_{a0} D_a(\mathbf{Z}_{a0}, t_0) F_a(\mathbf{Z}_{a0}, t_0) \nonumber \\& \qquad \times \frac{\partial }{\partial t} \left[ \left( L_a - \frac{\partial L_a }{\partial (\partial _t \mathbf{Z}_a)} \cdot \partial _t \mathbf{Z}_a \right) \delta t_a + \frac{\partial L_a }{\partial (\partial _t \mathbf{Z}_a)} \cdot \delta \mathbf{Z}_a \right] , \end{aligned}$$
(128)
$$\begin{aligned}&\delta \mathcal{I}_{pf} =\sum _a \int {\text {d}}t \int d^6 Z \; \nabla \cdot \left[ D_a(\mathbf{Z}, t) F_a(\mathbf{Z}, t) \left\{ \overline{\delta } \mathbf{A}_0 \times \left( -\mu \mathbf{b} \left[ 1+ \frac{1}{B_0 \Omega _{a0}} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi \right. \right. \right. \right. \right. \nonumber \\&\qquad \qquad \left. \left. \left. + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right) \right] + \frac{m_a U}{B_0} ({\mathbf{v}}_a^\mathrm{(gc)} - U\mathbf{b} - {\mathbf{V}}_0 )_\perp - \mathbf{N}_a \right) - \overline{\delta } \Phi _0 \frac{e_a \mathbf{b}}{\Omega _{a0}} \nonumber \\&\qquad \qquad \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - {\mathbf{V}}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0} \right) - \left( \overline{\delta } \Phi _0 + \frac{\partial \Phi _0}{\partial \chi } \overline{\delta } \chi \right) \frac{c \mu }{\Omega _{a0}} \frac{2}{R} \nabla R \nonumber \\&\qquad \qquad \left. \left. - \left( \frac{1}{2} \frac{\partial }{\partial \chi } \overline{\delta } \Phi _0 + \frac{\partial ^2 \Phi _0}{\partial \chi ^2} \overline{\delta } \chi \right) \frac{c \mu }{\Omega _{a0}} \nabla \chi \right\} + \overline{\delta } \Phi _0 \frac{\nabla \chi }{|\nabla \chi |^2} \nabla \cdot \left( D_a F_a \frac{c \mu }{2 \Omega _{a0}} \nabla \chi \right) - \delta \mathbf{R}_a \right] , \end{aligned}$$
(129)

and

$$\begin{aligned} \delta \mathcal{I}_f&= \int {\text {d}}t \int {\text {d}}^3 \mathbf{x} \; \left[ \frac{\partial }{\partial t} \left( \mathcal{L}_f \delta t_E \right) + \nabla \cdot \left\{ \mathcal{L}_f \delta \mathbf{x}_E + \frac{\partial \mathcal{L}_f}{\partial ( \nabla \Phi _0 )} \left( \overline{\delta } \Phi _0 + \frac{\partial \Phi _0}{\partial \chi } \overline{\delta } \chi \right) \right. \right. \nonumber \\&\qquad \left. \left. + \frac{\partial \mathcal{L}_f}{\partial ( \nabla \phi _1 )} \overline{\delta } \phi _1 + \sum _{n=0}^1 \sum _{k=1}^3 \frac{\partial \mathcal{L}_f}{\partial ( \nabla A_{n k} )} \overline{\delta } A_{n k} + \frac{\partial \mathcal{L}_f}{\partial ( \nabla \chi )} \overline{\delta } \chi \right\} \right] \nonumber \\&= \int {\text {d}}t \int {\text {d}}^3 \mathbf{x} \; \left[ \frac{\partial }{\partial t} \left( \mathcal{L}_f \delta t_E \right) + \nabla \cdot \left\{ - \left( \frac{\partial \mathcal{L}_f}{\partial ( \nabla \phi _1 )} \partial _t ( \Phi _0 + \phi _1 ) \right. \right. \right. \nonumber \\&\qquad \left. + \sum _{n=0}^1 \sum _{k=1}^3 \frac{\partial \mathcal{L}_f}{\partial ( \nabla A_{nk} )} \partial _t A_{nk} + \frac{\partial \mathcal{L}_f}{\partial ( \nabla \chi )} \partial _t \chi \right) \delta t_E + \mathcal{L}_f \delta \mathbf{x}_E \nonumber \\&\qquad - \left( \frac{\partial \mathcal{L}_f}{\partial ( \nabla \phi _1 )} \nabla ( \Phi _0 + \phi _1 ) + \sum _{n=0}^1 \sum _{k=1}^3 \frac{\partial \mathcal{L}_f}{\partial ( \nabla A_{nk} )} \nabla A_{nk} +\frac{\partial \mathcal{L}_f}{\partial ( \nabla \chi )} \nabla \chi \right) \cdot \delta \mathbf{x}_E \nonumber \\&\qquad \left. \left. + \frac{\partial \mathcal{L}_f}{\partial ( \nabla \phi _1 )} \delta ( \Phi _0 + \phi _1 ) + \sum _{n=0}^1 \sum _{k=1}^3 \frac{\partial \mathcal{L}_f}{\partial ( \nabla A_{n k} )} \delta A_{n k} + \frac{\partial \mathcal{L}_f}{\partial ( \nabla \chi )} \delta \chi \right\} \right] . \end{aligned}$$
(130)

On the right-hand side of Eq. (129), \(\delta \mathbf{R}_a\) is associated with effects of finite gyroradii [see Eq. (85)] on electromagnetic fields and defined by

$$\begin{aligned} \delta \mathbf{R}_a&= e_a \sum _{n=1}^\infty \frac{1}{n!} \sum _{i_1 = 1}^\infty \cdots \sum _{i_{n-1} = 1}^\infty \left[ D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} \overline{\delta } \psi _a}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\& \qquad - \frac{\partial ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} \overline{\delta } \psi _a}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} \nonumber \\& \qquad + \cdots + (-1)^{n-1} \frac{\partial ^{n-1} ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \overline{\delta } \psi _a \nonumber \\& \qquad + \frac{e_a}{m_a c^2} \left( D_a F_a {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} (\mathbf{A}_1 \cdot \overline{\delta } \mathbf{A}_1)}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\& \qquad - \frac{\partial ( D_a F_a {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} (\mathbf{A}_1 \cdot \overline{\delta } \mathbf{A}_1)}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} + \cdots \nonumber \\& \qquad \left. \left. + \, (-1)^{n-1} \frac{\partial ^{n-1} ( D_a F_a {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} (\mathbf{A}_1 \cdot \overline{\delta } \mathbf{A}_1) \right) \right] , \end{aligned}$$
(131)

where \(\overline{\delta } \psi _a\) is given by

$$\begin{aligned} \overline{\delta } \psi _a = \overline{\delta } \phi _1(\mathbf{X}_a, t) - \frac{1}{c} [ {\mathbf{V}}_0 (\mathbf{X}_a, t) + U_a \mathbf{b} (\mathbf{X}_a, t) + ({\mathbf{v}}'_c)_\perp (\mathbf{X}_a, \xi _a, t)] \cdot \overline{\delta } \mathbf{A}_1 (\mathbf{X}_a, t) . \end{aligned}$$
(132)

We recall that the conservation of the magnetic moment \(\mu\) results from the invariance under the variation of the gyrophase \(\xi _a\), although to prepare for deriving the conservation laws of energy and toroidal momentum in the following subsections, we hereafter consider the case in which \(\delta \xi _a = 0\) and accordingly \(\partial L_a / \partial (\partial _t \mathbf{Z}_a) \cdot \delta \mathbf{Z}_a = \partial L_a / \partial (\partial _t \mathbf{X}_a) \cdot \delta \mathbf{X}_a\) in Eq. (128). Then, using Eqs. (128)–(130), we can rewrite Eq. (127) as

$$\begin{aligned} \delta \mathcal{I} = - \int {\text {d}}t \int {\text {d}}^3 \mathbf{X} \; \left[ \frac{\partial }{\partial t}\delta G_0 (\mathbf{X}, t) + \nabla \cdot \delta \mathbf{G} (\mathbf{X}, t) \right] , \end{aligned}$$
(133)

with the functions \(\delta G_0\) and \(\delta \mathbf{G}\) defined by

$$\begin{aligned} \delta G_0 (\mathbf{X}, t)&= \mathcal{E}_c \; \delta t_E - \mathbf{P}_c \cdot \delta \mathbf{x}_E , \nonumber \\ \delta \mathbf{G} (\mathbf{X}, t)&= \mathbf{Q}_c \; \delta t_E - {\varvec{\Pi }}_c \cdot \delta \mathbf{x}_E + \mathbf{S}_{\phi _1} \; \delta \phi _1 + \mathbf{S}_{\Phi _0} \; \delta \Phi _0 + \mathbf{S}_{V\zeta } \; \delta V^\zeta \nonumber \\&- {\varvec{\Sigma }}_{A1} \cdot \delta \mathbf{A}_1 - {\varvec{\Sigma }}_{A0} \cdot \delta \mathbf{A}_0 + \mathbf{S}_\chi \delta \chi + \delta \mathbf{T}, \end{aligned}$$
(134)

where

$$\begin{aligned} \mathcal{E}_c&= \sum _a \int {\text {d}}^3 v^\mathrm{(gc)}F_a H_a + \frac{1}{8\pi } \left( - |\nabla ( \Phi _0 + \phi _1 ) |^2 + |\mathbf{B}_0 + \mathbf{B}_1 |^2 \right) , \\ \mathbf{P}_c&= \sum _a \int {\text {d}}^3 v^\mathrm{(gc)} F_a \left( \frac{e_a}{c} \mathbf{A}_0 + m_a ( U \mathbf{b} + {\mathbf{V}}_0 ) \right) , \end{aligned}$$
$$\begin{aligned} \mathbf{Q}_c&= \sum _a \int {\text {d}}^3 v^\mathrm{(gc)} F_a \left[ H_a {\mathbf{v}}_a^\mathrm{(gc)} + \frac{\partial \mathbf{A}_0}{\partial t} \times \left\{ -\mu \mathbf{b} \left( 1+ \frac{1}{B_0 \Omega _{a0}} \left[ \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi \right. \right. \right. \right. \\& \qquad \left. \left. \left. \left. + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right] \right) + \frac{m_a U}{B_0} ({\mathbf{v}}_a^\mathrm{(gc)} - U\mathbf{b} - {\mathbf{V}}_0 )_\perp - \mathbf{N}_a \right\} \right. \\& \qquad - \left( \frac{\partial \Phi _0}{\partial t} \right) _\chi \frac{e_a \mathbf{b}}{\Omega _{a0}} \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - \mathbf{V}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0 } \right) - \frac{\partial \Phi _0}{\partial t} \frac{c \mu }{\Omega _{a0}} \frac{2}{R} \nabla R \\& \qquad \left. + \frac{\mu }{\Omega _{a0}} \left( \frac{1}{2} \frac{\partial V^\zeta }{\partial t} + \frac{\partial \chi }{\partial t} \frac{\partial V^\zeta }{\partial \chi } \right) \nabla \chi \right] + \left( \frac{\partial \Phi _0}{\partial t} \right) _\chi \frac{\nabla \chi }{|\nabla \chi |^2} \nabla \cdot \left( \sum _a \int d^3 v^\mathrm{(gc)} F_a \frac{c \mu }{2 \Omega _{a0}} \nabla \chi \right) \\& \qquad + \frac{1}{4\pi } \frac{\partial ( \Phi _0 + \phi _1 )}{\partial t} \nabla ( \Phi _0 + \phi _1 ) - \frac{1}{4\pi } \frac{\partial (\mathbf{A}_0 + \mathbf{A}_1 )}{\partial t} \times ( \mathbf{B}_0 + \mathbf{B}_1 ) \\& \qquad + \frac{1}{4\pi c} \left( \lambda \frac{\partial \mathbf{A}_1}{\partial t} + \alpha \frac{\partial \mathbf{A}_0}{\partial t} \right) - \frac{1}{4\pi }{\varvec{\Lambda }} \times \left( \frac{\partial \mathbf{A}_0}{\partial t} + \frac{\partial \chi }{\partial t} \nabla \zeta \right) , \end{aligned}$$
$$\begin{aligned} {\varvec{\Pi }}_c & = \sum _a \int {\text {d}}^3 v^\mathrm{(gc)} F_a \left[ {\mathbf{v}}_a^\mathrm{(gc)} \left( m_a U \mathbf{b} + \frac{e_a}{c} \mathbf{A}_0 \right) + \left\{ -\mu \mathbf{b} \left( 1+ \frac{1}{B_0 \Omega _{a0}} \left[ \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi \right. \right. \right. \right. \\& \qquad \left. \left. \left. + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right] \right) + \frac{m_a U}{B_0} ({\mathbf{v}}_a^\mathrm{(gc)} - U\mathbf{b} - {\mathbf{V}}_0 )_\perp - \mathbf{N}_a \right\} \times (\nabla \mathbf{A}_0 )^T \\& \qquad \left. - \frac{\mu }{\Omega _{a0}} \left( \frac{2V^\zeta }{R}\nabla R + \frac{\partial V^\zeta }{\partial \chi } \nabla \chi \right) \nabla \chi \right] + \frac{1}{8\pi } \left( |\nabla (\Phi _0 + \phi _1)|^2 - | \mathbf{B}_0 + \mathbf{B}_1 |^2 \right) \mathbf{I} \\& \qquad + \frac{1}{4\pi } \left[ - \left( \nabla (\Phi _0 + \phi _1)\right) \left( \nabla (\Phi _0 + \phi _1) \right) + \left( \nabla (\mathbf{A}_0 + \mathbf{A}_1) \right. \right. \\& \qquad - \left( \nabla (\mathbf{A}_0 + \mathbf{A}_1))^T \right) \cdot \left( \nabla (\mathbf{A}_0 + \mathbf{A}_1)\right) ^T - \frac{\lambda }{c} \left( \nabla \mathbf{A}_1\right) ^T - \frac{\alpha }{c} \left( \nabla \mathbf{A}_0\right) ^T \\& \qquad \left. + \, {\varvec{\Lambda }} \times \left( (\nabla \mathbf{A}_0)^T + (\nabla \zeta ) (\nabla \chi ) \right) \right] , \end{aligned}$$
$$\begin{aligned} \mathbf{S}_{\phi _1}& = - \frac{1}{4\pi } \nabla (\Phi _0 + \phi _1) , \end{aligned}$$
$$\begin{aligned} \mathbf{S}_{\Phi _0}& = \sum _a \int d^3 v^\mathrm{(gc)} F_a \left[ \frac{e_a}{\Omega _{a0}} \mathbf{b} \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - \mathbf{V}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0 } \right) + \frac{c \mu }{\Omega _{a0}} \frac{2}{R} \nabla R \right] \\& \qquad - \frac{\nabla \chi }{|\nabla \chi |^2} \nabla \cdot \left( \sum _a \int d^3 v^\mathrm{(gc)} F_a \frac{c \mu }{2 \Omega _0} \nabla \chi \right) - \frac{1}{4\pi } \nabla (\Phi _0 + \phi _1) , \end{aligned}$$
$$\begin{aligned} \mathbf{S}_{V\zeta }&= - \sum _a \int d^3 v^\mathrm{(gc)} F_a \frac{\mu }{2 \Omega _{a0}} \nabla \chi , \nonumber \\ {\varvec{\Sigma }}_{A1}&= \frac{1}{4\pi } \left( (\mathbf{B}_0 + \mathbf{B}_1) \times \mathbf{I} + \frac{\lambda }{c} \mathbf{I} \right) , \nonumber \\ {\varvec{\Sigma }}_{A0}&= \sum _a \int {\text {d}}^3 v^\mathrm{(gc)} F_a \left\{ \mu \mathbf{b} \left( 1+ \frac{1}{B_0 \Omega _{a0}} \left[ \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right] \right) \right. \nonumber \\& \qquad \left. - \frac{m_a U}{B_0} ({\mathbf{v}}_a^\mathrm{(gc)})_\perp + \mathbf{N}_a \right\} \times \mathbf{I} + \frac{1}{4\pi } \left( ( \mathbf{B}_0 + \mathbf{B}_1 - {\varvec{\Lambda }} )\times \mathbf{I} + \frac{\alpha }{c} \mathbf{I} \right) , \nonumber \\ \mathbf{S}_\chi&= \sum _a \int {\text {d}}^3 v^\mathrm{(gc)} F_a \left[ V^\zeta \frac{m_a}{B_0} \mathbf{b} \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - {\mathbf{V}}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0 } \right) - \frac{\mu }{2\Omega _{a0}} \frac{\partial V^\zeta }{\partial \chi } \nabla \chi \right] \nonumber \\& \qquad - V^\zeta \frac{\nabla \chi }{|\nabla \chi |^2} \nabla \cdot \left( \sum _a \int {\text {d}}^3 v^\mathrm{(gc)} F_a \frac{c \mu }{2 \Omega _0} \nabla \chi \right) + \frac{1}{4\pi } {\varvec{\Lambda }} \times \nabla \zeta , \nonumber \\ \delta \mathbf{T}&= \sum _a \int {\text {d}} U \int {\text {d}} \mu \int {\text {d}} \xi \; \delta \mathbf{R}_a . \end{aligned}$$
(135)

Here, the superscript T represents the transpose of the tensor, and \(\mathbf{I}\) denotes the unit tensor. Comparing the variation \(\delta \mathcal{I}\) of the gyrokinetic action integral shown in Eqs. (133)–(135) and the similar expression of \(\delta \mathcal{I}\) given in Sugama et al. (2013) for the Vlasov–Poisson–Ampère system, we find that more complicated terms appear in \(\delta G_0\) and \(\delta \mathbf{G}\) in the present system due to effects of the finite gyroradii and the new variational fields included for separately determining the turbulent and background electromagnetic fields.

It should be noted that Eq. (133) is derived using the Euler–Lagrange equations shown in Sects. 3.1 and 3.2, of which Eq. (76) requires the integral over the flux surface. Thus, given any spatial point in the integral domain of Eq. (133), the flux surface including the point should be wholly contained in the integral domain in order for Eq. (133) to be valid. If the variations in the variables are such that \(\delta \mathcal{I} = 0\) holds for an arbitrary spatiotemporal integral domain represented by \([t_1, t_2] \times [s_1, s_2]\), where \([s_1, s_2]\) represents the spatial volume region sandwiched between two flux surfaces labeled by \(s_1\) and \(s_2\), then the conservation law is derived as

$$\begin{aligned}&\left\langle \frac{\partial }{\partial t}\delta G_0 (\mathbf{X}, t) + \nabla \cdot \delta \mathbf{G} (\mathbf{X}, t) \right\rangle \nonumber \\&\quad = \left\langle \frac{\partial }{\partial t}\delta G_0 (\mathbf{X}, t) \right\rangle + \frac{1}{V'}\frac{\partial }{\partial s} \left( V' \left\langle \delta \mathbf{G} \cdot \nabla s \right\rangle \right) = 0. \end{aligned}$$
(136)

This is Noether’s theorem for the present gyrokinetic system. Here, we use flux coordinates \((s, \theta , \zeta )\), where s denotes an arbitrary radial coordinate to label flux surfaces so that \(\chi\) is written as a function \(\chi = \chi (s, t)\). The volume enclosed by the flux surface with the label s at the time t is denoted by V(st) and its radial derivative is represented by \(V' \equiv \partial V/\partial s\). Under the nonstationary background field \(\mathbf{B}_0\), flux surfaces may change their shapes, and the grid of the flux coordinates moves. Then, the grid velocity [see Eq. (2.36) in Hirshman and Sigmar (1981)] is given by

$$\begin{aligned} \mathbf{u}_s = \frac{\partial \mathbf{x} (s, \theta , \zeta , t)}{\partial t} , \end{aligned}$$
(137)

and we obtain the following formula:

$$\begin{aligned} \left\langle \frac{\partial }{\partial t}\delta G_0 \right\rangle = \frac{1}{V'} \left[ \frac{\partial }{\partial t} \left( V' \left\langle \delta G_0 \right\rangle \right) - \frac{\partial }{\partial s} \left( V' \left\langle \delta G_0 \mathbf{u}_s \cdot \nabla s \right\rangle \right) \right] , \end{aligned}$$
(138)

where on the right-hand side, the partial derivatives \(\partial /\partial t\) and \(\partial /\partial s\) act on functions of (st) obtained after taking the flux-surface average, while on the left-hand side, the partial time derivative \(\partial /\partial t\) is taken with fixed \(\mathbf{X}\) before the flux-surface average [see Eq. (2.35) in Hirshman and Sigmar (1981)]. On the right-hand side of Eq. (138), \(\mathbf{u}_s \cdot \nabla s\) represents the radial velocity of the flux surface and the last term gives the correction due to the radial surface motion for evaluating the surface-averaged rate of change. Substituting Eq. (138) into Eq. (136), we obtain

$$\begin{aligned} \frac{\partial }{\partial t} \left( V' \left\langle \delta G_0 \right\rangle \right) + \frac{\partial }{\partial s} \left( V' \left\langle \left( \delta \mathbf{G} - \delta G_0 \mathbf{u}_s \right) \cdot \nabla s \right\rangle \right) = 0 . \end{aligned}$$
(139)

If the infinitesimal transformations of the variables \((\delta t_E, \delta \mathbf{x}_E, \ldots )\) in Eq. (134) are regarded as arbitrary functions of \((\mathbf{X}, t)\), the general expressions of \(\delta G_0\) and \(\delta \mathbf{G}\) given by Eqs. (134) and (135) do not seem to be invariant under the gauge transformation defined in Eq. (63) with \(\partial \mathcal{S}_0 /\partial t = \nabla ^2 \mathcal{S}_0 = \partial \mathcal{S}_0 / \partial \zeta = 0\). However, when \((\delta t_E, \delta \mathbf{x}_E, \ldots )\) represent symmetric infinitesimal transformations for which \(\delta G_0\) and \(\delta \mathbf{G}\) take the specific functional forms truly satisfying the conservation laws as shown in Sects. 4.5 and 4.6, they are gauge-invariant. For example, the canonical momentum \(\mathbf{P}_c\) per unit volume defined in Eq. (135) obviously can change its value under the gauge transformation, Eq. (63), except that its toroidal component \(\mathbf{P}_c \cdot \mathbf{e}_\zeta\) which appears in the toroidal momentum conservation law is invariant because \(\partial \mathcal{S}_0/\partial \zeta = 0\). In this way, parts of the expressions in Eq. (135) which are not gauge-invariant do not contribute to the energy and toroidal momentum conservation laws in which all expressions are gauge-invariant.

4 Collisional systems

We now consider the gyrokinetic Boltzmann equation for the distribution function \(F_a (\mathbf{Z}, t)\) for species a,

$$\begin{aligned} \left( \frac{\partial }{\partial t} + \frac{d \mathbf{Z}_a}{dt} \cdot \frac{\partial }{\partial \mathbf{Z}} \right) F_a (\mathbf{Z}, t) = \sum _b C_{ab}^g [F_a, F_b] (\mathbf{Z}, t) + \mathcal{S}_a (\mathbf{Z}, t) , \end{aligned}$$
(140)

where \(C_{ab}^g [F_a, F_b] (\mathbf{Z}, t)\) represents the rate of change in \(F_a(\mathbf{Z}, t)\) due to Coulomb collisions between particle species a and b, and \(\mathcal{S}_a (\mathbf{Z}, t)\) denotes other parts including external particle, momentum, and/or energy sources if any. It is shown in Sect. 4.1 how the collision operator \(C_{ab}^g [F_a, F_b]\) for the gyrocenter distribution functions \(F_a\) and \(F_b\) is given from the collision operator \(C_{ab}^p [f_a, f_b]\) for the particle distribution functions \(f_a\) and \(f_b\).

In early works on the recursive formulation of deriving the gyrokinetic equation for the perturbed distribution function (conventionally denoted by \(\delta f\)), the gyrokinetic collision operator \(C_{ab}^g\) is derived from the collision operator \(C_{ab}^p\) for the particle distribution function by including the effect of the finite gyroradius \(({\varvec{\rho }})\) to the lowest order in the WKB or perpendicular wavenumber \((\mathbf{k}_\perp )\) representation where the factor \(\exp ( i \mathbf{k}_\perp \cdot {\varvec{\rho }} )\) appears as a difference between the perturbed gyrocenter and particle distribution function (Antonsen and Lane 1980; Catto and Tsang 1977). We should note that conservations of particle’s number, momentum, and energy in two-body Coulomb collisions as well as Boltzmann’s H-theorem (or positive definiteness of entropy production) are described conventionally using \(C_{ab}^p\) and the velocity-space integrals with fixing the particle position \(\mathbf{x}\) as

$$\begin{aligned}&\int {\text {d}}^3 v \; C_{ab}^p [f_a, f_b] = 0, \nonumber \\&\int {\text {d}}^3 v \; C_{ab}^p [f_a, f_b] m_a {\mathbf{v}} + \int {\text {d}}^3 v \; C_{ba}^p [f_b, f_a] m_b {\mathbf{v}}= 0, \nonumber \\&\int d^3 v \; C_{ab}^p [f_a, f_b] \frac{1}{2} m_a v^2 + \int {\text {d}}^3 v \; C_{ba}^p [f_b, f_a] \frac{1}{2} m_b v^2 = 0, \nonumber \\&\int {\text {d}}^3 v \; C_{ab}^p [f_a, f_b] \log f_a + \int {\text {d}}^3 v \; C_{ba}^p [f_b, f_a] \log f_b \le 0. \end{aligned}$$
(141)

In the last line of Eq. (141), the equality is attained if and only if \(f_a\) and \(f_b\) have the Maxwellian forms with the same temperature and the same mean flow velocity. The expressions given in terms of the particle phase-space coordinates \((\mathbf{x}, {\mathbf{v}})\) in Eq. (141) can be formally rewritten in the gyrocenter coordinate coordinates \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) by replacing \(\int {\text {d}}^3 v \, C_{ab}^p [f_a, f_b] \cdots\) with \(\int {\text {d}}^3 X {\text {d}}^3 v^\mathrm{(gc)} \, \delta [ \mathbf{x} - \mathbf{x}(\mathbf{Z}) ] C_{ab}^g [F_a, F_b] \cdots\), where \(\mathbf{x}(\mathbf{Z}) = \mathbf{X} + {\varvec{\rho }} (\mathbf{Z})\) represents the particle position as a function of the gyrocenter coordinates \(\mathbf{Z}\). Thus, we see that the conservation laws and Boltzmann’s H-theorem, which are represented as local properties in the particle position space, are no longer local in the gyrocenter position space because of the finite-gyroradius effect.

A rigorous model for \(C_{ab}^p\) such as the Landau operator [Eq. (161)] satisfies the properties shown in Eq. (141). However, for practical applications to analytical and numerical calculations, several approximate collision operators simpler than the full nonlinear Landau operator are employed. For example, the collision operator is normally linearized in the gyrokinetic equation for \(\delta f\) where terms of \({\mathcal {O}} [(\delta f)^2]\) are neglected although the linearized collision operator still retains the properties associated with the conservation laws and the entropy production. Several approximate models are proposed to construct the linearized gyrokinetic collision operators (Madsen 2013; Catto and Tsang 1977; Xu and Rosenbluth 1991; Lin et al. 1995; Wang et al. 1999; Abel et al. 2008; Sugama et al. 2009) which can be applied to neoclassical and turbulent transport simulations.

In the modern gyrokinetic theory using the Lie-transform technique, the nonlinear rigorous gyrokinetic collision operator for the full distribution function (often denoted by full-F) are derived in principle from the Landau particle collision operator by transforming the particle coordinates to the gyrocenter coordinates with keeping all-order terms in the perturbative expansion. In Brizard (2004), the reduced gyrokinetic collision operator, which can correctly describe classical transport processes, is presented based on the Lie-transform perturbation formalism. However, conservation laws are not exactly satisfied when the perturbative expansion is truncated inappropriately. In Sugama et al. (2015), the second-order terms in the truncated perturbative expansion of the gyrokinetic collision operator are modified so as to keep the energy and momentum conservations. Another prescription for deriving the collision operator with the conservation properties is shown in Burby et al. (2015) where the gyrokinetic collision operator is elegantly represented making full use of Poisson brackets. The original idea of writing the collision operator in terms of Poisson brackets is found in Brizard (2004). The Poisson bracket formulation of the collision operator is also employed in Sect. 4.2 to obtain the collisional gyrokinetic equation for toroidally rotating plasmas, from which conservation laws of particles, energy, and toroidal momentum as well as positive definiteness of entropy production are systematically derived.

Let’s return to the gyrokinetic Boltzmann equation, Eq. (140), in which the deviation of each distribution function from the local Maxwellian is regarded as of \({\mathcal {O}}(\delta )\), and accordingly the collision term \(C_{ab}^g\) is considered to be of \({\mathcal {O}}(\delta )\). We assume that the source term \(\mathcal{S}_a\) is of \({\mathcal {O}}(\delta ^2)\) so that its effect appears only in the transport time scale. To prevent the source term from affecting the charge conservation laws [see Eq. (182)], we also assume that

$$\begin{aligned} \sum _a e_a \int {\text {d}}^3 v^\mathrm{(gc)} \mathcal{S}_a (\mathbf{Z}, t) = 0 . \end{aligned}$$
(142)

We find that the gyrophase-dependent part of the right-hand side of Eq. (140) appears from \(C_{ab}^g\) and it is of \({\mathcal {O}}(\delta )\). Using \(\Omega _a = {\mathcal {O}}(\delta ^{-1})\), the gyrophase-dependent part of the left-hand side of Eq. (140) is written as \(\Omega _a \partial \widetilde{F}_a/\partial \xi\) to the lowest order in \(\delta\). Then, it is concluded that \(\widetilde{F}_a = {\mathcal {O}} (\delta ^2)\). Taking the gyrophase average of Eq. (140), we obtain

$$\begin{aligned} \left( \frac{\partial }{\partial t} + \frac{d \mathbf{Z}_a}{dt} \cdot \frac{\partial }{\partial \mathbf{Z}} \right) F_a (\mathbf{Z}, t) = \mathcal{K}_a (\mathbf{Z}, t) , \end{aligned}$$
(143)

where \(\mathcal{K}_a\) is the gyrophase-independent function given by

$$\begin{aligned} \mathcal{K}_a (\mathbf{Z}, t) = \sum _a \langle C_{ab}^g [F_a, F_b] (\mathbf{Z}, t) \rangle _\xi + \mathcal{S}_a (\mathbf{Z}, t) . \end{aligned}$$
(144)

Here, \(F_a (\mathbf{Z}, t)\) and \(\mathcal{S}_a (\mathbf{Z}, t)\) are both regarded as independent of the gyrophase \(\xi\) and \(\langle \cdots \rangle _\xi\) are omitted on them for simplicity. It is seen from Eq. (153) that effects of \(\widetilde{F}_a = {\mathcal {O}} (\delta ^2)\) on \(\langle C_{ab}^g [F_a, F_b] \rangle _\xi\) in the right-hand side of Eq. (143) are estimated as of \({\mathcal {O}} (\delta ^3)\). Hereafter, we neglect \(\widetilde{F}_a = {\mathcal {O}} (\delta ^2)\) in both sides of the gyrokinetic Boltzmann equation given by Eq. (143). Even so, its moment equations can correctly include the collisional transport fluxes of particles, energy, and toroidal momentum up to the leading order, that is \({\mathcal {O}}(\delta ^2)\), as confirmed later. Section 4.2 presents the approximate gyrokinetic collision operator, which has favorable conservation properties and correctly describes collisional transport of energy and toroidal angular momentum.

Using Eq.(67), the gyrokinetic Boltzmann equation in Eq. (143) can be rewritten as

$$\begin{aligned} \frac{\partial }{\partial t} \left( D_a F_a \right) + \frac{\partial }{\partial \mathbf{Z}} \cdot \left( D_a F_a \frac{{\text {d}} \mathbf{Z}_a}{{\text {d}}t} \right) = D_a \mathcal{K}_a . \end{aligned}$$
(145)

When \(\mathcal{K}_a=0\), Eq. (143) reduces to the gyrokinetic Vlasov equation for which Noether’s theorem can be applied to derive conservation laws of energy and toroidal momentum from symmetry properties (Sugama et al. 2014). However, even if \(\mathcal{K}_a \ne 0\), we see in Sect. 4.4 that the energy and toroidal momentum balance equations can be derived from Noether’s theorem modified using the correspondence relation between \(\partial F_a^V /\partial t\) and \(\partial F_a /\partial t - \mathcal{K}_a\), where \(F_a^V\) and \(F_a\) represent the solution of Eq. (143) for \(\mathcal{K}_a = 0\) and that for \(\mathcal{K}_a \ne 0\), respectively.

4.1 Collision operator in gyrocenter coordinates

In this subsection, we consider how the bilinear operator \(C_{ab} [F_a, F_b]\) representing the collision term for collisions between species a and b is transformed under the transformation of the phase-space coordinates. Here, we do not discuss the detailed functional form of the collision operator that is treated in the next subsection.

The particle coordinates and the gyrocenter coordinates are denoted by \(\mathbf{z} \equiv (\mathbf{x}, v'_\parallel , \mu _0, \xi _0)\) and \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\), respectively. Here, the particle coordinates \(\mathbf{x}\), \(v'_\parallel\), \(\mu _0\equiv m (v'_\perp )^2/ (2 B_0)\) and \(\xi _0\) are given by the first terms in the series expansions of the gyrocenter coordinates \(\mathbf{X}\), U, \(\mu\), and \(\xi\) as shown in Eq. (56). The coordinate transformation from \(\mathbf{z} \equiv (\mathbf{x}, v'_\parallel , \mu _0, \xi _0)\) to \(\mathbf{Z} \equiv (\mathbf{X}, U, \mu , \xi )\) defined in Eq. (56) is represented here by

$$\begin{aligned} \mathbf{Z} = \mathcal{T} (\mathbf{z} ) = \mathbf{z} + \Delta \mathbf{z} , \end{aligned}$$
(146)

where the detailed forms of \(\Delta \mathbf{z} \equiv (\Delta \mathbf{x}, \Delta v'_\parallel , \Delta \mu _0, \Delta \xi _0)\) are understood from Eq. (56). An arbitrary scalar field \(\mathcal{A}\) on the phase space can be expressed in terms of either the gyrocenter coordinates \(\mathbf{Z} = ( \mathbf{X}, U, \mu , \xi )\) or the particle coordinates \(\mathbf{z} = (\mathbf{x}, v_\parallel , \mu _0, \xi _0)\) as

$$\begin{aligned} \mathcal{A}^g (\mathbf{Z}) = \mathcal{A}^p (\mathbf{z}) . \end{aligned}$$
(147)

Using Eqs. (146), (147), and the Taylor series expansion, we obtain

$$\begin{aligned}&\mathcal{A}^p (\mathbf{z}) = (\mathcal{T}^* \mathcal{A}^g)(\mathbf{z}) \equiv \mathcal{A}^g (\mathcal{T}(\mathbf{z})) = \mathcal{A}^g (\mathbf{z}+ \Delta \mathbf{z}) \nonumber \\&= \sum _{n=0}^\infty \frac{1}{n!} \sum _{i_1,\cdots ,i_n} \Delta z^{i_1} \cdots \Delta z^{i_n} \frac{\partial ^n \mathcal{A}^g ( \mathbf{z} )}{ \partial z^{i_1} \cdots \partial z^{i_n}} , \end{aligned}$$
(148)

where \(\mathcal{T}^*\mathcal{A}^g\) denotes the pullback transformation of \(\mathcal{A}^g\) by \(\mathcal{T}\). Using the inverse transformation \(\mathcal{T}^{-1}\), we also have

$$\begin{aligned} \mathcal{A}^g (\mathbf{Z}) = (\mathcal{T}^{-1*}\mathcal{A}^p)(\mathbf{Z}) \equiv \mathcal{A}^p (\mathcal{T}^{-1} (\mathbf{Z})) . \end{aligned}$$
(149)

The Jacobians \(D^p\) and \(D^g\) for the two coordinate systems \(\mathbf{z}\) and \(\mathbf{Z}\) are related to each other by

$$\begin{aligned} D^p (\mathbf{z}) = \det \left[ \frac{\partial (\mathbf{Z})}{\partial (\mathbf{z})} \right] D^g (\mathbf{Z}) , \end{aligned}$$
(150)

where \(\partial (\mathbf{Z})/\partial (\mathbf{z})\) denotes the Jacobian matrix. Then, we use the following formula:

$$\begin{aligned} \delta ^6 ( \mathbf{z} + \Delta \mathbf{z} - \mathbf{Z} ) = \sum _{n = 0}^\infty \frac{1}{n!} \sum _{i_1, \ldots , i_n} \Delta z_{i_1} \cdots \Delta z_{i_n} \frac{\partial ^n \delta ^6 (\mathbf{z} - \mathbf{Z} )}{\partial z_{i_1} \ldots \partial z_{i_n} } , \end{aligned}$$
(151)

and partial integrals to derive the relation between the expressions of the scalar density \(\mathcal{D} \mathcal{A}\) in the gyrocenter and particle coordinate systems as

$$\begin{aligned} D^g (\mathbf{Z}) \mathcal{A}^g (\mathbf{Z})&= \int d^6 Z' \; \delta ^6 ( \mathbf{Z}' - \mathbf{Z} )D^g (\mathbf{Z}') \mathcal{A}^g (\mathbf{Z}') \nonumber \\&= \int d^6 z \; \delta ^6 ( \mathbf{z} + \Delta \mathbf{z} - \mathbf{Z} ) D^p (\mathbf{z}) \mathcal{A}^p (\mathbf{z}) \nonumber \\&= \sum _{n=0}^\infty \frac{(-1)^n}{n!} \sum _{i_1,\cdots ,i_n} \left[ \frac{\partial ^n \left[ \Delta z^{i_1} \cdots \Delta z^{i_n} D^p (\mathbf{z}) \mathcal{A}^p ( \mathbf{z} ) \right] }{ \partial z^{i_1} \cdots \partial z^{i_n}} \right] _{\mathbf{z} = \mathbf{Z}} , \end{aligned}$$
(152)

where the replacement of \(\mathbf{z}\) with \(\mathbf{Z}\) is represented by \([ \cdots ]_{\mathbf{z} = \mathbf{Z}} \equiv \int d^6 z \; \delta ^6 ( \mathbf{z} - \mathbf{Z} ) \cdots\).

We now note that the collision term \(C_{ab}\) can be regarded as a scalar field on the phase space. When using the particle coordinates, we represent the collision term by \(C_{ab}^p\). Then, the collision term \(C_{ab}^g\) represented in the gyrocenter coordinates is related to \(C_{ab}^p\) by

$$\begin{aligned} C_{ab}^g [F_a, F_b] = \mathcal{T}_a^{-1*} C_{ab}^p [\mathcal{T}_a^* F_a, \mathcal{T}_b^* F_b ] , \end{aligned}$$
(153)

where the distribution function for species a in the particle coordinates is written as the pullback \(f_a = \mathcal{T}_a^* F_a\) of that in the gyrocenter coordinates \(F_a\) by the coordinate transformation \(\mathcal{T}_a\) and \(\mathcal{T}_a^{-1*}\) transforms the collision term as a function of the particle coordinates into that of the gyrocenter coordinates.

To see collisional effects on conservation laws, it is convenient to represent the collision term in the gyrocenter coordinate using the transformation formula for the scalar density \(D_a C_{ab}\) rather than that for the scalar \(C_{ab}\) shown in Eq. (153). Using Eq. (152), we can derive

$$\begin{aligned}&D_a^g (\mathbf{Z}_a) C_{ab}^g [F_a, F_b](\mathbf{Z}_a) \mathcal{A}_a^g (\mathbf{Z}_a) \nonumber \\&= \sum _{n=0}^\infty \frac{(-1)^n}{n!} \sum _{i_1,\cdots ,i_n} \left[ \frac{\partial ^n \left[ \Delta z_a^{i_1} \cdots \Delta z_a^{i_n} D_a^p (\mathbf{z}_a) C_{ab}^p [f_a, f_b](\mathbf{z}_a) \mathcal{A}_a^p (\mathbf{z}_a) \right] }{ \partial z_a^{i_1} \cdots \partial z_a^{i_n}} \right] _{\mathbf{z}_a = \mathbf{Z}_a} , \end{aligned}$$
(154)

where \(\mathcal{A}_a\) is an arbitrary scalar field depending on particle species and \(f_a = \mathcal{T}_a^* F_a\) is rewritten using Eq. (148) as

$$\begin{aligned} f_a (\mathbf{z}_a) = \sum _{n=0}^\infty \frac{1}{n!} \sum _{i_1,\ldots ,i_n} \Delta z_a^{i_1} \ldots \Delta z_a^{i_n} \frac{\partial ^n F_a ( \mathbf{z}_a )}{ \partial z_a^{i_1} \cdots \partial z_a^{i_n}} . \end{aligned}$$
(155)

Then, the gyrocenter representation of the collision operator \(C_{ab}^g\) acting on \(F_a\) and \(F_b\) is obtained by Eq. (154) with putting \(\mathcal{A}^g = \mathcal{A}^p = 1\) and using Eq. (155) to express \(f_a\) and \(f_b\) in terms of \(F_a\) and \(F_b\), respectively. Integrating Eq. (154) with respect to \((U, \mu , \xi )\) and taking the summation over species b yield

$$\begin{aligned} \int {\text {d}}U \int {\text {d}}\mu \int {\text {d}}\xi \; D_a^g (\mathbf{Z}) C_a^g (\mathbf{Z}) \mathcal{A}_a^g (\mathbf{Z}) = \left[ \int {\text {d}}^3 v \; C_a^p (\mathbf{z}) \mathcal{A}_a^p (\mathbf{z}) \right] _{\mathbf{z} = \mathbf{Z}} - \, \nabla \cdot \mathbf{J}_{Aa}^\mathrm{C}, \end{aligned}$$
(156)

where \(C_a^g = \sum _b C_{ab}^g\) and \(\nabla = \partial /\partial \mathbf{X}\) are used and \(\int {\text {d}}^3 v = \int {\text {d}}v_\parallel \int d\mu _0 \int {\text {d}}\xi _0 \, D_a^p (\mathbf{z})\) denotes the velocity-space integral using the particle coordinates. Here, the transport flux \(\mathbf{J}_{Aa}^\mathrm{C}\) of the quantity \(\mathcal{A}_a\) due to collisions and finite gyroradii of particles is defined by

$$\begin{aligned} \mathbf{J}_{Aa}^\mathrm{C} (\mathbf{X})&= \sum _{n=0}^\infty \frac{(-1)^n}{(n+1)!} \sum _{i_1,\cdots ,i_n} \frac{\partial ^n}{ \partial X^{i_1} \cdots \partial X^{i_n}} \left[ \int d^3 v \; \Delta \mathbf{x}_a \Delta x_a^{i_1} \cdots \Delta x_a^{i_n} C_a^p (\mathbf{z}) \mathcal{A}_a^p (\mathbf{z}) \right] _{\mathbf{x} = \mathbf{X}} \nonumber \\&= \left[ \int {\text {d}}^3 v \; \Delta \mathbf{x}_a C_a^p (\mathbf{z}) \mathcal{A}_a^p (\mathbf{z}) \right] _{\mathbf{x} = \mathbf{X}} + \cdots \end{aligned}$$
(157)

The integral of an arbitrary scalar field \(\mathcal{A}_a\) over the whole phase space is written in either the gyrocenter or particle coordinate system as

$$\begin{aligned} \int {\text {d}}^6 Z \; D^g_a (\mathbf{Z}) C^g_a (\mathbf{Z}) \mathcal{A}^g_a (\mathbf{Z}) = \int {\text {d}}^6 z \; D^p_a (\mathbf{z}) C^p_a (\mathbf{z}) \mathcal{A}^p_a (\mathbf{z}) . \end{aligned}$$
(158)

For the case of \(\mathcal{A}_a = 1\), Eqs. (156) and (157) reduce to

$$\begin{aligned} \int {\text {d}}U \int {\text {d}}\mu \int {\text {d}}\xi \; D_a^g (\mathbf{Z}) C_a^g (\mathbf{Z}) = - \nabla \cdot {\varvec{\Gamma }}_a^\mathrm{C} (\mathbf{X}), \end{aligned}$$
(159)

and

$$\begin{aligned} {\varvec{\Gamma }}_a^\mathrm{C} (\mathbf{X})&= \sum _{n=0}^\infty \frac{(-1)^n}{(n+1)!} \sum _{i_1,\cdots ,i_n} \frac{\partial ^n}{ \partial X^{i_1} \cdots \partial X^{i_n}} \left[ \int {\text {d}}^3 v \; \Delta \mathbf{x}_a \Delta x_a^{i_1} \cdots \Delta x_a^{i_n} C_a^p (\mathbf{z}) \right] _{\mathbf{z} = \mathbf{Z}} \nonumber \\&= \left[ \int {\text {d}}^3 v \; \Delta \mathbf{x}_a C_a^p (\mathbf{z}) \right] _{\mathbf{x} = \mathbf{X}} + \cdots , \end{aligned}$$
(160)

respectively, where \(\int {\text {d}}^3 v \, C_a^p(\mathbf{z}) = 0\) is used. Here, \({\varvec{\Gamma }}_a^\mathrm{C}\) is regarded as the classical particle flux which occurs due to collisions and finite gyroradii. In fact, using \(\Delta \mathbf{x}_a \simeq -{\varvec{\rho }}_a\), we see that the primary term of \({\varvec{\Gamma }}_a^\mathrm{C}\) shown in the last line of Eq. (160) is identical to the conventional definition of the classical particle flux \({\varvec{\Gamma }}_a^\mathrm{cl} \equiv (c/e_a B_0) \mathbf{F}_{a1} \times \mathbf{b}\), where \(\mathbf{F}_{a1} \equiv \int d^3 v \, m_a {\mathbf{v}} \, C_a^p\) is the collisional friction force. Thus, we have \({\varvec{\Gamma }}_a^\mathrm{C} = {\varvec{\Gamma }}_a^\mathrm{cl} [1 + {\mathcal {O}}(\delta )]\).

As shown in Eqs. (154) and (155), the collision term \(C_{ab}^g\) given from \(C_{ab}^p\) by the coordinate transformation contains the infinite series expansion in \(\Delta \mathbf{z}\). To use it for the gyrokinetic equation, appropriate approximations such as a truncation of the series expansion are preferable. In Sugama et al. (2015), an approximated gyrokinetic collision operator which keeps only finite-order expansion terms is presented such that it takes the conservative form similar to Eq. (154) and satisfies the energy and toroidal momentum conservation laws in the case of the low-flow ordering. In the present study for the case of high-flow ordering, we employ another elegant method originated by Burby et al. (2015) to represent the gyrokinetic collision operator using Poisson brackets as shown in the next subsection.

4.2 Collision operator represented in terms of Poisson brackets

A well-established collision operator is known as the Landau operator [see, for example, Eq. (3.22) in Helander and Sigmar (2002)] which is written in terms of the particle coordinates \(\mathbf{z}\equiv (\mathbf{x}, {\mathbf{v}})\) as

$$\begin{aligned}&C_{ab} [F_a, F_b] = - \frac{\alpha _{ab}}{m_a} \frac{\partial }{\partial {\mathbf{v}}} \cdot \left[ \int d^3 v' \mathbf{U} (\mathbf{u} ) \cdot \left\{ \frac{F_a ({\mathbf{v}})}{m_a} \frac{\partial F_b ({\mathbf{v}}')}{\partial {\mathbf{v}}'} - \frac{F_b ({\mathbf{v}}')}{m_b} \frac{\partial F_a ({\mathbf{v}})}{\partial {\mathbf{v}}} \right\} \right] \nonumber \\&= - \frac{\alpha _{ab}}{m_a} \frac{\partial }{\partial {\mathbf{v}}} \cdot \left[ \int d^6 z' \delta ( \mathbf{x} - \mathbf{x}' ) \mathbf{U} (\mathbf{u} ) \cdot \left\{ \frac{F_a (\mathbf{z})}{m_a} \frac{\partial F_b (\mathbf{z}' )}{\partial {\mathbf{v}}' } - \frac{F_b (\mathbf{z}' )}{m_b} \frac{\partial F_a (\mathbf{z})}{\partial {\mathbf{v}}} \right\} \right] . \end{aligned}$$
(161)

Here,

$$\begin{aligned} \mathbf{u} \equiv {\mathbf{v}} - {\mathbf{v}}' , \quad \mathbf{U} (\mathbf{u} ) \equiv \frac{u^2 \mathbf{I} - \mathbf{u} \mathbf{u}}{u^3} , \end{aligned}$$
(162)

and

$$\begin{aligned} \alpha _{ab} \equiv 2\pi e_a^2 e_b^2 \ln \Lambda , \end{aligned}$$
(163)

where \(\ln \Lambda\) is the Coulomb logarithm.

In Sugama et al. (2015), a gyrokinetic collision operator is constructed under the low-flow ordering such that collisional terms in the particle, energy and momentum equations are represented by the divergences of the classical transport fluxes. To obtain similar representations for the high-flow case, we here follow Burby et al. (2015) and use Poisson brackets. We here note

$$\begin{aligned} \frac{1}{m_a} \frac{\partial }{\partial {\mathbf{v}}} \cdots = \{ \mathbf{x}, \cdots \} , \; \; \frac{1}{m_b} \frac{\partial }{\partial {\mathbf{v}}'} \cdots = \{ \mathbf{x}', \cdots \} \end{aligned}$$
(164)

and

$$\begin{aligned} {\mathbf{v}}&= \frac{{\text {d}}}{dt} \mathbf{x}(\mathbf{Z}, t) = \frac{{\text {d}}{} \mathbf{Z} }{{\text {d}}t} \cdot \frac{\partial }{\partial \mathbf{Z}} \mathbf{x} + \frac{\partial \mathbf{x}(\mathbf{Z}, t)}{\partial t} \nonumber \\&= \{ \mathbf{x}, H \} + \{ \mathbf{x}, \mathbf{X} \} \cdot \frac{e}{c} \frac{\partial \mathbf{A}^*}{\partial t} + \frac{\partial \mathbf{x}(\mathbf{Z}, t)}{\partial t} = \{ \mathbf{x}, H \} + {\mathcal {O}}(\delta ^2) \end{aligned}$$
(165)

to rewrite the collision operator in Eq. (161) as Burby et al. (2015)

$$\begin{aligned} C_{ab} [F_a, F_b]&= -\alpha _{ab} \sum _{i = 1}^3 \{ x_{ai}, \gamma ^{ab}_i \} . \end{aligned}$$
(166)

Here, \(x_{ai}\) and \(\gamma ^{ab}_i\) are the ith Cartesian components of the particle position vector \(\mathbf{x}_a = \mathbf{X}_a + {\varvec{\rho }}_a\) and the vector \({\varvec{\gamma }}^{ab}\), respectively, the latter of which is defined by

$$\begin{aligned} {\varvec{\gamma }}^{ab}( \mathbf{Z}_a) \equiv \int d^6 Z_b \, D_b(\mathbf{Z}_b) \delta [ \mathbf{x}_a ( \mathbf{Z}_a ) - \mathbf{x}_b ( \mathbf{Z}_b ) ] \mathbf{U} (\mathbf{u}_{ab}) \cdot \mathbf{A}_{ab} , \end{aligned}$$
(167)

with

$$\begin{aligned} \mathbf{u}_{ab} \equiv \{ \mathbf{x}_a, H_a \} - \{ \mathbf{x}_b, H_b \} , \end{aligned}$$
(168)

and

$$\begin{aligned} \mathbf{A}_{ab} \equiv F_a ( \mathbf{Z}_a ) \{ \mathbf{x}_b, F_b (\mathbf{Z}_b) \} - F_b ( \mathbf{Z}_b ) \{ \mathbf{x}_a, F_a (\mathbf{Z}_a) \}. \end{aligned}$$
(169)

Then, as seen in Eq. (159), we find again that the integral of the collision operator with respect to the gyrocenter velocity variables \((U_a, \mu _a, \xi _a)\) at the fixed gyrocenter position \(\mathbf{X}_a = \mathbf{X}\) does not vanish but it is given in the divergence form (Sugama et al. 2017),

$$\begin{aligned} \int {\text {d}}^3 v_a^\mathrm{(gc)} \, C_{ab} [F_a, F_b] = - \nabla \cdot {\varvec{\Gamma }}_{ab}^\mathrm{C} (\mathbf{X}) , \end{aligned}$$
(170)

where \(\nabla \equiv \partial / \partial \mathbf{X}\) and

$$\begin{aligned} {\varvec{\Gamma }}_{ab}^\mathrm{C} \equiv -\alpha _{ab} \int {\text {d}}^3 v_a^\mathrm{(gc)} \{ \mathbf{X}_a, \mathbf{x}_a \} \cdot {\varvec{\gamma }}^{ab} \end{aligned}$$
(171)

represents the classical particle flux due to finite gyroradii and collisions between the species a and b. In addition, Eq. (166) can be used to derive the integral formulas representing the divergences of energy, toroidal momentum, and entropy fluxes at the gyrocenter position \(\mathbf{X}_a = \mathbf{X}_b = \mathbf{X}\) as (Sugama et al. 2017)

$$\begin{aligned}&\int {\text {d}}^3 v_a^\mathrm{(gc)} \, C_{ab} H_a + \int {\text {d}}^3 v_b^\mathrm{(gc)} \, C_{ba} H_b = - \nabla \cdot ( \mathbf{Q}_{ab}^\mathrm{C} + \mathbf{Q}_{ba}^\mathrm{C} ) , \nonumber \\&\int {\text {d}}^3 v_a^\mathrm{(gc)} \, C_{ab} P^c_{a\zeta } + \int {\text {d}}^3 v_b^\mathrm{(gc)} \, C_{ba} P^c_{b\zeta } = - \nabla \cdot ( {\varvec{\Pi }}_{ab\zeta }^\mathrm{C} + {\varvec{\Pi }}_{ba\zeta }^\mathrm{C} ) , \nonumber \\&- \int {\text {d}}^3 v_a^\mathrm{(gc)} \, C_{ab} (\log F_a + 1) - \int {\text {d}}^3 v_b^\mathrm{(gc)} \, C_{ba} (\log F_b + 1) = \sigma _{ab}^\mathrm{C} - \nabla \cdot ( \mathbf{J}_{Sab} ^\mathrm{C} + \mathbf{J}_{Sba} ^\mathrm{C} ) . \end{aligned}$$
(172)

Here, the energy flux \(\mathbf{Q}_{ab}^\mathrm{C}\), the toroidal momentum flux \({\varvec{\Pi }}_{ab\zeta }^\mathrm{C}\) and the entropy flux \(\mathbf{J}_{Sab} ^\mathrm{C}\) are defined by

$$\begin{aligned} \mathbf{Q}_{ab}^\mathrm{C}\equiv & {} -\alpha _{ab} \left[ \int d^3 v_a^\mathrm{(gc)} H_a \{ \mathbf{X}_a, \mathbf{x}_a \} \cdot {\varvec{\gamma }}^{ab} \right. \nonumber \\&\left. + \sum _{n=0}^{\infty } \frac{(-1)^n}{(n+1)!} \sum _{i_1, \cdots , i_n} \frac{\partial ^n \left( \int d^3 v_a^\mathrm{(gc)} \rho _{a i_1} \cdots \rho _{a i_n} {\varvec{\rho }}_a {\varvec{\gamma }}^{ab} \cdot \{ \mathbf{x}_a, H_a \} \right) }{\partial X_{a i_1} \cdots \partial X_{a i_n}} \right] , \end{aligned}$$
(173)
$$\begin{aligned} {\varvec{\Pi }}_{ab\zeta }^\mathrm{C}\equiv & {} -\alpha _{ab} \left[ \int {\text {d}}^3 v_a^\mathrm{(gc)} P^c_{a\zeta } \{ \mathbf{X}_a, \mathbf{x}_a \} \cdot {\varvec{\gamma }}^{ab} \right. \nonumber \\&\left. + \sum _{n=0}^{\infty } \frac{(-1)^n}{(n+1)!} \sum _{i_1, \cdots , i_n} \frac{\partial ^n \left( \int d^3 v_a^\mathrm{(gc)} \rho _{a i_1} \cdots \rho _{a i_n} {\varvec{\rho }}_a {\varvec{\gamma }}^{ab} \cdot \{ \mathbf{x}_a, P^c_{a\zeta } \} \right) }{\partial X_{a i_1} \cdots \partial X_{a i_n}} \right] , \end{aligned}$$
(174)

and

$$\begin{aligned} \mathbf{J}_{Sab} ^\mathrm{C}\equiv \, & {} \alpha _{ab} \left[ \int d^3 v_a^\mathrm{(gc)} (\log F_a + 1) \{ \mathbf{X}_a, \mathbf{x}_a \} \cdot {\varvec{\gamma }}^{ab} \right. \nonumber \\&\left. + \sum _{n=0}^{\infty } \frac{(-1)^n}{(n+1)!} \sum _{i_1, \cdots , i_n} \frac{\partial ^n \left( \int d^3 v_a^\mathrm{(gc)} \rho _{a i_1} \cdots \rho _{a i_n} {\varvec{\rho }}_a {\varvec{\gamma }}^{ab} \cdot \{ \mathbf{x}_a, \log F_a \} \right) }{\partial X_{a i_1} \cdots \partial X_{a i_n}} \right] , \end{aligned}$$
(175)

respectively. The entropy production rate \(\sigma _{ab}^\mathrm{C}\) is given by

$$\begin{aligned} \sigma _{ab}^\mathrm{C} (\mathbf{X})\equiv \, & {} \alpha _{ab} \int d^6 Z_a \int d^6 Z_a D_a D_b \delta (\mathbf{x}_a - \mathbf{x}_b) \delta (\mathbf{x}_a - \mathbf{X}) \nonumber \\&\times (F_a F_b)^{-1} \mathbf{A}_{ab} \cdot \mathbf{U}(\mathbf{u}_{ab})\cdot \mathbf{A}_{ab} . \end{aligned}$$
(176)

With the formula \(\mathbf{a} \cdot \mathbf{U} (\mathbf{u}) \cdot \mathbf{a} = u^{-3} [a^2 u^2 - (\mathbf{a}\cdot \mathbf{u})^2 ] \ge 0\), Eq. (176) proves \(\sigma _{ab}^\mathrm{C} \ge 0\) which represents the second law of thermodynamics.

4.3 Equations for gyrocenter densities and polarization in collisional systems

Integrating the gyrokinetic Boltzmann equation, Eq. (145), with respect to the gyrocenter velocity-space coordinates \((U, \mu , \xi )\) and using Eq. (170), we obtain the particle balance equation,

$$\begin{aligned} \frac{\partial n_a^\mathrm{(gc)}}{\partial t} + \nabla \cdot ( {\varvec{\Gamma }}_a^\mathrm{(gc)} + {\varvec{\Gamma }}_a^\mathrm{C} ) = \int d^3 v^\mathrm{(gc)} \, \mathcal{S}_a . \end{aligned}$$
(177)

Here, the gyrocenter density \(n_a^\mathrm{(gc)}\) and the gyrocenter flux \({\varvec{\Gamma }}_a^\mathrm{(gc)}\) are given by

$$\begin{aligned} n_a^\mathrm{(gc)} (\mathbf{X}, t) \equiv \int d^3 v^\mathrm{(gc)} F_a , \end{aligned}$$
(178)

and

$$\begin{aligned} {\varvec{\Gamma }}_a^\mathrm{(gc)} \equiv n_a^\mathrm{(gc)} \mathbf{u}_a^\mathrm{(gc)} \equiv \int d^3 v^\mathrm{(gc)} F_a {\mathbf{v}}_a^\mathrm{(gc)} , \end{aligned}$$
(179)

respectively, where \(\mathbf{u}_a^\mathrm{(gc)}\) represents the gyrocenter fluid velocity, and the gyrocenter drift velocity \({\mathbf{v}}_a^\mathrm{(gc)} \equiv d \mathbf{X}_a/dt\) is given by evaluating the right-hand side of Eq. (55) at \((\mathbf{X}, U, \mu )\). The classical particle particle flux \({\varvec{\Gamma }}_a^\mathrm{C}\) is defined using Eq. (171) and

$$\begin{aligned} {\varvec{\Gamma }}_a^\mathrm{C} \equiv \sum _b {\varvec{\Gamma }}_{ab}^\mathrm{C} . \end{aligned}$$
(180)

The right-hand side of Eq. (177) represents the particle source term.

Flux-surface-averaging Eq. (177) gives

$$\begin{aligned}&\frac{\partial }{\partial t} ( V' \langle n_a^\mathrm{(gc)}\rangle ) + \frac{\partial }{\partial s} ( V' \langle ( {\varvec{\Gamma }}_a^\mathrm{(gc)} + {\varvec{\Gamma }}_a^\mathrm{C} - n_a^\mathrm{(gc)} \mathbf{u}_s ) \cdot \nabla s \rangle )\nonumber \\&= V' \left\langle \int d^3 v^\mathrm{(gc)} \, \mathcal{S}_a \right\rangle , \end{aligned}$$
(181)

where s is an arbitrary label of a flux surface, \(V' \equiv \partial V(s,t) / \partial s\), V(st) is the volume enclosed by the flux surface, and \(\mathbf{u}_s\) is defined by Eq. (137). Using Eqs. (142) and (177), we obtain the charge conservation law,

$$\begin{aligned} \frac{\partial }{\partial t} \left( \sum _a e_a n_a^\mathrm{(gc)} \right) + \nabla \cdot ( \mathbf{j}^\mathrm{(gc)} + \mathbf{j}^\mathrm{C} ) = 0 \end{aligned}$$
(182)

where the current density due to the gyrocenter drift and that due to the collisional particle transport are given by

$$\begin{aligned} \mathbf{j}^\mathrm{(gc)} \equiv \sum _a e_a {\varvec{\Gamma }}_a^\mathrm{(gc)} \equiv \sum _a e_a n_a^\mathrm{(gc)} \mathbf{u}_a^\mathrm{(gc)} , \end{aligned}$$
(183)

and

$$\begin{aligned} \mathbf{j}^\mathrm{C} \equiv \sum _a e_a {\varvec{\Gamma }}_a^\mathrm{C} , \end{aligned}$$
(184)

respectively. Note that the magnetization current is solenoidal and accordingly it does not contribute to the charge conservation law in Eq. (182). Equation (87) is substituted into Eq. (182) to show

$$\begin{aligned} \mathbf{j}^\mathrm{(gc)}_L + \mathbf{j}^\mathrm{C}_L = - \frac{\partial }{\partial t} \left( \frac{\mathbf{E}_L}{4 \pi } + \mathbf{P}^\mathrm{(pol)}_L \right) , \end{aligned}$$
(185)

where the subscript L is used to represent the longitudinal part of the vector variable. Then, using Eqs. (87), (182) and (185), we find that the useful formula,

$$\begin{aligned}&\left\langle \frac{\partial }{\partial t} \left( \mathcal{A} \sum _a e_a n_a^\mathrm{(gc)} \right) \right\rangle + \left\langle \nabla \cdot \left( \mathcal{A} \mathbf{j}^\mathrm{(gc)}_L \right) \right\rangle \nonumber \\&= \left\langle \frac{\partial \mathcal{A} }{\partial t} \sum _a e_a n_a^\mathrm{(gc)} \right\rangle + \left\langle \mathbf{j}^\mathrm{(gc)}_L \cdot \nabla \mathcal{A} \right\rangle - \left\langle \mathcal{A} \left( \nabla \cdot \mathbf{j}^\mathrm{C}_L \right) \right\rangle \nonumber \\&= \left\langle \nabla \cdot \left[ \frac{\partial \mathcal{A} }{\partial t} \left( \frac{\mathbf{E}_L}{4 \pi } + \mathbf{P}^\mathrm{(pol)}_L \right) - \mathcal{A} \mathbf{j}^\mathrm{C}_L \right] \right\rangle - \left\langle \frac{\partial }{\partial t} \left[ \left( \frac{\mathbf{E}_L}{4 \pi } + \mathbf{P}^\mathrm{(pol)}_L \right) \cdot \nabla \mathcal{A} \right\rangle \right] , \end{aligned}$$
(186)

holds for any function \(\mathcal{A}(\mathbf{X}, t)\). The relation in Eq. (186) is used in Secs. 4.5 and 4.6 to derive the energy and toroidal momentum balance equations given by Eqs. (207) and (223), respectively.

4.4 Effects of the collision and source terms on conservation laws

We now investigate effects of the collision and source terms on conservation laws for the gyrokinetic Boltzmann–Poisson–Ampère system of equations shown in Eqs. (83), (100), (101), (115), (116), and (143). Here, we also consider the gyrocenter distribution function \(F_a^V\) which obeys the gyrokinetic Vlasov equation,

$$\begin{aligned} \left( \frac{\partial }{\partial t} + \frac{d \mathbf{Z}_a}{dt} \cdot \frac{\partial }{\partial \mathbf{Z}} \right) F_a^V =0 , \end{aligned}$$
(187)

where \(d \mathbf{Z}_a/dt\) is evaluated using the electromagnetic fields \((\phi _1, \mathbf{A}_1, \Phi _0, I, \chi )\) obtained from the solution of the gyrokinetic Boltzmann–Poisson–Ampère system of equations. Here, it should be noted that, if the distribution functions \(F_a\) and \(F_a^V\), which are given as the solutions of Eqs. (143) and (187), respectively, are initially gyrophase-independent, they are gyrophase-independent at any time. Besides, \(F_a^V\) is assumed to coincide instantaneously with \(F_a\) at a given time \(t_0\). Therefore, Eqs. (83), (100), (101), (115) and (116) are all satisfied at that moment even if \(F_a\) is replaced with \(F_a^V\) in these equations. Thus, the gyrokinetic Vlasov–Poisson–Ampère system of equations are instantaneously satisfied by \((F_a^V, \phi _1, \mathbf{A}_1, \Phi _0, I, \chi )\) at \(t=t_0\). The action integral \(\mathcal{I}\) is defined from the Lagrangian L in Eq. (59) to derive all the governing equations for the gyrokinetic Vlasov–Poisson–Ampère system based on the variational principle, and its variation \(\delta \mathcal{I}\) associated with the infinitesimal variable transformations is given in Sect. 3.3 to obtain conservation laws from Noether’s theorem. Here, the action integral \(\mathcal{I}\) can be expressed in terms of \((F_a^V, \phi _1, \mathbf{A}_1, \Phi _0, I, \chi )\) over a small time interval, \(t_0 - h / 2 \le t \le t_0 + h / 2\), during which the gyrokinetic Vlasov–Poisson–Ampère system of equations are approximately satisfied by them within the errors of order h. Then, neglecting the errors of higher order in h, we can write the variation \(\delta \mathcal{I}\) in the same form as in Eq. (133),

$$\begin{aligned} \delta \mathcal{I} = - \int _{t_0 - h/2}^{t_0 + h/2} dt \int d^3 \mathbf{X} \; \left[ \frac{\partial }{\partial t}\delta G_0^V (\mathbf{X}, t) + \nabla \cdot \delta \mathbf{G}^V (\mathbf{X}, t) \right] , \end{aligned}$$
(188)

with the functions \(\delta G_0^V\) and \(\delta \mathbf{G}^V\) defined by

$$\begin{aligned} \delta G_0^V (\mathbf{X}, t)&= \mathcal{E}_c^V \; \delta t_E - \mathbf{P}_c^V \cdot \delta \mathbf{x}_E , \nonumber \\ \delta \mathbf{G}^V (\mathbf{X}, t)&= \mathbf{Q}_c^V \; \delta t_E - {\varvec{\Pi }}_c^V \cdot \delta \mathbf{x}_E + \mathbf{S}_{\phi _1} \; \delta \phi _1 + \mathbf{S}_{\Phi _0}^V \; \delta \Phi _0 + \mathbf{S}_{V\zeta }^V \; \delta V^\zeta \nonumber \\& \qquad - {\varvec{\Sigma }}_{A1} \cdot \delta \mathbf{A}_1 - {\varvec{\Sigma }}_{A0}^V \cdot \delta \mathbf{A}_0 + \mathbf{S}_\chi ^V \delta \chi + \delta \mathbf{T}^V , \end{aligned}$$
(189)

where \(\mathcal{E}_c^V\) and \(\mathbf{P}_c^V\) are defined by

$$\begin{aligned} \mathcal{E}_c^V&= \sum _a \int d^3 v^\mathrm{(gc)} F_a^V H_a + \frac{1}{8\pi } \left( - |\nabla ( \Phi _0 + \phi _1 ) |^2 + |\mathbf{B}_0 + \mathbf{B}_1 |^2 \right) ,\nonumber \\ \mathbf{P}_c^V& = \sum _a \int d^3 v^\mathrm{(gc)} F_a^V \left( \frac{e_a}{c} \mathbf{A}_0 + m_a ( U \mathbf{b} + {\mathbf{V}}_0 ) \right) , \end{aligned}$$
(190)

and definitions of other variables \(\mathbf{Q}_c^V\), \({\varvec{\Pi }}_c^V\), \(\mathbf{S}_{\phi _1}\), \({\varvec{\Sigma }}_{A1}\), \({\varvec{\Sigma }}_{A0}^V\) \(\mathbf{S}_\chi ^V\), and \(\delta \mathbf{T}^V\) are shown in Eq. (135). The superscript V in the variables \((\mathcal{E}_c^V , \mathbf{P}_c^V, \cdots )\) implies that they are defined using the distribution function \(F_a^V\) instead of \(F_a\).

As explained before Eq. (136), the integral domain of Eq. (188) is not an arbitrary local one in the \(\mathbf{X}\)-space but it can be local only in the radial direction in order for Eq. (133) to be valid. Then, if the variations \(\delta t_E, \delta \mathbf{x}_E, \cdots\) in Eq. (189) are such that \(\delta \mathcal{I} = 0\) holds for a spatiotemporal integral domain defined by \([t_0 - h / 2, t_0 + h / 2 ] \times [s_1, s_2]\), where \([s_1, s_2]\) represents an arbitrary spatial volume region sandwiched between two flux surfaces labeled by \(s_1\) and \(s_2\), then the conservation law is derived as

$$\begin{aligned}&\left[ \left\langle \frac{\partial }{\partial t}\delta G_0^V (\mathbf{X}, t) + \nabla \cdot \delta \mathbf{G}^V (\mathbf{X}, t_0) \right\rangle \right] _{t=t_0} \nonumber \\& = \left[ \left\langle \frac{\partial }{\partial t}\delta G_0^V (\mathbf{X}, t) \right\rangle + \frac{1}{V'}\frac{\partial }{\partial s} \left( V' \left\langle \delta \mathbf{G}^V \cdot \nabla s \right\rangle \right) \right] _{t=t_0} = 0 . \end{aligned}$$
(191)

This is Noether’s theorem for the gyrokinetic Vlasov–Poisson–Ampère system as shown in Eq. (136). Using \(F_a^V (\mathbf{Z}, t_0) = F_a (\mathbf{Z}, t_0)\) and comparing Eq. (143) with Eq. (187), we find

$$\begin{aligned} \left[ \frac{\partial F_a^V (\mathbf{Z}, t)}{\partial t} \right] _{t = t_0} = \left[ \frac{\partial F_a (\mathbf{Z}, t)}{\partial t} \right] _{t = t_0} - \mathcal{K}_a (\mathbf{Z}, t_0) , \end{aligned}$$
(192)

where \(\mathcal{K}\) is defined by Eq. (144). Let us also define \(\delta G_0\) and \(\delta \mathbf{G}\) from \(\delta G_0^V\) and \(\delta \mathbf{G}^V\) by replacing \(F_a^V\) with \(F_a\). Then, we have \(\mathbf{G}^V (\mathbf{X}, t_0) = \mathbf{G} (\mathbf{X}, t_0)\) and

$$\begin{aligned} \left[ \frac{\partial \delta G_0^V (\mathbf{X}, t)}{\partial t} \right] _{t = t_0} = \left[ \frac{\partial \delta G_0 (\mathbf{X}, t))}{\partial t} \right] _{t = t_0} - \delta K_{G0} (\mathbf{X}, t_0) , \end{aligned}$$
(193)

where

$$\begin{aligned} \delta K_{G0}& = K_{\mathcal{E}c} \delta t_E - \mathbf{K}_{Pc} \cdot \delta \mathbf{x}_E , \nonumber \\ K_{\mathcal{E}c}& = \sum _a e_a \int d^3 v^\mathrm{(gc)} \mathcal{K}_a H_a,\nonumber \\ \mathbf{K}_{Pc}&= \sum _a e_a \int d^3 v^\mathrm{(gc)} \mathcal{K}_a \mathbf{P}^c_a . \end{aligned}$$
(194)

Here, \(\mathbf{P}^c_a\) denotes the canonical momentum for species a defined by

$$\begin{aligned} \mathbf{P}^c_a = \frac{e_a}{c} \mathbf{A}_a^* = \frac{e_a}{c} \mathbf{A}_0 + m_a ( U \mathbf{b} + {\mathbf{V}}_0 ) . \end{aligned}$$
(195)

Substituting Eq. (193) into Eq. (191) and rewriting the arbitrarily chosen time \(t_0\) as t, we obtain the conservation law for the gyrokinetic Boltzmann–Poisson–Ampère system,

$$\begin{aligned}&\left\langle \frac{\partial }{\partial t}\delta G_0 (\mathbf{X}, t) + \nabla \cdot \delta \mathbf{G} (\mathbf{X}, t) \right\rangle \nonumber \\& = \left\langle \frac{\partial }{\partial t}\delta G_0 (\mathbf{X}, t) \right\rangle + \frac{1}{V'}\frac{\partial }{\partial s} \left( V' \left\langle \delta \mathbf{G} \cdot \nabla s \right\rangle \right) = \left\langle \delta K_{G0} \right\rangle , \end{aligned}$$
(196)

where \(\left\langle \delta K_{G0} \right\rangle\) represents effects of the collision and source terms on the conservation law. Under the nonstationary background field \(\mathbf{B}_0\), flux surfaces may change their shapes and the grid of the flux coordinates moves. Then, Eq. (136) is rewritten as

$$\begin{aligned} \frac{\partial }{\partial t} \left( V' \left\langle \delta G_0 \right\rangle \right) + \frac{\partial }{\partial s} \left( V' \left\langle \left( \delta \mathbf{G} - \delta G_0 \mathbf{u}_s \right) \cdot \nabla s \right\rangle \right) = V' \left\langle \delta K_{G0} \right\rangle , \end{aligned}$$
(197)

where \(\mathbf{u}_s\) is defined by Eq. (137). In Secs. 4.5 and 4.6, gyrokinetic energy and toroidal angular momentum balance equations are derived from Eq. (197).

4.5 Energy balance equation

The variation \(\delta \mathcal{I}\) of the action given in Eq. (188) vanishes under the infinitesimal time translation represented by

$$\begin{aligned} \delta t_E = \epsilon , \end{aligned}$$
(198)

where \(\epsilon\) is an infinitesimally small constant. Here, all other infinitesimal variations \(\delta \mathbf{x}_E\), \(\delta \phi _1\), \(\cdots\) are regarded as zero. Then, \(\delta G_0\) and \(\delta \mathbf{G}\) are determined by these conditions for the infinitesimal time translation and they satisfy Eq. (197) which leads to the energy balance equation,

$$\begin{aligned} \frac{\partial }{\partial t} \left( V' \left\langle \mathcal{E}_c \right\rangle \right) + \frac{\partial }{\partial s} \left( V' \left\langle \left( \mathbf{Q}_c + \mathbf{Q}_R - \mathcal{E}_c \mathbf{u}_s \right) \cdot \nabla s \right\rangle \right) = V' \left\langle K_{\mathcal{E}c} \right\rangle . \end{aligned}$$
(199)

Here, \(\mathbf{Q}_c\) is defined in Eq. (135) and \(\mathbf{Q}_R\) is given by

$$\begin{aligned} \mathbf{Q}_R& = e_a \sum _{n=1}^\infty \frac{1}{n!} \sum _{i_1 = 1}^\infty \cdots \sum _{i_{n-1} = 1}^\infty \int dU \int d\mu \int d \xi \left[ - D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} \partial _t \psi _a}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\& \qquad + \frac{\partial ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} \partial _t \psi _a}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} \nonumber \\& \qquad + \cdots + (-1)^n \frac{\partial ^{n-1} ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \partial _t \psi _a \nonumber \\& \qquad + \frac{e_a}{m_a c^2} \left\{ - D_a F_a {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} (\mathbf{A}_1 \cdot \overline{\delta } \mathbf{A}_1)}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\& \qquad + \frac{\partial ( D_a F_a {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} (\mathbf{A}_1 \cdot \partial _t \mathbf{A}_1)}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} + \cdots \nonumber \\& \qquad \left. \left. + \, (-1)^n \frac{\partial ^{n-1} ( D_a F_a {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} (\mathbf{A}_1 \cdot \partial _t \mathbf{A}_1) \right\} \right] . \end{aligned}$$
(200)

The energy density \(\mathcal{E}_c\), which is defined in Eq. (135), can be rewritten as

$$\begin{aligned} \mathcal{E}_c& = \sum _a \int d^3 v^\mathrm{(gc)} \, F_a \left( \frac{m_a}{2} \left| {\mathbf{V}}_0 + U \mathbf{b} + ({\mathbf{v}}'_c)_\perp - \frac{e_a}{m_a c}{} \mathbf{A}_1 \right| ^2 + H_{a1}^V \right. \nonumber \\& \qquad \left. + \frac{e_a^2}{2 B_0} \frac{\partial }{\partial \mu } \left\langle {\widetilde{\psi }}_a \left( 2 \widetilde{\phi _1} - {\widetilde{\psi }}_a \right) \right\rangle _\xi \right) + \frac{1}{8\pi } \left( |\nabla (\Phi _0 + \phi _1)|^2 + |\mathbf{B}_0 + \mathbf{B}_1 |^2 \right) + \mathcal{E}_R , \end{aligned}$$
(201)

where

$$\begin{aligned} \mathcal{E}_R&= \Phi _0 \sum _a e_a n_a^\mathrm{(gc)} + \sum _a \int d^3 v^\mathrm{(gc)} F_a^* e_a \phi _1 (\mathbf{X} + {\varvec{\rho }}_a) - \frac{1}{4\pi } |\nabla (\Phi _0 + \phi _1)|^2 \nonumber \\& = \Phi _0 \sum _a e_a n_a^\mathrm{(gc)} - \frac{1}{4\pi } \nabla \Phi _0 \cdot \nabla (\Phi _0 + \phi _1) + \nabla \cdot \left( - \frac{1}{4\pi } \phi _1 \nabla (\Phi _0 + \phi _1) + {\varvec{\Phi }}_R \right) , \end{aligned}$$
(202)

and

$$\begin{aligned} {\varvec{\Phi }}_R&= \sum _a e_a \sum _{n=1}^\infty \frac{1}{n!} \sum _{i_1 = 1}^\infty \cdots \sum _{i_{n-1} = 1}^\infty \int dU \int d\mu \int d \xi \left[ D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} \phi _1 (\mathbf{X})}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\& \qquad - \frac{\partial ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} \phi _1 (\mathbf{X})}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} \nonumber \\& \qquad \left. + \cdots + (-1)^{n-1} \frac{\partial ^{n-1} ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \phi _1 (\mathbf{X}) \right] . \end{aligned}$$
(203)

The right-hand side of Eq. (199) is rewritten as

$$\begin{aligned} V' \left\langle K_{\mathcal{E}c} \right\rangle&= - \frac{\partial }{\partial s} \left( V' \left\langle \mathbf{Q}^\mathrm{C} \cdot \nabla s \right\rangle \right) + V' \sum _a \left\langle \int d^3 v^\mathrm{(gc)} \mathcal{S}_a H_a \right\rangle , \end{aligned}$$
(204)

where the energy flux \(\mathbf{Q}^\mathrm{C}\) due to collisions and finite gyroradii is given using Eq. (173) and

$$\begin{aligned} \mathbf{Q}^\mathrm{C} \equiv \sum _a \mathbf{Q}_a^\mathrm{C} \equiv \sum _b \mathbf{Q}_{ab}^\mathrm{C} . \end{aligned}$$
(205)

The last term on the right-hand side of Eq. (204) represents the external energy source. We also note that the formula in Eq. (186) with \(\mathcal{A}=\Phi _0\) yields

$$\begin{aligned}&\left\langle \frac{\partial }{\partial t} \left( \Phi _0 \sum _a e_a n_a^\mathrm{(gc)} \right) \right\rangle + \left\langle \nabla \cdot \left( \Phi _0 \mathbf{j}^\mathrm{(gc)}_L \right) \right\rangle \nonumber \\& = \left\langle \nabla \cdot \left[ \frac{\partial \Phi _0 }{\partial t} \left( \frac{\mathbf{E}_L}{4 \pi } + \mathbf{P}^\mathrm{(pol)}_L \right) \right] \right\rangle - \left\langle \frac{\partial }{\partial t} \left[ \left( \frac{\mathbf{E}_L}{4 \pi } + \mathbf{P}^\mathrm{(pol)}_L \right) \cdot \nabla \Phi _0 \right] \right\rangle . \end{aligned}$$
(206)

Now, using Eqs. (201)–(206), the energy balance equation in Eq. (199) is rewritten as

$$\begin{aligned} \frac{\partial }{\partial t} \left( V' \langle \mathcal{E}^*\rangle \right) + \frac{\partial }{\partial s} \left( V' \left\langle ( \mathbf{Q} - \mathcal{E}^* \mathbf{u}_s ) \cdot \nabla s \right\rangle \right) = V' \sum _a \left\langle \int d^3 v^\mathrm{(gc)} \, \mathcal{S}_a ( H_a - e_a \Phi _0 ) \right\rangle , \end{aligned}$$
(207)

where the energy density \(\mathcal{E}^*\) and the energy flux \(\mathbf{Q}\) are defined by

$$\begin{aligned} \mathcal{E}^*\equiv & {} \sum _a \int d^3 v^\mathrm{(gc)} \, F_a \left( \frac{m_a}{2} \left| {\mathbf{V}}_0 + U \mathbf{b} + ({\mathbf{v}}'_c)_\perp - \frac{e_a}{m_a c}{} \mathbf{A}_1 \right| ^2 + H_{a1}^V \right. \nonumber \\&\left. + \frac{e_a^2}{2 B_0} \frac{\partial }{\partial \mu } \left\langle {\widetilde{\psi }}_a \left( 2 \widetilde{\phi _1} - {\widetilde{\psi }}_a \right) \right\rangle _\xi \right) - \mathbf{P}^\mathrm{(pol)} \cdot \nabla \Phi _0 \nonumber \\&+ \frac{1}{8\pi } \left( |\nabla (\Phi _0 + \phi _1)|^2 + |\mathbf{B}_0 + \mathbf{B}_1 |^2 \right) , \end{aligned}$$
(208)

and

$$\begin{aligned} \mathbf{Q} = \mathbf{Q}_c^* + \mathbf{Q}_R^* + { \mathbf{Q}^\mathrm{C*} } , \end{aligned}$$
(209)

respectively. Here, the energy fluxes \(\mathbf{Q}_c^*\), \(\mathbf{Q}_R^*\), and \(\mathbf{Q}^\mathrm{C*}\) are given by

$$\begin{aligned} \mathbf{Q}_c^*&= \sum _a \int d^3 v^\mathrm{(gc)} F_a \left[ ( H_a - e_a \Phi _0 ) {\mathbf{v}}_a^\mathrm{(gc)} + \frac{\partial \mathbf{A}_0}{\partial t} \times \left\{ -\mu \mathbf{b} \left( 1+ \frac{1}{B_0 \Omega _{a0}} \left[ \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi \right. \right. \right. \right. \nonumber \\& \qquad \left. \left. \left. \left. + \frac{|\nabla \chi |^2}{2} \frac{\partial V^\zeta }{\partial \chi } \right] \right) + \frac{m_a U}{B_0} ({\mathbf{v}}_a^\mathrm{(gc)} - U\mathbf{b} - {\mathbf{V}}_0 )_\perp - \mathbf{N}_a \right\} \right. \nonumber \\& \qquad - \left( \frac{\partial \Phi _0}{\partial t} \right) _\chi \frac{e_a \mathbf{b}}{\Omega _{a0}} \times \left( {\mathbf{v}}_a^\mathrm{(gc)} - \mathbf{V}_0 - \frac{e_a}{m_a} \frac{\partial \Psi _a}{\partial {\mathbf{V}}_0 } \right) - \frac{\partial \Phi _0}{\partial t} \frac{c \mu }{\Omega _{a0}} \frac{2}{R} \nabla R \nonumber \\& \qquad \left. + \frac{\mu }{\Omega _{a0}} \left( \frac{1}{2} \frac{\partial V^\zeta }{\partial t} + \frac{\partial \chi }{\partial t} \frac{\partial V^\zeta }{\partial \chi } \right) \nabla \chi \right] + \left( \frac{\partial \Phi _0}{\partial t} \right) _\chi \frac{\nabla \chi }{|\nabla \chi |^2} \nabla \cdot \left( \sum _a \int d^3 v^\mathrm{(gc)} F_a \frac{c \mu }{2 \Omega _{a0}} \nabla \chi \right) \nonumber \\& \qquad + \frac{1}{4\pi } \frac{\partial \phi _1}{\partial t} \nabla ( \Phi _0 + \phi _1 ) - \frac{1}{4\pi } \frac{\partial (\mathbf{A}_0 + \mathbf{A}_1 )}{\partial t} \times ( \mathbf{B}_0 + \mathbf{B}_1 ) \nonumber \\& \qquad + \frac{1}{4\pi c} \left( \lambda \frac{\partial \mathbf{A}_1}{\partial t} + \alpha \frac{\partial \mathbf{A}_0}{\partial t} \right) - \frac{1}{4\pi }{\varvec{\Lambda }} \times \left( \frac{\partial \mathbf{A}_0}{\partial t} + \frac{\partial \chi }{\partial t} \nabla \zeta \right) , \end{aligned}$$
(210)
$$\begin{aligned} \mathbf{Q}_R^*&= \mathbf{Q}_R + \frac{\partial {\varvec{\Phi }}_R}{\partial t} , \end{aligned}$$
(211)

and

$$\begin{aligned} \mathbf{Q}^\mathrm{C*} \equiv \sum _a ( \mathbf{Q}_a^\mathrm{C} - e_a \Phi _0 {\varvec{\Gamma }}_a^\mathrm{C} ) , \end{aligned}$$
(212)

respectively, where \(\mathbf{N}_a\), \(\mathbf{Q}_R\), and \({\varvec{\Phi }}_R\) are defined in Eqs. (107), (200), and (203), respectively. It is confirmed later in Sect. 5.5 that the ensemble average of \(\langle \mathbf{Q} \cdot \nabla s \rangle\) contains all classical, neoclassical, and turbulent energy fluxes which are separately derived in the previous works.

4.6 Toroidal angular momentum balance equation

The toroidal angular momentum balance equation is derived from the fact that \(\delta \mathcal{I} = 0\) under the infinitesimal toroidal rotation represented by

$$\begin{aligned} \delta \mathbf{x}_E = \epsilon \mathbf{e}_\zeta (\mathbf{X}) . \end{aligned}$$
(213)

Here, \(\epsilon\) is again an infinitesimally small constant, and \(\mathbf{e}_\zeta (\mathbf{X})\) is defined by

$$\begin{aligned} \mathbf{e}_\zeta (\mathbf{X}) = \partial \mathbf{X}(R, z, \zeta )/\partial \zeta = R^2 \nabla \zeta , \end{aligned}$$
(214)

where the right-handed cylindrical spatial coordinates \((R, z, \zeta )\) are used. We also define \(\hat{\mathbf{z}}\) by

$$\begin{aligned} \hat{\mathbf{z}} = R\nabla \zeta \times \nabla R , \end{aligned}$$
(215)

which represents the unit vector in the z-direction. Then, if putting the origin of the position vector \(\mathbf{X}\) at \((R, z) = (0, 0)\), we have

$$\begin{aligned} \mathbf{e}_\zeta (\mathbf{X}) = \mathbf{X} \times \hat{\mathbf{z}} . \end{aligned}$$
(216)

Under the infinitesimal toroidal rotation, the variations of the vector variables are given as

$$\begin{aligned} \delta \mathbf{A}_1 = \epsilon \mathbf{A}_1\times \hat{\mathbf{z}} , \quad \delta \mathbf{A}_0 = \epsilon \mathbf{A}_0 \times \hat{\mathbf{z}} , \end{aligned}$$
(217)

although the other variations \(\delta t_E\), \(\delta \phi _1\), \(\cdots\), are all regarded as zero. Then, using these variations of the variables associated with the infinitesimal toroidal rotation, the canonical momentum balance equation is derived from Eq. (197) as

$$\begin{aligned}&\left\langle \frac{\partial ( \mathbf{P}_c \cdot \mathbf{e}_\zeta ) }{\partial t} \right\rangle + \frac{1}{V'} \frac{\partial }{\partial s} \left[ V' \left\langle \nabla s \cdot \left( {\varvec{\Pi }}_c \cdot \mathbf{e}_\zeta + \left( {\varvec{\Sigma }}_{A1} \times \mathbf{A}_1 + {\varvec{\Sigma }}_{A0} \times \mathbf{A}_0 \right) \cdot \hat{\mathbf{z}} + \mathbf{P}_{R\zeta } \right) \right\rangle \right] \nonumber \\&= \left\langle K_{Pc\zeta } \right\rangle . \end{aligned}$$
(218)

Here, the density of the canonical toroidal angular momentum is defined by

$$\begin{aligned} \mathbf{P}_c \cdot \mathbf{e}_\zeta = \sum _a \int d^3 v^\mathrm{(gc)} F_a (P_a^c)_\zeta , \end{aligned}$$
(219)

with the toroidal component of the canonical momentum denoted by

$$\begin{aligned} (P_a^c)_\zeta = \frac{e_a}{c} A_{a \zeta }^* = \frac{e_a}{c} A_{0 \zeta } + m_a ( U b_\zeta + V_\zeta ) , \end{aligned}$$
(220)

where \(b_\zeta \equiv I/B_0\) and \(V_\zeta \equiv {\mathbf{V}}_0 \cdot \mathbf{e}_\zeta\) represent the covariant toroidal components of \(\mathbf{b} \equiv \mathbf{B}_0 / B_0\) and \({\mathbf{V}}_0\), respectively. Definitions of \({\varvec{\Sigma }}_{A1}\) and \({\varvec{\Sigma }}_{A0}\) on the left-hand side of Eq. (218) are given in Eqs. (135) and \(\mathbf{P}_{R\zeta }\) is defined by

$$\begin{aligned}&\mathbf{P}_{R\zeta } = \sum _a e_a \sum _{n=1}^\infty \frac{1}{n!} \sum _{i_1 = 1}^\infty \cdots \sum _{i_{n-1} = 1}^\infty \int dU \int d\mu \int d\xi \nonumber \\&\times \left[ D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} \partial _\zeta \psi _a}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\&- \frac{\partial ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} \partial _\zeta \psi _a}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} \nonumber \\&+ \cdots + (-1)^{n-1} \frac{\partial ^{n-1} ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \partial _\zeta \psi _a \nonumber \\&+ \frac{e_a}{m_a c^2} \left( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} \frac{\partial ^{n-1} (\mathbf{A}_1 \cdot \partial _\zeta \mathbf{A}_1)}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} \right. \nonumber \\&- \frac{\partial ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i1}} \frac{\partial ^{n-2} (\mathbf{A}_1 \cdot \partial _\zeta \mathbf{A}_1)}{\partial X_{i_2} \cdots \partial X_{i_{n-1}}} + \cdots \nonumber \\&\left. \left. + \, (-1)^{n-1} \frac{\partial ^{n-1} ( D_a F_a^* {\varvec{\rho }}_a \rho _{ai_1} \cdots \rho _{ai_{n-1}} )}{\partial X_{i_1} \cdots \partial X_{i_{n-1}}} (\mathbf{A}_1 \cdot \partial _\zeta \mathbf{A}_1) \right) \right] . \end{aligned}$$
(221)

On the right-hand of Eq. (218), the variation of the canonical toroidal angular momentum due to collisions and external sources is given by

$$\begin{aligned}&K_{Pc\zeta } = \mathbf{K}_{Pc} \cdot \mathbf{e}_\zeta = \sum _a \int d^3 v^\mathrm{(gc)} \mathcal{K}_a (P_a^c)_\zeta . \end{aligned}$$
(222)

We now use Eqs. (218)–(222) and Eq. (186) with \(\mathcal{A} = A_{0\zeta } = - \chi\) to write the toroidal angular momentum balance equation as

$$\begin{aligned}&\left\langle \frac{\partial }{\partial t} \left[ P_{\parallel V \zeta } - \frac{1}{c} \left( \mathbf{P}^\mathrm{(pol)}_L + \frac{\mathbf{E}_L}{4\pi } \right) \cdot \nabla A_{0\zeta } \right] \right\rangle \nonumber \\&+ \frac{1}{V'}\frac{\partial }{\partial s} \left[ V' \left\{ \Pi _{\parallel V \zeta }^s + \Pi _{R\zeta }^s - \frac{1}{4\pi } \left\langle A_{1\zeta } (\nabla \times \mathbf{B}_1) \cdot \nabla s \right\rangle \right. \right. \nonumber \\&- \frac{1}{4\pi } \left\langle E_{L\zeta } E_L^s + B_{1\zeta }B_1^s \right\rangle + \frac{1}{4\pi c} \left\langle \frac{\partial \lambda }{\partial \zeta } A_1^s \right\rangle \left. \left. + \frac{1}{c} \left\langle \frac{\partial A_{0\zeta }}{\partial t} \left( \mathbf{P}^\mathrm{(pol)}_L + \frac{\mathbf{E}_L}{4\pi } \right) \cdot \nabla s \right\rangle \right\} \right] \nonumber \\&= \left\langle K_{Pc\zeta } \right\rangle + \frac{1}{c} \left\langle \nabla \cdot \left( A_{0\zeta } \mathbf{j}_L^\mathrm{C} \right) \right\rangle , \end{aligned}$$
(223)

where

$$\begin{aligned}&P_{\parallel V \zeta } = \sum _a \int d^3 v^\mathrm{(gc)} F_a m_a ( U b_\zeta + V_\zeta ) , \nonumber \\&\Pi _{\parallel V \zeta }^s = \sum _a \int d^3 v^\mathrm{(gc)} F_a m_a ( U b_\zeta + V_\zeta ) {\mathbf{v}}_a^\mathrm{(gc)} \cdot \nabla s , \nonumber \\&\Pi _{R\zeta }^s = \mathbf{P}_{R\zeta } \cdot \nabla s . \end{aligned}$$
(224)

Using Eqs. (142), (144), (171), (172), (180), (184), and (222), the right-hand side of Eq. (223) is rewritten as

$$\begin{aligned} \left\langle K_{Pc\zeta } \right\rangle + \frac{1}{c} \left\langle \nabla \cdot \left( A_{0\zeta } \mathbf{j}_L^\mathrm{C} \right) \right\rangle = - \frac{1}{V'} \frac{\partial }{\partial s} \left[ V' (\Pi ^\mathrm{C*})^s \right] + \sum _a \left\langle \int d^3 v^\mathrm{(gc)} \mathcal{S}_a m_a U b_\zeta \right\rangle . \end{aligned}$$
(225)

The last term on the right-hand side of Eq. (225) represents the external source of the toroidal angular momentum. The radial flux of the toroidal angular momentum due to collisions and finite gyroradii is defined by

$$\begin{aligned} (\Pi ^\mathrm{C*})^s \equiv \sum _a \left\langle \left( {\varvec{\Pi }}_{a\zeta }^\mathrm{C} + \frac{e_a}{c } \chi {\varvec{\Gamma }}_a^\mathrm{C} \right) \cdot \nabla s \right\rangle , \end{aligned}$$
(226)

where \({\varvec{\Pi }}_{a\zeta }^\mathrm{C}\) is given by

$$\begin{aligned} {\varvec{\Pi }}_{a\zeta }^\mathrm{C} \equiv \sum _b {\varvec{\Pi }}_{ab\zeta }^\mathrm{C} , \end{aligned}$$
(227)

and \({\varvec{\Pi }}_{ab\zeta }^\mathrm{C}\) is defined by Eq. (174).

Substituting Eq. (225) into Eq. (223), the toroidal angular momentum balance equation is rewritten as

$$\begin{aligned}&\frac{\partial }{\partial t} \left( V' \left\langle P_{\parallel V \zeta } + \frac{1}{c} \left( \mathbf{P}^\mathrm{(pol)}_L + \frac{\mathbf{E}_L}{4\pi } \right) \cdot \nabla \chi \right\rangle \right) \nonumber \\&+ \frac{\partial }{\partial s} \left[ V' \left\{ \Pi _{\parallel V \zeta }^s + \Pi _{R\zeta }^s + (\Pi ^\mathrm{C*})^s - \frac{1}{4\pi } \left\langle A_{1\zeta } (\nabla \times \mathbf{B}_1) \cdot \nabla s \right\rangle \right. \right. \nonumber \\&- \frac{1}{4\pi } \left\langle E_{L1\zeta } E_{L1}^s + B_{1\zeta }B_1^s \right\rangle + \frac{1}{4\pi c} \left\langle \frac{\partial \lambda }{\partial \zeta } A_1^s \right\rangle \nonumber \\&\left. \left. - \frac{1}{c} \frac{\partial \chi (s,t)}{\partial t} \left\langle \left( \mathbf{P}^\mathrm{(pol)}_L + \frac{\mathbf{E}_L}{4\pi } \right) \cdot \nabla s \right\rangle - \left\langle P_{\parallel V \zeta } \mathbf{u}_s \cdot \nabla s \right\rangle \right\} \right] \nonumber \\&= V' \sum _a \left\langle \int d^3 v^\mathrm{(gc)} \, \mathcal{S}_a m_a ( U b_\zeta + V_\zeta ) \right\rangle . \end{aligned}$$
(228)

In Sect. 5.6, we derive the ensemble-averaged toroidal angular momentum balance equation from Eq. (228) to confirm that it is consistent with the conventional result up to the second order in \(\delta\).

5 Separation into ensemble-averaged and turbulent parts

Kinetic theories of collisional and turbulent transport processes in magnetically confined plasmas are often developed as individual subjects independent of each other because of difficulties in simultaneous analyses of both processes which are caused by different physical mechanisms involving very distinct spatiotemporal scales. In the kinetic theory of classical and neoclassical collisional transport (Braginskii 1965; Helander and Sigmar 2002; Hirshman and Sigmar 1981; Hinton and Hazeltine 1976), particle distribution functions and electromagnetic fields are generally assumed to be in a quasi-steady state and have characteristic scale lengths which are comparable with the system size. In contrast, it is considered in the gyrokinetic theory of turbulent transport driven by microinstabilities (Horton 2012) that fluctuations in the distribution functions and electromagnetic fields vary in the transit time scale defined by the ratio of the equilibrium scale length to the thermal velocity and their characteristic wavelengths in the directions perpendicular to the background magnetic field are given by the gyroradius of thermal particles. Historically, using the assumptions mentioned above and classical perturbation methods such as recursive techniques, drift kinetic and gyrokinetic equations for perturbed distribution functions (\(\delta f\)) in the case of the low-flow ordering were individually derived as governing equations for neoclassical and turbulent transport processes, respectively (Hazeltine and Meiss 1992; Rutherford and Frieman 1968; Taylor and Hastie 1968; Antonsen and Lane 1980; Catto et al. 1981; Frieman and Chen 1982). In the same way, the \(\delta f\) drift kinetic and gyrokinetic equations in the high-flow ordering are derived for toroidally rotating plasmas (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b; Artun and Tang 1994; Sugama and Horton 1998) and the derivations of these equations based on the classical methods are comprehensively reviewed by Abel et al. (2013) On the other hand, the modern gyrokinetic equations derived from the Lie-transform techniques (Brizard and Hahm 2007; Hahm et al. 1988; Hahm 1988; Brizard 1989, 1995; Hahm 1996) govern behaviors of the full distribution function (full-F), and if the collision term is included, they should, in principle, simultaneously describe collisional and turbulent processes. Actually, in Sects. 5.2 and 5.3, it is confirmed that the conventional \(\delta f\) drift kinetic and gyrokinetic equations for toroidally rotating plasmas can be derived from the full-F gyrokinetic equation, Eq. (140), with the collision term given in Eq. (166). Besides, consistency between the conventional and present formulations of transport theories is clarified further by showing that classical, neoclassical, and turbulent transport fluxes of particles, heat, and toroidal momentum separately defined in the previous works (Braginskii 1965; Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b; Artun and Tang 1994; Sugama and Horton 1998; Abel et al. 2013) are all included in the particle, energy, and toroidal momentum balance equations which are derived in Sects. 4.3, 4.5, and 4.6, respectively. This helps to understand physical meanings of the primary terms appearing in the complicated expressions of these balance equations.

To compare the results shown in the previous sections with those in the conventional recursive formulations of the collisional and turbulent transport (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b; Artun and Tang 1994; Sugama and Horton 1998; Abel et al. 2013), we divide an arbitrary physical variable \({\mathcal {Q}}\) into the average and turbulent parts as

$$\begin{aligned} {\mathcal {Q}} = \langle {\mathcal {Q}} \rangle _\mathrm{ens} + \hat{\mathcal {Q}} , \end{aligned}$$
(229)

where \(\langle \cdots \rangle _\mathrm{ens}\) represents the ensemble average, and we immediately find \(\langle \hat{\mathcal {Q}} \rangle _\mathrm{ens} = 0\). We identify the zeroth-order fields \(\mathbf{A}_0\) and \(\mathbf{B}_0\) with the ensemble-averaged parts to write

$$\begin{aligned} \mathbf{A}_0 = \langle \mathbf{A} \rangle _\mathrm{ens}, \quad \mathbf{A}_1 = {\hat{\mathbf{A}}} , \quad \mathbf{B}_0 = \langle \mathbf{B} \rangle _\mathrm{ens} , \quad \mathbf{B}_1 = {\hat{\mathbf{B}}} . \end{aligned}$$
(230)

We also assume that the zeroth-order electrostatic potential \(\Phi _0\) has no turbulent part,

$$\begin{aligned} \Phi _0 = \langle \Phi _0 \rangle _\mathrm{ens} , \end{aligned}$$
(231)

while the perturbation electrostatic potential \(\phi _1\) is written as

$$\begin{aligned} \phi _1 = \langle \phi _1 \rangle _\mathrm{ens} + {\hat{\phi }}_1 \equiv \Phi _1 + {\hat{\phi }} , \end{aligned}$$
(232)

where \(\Phi _1 \equiv \langle \phi _1 \rangle _\mathrm{ens}\) and \({\hat{\phi }} \equiv {\hat{\phi }}_1\) represent the ensemble-averaged and turbulent parts of the first-order electrostatic potential, respectively. When \(\Phi _0 = 0\), \(\Phi _1 \equiv \langle \phi _1 \rangle _\mathrm{ens} \ne 0\) gives the background the \(\mathbf{E} \times \mathbf{B}\) flow of \({\mathcal {O}}(\delta v_T)\), which corresponds to the so-called low-flow ordering. Then, using Eq. (43), we have

$$\begin{aligned} \langle \psi \rangle _\mathrm{ens} = \Phi _1, {\hat{\psi }} = {\hat{\phi }} - \frac{1}{c} ({\mathbf{V}}_0 + {\mathbf{v}}' ) \cdot {\hat{\mathbf{A}}} , \end{aligned}$$
(233)

where

$$\begin{aligned} {\mathbf{v}}' = U \mathbf{b} +({\mathbf{v}}'_c)_\perp \end{aligned}$$
(234)

represents the particle velocity observed from the toroidally rotating frame.

We assume that the ensemble average \(\langle {\mathcal {Q}} \rangle _\mathrm{ens}\) of any variable \({\mathcal {Q}}\) considered here has a slow temporal variation subject to the so-called transport ordering,

$$\begin{aligned} \partial \ln \langle {\mathcal {Q}} \rangle _\mathrm{ens} / \partial t = {\mathcal {O}} (\delta ^2 v_T / L) , \end{aligned}$$
(235)

and that it has a gradient scale length L,

$$\begin{aligned} |\nabla \ln \langle {\mathcal {Q}} \rangle _\mathrm{ens} | = {\mathcal {O}} (1 / L) , \end{aligned}$$
(236)

where L is on the same order as gradient scale lengths of the equilibrium field and pressure profiles. We also impose the constraint of axisymmetry on \(\langle {\mathcal {Q}} \rangle _\mathrm{ens}\) that is written as

$$\begin{aligned} \partial \langle {\mathcal {Q}} \rangle _\mathrm{ens} / \partial \zeta = 0 . \end{aligned}$$
(237)

On the other hand, the turbulent part \(\hat{\mathcal {Q}}\) of \({\mathcal {Q}}\) is assumed to have gradient scale lengths L and \(\rho\) in the directions parallel and perpendicular to the equilibrium magnetic field \(\mathbf{B}_0\), respectively,

$$\begin{aligned} \mathbf{b} \cdot \nabla \ln \hat{\mathcal {Q}} = {\mathcal {O}} (1 / L) , \quad | \nabla _\perp \ln \hat{\mathcal {Q}} | = {\mathcal {O}} (1 / \rho ) , \end{aligned}$$
(238)

where \(\nabla _\perp = \nabla - \mathbf{b} \mathbf{b} \cdot \nabla\). In addition, the temporal variation of \(\hat{\mathcal {Q}}\) observed from the rotating frame with the toroidal velocity \({\mathbf{V}}_0 = V^\zeta \mathbf{e}_\zeta\) is assumed to have a characteristic frequency of the transit frequency \(\sim v_T/L\),

$$\begin{aligned} \left( \frac{\partial }{\partial t} + V^\zeta \frac{\partial }{\partial \zeta } \right) \ln \hat{\mathcal {Q}} = {\mathcal {O}} \left( \frac{v_T}{L} \right) . \end{aligned}$$
(239)

Then, the temporal variation of \(\hat{\mathcal {Q}}\) observed from the laboratory frame is given by

$$\begin{aligned} \frac{\partial }{\partial t} \ln \hat{\mathcal {Q}} = - V^\zeta \frac{\partial }{\partial \zeta } \ln \hat{\mathcal {Q}} + {\mathcal {O}} \left( \frac{v_T}{L} \right) = {\mathcal {O}} \left( \Omega _0 \right) , \end{aligned}$$
(240)

where \(V^\zeta = {\mathcal {O}}(v_T/L)\) and \(\partial \ln \hat{\mathcal {Q}} / \partial \zeta = {\mathcal {O}}(\delta ^{-1})\) are used. The orderings described in Eq. (239) and (240) are in accordance with those which were made originally by Cooper (1988) to represent ballooning instabilities in tokamaks with sheared toroidal flows.

5.1 Ensemble-averaged and turbulent parts of the distribution function

In this subsection, we consider the local equilibrium distribution function which is the lowest order solution of the gyrokinetic equation in Eq. (140). The energy or Hamiltonian of the particle is a constant of motion under the stationary electromagnetic fields and it is normally used to represent the lowest order solution of the full-F kinetic equation by the local Maxwellian equilibrium distribution function for which the collision term vanishes. However, we need to be careful to use the Hamiltonian H defined in Eq. (50) for the local Maxwellian distribution. The Hamiltonian H represents the energy observed in the laboratory frame for the system with the large mean flow velocity \({\mathbf{V}}_0\). Therefore, the Maxwellian distribution \(\propto \exp (-H/T)\) is maximized when the particle velocity \({\mathbf{v}}\) in the laboratory frame vanishes although it is natural for the maximum distribution to occur at the mean flow velocity \({\mathbf{v}} = {\mathbf{V}}_0\). We also note that the dominant component of H is given by the potential energy \(e \Phi _0 = {\mathcal {O}}(\delta ^{-1})\) rather than the kinetic energy and accordingly \(\exp (-H/T)\) is regarded as a quantity of \({\mathcal {O}}(\delta ^{-1})\) which is not formally appropriate to use for the lowest order distribution.

To obtain the suitable lowest order solution of the gyrokinetic Boltzmann equation in Eq. (140), we here define

$$\begin{aligned} H^* = H - e [\Phi _0 (\chi (\mathbf{X} + {\varvec{\rho }})) + \langle \Phi _1 \rangle (\chi (\mathbf{X} + {\varvec{\rho }})) ] - V^\zeta (\chi (\mathbf{X} + {\varvec{\rho }})) \left( P^c_\zeta + \frac{e}{c} \chi (\mathbf{X} + {\varvec{\rho }}) \right) , \end{aligned}$$
(241)

where the flux-surface functions \(\Phi _0\), \(\langle \Phi _1 \rangle\), and \(\chi\) are all evaluated not at the gyrocenter position \(\mathbf{X}\) but at the particle position \(\mathbf{x} \equiv \mathbf{X} + {\varvec{\rho }}\). Then, we find

$$\begin{aligned} \{ \mathbf{x} , H^* \} = \{ \mathbf{x} , H \} - {\mathbf{V}}_0 (\mathbf{x}) + {\mathcal {O}}(\delta ^2) , \end{aligned}$$
(242)

where \({\mathbf{V}}_0 = V^\zeta \mathbf{e}_\zeta\), \(\{ \mathbf{x} , P^c_\zeta \} = \mathbf{e}_\zeta\), and \(\{ \mathbf{x} , \mathbf{x} \} = {\mathcal {O}}(\delta ^2)\) are used. The gyrophase average of Eq. (241) is shown to be written as

$$\begin{aligned} \langle H^* \rangle _\xi&= H - e [\Phi _0 (\chi (\mathbf{X})) + \langle \Phi _1 \rangle (\chi (\mathbf{X} )) ] - V^\zeta (\chi (\mathbf{X} )) \left( P^c_\zeta + \frac{e}{c} \chi (\mathbf{X} ) \right) \nonumber \\& \qquad - \frac{\mu |\nabla \chi |^2}{2 \Omega _0} \frac{\partial V^\zeta }{\partial \chi } + {\mathcal {O}}(\delta ^2) \nonumber \\& = \epsilon ^* + e \langle {\hat{\psi }} \rangle _\xi + {\mathcal {O}}(\delta ^2) , \end{aligned}$$
(243)

where

$$\begin{aligned} \epsilon ^* \equiv \epsilon + \epsilon _1 , \end{aligned}$$
(244)
$$\begin{aligned} \epsilon \equiv \frac{1}{2} m U^2 + \mu B_0 + e {\widetilde{\Phi }}_1 - \frac{1}{2} m V_0^2 , \end{aligned}$$
(245)

and

$$\begin{aligned} \epsilon _1 \equiv - \frac{\mu }{\Omega _0} \left( \frac{2 V^\zeta }{R} \nabla R \cdot \nabla \chi + |\nabla \chi |^2 \frac{\partial V^\zeta }{\partial \chi } \right) . \end{aligned}$$
(246)

Here,

$$\begin{aligned} {\widetilde{\Phi }}_1 \equiv \Phi _1 - \langle \Phi _1 \rangle \end{aligned}$$
(247)

represents the poloidal-angle-dependent part of the first-order ensemble-averaged electrostatic potential. Note that the spatial functions on the right-hand side of Eqs. (244)–(247) are evaluated not at the particle position \(\mathbf{x} \equiv \mathbf{X} + {\varvec{\rho }}\) but at the gyrocenter position \(\mathbf{X}\). We also see

$$\begin{aligned} H^* = \epsilon + {\mathcal {O}}(\delta ) . \end{aligned}$$
(248)

We can see from Eqs. (242), (245), and (248) that \(H^*\) has no \({\mathcal {O}}(\delta ^{-1})\) term and its zeroth-order part \(\epsilon\) represents the energy observed from the frame rotating with the velocity \({\mathbf{V}}_0\). On the right-hand side of Eq. (245), the last term represents the centrifugal potential energy (Landau and Lifshitz 1976) while the sum of the first and second terms gives the kinetic energy in the rotating frame where the the zeroth-order electric field \(\mathbf{E}'_0 = \mathbf{E}_0 + c^{-1} {\mathbf{V}}_0 \times \mathbf{B}_0\) vanishes and accordingly the zeroth-order electrostatic potential does not appear. In fact, \(\epsilon\) defined in Eq. (245) is used to describe the local Maxwellian equilibrium distribution function in the recursive formulations of the drift kinetic and gyrokinetic equations for toroidally rotating plasmas (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b, 1998; Artun and Tang 1994; Abel et al. 2013).

We now use \(H^*\) to define the distribution function \(F^{(H*)}\) by

$$\begin{aligned} F^{(H*)} \equiv N (\chi (\mathbf{x})) \left( \frac{m}{2 \pi T (\chi (\mathbf{x}))} \right) ^{3/2} \exp \left( - \frac{H^*}{T (\chi (\mathbf{x}))} \right) , \end{aligned}$$
(249)

where the flux-surface functions N and T are evaluated at the particle position \(\mathbf{x} \equiv \mathbf{X} + {\varvec{\rho }}\). Then, we find that the collision operator defined by Eq. (166) satisfies

$$\begin{aligned} C_{ab} [ F^{(H*)}_a, F^{(H*)}_b ] = {\mathcal {O}} (\delta ^2) , \end{aligned}$$
(250)

which is gyrophase-averaged to give

$$\begin{aligned} C_{ab} [ \langle F^{(H*)}_a \rangle _{\xi _a} , \langle F^{(H*)}_b \rangle _{\xi _b} ] = {\mathcal {O}} (\delta ^2) , \end{aligned}$$
(251)

where \(F^{(H*)}_a\) and \(F^{(H*)}_b\) are defined using Eq. (249) for particle species a and b. In order for Eqs. (250) and (251) to be rigorously satisfied, \(T_a = T_b\) is required, although when \(m_a/m_b \ll 1\) or \(m_b/m_a \ll 1\), they are approximately valid even for \(T_a \ne T_b\). Taking the gyrophase average of Eq. (249), we obtain

$$\begin{aligned} \langle F^{(H*)} \rangle _\xi& = N (\chi (\mathbf{X})) \left( \frac{m}{2 \pi T (\chi (\mathbf{X}))} \right) ^{3/2} \exp \left( - \frac{\langle H^* \rangle _\xi }{T (\chi (\mathbf{X}))} \right) + {\mathcal {O}} (\delta ^2) \nonumber \\& = F_0 - \frac{F_0}{T} \left( \epsilon _1 + e \langle {\hat{\psi }} \rangle _\xi \right) + {\mathcal {O}} (\delta ^2), \end{aligned}$$
(252)

where the zeroth-order distribution function \(F_0\) is defined by

$$\begin{aligned} F_0 = N (\chi (\mathbf{X})) \left( \frac{m}{2 \pi T (\chi (\mathbf{X}))} \right) ^{3/2} \exp \left( - \frac{\epsilon }{T (\chi (\mathbf{X}))} \right) . \end{aligned}$$
(253)

This function \(F_0\) coincides with the local equilibrium distribution function used in the recursive formulations of the drift kinetic and gyrokinetic equations for toroidally rotating plasmas (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b; Artun and Tang 1994; Sugama and Horton 1998; Abel et al. 2013).

Taking the logarithm of Eq. (252) and its time derivative along the gyrocenter orbit, we have

$$\begin{aligned} \frac{\text {d}}{{\text {d}}t} \ln \langle F^{(H*)} \rangle _\xi = \frac{{\text {d}} \chi }{{\text {d}}t} \left\{ \frac{\partial \ln N }{\partial \chi } + \frac{\partial \ln T }{\partial \chi } \left( \frac{\langle H^* \rangle _\xi }{T} - \frac{3}{2} \right) \right\} - \frac{1}{T} \frac{{\text {d}} \langle H^* \rangle _\xi }{{\text {d}}t} + {\mathcal {O}}(\delta ^2) . \end{aligned}$$
(254)

Then, \({\text {d}} \langle H^* \rangle _\xi /{\text {d}}t\) on the right-hand side of Eq. (254) is given by

$$\begin{aligned}&\frac{{\text {d}}}{{\text {d}}t} \langle H^* \rangle _\xi = \frac{{\text {d}}}{{\text {d}}t}\epsilon ^* + \frac{{\text {d}}}{{\text {d}}t} ( e \langle {\hat{\psi }} \rangle _\xi ) + {\mathcal {O}}(\delta ^2) \nonumber \\&= - \frac{e}{c} U \mathbf{b} \cdot \frac{\partial \mathbf{A}_0}{\partial t} - \frac{{\text {d}} \chi }{{\text {d}}t} \left( m ( U b_\zeta + V_\zeta ) \frac{\partial V^\zeta }{\partial \chi } + e \frac{\partial \langle \Phi _1 \rangle }{\partial \chi } \right) \nonumber \\&- \left[ U \mathbf{b} \cdot \nabla \left( \frac{\mu |\nabla \chi |^2}{2 \Omega _0} \right) \right] \frac{\partial V^\zeta }{\partial \chi } + e \left( \frac{\partial }{\partial t} + V^\zeta \frac{\partial }{\partial \zeta } \right) \langle {\hat{\psi }} \rangle _\xi + {\mathcal {O}}(\delta ^2) . \end{aligned}$$
(255)

We also use

$$\begin{aligned} \frac{{\text {d}}}{{\text {d}}t} P^c_\zeta = \frac{{\text {d}}}{{\text {d}}t} \left( - \frac{e}{c} \chi + m ( U b_\zeta + V_\zeta ) \right) = - e \frac{\partial \langle {\hat{\psi }} \rangle _\xi }{\partial \zeta } + {\mathcal {O}}(\delta ) \end{aligned}$$
(256)

to obtain

$$\begin{aligned} \frac{{\text {d}} \chi }{{\text {d}}t}&= \frac{c m}{e} \frac{{\text {d}}}{{\text {d}}t} ( U b_\zeta + V_\zeta ) + c \frac{\partial \langle {\hat{\psi }} \rangle _\xi }{\partial \zeta } + {\mathcal {O}}(\delta ^2) \nonumber \\&= \frac{c m}{e} U \mathbf{b} \cdot \nabla _\epsilon ( U b_\zeta + V_\zeta ) + c \frac{\partial \langle {\hat{\psi }} \rangle _\xi }{\partial \zeta } + {\mathcal {O}}(\delta ^2) , \end{aligned}$$
(257)

where \(\nabla _\epsilon \equiv (\partial /\partial \mathbf{X})_\epsilon\) represents the partial derivative with respect to \(\mathbf{X}\) using \((\epsilon , \mu , \xi )\) instead of \((U, \mu , \xi )\) for fixed variables. Then, using Eqs. (255) and (257), Eq. (254) is rewritten as

$$\begin{aligned} \frac{{\text {d}}}{{\text {d}}t} \langle F^{(H*)} \rangle _\xi&= F_0 \left[ \frac{{\text {d}} \chi }{{\text {d}}t} \left\{ \frac{\partial \ln N }{\partial \chi } + \frac{e}{T} \frac{\partial \langle \Phi _1 \rangle }{\partial \chi } + \frac{\partial \ln T }{\partial \chi } \left( \frac{\epsilon }{T} - \frac{3}{2} \right) \right. \right. \nonumber \\& \qquad \left. + \, \frac{m}{T} ( U b_\zeta + V_\zeta ) \frac{\partial V^\zeta }{\partial \chi } \right\} + \left\{ U \mathbf{b} \cdot \nabla \left( \frac{\mu |\nabla \chi |^2}{2 \Omega _0} \right) \right\} \frac{1}{T} \frac{\partial V^\zeta }{\partial \chi } \nonumber \\& \qquad \left. + \, \frac{e}{c T} U \mathbf{b} \cdot \frac{\partial \mathbf{A}_0}{\partial t} - \frac{e}{T} \left( \frac{\partial }{\partial t} + V^\zeta \frac{\partial }{\partial \zeta } \right) \langle {\hat{\psi }} \rangle _\xi \right] + {\mathcal {O}}(\delta ^2). \end{aligned}$$
(258)

We now write the solution F of the gyrokinetic Boltzmann equation in Eq. (143) as

$$\begin{aligned} F&= \langle F^{(H*)} \rangle _\xi + F_1 \nonumber \\&= F_0 \left[ 1 - \frac{1}{T} \left( \epsilon _1 + e \langle {\hat{\psi }} \rangle _\xi \right) \right] + F_1 + {\mathcal {O}} (\delta ^2), \end{aligned}$$
(259)

where Eq. (252) is used. We can also write F as the sum of the ensemble-averaged and turbulent parts,

$$\begin{aligned} F \equiv \langle F \rangle _\mathrm{ens} + \hat{F} . \end{aligned}$$
(260)

Separating the first-order distribution function \(F_1\) into the ensemble-averaged and turbulent parts as

$$\begin{aligned} F_1 \equiv \langle F_1 \rangle _\mathrm{ens} + \hat{F}_1 \equiv f_1 + \hat{h} , \end{aligned}$$
(261)

and using Eqs. (259), we find that the first-order ensemble-averaged part \(f_1 \equiv \langle F_1 \rangle _\mathrm{ens}\) is related to \(\langle F \rangle _\mathrm{ens}\) as

$$\begin{aligned} \langle F \rangle _\mathrm{ens} = F_0 \left( 1 - \frac{\epsilon _1}{T} \right) + f_1 + {\mathcal {O}} (\delta ^2), \end{aligned}$$
(262)

and \(\hat{h} \equiv \hat{F}_1\) represents the nonadiabatic part of the turbulent distribution function which satisfies

$$\begin{aligned} \hat{F} = - F_0 \frac{e \langle {\hat{\psi }} \rangle _\xi }{T} + \hat{h} + {\mathcal {O}} (\delta ^2). \end{aligned}$$
(263)

As shown later, the neoclassical and turbulent transport fluxes of particles, heat, and toroidal momentum can be evaluated from the first-order distribution functions \(f_1\) and \(\hat{h}\) appearing in Eqs. (262) and (263), respectively. These functions \(f_1\) and \(\hat{h}\) are governed by the drift kinetic and gyrokinetic equations, respectively, which are derived in the following subsections.

5.2 Derivation of the drift kinetic equation

Taking the ensemble average of Eq. (140) and using Eqs. (258) and (259), we obtain the equation for the first-order ensemble-averaged distribution function \(f_{a1}\) for the species a as

$$\begin{aligned}&U ( \mathbf{b} \cdot \nabla \theta ) \frac{\partial f_{a1}}{\partial \theta } - \sum _b \left( C_{ab} [ f_{a1}, F_{b0} ] + C_{ab} [ F_{a0}, f_{b1} ] \right) \nonumber \\&= -F_{a0} \left[ \frac{{\text {d}} \chi }{{\text {d}}t} \left\{ \frac{\partial \ln N_a }{\partial \chi } + \frac{e_a}{T_a} \frac{\partial \langle \Phi _1 \rangle }{\partial \chi } + \frac{\partial \ln T_a }{\partial \chi } \left( \frac{\epsilon }{T_a} - \frac{3}{2} \right) \right. \right. \nonumber \\&\quad \left. + \frac{m_a}{T_a} ( U b_\zeta + V_\zeta ) \frac{\partial V^\zeta }{\partial \chi } \right\} + \left\{ U \mathbf{b} \cdot \nabla \left( \frac{\mu |\nabla \chi |^2}{2 \Omega _0} \right) \right\} \frac{1}{T_a} \frac{\partial V^\zeta }{\partial \chi } \nonumber \\&\quad \left. + \, \frac{e_a}{c T_a} U \mathbf{b} \cdot \frac{\partial \mathbf{A}_0}{\partial t} - \frac{e}{T} \left( \frac{\partial }{\partial t} + V^\zeta \frac{\partial }{\partial \zeta } \right) \langle {\hat{\psi }}_a \rangle _\xi \right] , \end{aligned}$$
(264)

where the second-order terms are neglected and \(f_{a1}\) is regarded as a function of \(( \chi , \theta , \epsilon , \mu , \sigma \equiv U/|U| )\) so that the partial derivative \(\partial /\partial \theta\) is taken with keeping the variables \(( \chi , \epsilon , \mu , \sigma )\) constant. We now define \(g_a \equiv g_a ( \chi , \theta , \epsilon , \mu , \sigma )\) by

$$\begin{aligned} g_a \equiv f_{a1} - F_{a0} \frac{e_a}{T_a} \int ^\theta \frac{{\text {d}}\theta }{\mathbf{B}_0 \cdot \nabla \theta } \left( \mathbf{B}_0 \cdot \mathbf{E}^{(A)} - \frac{B_0^2}{\langle B_0^2 \rangle } \langle \mathbf{B}_0 \cdot \mathbf{E}^{(A)} \rangle \right) , \end{aligned}$$
(265)

where \(\mathbf{E}^{(A)}\) represents the inductive electric field given by

$$\begin{aligned} \mathbf{E}^{(A)} \equiv - \frac{1}{c} \frac{\partial \mathbf{A}_0}{\partial t} . \end{aligned}$$
(266)

Then, Eq. (264) is rewritten as

$$\begin{aligned}&U ( \mathbf{b} \cdot \nabla \theta ) \frac{\partial g_a}{\partial \theta } - \sum _b \left( C_{ab} [ g_a, F_{b0} ] + C_{ab} [ F_{a0}, g_b ] \right) \nonumber \\&\quad = \frac{1}{T_a} F_{a0} ( W_{a1} X_{a1} + W_{a2} X_{a2} + W_{aV} X_V + W_{aE} X_E ) , \end{aligned}$$
(267)

where the thermodynamic forces \((X_{a1} , X_{a2}, X_V, X_E )\) are defined by

$$\begin{aligned}&X_{a1} \equiv - \frac{1}{N_a} \frac{\partial (N_a T_a)}{\partial \chi } - e_a \frac{\partial \langle \Phi _1 \rangle }{\partial \chi } , \; \; X_{a2} \equiv - \frac{\partial T_a}{\partial \chi } , \nonumber \\&X_V \equiv - \frac{\partial V^\zeta }{\partial \chi } = c \frac{\partial ^2 \Phi _0}{\partial \chi ^2} , \; \; X_E \equiv \frac{\langle \mathbf{B}_0 \cdot \mathbf{E}^{(A)} \rangle }{ \langle B_0^2 \rangle ^{1/2}} , \end{aligned}$$
(268)

and the functions \((W_{a1}, W_{a2}, W_{aV}, W_{aE})\) are defined by

$$\begin{aligned}& W_{a1} \equiv \frac{m_a c}{e_a} U \mathbf{b} \cdot \nabla (U b_\zeta + V_\zeta ) , \; \; W_{a2} \equiv W_{a1} \left( \frac{\epsilon }{T_a} - \frac{5}{2} \right) , \nonumber \\& W_{aV} \equiv \frac{m_a c}{2e_a} U \mathbf{b} \cdot \nabla \left[ m_a (U b_\zeta + V_\zeta )^2 + \mu \frac{|\nabla \chi |^2}{B_0} \right] , \; \; W_{aE} \equiv \frac{e_a U B_0}{\langle B_0^2 \rangle ^{1/2}} . \end{aligned}$$
(269)

Here, it should be noted that the parallel derivative \(\mathbf{b} \cdot \nabla =(\mathbf{b} \cdot \nabla \theta ) \partial /\partial \theta\) for axisymmetric functions is taken with keeping the variables \(( \chi , \epsilon , \mu , \sigma )\) constant. Equation (267) agrees with the well-known linearized drift kinetic equation, on which the neoclassical transport theory for the toroidally rotating axisymmetric plasma is based (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b).

Taking the velocity-space integral of Eq. (267), yields the first-order ensemble-averaged continuity equation,

$$\begin{aligned} \mathbf{B}_0 \cdot \nabla \left( \frac{1}{B_0} \int {\text {d}}^3 v \; g_a U \right) = - \nabla \cdot \left( n_{a0} \mathbf{u}_{\perp a1} \right) , \end{aligned}$$
(270)

where

$$\begin{aligned} \int {\text {d}}^3 v \equiv \sum _{\sigma = \pm 1} \frac{2 \pi B_0}{m_a} \int _{\Xi _a}^{\infty } d \epsilon \int _{0}^{\epsilon - \Xi _a} d \mu , \end{aligned}$$
(271)

and

$$\begin{aligned} \Xi _a \equiv e_a {\widetilde{\Phi }}_1 - \frac{1}{2} m_a V_0^2 . \end{aligned}$$
(272)

Here, the equilibrium density \(n_{a0}\) is defined by the velocity-space integral of local Maxwellian distribution function \(F_{a0}\) in Eq. (253) as

$$\begin{aligned} n_{a0} = N_a \exp \left( - \frac{\Xi _a}{T_a} \right) , \end{aligned}$$
(273)

and the first-order ensemble-averaged perpendicular fluid velocity \(\mathbf{u}_{\perp a1}\) is given by

$$\begin{aligned} \mathbf{u}_{\perp a1}\equiv & {} \frac{c}{e_a B_0} \mathbf{b} \times \left[ \frac{\nabla (n_{a0} T_a)}{n_{a0}} + e_a \nabla \Phi _1 + m_a {\mathbf{V}}_0 \cdot \nabla {\mathbf{V}}_0 \right] \nonumber \\= & {} \frac{c}{e_a B_0} (\nabla \chi \times \mathbf{b}) \left( X_{a1} + \frac{\Xi _a}{T_a} X_{a2} + m_a R^2 V^\zeta X_V \right) . \end{aligned}$$
(274)

We can see that \(\mathbf{u}_{\perp a1}\) is driven by the pressure gradient, the electrostatic field, and the inertial force \(- m_a {\mathbf{V}}_0 \cdot \nabla {\mathbf{V}}_0\). We here take the ensemble average of Eq. (109) and retain only its lowest order part in \(\delta\) to give the equilibrium part of Ampère’s law,

$$\begin{aligned} \nabla ^2 \mathbf{A}_0 + \frac{4\pi }{c} \mathbf{j}_0 =0 , \end{aligned}$$
(275)

where the equilibrium current \(\mathbf{j}_0\) is given by

$$\begin{aligned} \mathbf{j}_0& = \langle (\mathbf{j}^{(\mathrm{gc})})_T \rangle _\mathrm{ens} + \nabla \times \left( \langle \mathbf{M} \rangle _\mathrm{ens} + \frac{c}{4\pi } \langle {\varvec{\Lambda }}\rangle _\mathrm{ens} \right) \nonumber \\&= \sum _a e_a \left( \int d^3 v \; g_a U \mathbf{b} + n_{a0} \mathbf{u}_{\perp a1} \right) . \end{aligned}$$
(276)

This agrees with the conventional representation of the equilibrium current in the neoclassical transport theory (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b). In deriving Eq. (276), Eqs. (262) and (265) are used to express the ensemble-averaged part of the distribution function \(F_a\), by which the lowest order part of Eqs. (75) and (103)–(106) are evaluated. It is confirmed from Eq. (270) that the expression of the equilibrium current \(\mathbf{j}_0\) given in the last line of Eq. (276) is consistent with the solenoidal condition, \(\nabla \cdot \mathbf{j}_0 =0\).

5.3 Derivation of the gyrokinetic equation for the turbulent part of the distribution function

Taking the turbulent part of Eq. (140) and using Eqs. (257)–(259), and (261), we obtain the equation for the nonadiabatic part \(\hat{h}_a\) of the turbulent distribution function for the species a as

$$\begin{aligned}&\frac{{\text {d}} \hat{h}_a}{dt} = \left[ \frac{\partial }{\partial t} + \left( {\mathbf{V}}_0 + U \mathbf{b} + {\mathbf{v}}_{da} - \frac{c}{B_0} \nabla \langle {\hat{\psi }}_a \rangle _\xi \times \mathbf{b} \right) { \cdot } \nabla \right] \hat{h}_a \nonumber \\&= \frac{1}{T_{a0}} F_{a0} \left[ c \frac{\partial \langle {\hat{\psi }}_a \rangle _\xi }{\partial \zeta } \left\{ X_{a1} + X_{a2} \left( \frac{\epsilon }{T_a} - \frac{5}{2} \right) + X_V m_a ( U b_\zeta + V_\zeta ) \right\} \right. \nonumber \\& \qquad \left. + e_a \left( \frac{\partial }{\partial t} + V^\zeta \frac{\partial }{\partial \zeta } \right) \langle {\hat{\psi }}_a \rangle _\xi \right] + \sum _b \langle C_{ab} [ \hat{h}_a , F_{b0} ] + C_{ab} [ F_{a0} , \hat{h}_b ] \rangle _\xi , \end{aligned}$$
(277)

where \(h_a\) is regarded as a function of \(( \chi , \theta , \epsilon , \mu , \sigma \equiv U/|U| )\) and \({\mathbf{v}}_{da}\) is the drift velocity defined by

$$\begin{aligned} {\mathbf{v}}_{da} = \frac{c}{e_a B_0} \mathbf{b} \times \left[ e_a \nabla \langle \Phi _1 \rangle + \mu \nabla B_0 + m ( U^2 \mathbf{b}\cdot \nabla \mathbf{b} + 2 U {\mathbf{V}}_0 \cdot \nabla \mathbf{b} + {\mathbf{V}}_0 \cdot \nabla {\mathbf{V}}_0 ) \right] . \end{aligned}$$
(278)

The turbulent part \({\hat{{\mathbf{v}}}}_a^\mathrm{(gc)}\) of the gyrocenter drift velocity \({\mathbf{v}}_a^\mathrm{(gc)} = d \mathbf{X}_a / dt\) is written as

$$\begin{aligned} {\hat{{\mathbf{v}}}}_a^\mathrm{(gc)} = \frac{c}{B_0} \mathbf{b} \times \nabla \langle {\hat{\psi }}_a (\mathbf{X} + {\varvec{\rho }}_a, t) \rangle _\xi + {\mathcal {O}}(\delta ^2) . \end{aligned}$$
(279)

Equation (277) is valid to the lowest order in \(\delta\) and agrees with the conventional gyrokinetic equation for toroidally rotating plasmas derived from using the WKB representation (Sugama and Horton 1998; Abel et al. 2013).

The turbulent part of the current density given by Eq. (98) is written as the sum of longitudinal and transverse parts,

$$\begin{aligned} {\hat{\mathbf{j}}} \equiv {\hat{\mathbf{j}}}_G = {\hat{\mathbf{j}}}_L + {\hat{\mathbf{j}}}_T . \end{aligned}$$
(280)

Here, it is found from Eq. (97) that the longitudinal turbulent current density \({\hat{\mathbf{j}}}_L\) satisfies

$$\begin{aligned} {\hat{\mathbf{j}}}_L = \frac{1}{4 \pi } \nabla \hat{\lambda } . \end{aligned}$$
(281)

Then, using Eqs. (83), (98), (236), (238), and (281), we obtain

$$\begin{aligned} \nabla ^2 \hat{\lambda } = - \nabla \cdot [ ( \nabla ^2 {\hat{\phi }}) {\mathbf{V}}_0 ] + {\mathcal {O}} (\delta ^0) = - \nabla ^2 ( {\mathbf{V}}_0 \cdot \nabla {\hat{\phi }} ) + {\mathcal {O}} (\delta ^0) , \end{aligned}$$
(282)

which gives

$$\begin{aligned} \hat{\lambda } = - {\mathbf{V}}_0 \cdot \nabla {\hat{\phi }} + {\mathcal {O}} (\delta ^2) = \frac{\partial {\hat{\phi }} }{\partial t} + {\mathcal {O}} (\delta ^2) , \end{aligned}$$
(283)

where Eq. (240) is used. Substituting Eq. (283) into Eq. (281) yields

$$\begin{aligned} {\hat{\mathbf{j}}}_L = - \frac{1}{4\pi } \frac{\partial {\hat{\mathbf{E}}}_L }{\partial t} + {\mathcal {O}} (\delta ) , \end{aligned}$$
(284)

where

$$\begin{aligned} {\hat{\mathbf{E}}}_L \equiv - \nabla {\hat{\phi }} \end{aligned}$$
(285)

represents the longitudinal part of the turbulent electric field. The transverse turbulent current density \({\hat{\mathbf{j}}}_T\) is written as

$$\begin{aligned} {\hat{\mathbf{j}}}_T = - \left( [ \nabla _\perp ^2 {\hat{\phi }} (\mathbf{x}) ] {\mathbf{V}}_0 \right) _T + \sum _a e_a \int d^3 v \, \hat{h}_a ( \mathbf{x} - {\varvec{\rho }}_a ) {\mathbf{v}}' + {\mathcal {O}}(\delta ) , \end{aligned}$$
(286)

where Eqs. (83), (98), and (263) are used. Here, \(( [ \nabla _\perp ^2 {\hat{\phi }} (\mathbf{x}) ] {\mathbf{V}}_0 )_T\) denotes the transverse part of \([ \nabla _\perp ^2 {\hat{\phi }} (\mathbf{x}) ] {\mathbf{V}}_0\) and it is given in the WKB representation (Sugama and Horton 1998) by \(- {\hat{\phi }}_{\mathbf{k}_\perp } [ \mathbf{k}_\perp \times ( {\mathbf{V}}_0 \times \mathbf{k}_\perp )]\), where \(\mathbf{k}_\perp\) is the wavenumber vector perpendicular to the background magnetic field.

Now, using Eqs. (238) and (263), the turbulent part of the gyrokinetic Poisson equation given in Eq. (83) is written to the lowest order in \(\delta\) as

$$\begin{aligned} - \nabla _\perp ^2 {\hat{\phi }} (\mathbf{x}) + \lambda _D^{-2} \left( {\hat{\phi }} (\mathbf{x}) - \frac{{\mathbf{V}}_0}{c} \cdot {\hat{\mathbf{A}}} (\mathbf{x}) \right) = 4 \pi \sum _a e_a \int d^3 v \, \hat{h}_a ( \mathbf{x} - {\varvec{\rho }}_a ) , \end{aligned}$$
(287)

where the Debye length \(\lambda _D\) appearing on the left-hand side of Eq. (287) is defined by

$$\begin{aligned} \lambda _D^{-2} \equiv 4\pi \sum _a \frac{n_a e_a^2}{T_a} . \end{aligned}$$
(288)

With the help of Eq. (286), the turbulent part of the gyrokinetic Ampère’s law given in Eq. (100) is written as

$$\begin{aligned} \frac{1}{c} \left( [ \nabla _\perp ^2 {\hat{\phi }} (\mathbf{x}) ] {\mathbf{V}}_0 \right) _T - \, \nabla _\perp ^2 {\hat{\mathbf{A}}}(\mathbf{x}) = \frac{4 \pi }{c} \sum _a e_a \int d^3 v \, \hat{h}_a ( \mathbf{x} - {\varvec{\rho }}_a ) {\mathbf{v}}' . \end{aligned}$$
(289)

Equations (287) and (289) agree with the gyrokinetic Poisson equation and the gyrokinetic Ampère’s law derived by conventional recursive formulations (Sugama and Horton 1998; Abel et al. 2013) for toroidally rotating plasmas. Note that, for wavelengths longer than the Debye length \(\lambda _D\), the first term \(- \nabla _\perp ^2 {\hat{\phi }} (\mathbf{x})\) on the left-hand side of Eq. (287) can be neglected and then Eq. (287) reduces to the quasineutrality condition which coincides with Eq. (A1) in Sugama and Horton (1998) and Eq. (146) in Abel et al. (2013). For this case, the term including \(\nabla _\perp ^2 {\hat{\phi }} (\mathbf{x})\) on the left-hand side of Eq. (289) is correspondingly neglected and Eq. (289) gives the same gyrokinetic Ampère’s law as given by Eqs. (A2) and (A3) in Sugama and Horton (1998) and Eq. (148) in Abel et al. (2013). We also recall that the Landau collision operator shown in Eq. (161) is valid for such wavelengths longer than \(\lambda _D\).

5.4 Ensemble-averaged particle balance equation

In this subsection and the following subsections, we use the results from Secs. 5.15.3 to derive the ensemble-averaged particle, energy, and toroidal angular momentum balance equations which are shown to contain all classical, neoclassical, and turbulent transport fluxes given in the previous works based on the recursive formulations (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b, 1998; Abel et al. 2013).

Taking the ensemble average of Eq. (177) and subsequently its flux-surface average, we obtain

$$\begin{aligned} \left\langle \frac{\partial \langle n_a^\mathrm{(gc)} \rangle _\mathrm{ens}}{\partial t} \right\rangle + \frac{1}{V'} \frac{\partial }{\partial s} \left( V' \left\langle \left\langle \left( {\varvec{\Gamma }}_a^\mathrm{(gc)} + {\varvec{\Gamma }}_a^\mathrm{C} \right) \cdot \nabla s \right\rangle \right\rangle \right) = \left\langle \int d^3 v^\mathrm{(gc)} \mathcal{S}_a \right\rangle , \end{aligned}$$
(290)

where

$$\begin{aligned} \langle n_a^\mathrm{(gc)} \rangle _\mathrm{ens} = n_{a0} + {\mathcal {O}}(\delta ) , \end{aligned}$$
(291)

and

$$\begin{aligned} \langle \langle \cdots \rangle \rangle \end{aligned}$$
(292)

represents a double average over the flux surface and the ensemble. On the right-hand side of Eq. (290), the source term \(\mathcal{S}_a\) is regarded as of \({\mathcal {O}} (\delta ^2)\) as well as all other terms in Eq. (290), and it is assumed to have no turbulent component so that \(\mathcal{S}_a = \langle \mathcal{S}_a \rangle _\mathrm{ens}\).

The gyrocenter particle flux \({\varvec{\Gamma }}_a^\mathrm{(gc)} \equiv n_a^\mathrm{(gc)} \mathbf{u}_a^\mathrm{(gc)}\) is defined in Eq. (179) and its radial component is double-averaged over the flux surface and the ensemble to give

$$\begin{aligned} (\Gamma _a^\mathrm{(gc)})^s \equiv \left\langle \left\langle {\varvec{\Gamma }}_a^\mathrm{(gc)} \cdot \nabla s \right\rangle \right\rangle = (\Gamma _a^\mathrm{NA})^s + (\Gamma _a^\mathrm{A})^s , \end{aligned}$$
(293)

where the nonturbulent part \((\Gamma _a^\mathrm{NA})^s\) and the turbulence-driven part \((\Gamma _a^\mathrm{A})^s\) are written using Eqs. (260)–(263) as

$$\begin{aligned} (\Gamma _a^\mathrm{NA})^s\equiv \, & {} \left\langle \int d^3 v^\mathrm{(gc)} \langle F_a \rangle _\mathrm{ens} \langle {\mathbf{v}}_a^\mathrm{(gc)} \rangle _\mathrm{ens} \cdot \nabla s \right\rangle \nonumber \\= & {} (\Gamma _a^\mathrm{ncl})^s + (\Gamma _a^H)^s + (\Gamma _a^{(E)})^s +{\mathcal {O}}(\delta ^3) , \end{aligned}$$
(294)

and

$$\begin{aligned} (\Gamma _a^\mathrm{A})^s\equiv \, & {} \left\langle \int d^3 v^\mathrm{(gc)} \langle \hat{F}_a \hat{\mathbf{v}}_a^\mathrm{(gc)} \rangle _\mathrm{ens} \cdot \nabla s \right\rangle \nonumber \\= & {} - \left\langle \left\langle \frac{c}{B_0} \int d^3 v^\mathrm{(gc)} \hat{h}_a ( \nabla {\hat{\psi }}_a \times \mathbf{b} ) \cdot \nabla s \right\rangle \right\rangle + {\mathcal {O}}(\delta ^3) , \end{aligned}$$
(295)

respectively. On the right-hand side of Eq. (294), the radial component of the ensemble-averaged gyrocenter drift velocity is given from

$$\begin{aligned} \langle {\mathbf{v}}_a^\mathrm{(gc)} \rangle _\mathrm{ens} \cdot \nabla \chi&= \frac{m_a c}{e_a} \frac{B_0}{B_{\parallel a}^*} U \mathbf{b}\cdot \nabla _{\epsilon ^*} ( U b_\zeta + V_\zeta ) - \frac{1}{B_0} \left( \frac{\partial \mathbf{A}_0}{\partial t} \times \mathbf{b} \right) \cdot \nabla \chi \nonumber \\&- \frac{m_a c^2}{e_a^2} b_\zeta \mathbf{b}\cdot \nabla \left( \frac{\mu |\nabla \chi |^2}{2 B_0} \frac{\partial V^\zeta }{\partial \chi } \right) + {\mathcal {O}}(\delta ^3) , \end{aligned}$$
(296)

and the definitions of the fluxes \((\Gamma _a^\mathrm{ncl})^s\), \((\Gamma _a^H)^s\), and \((\Gamma _a^{(E)})^s\) are written as

$$\begin{aligned} (\Gamma _a^\mathrm{ncl})^s\equiv & \frac{1}{\chi '} \left\langle \int d^3 v \, g_a W_{a1} \right\rangle , \nonumber \\ (\Gamma _a^H)^s\equiv & (L^H)_{1V}^a X_{a1} , \nonumber \\ (\Gamma _a^{(E)})^s\equiv & c \left\langle n_a \frac{\mathbf{E}^{(A)} \times \mathbf{b} }{B_0} \cdot \nabla s\right\rangle - \frac{c I}{\chi '} \left\langle \frac{E_\parallel ^{(A)}}{B_0} \left( n_a - \langle n_a \rangle \frac{B_0^2}{\langle B_0^2 \rangle } \right) \right\rangle , \end{aligned}$$
(297)

where \(\chi ' \equiv \partial \chi (s,t)/\partial s\) and

$$\begin{aligned} (L^H)_{1V}^a \equiv \frac{m_a c^2 I T_a}{2 e_a^2 \chi '} \left\langle \frac{n_a}{B_0^2} \mathbf{b}\cdot \nabla \left( \frac{|\nabla \chi |^2}{B_0} \right) \right\rangle . \end{aligned}$$
(298)

We find Eqs. (294) and (295) agree with the results from the neoclassical and gyrokinetic theories based on the recursive formulations in Sugama and Horton (1997a, b, 1998). As shown in Eq. (297), the neoclassical and turbulent particle fluxes denoted by \((\Gamma _a^\mathrm{ncl})^s\), and \((\Gamma _a^\mathrm{A})^s\) are evaluated by the solutions \(g_a\) and \(\hat{h}_a\) of the first-order drift kinetic and gyrokinetic equations shown in Eqs. (267) and (277), respectively. We see that the inductive electric field \(\mathbf{E}^{(A)} \equiv - c^{-1}\partial \mathbf{A}_0/\partial t\) produces the radial particle flux \((\Gamma _a^{(E)})^s\). We also note that, for axisymmetric systems with up–down symmetry, \((L^H)_{1V}^a = 0\) and accordingly \((\Gamma _a^H)^s = 0\) (Sugama and Horton 1997a). The radial classical particle flux which appears on the left-hand side of Eq. (290) is given by

$$\begin{aligned} (\Gamma _a^\mathrm{C})^s \equiv \left\langle \left\langle {\varvec{\Gamma }}_a^\mathrm{C} \cdot \nabla s \right\rangle \right\rangle = \left\langle \frac{c}{e_a B_0} ( \mathbf{F}_{a1} \times \mathbf{b} ) \cdot \nabla s \right\rangle + {\mathcal {O}}(\delta ^3) , \end{aligned}$$
(299)

where \({\varvec{\Gamma }}_a^\mathrm{C} \equiv \sum _b {\varvec{\Gamma }}_{ab}^\mathrm{C}\) is defined by Eq. (171) and \(\mathbf{F}_{a1} \equiv \int d^3 v \, m_a {\mathbf{v}} \, C_a\) is the collisional friction force.

In the same manner as in deriving Eq. (197) from Eq. (196), the ensemble-averaged particle transport equation can be obtained from Eq. (290) as

$$\begin{aligned} \frac{\partial }{\partial t} \left( V' n_{a0} \right) + \frac{\partial }{\partial s} \left( V' \left[ (\Gamma _a)^s - n_{a0} \langle \mathbf{u}_s \cdot \nabla s \rangle \right] \right) = V' \left\langle \int {\text {d}}^3 v^\mathrm{(gc)} \mathcal{S}_a \right\rangle , \end{aligned}$$
(300)

where the total radial particle flux \((\Gamma _a)^s\) is given by

$$\begin{aligned} (\Gamma _a)^s& = (\Gamma _a^\mathrm{(gc)})^s + (\Gamma _a^\mathrm{C})^s \nonumber \\&= (\Gamma _a^\mathrm{ncl})^s + (\Gamma _a^H)^s + (\Gamma _a^{(E)})^s + (\Gamma _a^\mathrm{A})^s + (\Gamma _a^\mathrm{C})^s . \end{aligned}$$
(301)

Equations (300) and (301) agree with those derived in the previous works using the recursive formulations [see Eqs. (55) and (58) in Sugama and Horton (1998) or Eqs. (166) and (170) in Abel et al. (2013)].

5.5 Ensemble-averaged energy balance equation

The energy density \(\mathcal{E}^*\) in Eq. (208) and the radial component of the energy flux \(\mathbf{Q}\) in Eq. (209) are double-averaged over the flux surface and the ensemble to give

$$\begin{aligned} \langle \langle \mathcal{E}^* \rangle \rangle = \left\langle \frac{1}{2}\sum _a n_{a0} m_a V_0^2 + \frac{3}{2}\sum _a n_{a0} T_{a0} + \frac{1}{8\pi } (|\nabla \Phi _0|^2 + B_0^2) \right\rangle + {\mathcal {O}}(\delta ) , \end{aligned}$$
(302)

and

$$\begin{aligned} \langle \langle \mathbf{Q} \cdot \nabla s \rangle \rangle&= \sum _a \left[ (q_a)^s + \frac{5}{2} T_{a0} (\Gamma _a)^s + V^\zeta ( \Pi _a )^s \right] + \left\langle \mathbf{S}^\mathrm{(Poynting)} \cdot \nabla s \right\rangle \nonumber \\& \qquad - \, \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla s \cdot \left( {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_L + {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T + {\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L + {\hat{\mathbf{B}}} {\hat{\mathbf{B}}} \right. \right. \right. \nonumber \\& \qquad \left. \left. \left. + \, \frac{4\pi }{c} {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} \right) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle + {\mathcal {O}}(\delta ^3) , \end{aligned}$$
(303)

respectively, where \(\mathbf{S}^\mathrm{(Poynting)} \equiv (c/4\pi ) \langle \mathbf{E} \rangle _\mathrm{ens} \times \mathbf{B}_0\) represents the nonturbulent part of the Poynting vector, \({\hat{\mathbf{E}}}_L \equiv - \nabla {\hat{\phi }}\) (\({\hat{\mathbf{E}}}_T \equiv - c^{-1} \partial {\hat{\mathbf{A}}}/\partial t\)) is the longitudinal (transverse) part of the turbulent electric field, the radial particle flux \((\Gamma _a)^s\) is given by Eq. (301), the radial flux \(( \Pi _a )^s\) of the toroidal angular momentum is defined later in Eq. (316), and the radial heat flux \((q_a)^s\) is written as

$$\begin{aligned} (q_a)^s\equiv & {} (q_a^\mathrm{(gc)})^s + (q_a^\mathrm{C})^s \nonumber \\\equiv & {} (q_a^\mathrm{ncl})^s + (q_a^H)^s + (q_a^{(E)})^s + (q_a^\mathrm{A})^s + (q_a^\mathrm{C})^s . \end{aligned}$$
(304)

Here, \((q_a^C)^s\), \((q_a^\mathrm{ncl})^s\), \((q_a^H)^s\), \((q_a^{(E)})^s\), and \((q_a^\mathrm{A})^s\) are defined by

$$\begin{aligned} (q_a^C)^s\equiv \, & {} - \frac{m_a c}{e_a \chi '} \left\langle \int {\text {d}}^3 v \, C_a \left( \epsilon - \frac{5}{2} T_a \right) {\mathbf{v}}'_\perp \cdot \mathbf{e}_\zeta \right\rangle , \nonumber \\ (q_a^\mathrm{ncl})^s\equiv \, & {} \frac{T_a}{\chi '} \left\langle \int {\text {d}}^3 v \, g_a W_{a2} \right\rangle , \nonumber \\ (q_a^H)^s\equiv \, & {} T_a (L^H)_{2V}^a X_{a2} , \nonumber \\ (q_a^{(E)})^s\equiv \, & {} c \left\langle n_a \Xi _a \frac{\mathbf{E}^{(A)} \times \mathbf{b} }{B_0} \cdot \nabla s \right\rangle - \frac{c I}{\chi '} \left\langle \frac{E_\parallel ^{(A)}}{B_0} \left( n_a \Xi _a - \langle n_a \Xi _a \rangle \frac{B_0^2}{\langle B_0^2 \rangle } \right) \right\rangle , \nonumber \\ (q_a^A)^s\equiv \, & {} - \left\langle \left\langle \frac{c}{B_0} \int d^3 v^\mathrm{(gc)} \hat{h}_a \left( \epsilon - \frac{5}{2} T_a \right) ( \nabla {\hat{\psi }}_a \times \mathbf{b} ) \cdot \nabla s \right\rangle \right\rangle , \end{aligned}$$
(305)

where

$$\begin{aligned} (L^H)_{2V}^a \equiv \frac{m_a c^2 I T_a}{2 e_a^2 \chi '} \left\langle \left( 1 + \frac{\Xi _a}{T_a} \right) \frac{n_a}{B_0^2} \mathbf{b}\cdot \nabla \left( \frac{|\nabla \chi |^2}{B_0} \right) \right\rangle . \end{aligned}$$
(306)

The energy source term on the right-hand side of Eq. (207) is written as

$$\begin{aligned} \int {\text {d}}^3 v^\mathrm{(gc)} \, \mathcal{S}_a ( H_a - e_a \Phi _0 ) = \int {\text {d}}^3 v^\mathrm{(gc)} \, \mathcal{S}_a \left( \frac{1}{2} m |U \mathbf{b} + {\mathbf{V}}_0|^2 + \mu B_0 \right) + {\mathcal {O}}(\delta ^3) . \end{aligned}$$
(307)

We see from Eq. (302) that the primary components of \(\mathcal{E}^*\) are the kinetic, thermal, and electromagnetic energies. The electric energy in \(\mathcal{E}^*\) is normally neglected because the ratio of \(|\nabla \Phi _0|^2 / ( 8\pi )\) to \(\frac{1}{2}\sum _a n_{a0} m_a V_0^2\) is given by \((v_{PA}/ c)^2\), where

$$\begin{aligned} v_{PA} \equiv \frac{B_P}{ (4 \pi \rho _m)^{1/2}} \end{aligned}$$
(308)

represents the poloidal Alfvén velocity, \(B_P \equiv |\nabla \chi | / R\) is the poloidal magnetic field, and \(\rho _m \equiv \sum _a n_{a0} m_a \equiv \sum _a m_a \int d^3 v \, f_{a0}\) is the equilibrium mass density. In deriving Eqs. (303)–(305), Eqs. (260)–(263) and (265) are used to evaluate the ensemble-averaged part of the energy fluxes defined in Eqs. (209)–(212). In Eq. (305), \(( q_a^\mathrm{cl} )^s\), \(( q_a^\mathrm{ncl} )^s\), and \(( q_a^A )^s\) represent the classical, neoclassical, and turbulent heat transport fluxes, respectively. The inductive electric field \(\mathbf{E}^{(A)} \equiv - c^{-1}\partial \mathbf{A}_0/\partial t\) produces the heat flux \(( q_a^{(E)} )^s\). For axisymmetric systems with up–down symmetry, \((L^H)_{2V}^a\) defined by Eq. (306) vanishes in the same way as \((L^H)_{1V}^a\) in Eq. (298) and accordingly \((q_a^H)^s = 0\) (Sugama and Horton 1997a).

The radial flux \(\langle \mathbf{S}^\mathrm{(Poynting)} \cdot \nabla s \rangle = - (4\pi )^{-1} \langle [ ( \partial \mathbf{A}_0 /\partial t ) \times \mathbf{B}_0 ] \cdot \nabla s \rangle\) appearing on the right-hand side of Eq. (303) of the Poynting vector is originally included in the second last line of Eq. (210). Also, as shown below, the non-diagonal components of the Maxwell stresses \({\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_L\), \({\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T\), \({\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L\), \({\hat{\mathbf{B}}} {\hat{\mathbf{B}}}\), and \({\hat{\mathbf{j}}} {\hat{\mathbf{A}}}\) in Eq. (303) originate from the terms \((4\pi )^{-1} (\partial \phi _1 / \partial t ) \nabla \phi _1\), \(- (4\pi )^{-1} (\partial \mathbf{A}_1 / \partial t ) \times \mathbf{B}_1\), and \((4\pi c)^{-1} \lambda \partial \mathbf{A}_1 / \partial t\) in the last two lines of Eq. (210). Using Eqs. (240), (283), (284), and neglecting small terms of \({\mathcal {O}}(\delta ^3)\), we can write

$$\begin{aligned}&\frac{1}{4\pi } \left\langle \left\langle \frac{\partial \phi _1}{ \partial t } \nabla \phi _1 \cdot \nabla s \right\rangle \right\rangle \simeq - \frac{V^\zeta }{4\pi } \left\langle \left\langle \frac{\partial {\hat{\phi }}}{ \partial \zeta } \nabla {\hat{\phi }} \cdot \nabla s \right\rangle \right\rangle = - \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla s \cdot ( {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_L ) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle , \nonumber \\&- \frac{1}{4\pi } \left\langle \left\langle \left( \frac{\partial \mathbf{A}_1}{ \partial t } \times \mathbf{B}_1 \right) \cdot \nabla s \right\rangle \right\rangle \simeq \frac{V^\zeta }{4\pi } \left\langle \left\langle \left( \frac{\partial {\hat{\mathbf{A}}}}{ \partial \zeta } \times {\hat{\mathbf{B}}} \right) \cdot \nabla s \right\rangle \right\rangle \simeq \frac{V^\zeta }{4\pi } \left\langle \left\langle \left( \frac{\partial {\hat{\mathbf{B}}}}{ \partial \zeta } \times {\hat{\mathbf{A}}} \right) \cdot \nabla s \right\rangle \right\rangle , \nonumber \\&= \frac{V^\zeta }{4\pi } \left\langle \left\langle \mathbf{e}_\zeta \cdot \nabla {\hat{\mathbf{B}}} \cdot ( {\hat{\mathbf{A}}} \times \nabla s ) \right\rangle \right\rangle = \frac{V^\zeta }{4\pi } \left\langle \left\langle ( {\hat{\mathbf{A}}} \times \nabla s ) \cdot \nabla {\hat{\mathbf{B}}} \cdot \mathbf{e}_\zeta + [ \mathbf{e}_\zeta \times ( {\hat{\mathbf{A}}} \times \nabla s ) ] \cdot ( \nabla \times {\hat{\mathbf{B}}} ) \right\rangle \right\rangle \nonumber \\&\simeq \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla \cdot [ ({\hat{\mathbf{B}}} \cdot \mathbf{e}_\zeta ) ( {\hat{\mathbf{A}}} \times \nabla s )] - ({\hat{\mathbf{B}}} \cdot \mathbf{e}_\zeta ) \nabla \cdot ( {\hat{\mathbf{A}}} \times \nabla s ) - ({\hat{\mathbf{A}}} \cdot \mathbf{e}_\zeta ) \nabla s \cdot ( \nabla \times {\hat{\mathbf{B}}} ) \right\rangle \right\rangle \nonumber \\&\simeq - \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla s \cdot [ {\hat{\mathbf{B}}} {\hat{\mathbf{B}}} + ( \nabla \times {\hat{\mathbf{B}}}) {\hat{\mathbf{A}}} ] \cdot \mathbf{e}_\zeta \right\rangle \right\rangle = - \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla s \cdot \left( {\hat{\mathbf{B}}} {\hat{\mathbf{B}}} + \frac{4\pi }{c} {\hat{\mathbf{j}}}_T {\hat{\mathbf{A}}} \right) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle , \nonumber \\&- \frac{V^\zeta }{c} \left\langle \left\langle \nabla s \cdot ( {\hat{\mathbf{j}}}_T {\hat{\mathbf{A}}} ) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle = - \frac{V^\zeta }{c} \left\langle \left\langle \nabla s \cdot \left( {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} + \frac{1}{4\pi } \frac{\partial {\hat{\mathbf{E}}}_L}{\partial t} {\hat{\mathbf{A}}} \right) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle \nonumber \\&\simeq - \frac{V^\zeta }{c} \left\langle \left\langle \nabla s \cdot \left( {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} - \frac{1}{4\pi } {\hat{\mathbf{E}}}_L \frac{\partial {\hat{\mathbf{A}}}}{\partial t} \right) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle \simeq - V^\zeta \left\langle \left\langle \nabla s \cdot \left( \frac{1}{c} {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} + \frac{1}{4\pi } {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T \right) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle , \nonumber \\&\frac{1}{4\pi c} \left\langle \left\langle \lambda \frac{\partial \mathbf{A}_1}{ \partial t } \cdot \nabla s \right\rangle \right\rangle \simeq - \frac{V^\zeta }{4\pi c} \left\langle \left\langle \frac{\partial {\hat{\phi }} }{\partial \zeta } \frac{\partial {\hat{\mathbf{A}}}}{ \partial t} \cdot \nabla s \right\rangle \right\rangle = - \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla s \cdot ( {\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L ) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle . \end{aligned}$$
(309)

Here, \(c^{-1} {\hat{\mathbf{j}}} {\hat{\mathbf{A}}}\) is regarded as the advection of the electromagnetic momentum [see Eq. (189) in Abel et al. (2013)] and it can also be treated as the difference between two types of definitions of the turbulent momentum flux [see Eq. (60) of Sugama and Horton (1998)]. We see that the Maxwell stress \({\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_T\) due to the transverse electric field does not appear in Eq. (303) because the Darwin approximation (Kaufman and Rostler 1971; Sugama et al. 2013) is employed here.

Now, Eq. (207) is rewritten as

$$\begin{aligned}&\frac{\partial }{\partial t} \left( V' \left\langle \frac{1}{2}\sum _a n_{a0} m_a V_0^2 + \frac{3}{2}\sum _a n_{a0} T_{a0} + \frac{1}{8\pi } (|\nabla \Phi _0|^2 + B_0^2) \right\rangle \right) \nonumber \\&+ \frac{\partial }{\partial s} \left( V' \left[ \sum _a \left\{ (q_a)^s + \frac{5}{2} T_{a0} (\Gamma _a)^s + V^\zeta (\Pi _a)^s \right\} + \left\langle \mathbf{S}^\mathrm{(Poynting)} \cdot \nabla s \right\rangle \right. \right. \nonumber \\&- \frac{V^\zeta }{4\pi } \left\langle \left\langle \nabla s \cdot \left( {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_L + {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T + {\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L + {\hat{\mathbf{B}}} {\hat{\mathbf{B}}} + \frac{4\pi }{c} {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} \right) \cdot \mathbf{e}_\zeta \right\rangle \right\rangle \nonumber \\&\left. \left. - \left\langle \left\{ \frac{1}{2}\sum _a n_{a0} m_a V_0^2 + \frac{3}{2}\sum _a n_{a0} T_{a0} + \frac{1}{8\pi } (|\nabla \Phi _0|^2 + B_0^2) \right\} \mathbf{u}_s \cdot \nabla s \right\rangle \right] \right) \nonumber \\&= V' \sum _a \left\langle \int d^3 v^\mathrm{(gc)} \mathcal{S}_a \left( \frac{1}{2} m_a |U \mathbf{b} + {\mathbf{V}}_0|^2 + \mu B_0 \right) \right\rangle , \end{aligned}$$
(310)

where all terms are of \({\mathcal {O}}(\delta ^2)\) and \({\mathcal {O}}(\delta ^3)\) terms are neglected. Using the relations \(\mathbf{j}_0 = (c/4\pi ) \nabla \times \mathbf{B}_0\) [see Eq. (275)],

$$\begin{aligned} \frac{1}{8\pi } \left\langle \frac{\partial (B_0^2 ) }{\partial t} \right\rangle = - \frac{1}{V'} \frac{\partial }{\partial s} \left( V' \langle \mathbf{S}^\mathrm{(Poynting)} \cdot \nabla s \rangle \right) - \langle \mathbf{j}_0 \cdot \mathbf{E}^{(A)} \rangle , \end{aligned}$$
(311)

and the toroidal momentum balance equation given later in Eqs. (315), Eq. (310) is rewritten as

$$\begin{aligned}&\frac{\partial }{\partial t} \left( V' \frac{3}{2}\sum _a \left\langle n_{a0} T_{a0} \right\rangle \right) \nonumber \\&+ \frac{\partial }{\partial s} \left( V' \left[ \sum _a \left\{ (q_a)^s + \frac{5}{2} T_{a0} (\Gamma _a)^s \right\} - \frac{3}{2}\sum _a \left\langle n_{a0} T_{a0} \mathbf{u}_s \cdot \nabla s \right\rangle \right] \right) \nonumber \\& = V' \left\langle \left( \mathbf{j}_0 - \sum _a n_{a0} e_a {\mathbf{V}}_0 \right) \cdot \mathbf{E}^{(A)} \right\rangle - V' \frac{\partial V^\zeta }{\partial s} \left[ \sum _a (\Pi _a)^s - \left\langle \left\langle \nabla s \cdot \left\{ \frac{1}{c} {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} \right. \right. \right. \right. \nonumber \\&\left. \left. \left. \left. + \frac{1}{4\pi } \left( {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_L + {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T + {\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L + {\hat{\mathbf{B}}} {\hat{\mathbf{B}}} \right) \right\} \cdot \mathbf{e}_\zeta \right\rangle \right\rangle \right] \nonumber \\&+ \frac{1}{2} (V^\zeta )^2 \left[ \frac{\partial }{\partial t} \left( V' \left\langle R^2 \rho _m \right\rangle \right) - \frac{\partial }{\partial s} \left( V' \left\langle R^2 \rho _m \mathbf{u}_s \cdot \nabla s \right\rangle \right) \right] \nonumber \\&+ V' \sum _a \left\langle \int d^3 v^\mathrm{(gc)} \mathcal{S}_a \left( \frac{1}{2} m_a U^2 + \mu B_0 - \frac{1}{2} m_a V_0^2 \right) \right\rangle . \end{aligned}$$
(312)

Equations (310) and (312) represent the total and thermal energy balance equations, respectively, which are consistent with the previously obtained results in the literature (Sugama and Horton 1998; Abel et al. 2013). Note that the quasineutrality condition is used in Sugama and Horton (1998) and Abel et al. (2013) where \({\hat{\mathbf{E}}}_L\) disappears from the Maxwell stress tensor.

5.6 Ensemble-averaged toroidal angular momentum balance equation

Taking the ensemble average of the three toroidal momentum flux terms in Eq. (228), we find that

$$\begin{aligned}&\langle \Pi _{\parallel V \zeta }^s + \Pi _{R\zeta }^s + (\Pi ^\mathrm{C*})^s \rangle _\mathrm{ens} \equiv \sum _a ( \Pi _a )^s \nonumber \\\equiv & {} \sum _a \left[ ( \Pi _a^\mathrm{cl} )^s + ( \Pi _a^\mathrm{ncl} )^s + ( \Pi _a^H )^s+ ( \Pi _a^{(E)} )^s + ( \Pi _a^A )^s \right] , \end{aligned}$$
(313)

where \({\mathcal {O}}(\delta ^3)\) terms are neglected and the \({\mathcal {O}}(\delta ^2)\) toroidal momentum fluxes \(( \Pi _a^\mathrm{cl} )^s\), \(( \Pi _a^\mathrm{ncl} )^s\), \(( \Pi _a^H )^s\), \(( \Pi _a^{(E)} )^s\), and \(( \Pi _a^A )^s\) are defined by

$$\begin{aligned} ( \Pi _a^\mathrm{cl} )^s&= - \frac{m_a^2 c}{2 e_a \chi '} \left\langle \int d^3 v \, C_a \widetilde{(v_\zeta )^2} \right\rangle ,\nonumber \\ ( \Pi _a^\mathrm{ncl} )^s&= \frac{1}{\chi '} \left\langle \int d^3 v \, g_a W_{aV} \right\rangle , \nonumber \\ ( \Pi _a^H )^s&= - (L^H)_{1V}^a X_{a1} - (L^H)_{2V}^a X_{a2} \nonumber \\ ( \Pi _a^{(E)} )^s&= c m_a \left\langle n_a V_\zeta \frac{\mathbf{E}^{(A)} \times \mathbf{b} }{B_0} \cdot \nabla s \right\rangle - \frac{c m_a I}{\chi '} \left\langle \frac{E_\parallel ^{(A)}}{B_0} \left( n_a V_\zeta - \langle n_a V_\zeta \rangle \frac{B_0^2}{\langle B_0^2 \rangle } \right) \right\rangle ,\nonumber \\ ( \Pi _a^A )^s&= \frac{c m_a}{\chi '} \left\langle \left\langle \int d^3 v \, v_\zeta \hat{h}_a \frac{\partial {\hat{\psi }}_a}{\partial \zeta } \right\rangle \right\rangle , \end{aligned}$$
(314)

respectively. To derive Eqs. (314), Eqs. (260)–(263) and (265) are substituted into the definitions of the toroidal momentum fluxes given in Eq. (224). The classical, neoclassical, and turbulent transport fluxes of the toroidal momentum are denoted by \(( \Pi _a^\mathrm{cl} )^s\), \(( \Pi _a^\mathrm{ncl} )^s\), and \(( \Pi _a^A )^s\), respectively, while \(( \Pi _a^{(E)} )^s\) represents the toroidal momentum fluxes driven by the inductive electric field \(\mathbf{E}^{(A)} \equiv - c^{-1}\partial \mathbf{A}_0/\partial t\). For axisymmetric systems with up–down symmetry, \(( \Pi _a^H )^s\) vanishes just like \(( \Gamma _a^H )^s\) and \(( q_a^H )^s\) (Sugama and Horton 1997a). These toroidal momentum fluxes shown in Eq. (314) are consistent with the results obtained in the previous works based on the recursive formulations (Hinton and Wong 1985; Catto 1987; Sugama and Horton 1997b, 1998; Abel et al. 2013).

The ensemble-averaged toroidal angular momentum balance equation is now derived from Eq. (228) as

$$\begin{aligned}&\frac{\partial }{\partial t} \left( V' \left\langle \rho _m \left( 1 + \frac{v_{PA}^2}{c^2} \right) V_\zeta \right\rangle \right) \nonumber \\&+ \frac{\partial }{\partial s} \left( V' \left[ \sum _a ( \Pi _a )^s - \left\langle \rho _m \left( 1 + \frac{v_{PA}^2}{c^2} \right) V_\zeta ( \mathbf{u}_s \cdot \nabla s ) \right\rangle - \left\langle \left\langle \nabla s \cdot \left\{ \frac{1}{c} {\hat{\mathbf{j}}} {\hat{\mathbf{A}}} \right. \right. \right. \right. \right. \nonumber \\&\left. \left. \left. \left. \left. + \frac{1}{4\pi } \left( \langle \mathbf{E} \rangle _\mathrm{ens} \langle \mathbf{E}\rangle _\mathrm{ens} + {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_L + {\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T + {\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L + {\hat{\mathbf{B}}} {\hat{\mathbf{B}}} \right) \right\} \cdot \mathbf{e}_\zeta \right\rangle \right\rangle \right] \right) \nonumber \\&= V' \sum _a \left\langle \int d^3 v \; \mathcal{S}_a m_a ( U b_\zeta + V_\zeta ) \right\rangle , \end{aligned}$$
(315)

where \(v_{PA}\) is the poloidal Alfvén velocity defined in Eq. (308) and \(\rho _m\) is the mass density defined after Eq. (308) . The transport orderings \(\partial / \partial t = {\mathcal {O}} (\delta ^2)\) and \(\mathcal{S}_a = {\mathcal {O}} (\delta ^2)\) are used in Eq. (315) where all terms are of \({\mathcal {O}}(\delta ^2)\) and other higher order terms are neglected. Note that the stress term \({\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_T\) does not appear in Eq. (315) because as shown in Sugama et al. (2013), the transverse electric field is neglected using the Darwin approximation in the Lagrangian density \(\mathcal{L}_f\) given by Eq. (62). Furthermore, the stress terms \({\hat{\mathbf{E}}}_L {\hat{\mathbf{E}}}_T\) and \({\hat{\mathbf{E}}}_T {\hat{\mathbf{E}}}_L\) in Eq. (315) are found to be smaller than the other electromagnetic stress terms by the factor \(V_0 / c \ll 1\). Also, as mentioned in Sect. 5.5, \({\hat{\mathbf{E}}}_L\) disappears from the Maxwell stress when the quasineutrality condition is used.

The momentum flux \(( \Pi _a )^s\) including collisional and turbulent effects is written as

$$\begin{aligned} ( \Pi _a )^s = \frac{c m_a}{\chi '} \left\langle \left\langle - n_a V_\zeta E^{(A)}_\zeta - \frac{m_a}{2 e_a} \int d^3 v \, C_a v_\zeta ^2 + \int d^3 v \, v_\zeta \hat{h}_a \frac{\partial {\hat{\psi }}_a}{\partial \zeta } \right\rangle \right\rangle , \end{aligned}$$
(316)

where \(E^{(A)}_\zeta \equiv - c^{-1} (\partial \mathbf{A}_0 / \partial t ) \cdot \mathbf{e}_\zeta\), \(C_a \equiv \sum _b C_{ab}\), and \(v_\zeta \equiv \mathbf{e}_\zeta \cdot ({\mathbf{V}}_0 + {\mathbf{v}}' )\) is the covariant toroidal component of the particle velocity observed from the rest frame. The toroidal momentum balance given by Eqs. (315) with (316), which describes the evolutions of the toroidal flow and accordingly the background radial electric field profiles, agrees with the result from the recursive method in Sugama and Horton (1998) except that, in Sugama and Horton (1998), the background field \(\mathbf{B}_0\) is assumed to be stationary and \(\mathbf{u}_s\) does not appear. Also, under the quasineutrality condition, Eq. (315) reduces to the toroidal momentum balance equation given in Abel et al. (2013) where the term proportional to \((v_{PA}/c)^2\) and the stress terms due to the electric field are neglected.

6 Summary

In this paper, the Lagrangian variational principle and the collision operator represented in terms of Poisson brackets are combined for presenting the new gyrokinetic formulation to derive governing equations of background and turbulent electromagnetic fields and gyrocenter distribution functions for toroidally rotating plasmas. They satisfy the particle, energy, and toroidal momentum balance equations which, except for the external source terms, are written in the conservative forms suitable for long-time global transport simulation (Wang et al. 2009; Sarazin et al. 2011; Idomura 2014) to pursue evolutions of the background density, temperature, and flow profiles. These balance equations contain all classical, neoclassical, and turbulent transport fluxes which, in the scale-separation limit, coincide with those derived from conventional recursive formulations. Especially, in the present high-flow case, the background radial electric field can be determined from the toroidal momentum balance equation of the second order, which is in contrast with the low-flow axisymmetric case where higher order accuracy is required to determine the radial electric field.