1 Introduction

In 1942 (Alfvén 1942), Hannes Alfvén discovered that if a perfectly conducting medium (e.g., a fully ionized gas; i.e., a plasma) is immersed in a finite background magnetic field \(\varvec{B}_0\); electromagnetic waves can then propagate within it. The reason is that, while the free electron mobility remains extremely large along \(\varvec{B}_0\), its mobility perpendicular to \(\varvec{B}_0\) is inhibited. That is, a magnetized plasma is an anisotropic conducting medium with an extremely large parallel (to \(\varvec{B}_0\)) and finite perpendicular (to \(\varvec{B}_0\)) conductivities. Thus, correspondingly, these propagating electromagnetic waves, called Alfvén waves, have nearly vanishing parallel (to \(\varvec{B}_0\)) and finite perpendicular (to \(\varvec{B}_0\)) electric fields.

There are two types of Alfvén waves; the compressional (fast) and shear Alfvén waves (Jackson 1998). The compressional Alfvén wave (CAW) compresses the magnetic field as well as plasma and its group velocity propagates almost isotropically. The shear Alfvén wave (SAW), meanwhile, is nearly incompressible and, thus, more readily excitable by either external perturbations (e.g., solar wind, antenna) or intrinsic collective instabilities (Chen and Zonca 2016). This brief review is focused on the SAW or, more specifically, its kinetic extension; i.e., the kinetic Alfvén wave (KAW).

In a uniform plasma immersed in a uniform background magnetic field, \(\varvec{B}_0 = B_0 {\hat{\varvec{z}}}\), and adopting the ideal magnetohydrodynamic (MHD) fluid description, it is well known that the SAW satisfies the following linear dispersion relation:

$$\begin{aligned} \omega ^2 = k_\parallel ^2 v_{\text {A}}^2. \end{aligned}$$

Here, \(\omega \) and \(\varvec{k} = \varvec{k}_\perp + k_\parallel \varvec{b}_0\) are, respectively, the wave angular frequency and wave-vector, \(\varvec{b}_0 = \varvec{B}_0/B_0\), \(\varvec{k}_\perp \) is the perpendicular (to \(\varvec{B}_0\)) component of \(\varvec{k}\), \(k_\parallel = \varvec{k} \cdot \varvec{b}_0\), \(v_{\text {A}} = B_0/(4\pi \varrho _{\text {m}})^{1/2}\) is the Alfvén speed with \(\varrho _{\text {m}} \simeq n_0 m_i\) being the mass density, and \(m_i \gg m_e\). The corresponding wave polarization is:

$$\begin{aligned} \left| \frac{c \delta \varvec{E}_\perp }{\delta \varvec{B}_\perp } \right| = v_{\text {A}}, \end{aligned}$$

with \(\delta \varvec{E}_\perp \) and \(\delta \varvec{B}_\perp \) denoting, respectively, the fluctuating components of electric and magnetic field perpendicular to \(\varvec{B}_0\). Equation (1) indicates that SAW is an anisotropic electromagnetic wave; i.e., while its phase velocity can propagate in any direction, its group velocity, \(\varvec{v}_{\text {g}} = v_{\text {A}} \varvec{b}_0\), propagates only along \(\varvec{B}_0\). This property, of course, has the direct bearing on the feature of Alfvén wave resonant absorption (Gajevski and Winterberg 1965; Pridmore-Brown 1966; Grad 1969; Chen and Hasegawa 1974a, b).

In a non-uniform plasma, SAW attains the interesting property of a continuous spectrum. To illustrate this feature, let us consider the simplified slab model of a cold plasma with a non-uniform density, \(\varrho _m=\varrho _m(x)\), and a uniform \(\varvec{B}_0 = B_0 {\hat{\varvec{z}}}\). Assuming at \(t=0\) a localized initial perturbation \(\delta B_y (x,t=0) = \delta {\hat{B}}_y (x,0) = \exp (-x^2/\varDelta _x^2)\), \(|k_y \varDelta _x| \ll 1\), and \(\partial \delta B_y/\partial t = 0\), the perturbation then evolves according to the following wave equation:

$$\begin{aligned} \left[ \partial _t^2 + \omega _{\text {A}}^2 (x) \right] \delta B_y (x,t) = 0. \end{aligned}$$

Here, \(\omega _{\text {A}}^2 (x) = k_z^2 v_{\text {A}}^2(x)\) and the solution is:

$$\begin{aligned} \delta B_y (x,t) = \delta {\hat{B}}_y (x,0) \cos \left[ \omega _{\text {A}}(x) t \right]. \end{aligned}$$
Fig. 1
figure 1

Three-component dynamic power spectrum of magnetic field data from AMPTE CCE satellite [From original figure in Ref. Engebretson (2011), Engebretson et al. (1987)]. The geomagnetic \(B_{\text {R}}\), radially outward from the center of the Earth; \(B_{\text {E}}\), magnetically Eastward; and \(B_{\text {N}}\), approximately along local magnetic field lines correspond to, respectively, \(\delta B_x, \delta B_y, \,\text {and}\,\delta B_z\). L, MLT, and MLAT correspond, respectively, to the equatorial distance of the magnetic field line (in unit of the Earth radius), magnetic local time, and magnetic latitude

Equation (4) shows that every point in x oscillates at a different frequency, \(\omega _{\text {A}}(x)\). With a continuously varying \(\omega _{\text {A}}(x)\); the wave frequency, thus, constitutes a continuous spectrum. While the above result is based on a model with a one-dimensional non-uniformity in x, this general feature of SAW continuous spectrum also holds in magnetized plasmas with two- or three-dimensional non-uniformities (Grad 1969; Chen and Cowley 1989; Schulze-Berge et al. 1991; Chen and Zonca 2016). A good example is geomagnetic pulsations in the Earth’s magnetosphere. Figure 1 shows the oscillations in the Earths’ magnetic field as observed by the satellite AMPTE CCE (Engebretson 2011; Engebretson et al. 1987), illustrating the three-component dynamic power spectrum of magnetic field data from a full orbit from 02:30 to 17:30 UT March 6, 1987. Apogee is at the center of the figure. As the satellite moved outward from the morning side, \(\omega _{\text {A}}\) should decrease due to the decreasing \(|\varvec{B}_0|\) and \(|k_\parallel |\) (increasing field-line length), and this was clearly exhibited in the wave frequency of \(B_{\text {E}}\), the azimuthal (East–West) component of \(\delta \varvec{B}\) (i.e., the effective \(\delta B_y\)). \(B_{\text {E}}\) also shows that the wave frequency increases as the satellite moved inward toward the dusk side; consistent, again, with \(\omega _{\text {A}}\). Furthermore, the observed wave frequency consisted of several bands, which could be understood as harmonics of standing waves along the field line; i.e., different \(|k_\parallel |\).

\(\delta B_y (x,t)\) given by Eq. (4) also indicates an unique and important property of SAW continuous spectrum; i.e., the spatial structure evolves with time. Specifically, the wave-number in the non-uniformity direction is, time asymptotically, given by:

$$\begin{aligned} |k_x|= & \left| \partial \ln \delta B_y/\partial x\right| \nonumber \\\simeq & | {\text {d}} \omega _{\text {A}}(x)/{\text {d}} x | t \equiv |\omega _{\text {A}}'| t. \end{aligned}$$

That \(|k_x|\) increases with t is significant, since it implies that any initially long-scale perturbations will evolve into short scales. This point is illustrated in Fig. 2; showing the evolution of a smooth \(\delta B_y\) at \(t=0\) to a spatially fast varying \(\delta B_y\) at a later t (Qiu 2010; Qiu et al. 2011).

Fig. 2
figure 2

Snapshots of \(\delta B_y(x,t)\) spatial structure vs. x at different times, illustrating the formation of shorter scales at later times [from original figure in Refs. Qiu (2010), Qiu et al. (2011)]

Another consequence of \(|k_x|\) increasing with t is the temporal decay of \(\delta B_x\). From \({\varvec{\nabla }} \cdot \delta \varvec{B} \simeq {\varvec{\nabla }}_\perp \cdot \delta \varvec{B}_\perp = 0\), we can readily derive that, for \(|\omega _{\text {A}}' t| \gg |k_y|\):

$$\begin{aligned} \delta B_x (x,t) \simeq \frac{k_y}{\omega _{\text {A}}' (x) t} \delta {\hat{B}}_y (x,0) e^{ - i \omega _{\text {A}}(x) t } \left[ 1 + {{{\mathcal {O}}}} \left( \frac{k_y}{|\omega _{\text {A}}' t|} \right) + \cdots \right] . \end{aligned}$$

That is, \(\delta B_x\) decays temporally due to the phase mixing of increasingly more rapidly varying neighboring perturbations. This property also explains why, in Fig. 1, the radial component of \(\delta \varvec{B}\), \(B_\text {R}\), is much weaker than \(B_\text {E}\). We remark that, while we focus in this paper on the formation of SAW continuous spectrum due to non-uniform \(\omega _{\text {A}}\), similar effects can be expected due to flow shear, which often exists in laboratory and astrophysical plasmas (Kim 2007; Maiorano et al. 2020).

Noting that, as \(t \rightarrow \infty \), \(|k_x| \rightarrow \infty \), it, thus, suggests that the perturbation will develop singular structures toward the steady state. Indeed, taking \(\partial _t = - i \omega \), the SAW governing wave equation for the cold-plasma becomes (Chen and Hasegawa 1974a, b):

$$\begin{aligned} \left\{ \frac{\text {d}}{\text {d}x}\left[ \omega ^2 - \omega _{\text {A}}^2 (x) \right] \frac{\text {d}}{\text {d}x} - k_y^2 \left[ \omega ^2 - \omega _{\text {A}}^2 (x) \right] \right\} \delta B_x (x) = 0. \end{aligned}$$

\(\delta B_x\), thus, exhibits a logarithmic singularity at the Alfvén resonant point (layer), \(x_0\), where \(\omega ^2 = \omega _{\text {A}}^2(x_0)\) along with a finite resonant wave-energy absorption rate. Note that at the isolated extrema of the SAW continuum, \(|\omega _{\text {A}}^\prime | = 0\), phase mixing vanishes; consequently, perturbation remains regular and experiences no damping via resonant absorption. This feature has important implications to Alfvén instabilities in laboratory plasmas (Chen and Zonca 2016).

That the solution exhibits singularities naturally suggests that the microscopic length-scale physics neglected in the ideal MHD fluid description should be included in the long-time-scale dynamics of SAWs. For low-frequency SAWs, one can readily recognize the relevant perpendicular (to \(\varvec{B}_0\)) microscopic scales are either the ion-Larmor radius, \(\rho _{\text {i}} = v_{\text {ti}}/\varOmega _{\text {i}}\) with \(v_{{\text {ti}}}\) and \(\varOmega _{\text {i}}\) being, respectively, the ion thermal speed and ion cyclotron frequency, and/or \(\rho _{\text {s}} = c_{\text {S}}/\varOmega _{\text {i}}\) with \(c_{\text {S}}^2 = T_{\text {e}}/m_{\text {i}}\) and \(T_{\text {e}}\) being the electron temperature. Including the effects of finite \(\rho _{\text {i}}\) and/or \(\rho _{\text {s}}\) in the SAW dynamics then led to the discovery of the so-called kinetic Alfvén wave (KAW) (Hasegawa and Chen 1975, 1976).

The pioneering discovery of KAW was carried out before the introduction of linear electromagnetic gyrokinetic theory (Catto 1978; Antonsen and Lane 1980) and, later, nonlinear electromagnetic gyrokinetic theory (Frieman and Chen 1982). The analyses employed, therefore, involved taking the low-frequency (\(|\omega |\ll |\varOmega _{\text {i}}|\)) limit of the Vlasov dynamics. This makes theoretical analysis of KAW dynamics in non-uniform plasmas with realistic \(\varvec{B}_0 (\varvec{x})\) intractable; especially when dealing with the nonlinear physics. Indeed, previous nonlinear analyses adopted either the drift-kinetic or the two-fluid description (Mikhailovskii et al. 2007; Onishchenko et al. 2004a, b; Pokhotelov et al. 2003, 2004; Zhao et al. 2011). As our later discussions will show, such approximations not only are inadequate for treating realistic plasma regimes; but also often leave out important physics. The above discussions have, thus, motivated us (Chen and Zonca 2011, 2013; Zonca et al. 2015) to re-visit and explore further the KAW physics employing the powerful gyrokinetic theories.

Section 2 presents a brief review of the linear gyrokinetic theory (cf. Sect. 2.1) and its applications to KAW (cf. Sects. 2.2 and 2.3 ) along with KAW observations by satellites (cf. Sect. 2.4). The nonlinear gyrokinetic theory is then briefly reviewed in Sect. 3.1. It is then applied to examine the physics of the three-wave parametric decay instabilities, the modulational instabilities associated with the spontaneous generation of convective cells, and the quasi-linear phase-space transport induced by KAW (cf. Sects. 3.23.3 and 3.4 ). Results from corresponding numerical simulations are also presented. Final conclusions and discussions are given in Sect. 4.

2 Linear KAW physics

Here, we first introduce the foundation of the linear gyrokinetic formalism in Sect. 2.1. Linear KAW properties are then derived in Sect. 2.2 for uniform plasmas. Section 2.3 contains brief discussions of KAW in non-uniform plasmas; including the resonant mode conversion process. Observational evidences of KAWs by satellites are presented in Sect. 2.4.

2.1 Linear gyrokinetic theory

In magnetically confined plasmas, there exists a natural smallness parameter, \(\epsilon = \rho /a\) with \(\rho \) and a being, respectively, the charged particle’s Larmor radius and the macroscopic system scale length. Typically, we have \(\epsilon \lesssim {{\mathcal {O}}}(10^{-2}) \ll 1\). Since low-frequency but short-wavelength fluctuations are of interest here, one, thus, adopts the following linear gyrokinetic orderings (Catto 1978; Antonsen and Lane 1980; Frieman and Chen 1982; Sugama 2000; Brizard and Hahm 2007; Sugama 2017):

$$\begin{aligned} \left| \frac{\omega }{\varOmega _i} \right| \sim {{\mathcal {O}}}(\epsilon ), \quad \left| k_\perp \rho _i \right| \sim {{\mathcal {O}}}(1), \end{aligned}$$

and to include Landau resonance:

$$\begin{aligned} k_\parallel v_\parallel \sim \omega , \quad \mathrm{or} \quad \left| \frac{k_\parallel }{k_\perp } \right| \sim \left| \frac{\omega }{\varOmega _{\text {i}}} \right| \sim {{\mathcal {O}}}(\epsilon ). \end{aligned}$$

Noting, furthermore, for \(\left| k_\perp \rho _{\text {i}} \right| \sim {{\mathcal {O}}}(1)\) and \(\beta _{\text {i}} \lesssim {{\mathcal {O}}} (1)\), with \(\beta _{\text {i}} = 8\pi P_{0i}/B_0^2\) the ratio of plasma ion pressure to the background magnetic field energy density:

$$\begin{aligned} \left| \frac{\omega }{k_\perp v_{\text {A}}} \right| \sim \left| \frac{\omega }{\varOmega _{\text {i}}} \right| \left| \frac{1}{k_\perp \rho _{\text {i}}} \right| \beta _{\text {i}}^{1/2} \lesssim {{\mathcal {O}}}(\epsilon ), \end{aligned}$$

compressional Alfvén (fast) waves are systematically suppressed in the gyrokinetic orderings.

In the next step, linear gyrokinetic theories perform the following coordinate transformation from the charged particle’s phase space \((\varvec{x}, \varvec{v})\) to the corresponding guiding-center phase space \((\varvec{X}, \varvec{V})\), where:

$$\begin{aligned} \varvec{X}= & \varvec{X}_\perp + X_\parallel \varvec{b}_0, \quad \varvec{X}_\perp = \varvec{x}_\perp + \varvec{\rho }, \quad \varvec{\rho } = \varvec{v} \times \varvec{b}_0/\varOmega _c, \end{aligned}$$
$$\begin{aligned} \varvec{V}= & \left[ {{\mathcal {E}}} = v^2/2 , \mu = v_\perp ^2/2B_0 , \sigma = \mathrm{sgn}(v_\parallel ) \right] . \end{aligned}$$

Here, \(\varvec{b}_0 = \varvec{B}_0/B_0\), \(\varvec{\rho }\) is the gyroradius vector, \(v_\parallel = \varvec{v} \cdot \varvec{b}_0\), \(\mu \) is the magnetic moment adiabatic invariant (\(\mu = v_\perp ^2/2B_0\) at the leading order) and, assuming there is no equilibrium electrostatic potential, \({{\mathcal {E}}}\) is an equilibrium constant of motion.

In the guiding-center phase space, charged particle dynamics is naturally separated into the fast cyclotron motion and the slow guiding-center motion. One can then apply the gyrokinetic orderings and systematically average out the fast cyclotron motion (i.e., the gyrophase averaging) and obtain the asymptotically dominant (in terms of the smallness parameter \(\epsilon \)) perturbed distribution function response. This perturbed distribution function in the guiding-center phase space can then be inversely transformed back to the charged particle phase space and applied toward the field equations (i.e., Maxwell’s equations) for a self-consistent kinetic description (Catto 1978; Antonsen and Lane 1980).

For the purpose of the present review, we shall limit our considerations to that of a simple uniform plasma with an isotropic Maxwellian equilibrium distribution function. Readers interested in the detailed analyses and/or broader applications may consult References (Antonsen and Lane 1980; Chen and Hasegawa 1991). Assuming, furthermore, \(\beta \) (ratio between the plasma and magnetic pressures) \(\ll 1\), such that there is negligible magnetic compression, the particle velocity distribution is then given by:

$$\begin{aligned} f (\varvec{x}, \varvec{v}, t) = F_{\text {M}} ({{\mathcal {E}}}) + \delta f (\varvec{x}, \varvec{v}, t), \end{aligned}$$

where \(F_{\text {M}} ({{\mathcal {E}}}) = n_0/(\pi ^{3/2} v_{\text {t}}^3) \exp ( - {{\mathcal {E}}}/v_{\text {t}}^2 )\) is the Maxwellian distribution function, \(v_t\) is the thermal speed:

$$\begin{aligned} \delta f = \frac{q}{T} F_{\text {M}} ({{\mathcal {E}}}) \delta \phi + e^{-\varvec{\rho } \cdot {\varvec{\nabla }}} \delta g\; , \end{aligned}$$

\(T = mv_{\text {t}}^2/2\), \(\delta g\) satisfies the following linear gyrokinetic equation:

$$\begin{aligned} \left( \frac{\partial }{\partial t} + v_\parallel \varvec{b}_0 \cdot {\varvec{\nabla }} \right) \delta g= & \frac{q}{T} F_M ({{\mathcal {E}}}) \frac{\partial }{\partial t} \left\langle \delta L_{\text {g}} \right\rangle _\alpha, \end{aligned}$$
$$\begin{aligned} \delta L_{\text {g}}= & e^{\varvec{\rho } \cdot {\varvec{\nabla }}} \delta L\; , \end{aligned}$$
$$\begin{aligned} \delta L= & \delta \phi - v_\parallel \delta A_\parallel /c, \end{aligned}$$

and \(\left\langle \ldots \right\rangle _\alpha \) denotes averaging over the gyrophase angle, \(\alpha \). Here, the field variables are the scalar and vector potentials, \(\delta \phi \) and \(\delta \varvec{A}\), with \(\delta A_\parallel = \delta \varvec{A} \cdot \varvec{b}_0\) and the \({\varvec{\nabla }} \cdot \delta \varvec{A} = 0\) Coulomb gauge. The operator \(e^{\varvec{\rho } \cdot {\varvec{\nabla }}}\), meanwhile, represents the transformation between the particle and guiding center positions.

The corresponding field equations are the Poisson’s equation and the parallel Ampère’s law, \(\nabla ^2 \delta A_\parallel = - 4\pi \delta J_\parallel /c\). In the low-frequency and \(|k \lambda _{\text {D}}|^2 \ll 1\) limit with \(\lambda _{\text {D}}\) being the Debye length, Poisson’s equation can be approximated as the quasi-neutrality condition; \(\sum _j n_{0j} q_j \left\langle \delta f_j \right\rangle _{\varvec{v}} \simeq 0\). Here, \(\left\langle \ldots \right\rangle _{\varvec{v}} = \int d^3 \varvec{v} \left( \ldots \right) \) is the velocity-space integral, and subscript j runs over the particle species. Meanwhile, substituting the parallel Ampère’s law into the \({\varvec{\nabla }} \cdot \delta \varvec{J} \simeq 0\) quasi-neutrality condition as derived by Eq. (15) yields a generalized linear gyrokinetic vorticity equation, which is often convenient to use in studying SAW/KAW dynamics (Chen and Hasegawa 1991; Chen and Zonca 2011, 2016; Zonca et al. 2015).

2.2 Linear KAW properties

For plane-wave \((\omega ,\varvec{k})\) perturbations, Eq. (15) gives:

$$\begin{aligned} \delta g_{\varvec{k}} = -\frac{q}{T} F_{\text {M}} J_0(k_\perp \rho ) \frac{\omega }{k_\parallel v_\parallel - \omega } \left( \delta \phi - \frac{v_\parallel }{c} \delta A_\parallel \right) _{\varvec{k}}\; . \end{aligned}$$

Note, here, that \(J_0\) is the Bessel function and \(J_0(k_\perp \rho )\) corresponds to the gyro-averaging of the coordinate transformation, that is:

$$\begin{aligned} \left\langle \exp \left( - \varvec{\rho } \cdot {\varvec{\nabla }} \right) \right\rangle _\alpha = J_0(k_\perp \rho ). \end{aligned}$$

In SAW/KAW analyses, it is sometimes convenient to introduce an effective induced parallel potential defined by \(\varvec{b}_0 \cdot {\varvec{\nabla }} \delta \psi = - \partial _t \delta A_\parallel /c\) or:

$$\begin{aligned} \delta \psi _{\varvec{k}} = \omega \delta A_{\parallel \varvec{k}}/(c k_\parallel ). \end{aligned}$$

\(\delta \psi \), thus, gives rise to the induced parallel electric field; that is, the net parallel electric field is given by:

$$\begin{aligned} \delta E_\parallel= & - \varvec{b}_0 \cdot {\varvec{\nabla }} \left( \delta \phi - \delta \psi \right) ; \quad \mathrm{or} \nonumber \\ \delta E_{\parallel \varvec{k}}= & - i k_\parallel \left( \delta \phi - \delta \psi \right) _{\varvec{k}}. \end{aligned}$$

The quasi-neutrality condition then straightforwardly yields: (Chen and Hasegawa 1991)

$$\begin{aligned} \sum _{j} \left( \frac{n_0 q^2}{T_0} \right) _j \left\{ \delta \phi _{\varvec{k}} + \varGamma _{0 kj} \left[ \xi _{kj} Z_{kj} \delta \phi _{\varvec{k}} - (1 + \xi _{kj} Z_{kj} ) \delta \psi _{\varvec{k}} \right] \right\} = 0. \end{aligned}$$

Here, \(\xi _{kj} = \omega /|k_\parallel | v_{tj}\), \(Z_{kj} = Z(\xi _{kj})\) with Z being the well-known plasma dispersion function, and \(\varGamma _{0 kj} = I_0(b_{kj}) \exp ( - b_{kj} )\) with \(I_0\) the modified Bessel function and \(b_{kj} = k_\perp ^2 \rho _j^2/2 = k_\perp ^2 (T_j/m_j)/\varOmega _j^2\). The linear gyrokinetic vorticity equation, meanwhile, is given by: Chen and Hasegawa (1991)

$$\begin{aligned} i \frac{c^2}{4\pi \omega } k_\parallel ^2 k_\perp ^2 \delta \psi _{\varvec{k}} - i \sum _{j} \left( \frac{n_0 q^2}{T_0} \right) _j ( 1 - \varGamma _{0kj} ) \omega \delta \phi _{\varvec{k}} = 0. \end{aligned}$$

Noting that, for KAW, \(|k_\perp \rho _i| \sim {{\mathcal {O}}}(1)\) and \(|k_\perp \rho _e| \ll 1\) and, thus, \(\varGamma _{0 ke} \simeq 1\), Eqs. (22) and (23) then become:

$$\begin{aligned} \epsilon _{s\varvec{k}} \delta \phi _{\varvec{k}}= & \left[ 1 + \xi _{ke} Z_{ke} + \tau \left( 1 + \varGamma _{k} \xi _{ki} Z_{ki} \right) \right] \delta \phi _{\varvec{k}} \nonumber \\= & \left[ 1 + \xi _{ke} Z_{ke} + \tau \varGamma _{k} \left( 1 + \xi _{ki} Z_{ki} \right) \right] \delta \psi _{\varvec{k}}, \end{aligned}$$


$$\begin{aligned} \omega ^2 \delta \phi _{\varvec{k}} = k_\parallel ^2 v_{\text {A}}^2 \frac{b_k}{1-\varGamma _k} \delta \psi _{\varvec{k}}. \end{aligned}$$

Here, \(\tau = T_{0e}/T_{0i}\), \(b_k = b_{ki}\), \(\varGamma _k = \varGamma _{0ki}\), and \(\epsilon _{{\text {s}} \varvec{k}}\) is the dielectric constant for the slow-sound (ion-acoustic) wave (SSW).

It is also instructive, as done in some literatures, to define the effective parallel potential, \(\delta \phi _{\parallel \varvec{k}} = \delta \phi _{\varvec{k}} - \delta \psi _{\varvec{k}}\), and rewrite Eqs. (24) and (25) as:

$$ \epsilon _{{{s}}{\varvec{k}}} \delta \phi _{\parallel \varvec{k}} = - \tau \left( 1 - \varGamma _k \right) \delta \psi _{\varvec{k}}, $$


$$\begin{aligned} \left[ \omega ^2 - k_\parallel ^2 v_{\text {A}}^2 \frac{b_k}{1-\varGamma _k} \right] \delta \psi _{\varvec{k}} = - \omega ^2 \delta \phi _{\parallel \varvec{k}}. \end{aligned}$$

Equations (26) and (27) demonstrate the coupling between SAW and SSW via the finite \(|k_\perp \rho _{\text {s}}|\) term. In the \(|k_\perp \rho _{\text {i}}| \sim {{\mathcal {O}}}(1)\) short-wavelength limit, SAW evolves into KAW due to both the finite \(|k_\perp \rho _{\text {i}}|\) and \(|k_\perp \rho _{\text {s}}|\) effects. More specifically, the coupled KAW–SSW dispersion relation becomes:

$$\begin{aligned} \omega _{\varvec{k}}^2 \left[ 1 - \frac{\tau \left( 1 - \varGamma _k \right) }{\epsilon _{s\varvec{k}}} \right] = k_\parallel ^2 v_{\text {A}}^2 \frac{b_k}{1-\varGamma _k}. \end{aligned}$$

Let us concentrate on the KAW branch and, to further simplify the analysis, assume \(1 \gg \beta _{\text {i}} \sim \beta _e \gg m_{\text {e}}/m_{\text {i}}\). With \(|\omega | \sim |k_\parallel v_{\text {A}}|\), we then have \(|\xi _{ki}| = |\omega /k_\parallel v_{{\text {ti}}}| \sim \beta _{\text {i}}^{-1/2} \gg 1 \gg |\xi _{ke}| \sim (m_{\text {e}}/m_{\text {i}} \beta _{\text {e}})^{1/2}\), and, keeping only the lowest order \({{\mathcal {O}}}(1)\) terms:

$$\begin{aligned} \epsilon _{s\varvec{k}} \simeq 1 + \tau \left( 1 - \varGamma _k \right) \equiv \sigma _{k}. \end{aligned}$$

From Eq. (28), we then have:

$$\begin{aligned} \omega _{\varvec{k} r}^2 \simeq k_\parallel ^2 v_{\text {A}}^2 \frac{\sigma _k b_k}{1-\varGamma _k}. \end{aligned}$$

A sketch of \((\omega _{\varvec{k} r}/k_\parallel v_{\text {A}})^2\) versus \(b_k^{1/2}\) for different \(\tau \) values is given in Fig. (3).

Fig. 3
figure 3

Dispersion curves illustrating \((\omega _{\varvec{k} r}/k_\parallel v_{\text {A}})^2\) versus \(b_k^{1/2}\) for different \(\tau \) values

As to wave polarizations, which are useful for wave identification in observations, we can readily derive:

Fig. 4
figure 4

Polarization curves illustrating \(|c \delta \varvec{E}_\perp /v_{\text {A}} \delta \varvec{B}_\perp |\) versus \(b_k^{1/2}\) for different \(\tau \) values

$$\begin{aligned} \left| \frac{c \delta \varvec{E}_\perp }{\delta \varvec{B}_\perp } \right| = v_{\text {A}} \left[ \frac{b_k}{\sigma _k \left( 1- \varGamma _k\right) }\right] ^{1/2}, \end{aligned}$$


$$\begin{aligned} \left| \frac{c \delta E_\parallel }{\delta \varvec{B}_\perp } \right| = v_{\text {A}} \left| \frac{k_\parallel }{k_\perp } \right| \tau \left[ \frac{b_k \left( 1- \varGamma _k\right) }{\sigma _k}\right] ^{1/2}. \end{aligned}$$

Sketches of \(|c \delta \varvec{E}_\perp /v_{\text {A}} \delta \varvec{B}_\perp |\) and \(| c \delta E_\parallel k_\perp /v_{\text {A}}\delta \varvec{B}_\perp k_\parallel \tau |\) are given in, respectively, Figs. 4 and 5.

Fig. 5
figure 5

Polarization curves illustrating \(|c \delta E_\parallel k_\perp /v_{\text {A}}\delta \varvec{B}_\perp k_\parallel \tau |\) versus \(b_k^{1/2}\) for different \(\tau \) values

Equation (32) and Fig. 5 show that, for a fixed \(|k_\parallel /k_\perp |\), \(|\delta E_\parallel /\delta \varvec{B}_\perp |\) increases with \(b_k\). Since wave–particle energy and momentum exchanges are proportional to \(|\delta E_\parallel |\), short-wavelength KAW are, thus, expected to play crucial roles in the heating, acceleration, and transport of charged particles.

In addition to having a significant \(\delta E_\parallel \), another important property of KAW, in contrast to SAW, is that KAW has a finite perpendicular (to \(\varvec{B}_0\)) group velocity, \(\varvec{v}_{g \perp }\). Assuming \(|k_\perp \rho _{\text {i}}|^2 \ll 1\), we have, letting \(\omega _{\text {A}}^2 \equiv k_\parallel ^2 v_{\text {A}}^2\):

$$\begin{aligned} \omega _{\varvec{k}}^2 \simeq \omega _{\text {A}}^2 \left( 1 + k_\perp ^2 {\hat{\rho }}^2 \right) , \end{aligned}$$


$$\begin{aligned} {\hat{\rho }}^2 = \left( 3/4 + \tau \right) \rho _i^2. \end{aligned}$$


$$\begin{aligned} \varvec{v}_{g \perp } \simeq \frac{\omega _{\text {A}}^2}{\omega _{\varvec{k}}} {\hat{\rho }}^2 \varvec{k}_\perp . \end{aligned}$$

2.3 Linear mode conversion of KAW

Equation (33) has a significant implication in non-uniform plasmas. Consider, again, a slab plasma with a non-uniform \(\omega _{\text {A}}^2(x)\) and \(k_\perp ^2 = k_x^2(x)\) being the WKB wave-number in the non-uniformity direction, x. Equation (33) then indicates that KAW is propagating (\(k_x^2 > 0\)) in the \(\omega _{\varvec{k}}^2 > \omega _{\text {A}}^2(x)\) region, and it is cutoff (\(k_x^2 < 0\)) in the \(\omega _{\varvec{k}}^2 < \omega _{\text {A}}^2(x)\) region. That \(\varvec{v}_{g \perp }\) is finite also suggests that, in contrast to SAW, an initial smooth perturbation will not only evolve into short wavelengths, but also propagate toward the lower—\(\omega _{\text {A}}^2(x)\) region. These features are illustrated in Fig. 6b; where the spatial–temporal evolution of KAW is solved explicitly according to the following wave equation:

$$\begin{aligned} \left[ {\hat{\rho }}^2 \frac{\partial ^2}{\partial x^2} - 1 - \frac{1}{\omega _{\text {A}}^2(x)}\frac{\partial ^2}{\partial t^2} \right] \delta B_y (x,t) = 0. \end{aligned}$$

Note that Eq. (36) can be readily derived by letting \(\omega _{\varvec{k}} = i \partial /\partial t\) and \(k_\perp = - i \partial /\partial x\) in Eq. (33). The spatial profile of \(\omega _{\text {A}}^2(x)/\omega ^2 = 1/(1+x^2/L^2)\) is shown in Fig. 6a, with L indicating the profile length-scale, so that the KAW wave-packet frequency is assumed to be consistent with the SAW frequency at \(x=0\). Figure 6b shows the propagation of the KAW wave-packet in the direction of radial non-uniformity, consistent with Eq. (35).

Fig. 6
figure 6

a Spatial dependence of \(\omega _{\text {A}}^2\). b Propagation of the KAW wave-packet in the non-uniformity direction

That there exists a finite perpendicular group velocity also implies, in the steady state, the removal of “singular” resonance and linear mode conversion process (Hasegawa and Chen 1976). More specifically, the corresponding wave equation is given by:

$$\begin{aligned} \left\{ {\hat{\rho }}^2 \frac{\partial ^2}{\partial x^2} + \left[ \frac{\omega _0^2}{\omega _{\text {A}}^2(x)} - 1 \right] \right\} \delta {\hat{B}}_y (x) = \delta {\hat{B}}_{y 0}. \end{aligned}$$

Here, \(\omega _0\) is the external driving frequency. In the ideal SAW (\({\hat{\rho }} \rightarrow 0^+\)) limit, there is the resonance singularity at \(x_0\), where \(\omega _0^2 = \omega _{\text {A}}^2(x_0)\). Noting that, near \(x=x_0\), \(\omega _{\text {A}}^2(x) \simeq \omega _0^2 + \left( \omega _{\text {A}}^2 \right) '(x_0) (x - x_0) \equiv \omega _0^2 - (\omega _0^2/L_{\text {A}}) (x-x_0)\), Eq. (37) can be approximated as an inhomogeneous Airy equation and solved analytically. Equation (37) can then be solved, with appropriate boundary conditions, by connecting the solutions valid away from the \(x=x_0\) resonance layer via the analytic solution of the inhomogeneous Airy equation valid near \(x=x_0\) (Hasegawa and Chen 1975, 1976). The solutions away from the singular layer are given by:

$$\begin{aligned} \delta {\hat{B}}_y (x)= & \left\{ \begin{array}{l} \displaystyle \frac{\delta {\hat{B}}_{y 0}}{\epsilon _{\text {A}}(x)}, \quad \mathrm{for} \quad \omega _0^2 < \omega _{\text {A}}^2(x); \\ \, \\ \displaystyle \frac{\delta {\hat{B}}_{y 0}}{\epsilon _{\text {A}}(x)} -\displaystyle \frac{\sqrt{\pi }\delta {\hat{B}}_{y 0}}{({\hat{\rho }}^2/L_{\text {A}})^{1/2}} \left( \displaystyle \frac{{\hat{\rho }}^2}{\epsilon _{\text {A}} (x)} \right) ^{1/4} \end{array} \right. \; \nonumber \\&\times \exp \left[ i \int _{x_0}^x \left( \frac{\epsilon _{\text {A}}(x')}{{\hat{\rho }}^2} \right) ^{1/2} {\text {d}}x' + i \frac{\pi }{4} \right] , \quad \mathrm{for} \quad \omega _0^2 > \omega _{\text {A}}^2(x), \end{aligned}$$


$$\begin{aligned} \epsilon _{\text {A}}(x) = \frac{\omega _0^2}{\omega _{\text {A}}^2(x)} - 1. \end{aligned}$$

The corresponding numerical solutions are plotted in Fig. 7.

Fig. 7
figure 7

Illustration of ideal MHD (dashed blue line) and KAW (red line) solutions, which asymptotically match Eq. (38) for \(|x-x_0|/\varDelta _0 \gg 1\). The Airy swelling factor is evident from the normalization of the ordinate

Both the analytical results and mode conversion process exhibit two important features. One is, instead of being singular, the amplitude at \(x=x_0\) (where \(\omega _{\text {A}}(x_0) = \omega _0\)) is amplified by the Airy swelling factor; \((L_{\text {A}}/{\hat{\rho }})^{2/3}\). Here, we recall \(L_{\text {A}}\) is the scale length of \(\omega _{\text {A}}\) and \({\hat{\rho }}\), from Eq. (34), is of \({{\mathcal {O}}}(\rho _i)\), and, hence, \(|L_{\text {A}}/{\hat{\rho }}|\gg 1\). The other is the singularity at \(x=x_0\) is being replaced by the Airy scale length; \(\varDelta _0 = ({\hat{\rho }}^2 L_{\text {A}})^{1/3}\). Recalling, from Eq. (5), \(|k_x| \simeq |\omega _{\text {A}}'| t \simeq (\omega _0/L_{\text {A}}) t\), there then exists a KAW formation time scale given by \((\omega _0/L_{\text {A}}) t_0 \simeq 1/\varDelta _0\); i.e., \(\omega _0 t_0 \simeq (L_{\text {A}}/{\hat{\rho }})^{2/3}\). Taking, for an example, a typical laboratory plasma, \(L_{\text {A}}/{\hat{\rho }} \simeq {{\mathcal {O}}}(10^3)\), we have \(\omega _0 t_0 \simeq {{\mathcal {O}}}(10^2)\), suggesting that it is reasonable to anticipate, in the presence of SAW continuous spectrum, the appearance of KAW in such plasmas.

2.4 Satellite observations of KAWs

Due to the diagnostics constraints in laboratory plasmas, most of the KAW observations were made by satellites in the Sun–Earth space plasma environments. Shear Alfvénic oscillations in the magnetosphere have been linked to drivers from the upstream solar wind. Due to the collisionless nature of space plasmas, kinetic effects create large-amplitude waves and pressure pulses in the foreshock region upstream from the quasi-parallel bow shock. The foreshock is found to be an important source of (magnetic) pulsating continuous (Pc) magnetospheric waves in the Pc3 (period 10–45 s), Pc4 (period 45–150 s), and Pc5 (period 150–600 s) ranges (Fairfield et al. 1990; Engebretson et al. 1991; Chi et al. 1994; Clausen et al. 2009; Wang et al. 2019). The mode conversion process associated with the compressional modes of the foreshock waves has been suggested as a directly driven mechanism for the generation of the frequently observed discrete harmonic frequencies of shear Alfvénic field-line resonances (see Fig. 1) (Hasegawa et al. 1979; Hasegawa and Chen 1975; Lee et al. 1994). Indeed, near the magnetopause boundary, a sharp transition is frequently found in wave polarization from predominantly compressional waves in the magnetosheath to transverse in the boundary layer (Song et al. 1993; Rezeau et al. 1989; Chaston et al. 2008). THEMIS observations by Chaston et al. (2008) show a direct evidence of a turbulent spectrum of KAWs at the magnetopause with sufficient power to provide massive particle transport. Using coordinated observations in the foreshock and the magnetosphere, Wang et al. (2019) found direct evidence of Pc5 field line resonances driven by the foreshock perturbations. As remarked earlier, the main mode identification method for KAWs is based on the measurement of the wave polarization, \(|c \delta \varvec{E}_\perp /v_{\text {A}}\delta \varvec{B}_\perp |\). Two cases are illustrated here. One is observation by the Van Allen Probes in the Earth’s inner magnetosphere (Chaston et al. 2014) (cf. Fig. 8);

Fig. 8
figure 8

a The time-averaged ratio \(\mathrm E_{\mathrm{YFAC}}/\mathrm B_{\mathrm{XFAC}}\) in field-aligned coordinates (MKS units). Red line shows the fit of the local KAW dispersion relation (cf. Fig. 4) [reproduced from Ref. Chaston et al. (2014)]. b Relative phase and coherency (red) between \(\mathrm E_{\mathrm{YFAC}}\) and \(\mathrm B_{\mathrm{XFAC}}\) [reproduced from Ref. Chaston et al. (2014)]

the other is observations by the Cluster satellites in the solar wind (Salem et al. 2012) (cf. Fig. 9).

Fig. 9
figure 9

a Prediction of \(|\delta \varvec{E}/\delta \varvec{B}|_{\text{s}/c}\) for kinetic Alfvén waves (red curves) or whistler waves (black and blue curves) with specified angle \(\theta \). Cluster measurements of \(|\delta E_y/\delta B_z|\) up to 2 Hz, or 12 \(f_{ci}\), are presented without (green solid) and with (green dashed) the EFW noise floor removed [reproduced from Ref. Salem et al. (2012)]. b Prediction of \(|\delta B_\parallel |/|\delta \varvec{B}|_{\text{s}/c}\) for kinetic Alfvén waves (red) or whistler waves (black/blue) with specified angle \(\theta \). Cluster FGM measurements up to 2 Hz, or 12 \(f_{ci}\), are shown in green [reproduced from Ref. Salem et al. (2012)]

Both observations showed the measured polarizations, \(|c \delta \varvec{E}_\perp /v_{\text {A}}\delta \varvec{B}_\perp |\), agree qualitatively and/or quantitatively with those theoretically predicted for KAWs.

Finally, we remark that KAW physics has also been applied theoretically in laboratory fusion plasmas (Hasegawa and Chen 1975, 1976; Chen and Zonca 2013, 2016). Realistic plasma non-uniformities and magnetic field geometries often play crucially important roles in determining SAW/KAW mode structures and stability properties in such plasmas (see, e.g., Ref. Chen and Zonca (2016)). For example, in toroidal fusion plasmas, the Kinetic Toroidal Alfvén Eigenmodes (KTAEs) (Mett and Mahajan 1992) may exist within the SAW continuum and their dynamics are intrinsically related to those of KAWs. Furthermore, laboratory plasma experiments have shown evidence of coupling between SAW eigenmodes and KAWs (Wong et al. 1996) that may also be externally driven by mode conversion of fast modes (Fasoli et al. 1996). Since KAW carries significant implications to plasma heating and transport, it will be interesting to see more focused investigations on KAW physics in laboratory plasma experiments and/or simulations.

3 Nonlinear KAW physics

In this section, we first discuss the nonlinear gyrokinetic orderings and present the corresponding equations. We then apply the nonlinear gyrokinetic equations to the fundamental three-wave parametric decay instabilities. Here, we emphasize the qualitative and quantitative differences between the results of nonlinear gyrokinetic theory and those based on the ideal MHD theory. Corresponding simulations not only support the gyrokinetic theory results, but also suggest the excitation of \(k_\parallel \simeq 0\) fluctuations; i.e., convective cells. This motivated the studies on the spontaneous excitations of convective cells by KAWs. The results demonstrate the significant effects of finite ion-Larmor radius; and, thus, the nonlinear gyrokinetic theory as a powerful theoretical tool. Finally, we present a quasi-linear description of plasma transport due to KAWs.

3.1 Nonlinear gyrokinetic theory

In extending the linear gyrokinetic theory to the nonlinear regime, one allows the fluctuations to be of finite amplitudes with, however, the constraint that the corresponding nonlinear frequencies, \(\omega _{n\ell }\), be much less than the cyclotron frequency. In other words, consistent with the linear gyrokinetic orderings:

$$\begin{aligned} \left| \omega _{n\ell } \right| \sim \left| \delta \varvec{u}_\perp \cdot {\varvec{\nabla }}_\perp \right| \sim \left| \omega \right| \sim {{\mathcal {O}}}(\epsilon ) \left| \varOmega _i \right| . \end{aligned}$$

Here, \(\delta \varvec{u}_\perp \) represents the fluctuation-induced particle (guiding-center) jiggling velocity. Taking, for example, \(\delta \varvec{u}_\perp \simeq v_\parallel \delta \varvec{B}_\perp /B_0\) due to magnetic fluctuation, \(\delta \varvec{B}_\perp \), \(v_\parallel \sim v_t\), and \(|{\varvec{\nabla }}_\perp | \sim 1/\rho _i\), we then obtain the following nonlinear gyrokinetic orderings (Frieman and Chen 1982):

$$\begin{aligned} \left| \delta f/F_0 \right| \sim \left| \delta \varvec{B}/B_0 \right| \sim \left| c \delta \varvec{E}_\perp /(B_0v_t) \right| \sim {{\mathcal {O}}}(\epsilon ). \end{aligned}$$

Again, let us consider the case of a uniform plasma to simplify the presentation and highlight the important underlying physics. In a uniform case, the perturbed distribution function, \(\delta f\) as in the linear case, can be de-composed into an adiabatic and a non-adiabatic components, that is:

$$\begin{aligned} \delta f = -\frac{q}{T} F_{\text {M}} \delta \phi + \exp \left( - \varvec{\rho } \cdot {\varvec{\nabla }} \right) \delta g. \end{aligned}$$

Here, we have taken the background distribution to be Maxwellian, and \(\delta g\) satisfies the following nonlinear gyrokinetic equation (Frieman and Chen 1982):

$$\begin{aligned} \left[ \frac{\partial }{\partial t} + v_\parallel \varvec{b}_0 \cdot {\varvec{\nabla }} + \left\langle \delta \varvec{u}_{\perp \text{g}} \right\rangle _\alpha \cdot {\varvec{\nabla }} \right] \delta g = \frac{q}{T} F_\text{M} \frac{\partial }{\partial t} \left\langle \delta L_\text{g} \right\rangle _\alpha , \end{aligned}$$

\(\delta L_\text{g}\) given by Eqs. (16) and (17), and:

$$\begin{aligned} \left\langle \delta \varvec{u}_{\perp g} \right\rangle _\alpha= & (c/B_0) \varvec{b}_0 \times {\varvec{\nabla }} \left\langle \delta L_{\text {g}} \right\rangle _\alpha \nonumber \\= & (c/B_0) \left\langle \delta \varvec{E}_{\perp \text{g}} \right\rangle _\alpha \times \varvec{b}_0 + v_\parallel \left\langle \delta \varvec{B}_{\perp \text{g}} \right\rangle _\alpha /B_0. \end{aligned}$$

Expanding in terms of plane-wave solutions, Eq. (43) yields:

$$\begin{aligned} i \left( k_\parallel v_\parallel - \omega _k \right) \delta g_k= & - i \omega _k \frac{q}{T} J_k \delta L_k F_\text{M} \nonumber \\&+ \frac{c}{B_0} \varLambda _{k'}^{k''} \left[ J_{k'} \delta L_{k'} \delta g_{k''} - J_{k''} \delta L_{k''} \delta g_{k'} \right] , \end{aligned}$$

where \(J_k \equiv J_0(k_\perp v_\perp /\varOmega )\), \(J_0\) is the Bessel function:

$$\begin{aligned} \varLambda _{k'}^{k''}= & \varvec{b}_0 \cdot \left( \varvec{k}_\perp ' \times \varvec{k}_\perp '' \right) , \end{aligned}$$
$$\begin{aligned} \delta L_k= & \delta \phi _k - v_\parallel \delta A_{\parallel k}/\text{c} , \end{aligned}$$

and \((\omega _k, \varvec{k})\) satisfy frequency and wave-vector matching conditions; i.e., \(\omega _k = \omega _{k'} + \omega _{k''}\) and \(\varvec{k} = \varvec{k}' + \varvec{k}''\).

The field equations remain the same; i.e., the Poisson’s equation or the quasi-neutrality condition and the parallel Ampère’s Law or the generalized nonlinear gyrokinetic vorticity equation. The quasi-neutrality condition is formally the same as in the linear theory, that is:

$$\begin{aligned} \left( 1 + \tau \right) e \delta \phi _k / T_\text{e} = \left\langle J_k \delta g_{ki} - \delta g_{ke} \right\rangle _v, \end{aligned}$$

with \(\tau \equiv T_{\text {e}}/T_{\text {i}}\), consistent with the definition introduced below Eq. (25) and where we dropped the subscript “0” on equilibrium temperature; and \(J_k = J_{ki}\) for brevity. The nonlinear gyrokinetic vorticity equation (Chen and Zonca 2016; Chen et al. 2001; Zonca and Chen 2014a, b), meanwhile, is given by:

$$\begin{aligned} i k_\parallel \delta J_{\parallel k} - i \frac{c}{4\pi } \frac{\omega _k}{v_{\text {A}}^2}\frac{k_\perp ^2}{b_k} \left( 1-\varGamma _k \right) \delta \phi _k = \left( \mathrm{NL} \right) _{\text {A}} + \left( \mathrm{NL} \right) _{\phi }, \end{aligned}$$

where \(b_{k} = k_\perp ^2 \rho _i^2/2\), \(\varGamma _{k} = I_0(b_{k}) \exp ( - b_{k} )\), consistent with the definitions introduced below Eq. (22):

$$\begin{aligned} \left( \mathrm{NL} \right) _{\text {A}} = - \varLambda _{k'}^{k''} \left( \delta A_{\parallel k'} \delta J_{\parallel k''} - \delta A_{\parallel k''} \delta J_{\parallel k'} \right) /B_0, \end{aligned}$$


$$\begin{aligned} \left( \mathrm{NL} \right) _{\phi }= & \left( ec/B_0 \right) \varLambda _{k'}^{k''} \left\langle \left( J_{k} J_{k'} - J_{k''} \right) \delta L_{k'} \delta g_{k''} \right. \nonumber \\&- \left. \left( J_{k} J_{k''} - J_{k'} \right) \delta L_{k''} \delta g_{k'} \right\rangle _{i,v}. \end{aligned}$$

We remark that \(\left( \mathrm{NL} \right) _{\text {A}}\) corresponds to the Maxwell stress term due to the \(\delta J_\parallel \varvec{b}_0 \times \delta \varvec{B}_\perp \) force with \(\delta J_\parallel \) mainly carried by electrons due to \(m_e\ll m_i\). \(\left( \mathrm{NL} \right) _{\phi }\), meanwhile, is the gyrokinetic stress tensor, which is dominated by ions and reduces to the well-known fluid Reynolds stress in the \(k_\perp ^2 \rho _i^2 \ll 1\) limit (Chen and Zonca 2016; Chen et al. 2001).

3.2 Parametric decay instabilities

Parametric decay instability (PDI) is a fundamental nonlinear process involving three nonlinear coupled waves/oscillators (Jackson 1967; Liu and Rosenbluth 1976). One is the pump (“mother”) wave and the other two are the decay (“daughter”) waves. The PDI can be either resonant if both decay waves are marginally stable or weakly damped normal modes, or non-resonant if one of the decay waves is a heavily damped quasi-mode. Since the pump wave can be either spontaneously or externally excited, PDI, thus, is an important channel for wave energy transfer along with its associated consequences on plasma heating, acceleration, and transports.

Interested readers may refer to the original work (Chen and Zonca 2011) for the detailed derivations of the KAW PDI dispersion relations. Here, we will just present the key points and results. Let the three interacting waves be the pump wave \(\varvec{\varOmega }_0 = (\omega _0, \varvec{k}_0)\), the low-frequency daughter SSW \(\varvec{\varOmega }_{\text {s}} = (\omega _{\text {s}}, \varvec{k}_{\text {s}})\), and the daughter KAW \(\varvec{\varOmega }_- = (\omega _-, \varvec{k}_-)\) with \(\omega _- = \omega _\text{s} - \omega _0 \) and \(\varvec{k}_- = \varvec{k}_{\text {s}} - \varvec{k}_0\); consistent with frequency and wave-vector matching conditions. Let the small but finite pump wave amplitude be denoted as \(\varPhi _0 = e \delta \phi _0/T_{\text {e}}\). As \(\varvec{\varOmega }_s\) could be a quasi-mode, we then need to retain \({{\mathcal {O}}}(|\varPhi _0|^2)\) terms to properly account for non-resonant PDI. Carrying on the straightforward algebra (Chen and Zonca 2011), we then derive the KAW PDI dispersion relation:

$$\begin{aligned} \epsilon _{\text{s}k}\left( \epsilon _{Ak-}+\chi _{A-}^{(2)}\right) =C_{k}\left| \varPhi _{0}\right| ^{2}. \end{aligned}$$


$$\begin{aligned} \epsilon _{{\text{s}}k}=1+\tau +\tau \varGamma _{{\text {s}}} \xi _{{\text {s}}} Z (\xi _{{\text {s}}}) \end{aligned}$$


$$\begin{aligned} \epsilon _{\text {A}k-}=\left[ \left( 1-\varGamma _{-}\right) / b_{-}-\left( k_{\Vert }^{2} v_{A}^{2} / \omega ^{2}\right) _{-} \sigma _{-}\right] \end{aligned}$$

are the linear dielectric constants of, respectively, the \(\varvec{\varOmega }_{\text {s}}\)-SSW and \(\varvec{\varOmega }_-\)-KAW decay waves. Meanwhile, again, \(b_k = k_\perp ^2 \rho _i^2/2\), \(\varGamma _k = I_0(b_k) \exp (-b_k)\), and, from Eq. (29), \(\sigma _k = 1 + \tau (1 - \varGamma _k)\). \(\chi _{A-}^{(2)}\), as will be further discussed later, corresponds to nonlinear ion Compton scattering:

$$\begin{aligned} \chi _{A-}^{(2)}= & \epsilon _{{\text {s}}k}\left( \lambda ^{2} / \varGamma _{{\text {s}}}\right) G\left| \varPhi _{0}\right| ^{2}, \end{aligned}$$
$$\begin{aligned} \lambda ^{2}= & \left( \varOmega _{i} / \omega _{0}\right) ^{2} \left( \rho _{{\text {s}}}^{2} \varLambda _0^{\text {s}}\right) ^2/\left( \sigma _{-} b_{{\text {s}}-}\right), \end{aligned}$$
$$\begin{aligned} G= & \left\langle J_{0}^{2} J_{-}^{2} F_{0i}\right\rangle _{v}/n_0-F_{1}^{2} / \varGamma _{{\text {s}}}, \end{aligned}$$


$$\begin{aligned} F_{1}=\left\langle J_{{\text {s}}} J_{0} J_{-} F_{0i} \right\rangle _{v}/n_0, \end{aligned}$$

where \(\rho _{\text {s}}^2 = \tau \rho _i^2\) and \(b_{{\text {s}}-} = \tau b_{-}\). Note that \(G\ge 0\) from Schwartz inequality. \(C_{k}\) on the right-hand side of Eq. (52) represents the nonlinear coupling coefficient between \(\varvec{\varOmega }_{\text {s}}\) and \(\varvec{\varOmega }_-\) daughter waves via the pump wave \(|\varPhi _0|\), and:

$$\begin{aligned} C_k= \left( \lambda H \right) ^2, \end{aligned}$$


$$\begin{aligned} H=\left( \sigma _{0} \sigma _{-}-F_{1} \sigma _{{\text {s}}} / \varGamma _{{\text {s}}}\right) . \end{aligned}$$

Furthermore, in the PDI dispersion relation, Eq. (52), we have dropped the term associated with nonlinear frequency shift to focus on the stability property (Chen and Zonca 2011).

Let us first consider the resonant decay, which occurs when both decay daughter waves, \(\varvec{\varOmega }_{\text {s}}\) and \(\varvec{\varOmega }_-\), are weakly damped normal modes. This generally requires \(\tau \equiv T_{\text {e}}/T_{\text {i}} \gtrsim 5\) (Hasegawa and Chen 1975, 1976) to minimize the ion-Landau damping of the \(\varvec{\varOmega }_{\text {s}}\) (SSW) mode. In this case, letting, \(\omega _{\text {s}} = \omega _{{\text {s}}r} + i \gamma \) as well as noting \(\epsilon _{{\text {s}}kr} (\omega _{{\text {s}}r}) = 0\) and \(\epsilon _{Ak-r}(\omega _{{\text {A}}-r})=\epsilon _{{\text {A}}k-r}(\omega _{{\text {s}}r}-\omega _0)=0\), Eq. (52) reduces to:

$$\begin{aligned} \left( \gamma +\gamma _{{\text {dA}}-}\right) \left( \gamma +\gamma _{{\text {ds}}}\right) =\left( \lambda H\left| \varPhi _{0}\right| \right) ^{2}\left[ -\frac{\partial \epsilon _{{\text {s}}kr}}{\partial \omega _{{\text {s}} r}} \frac{\partial \epsilon _{{\text {A}}k-r}}{\partial \omega _{{\text {A}}-r}}\right] ^{-1}, \end{aligned}$$

where \(\gamma _{{\text {dA}}-}\) and \(\gamma _{{\text {ds}}}\) are, respectively, the linear damping rates of the KAW and SSW daughter waves. We also note that, to have a parametric growth (\(\gamma >0\)), the square bracket term on the right-hand side of Eq. (61) must be positive; which can be shown to dictate \(\omega _{{\text {s}}r}\omega _0 > 0\). Thus, the KAW decay wave has its normal-mode real frequency lower than that of the KAW pump frequency, \(\omega _0\), by the amount of the SSW normal-mode frequency, \(\omega _{{\text {s}}r}\). Noting that, for \(\beta \ll 1\), we have \(|\omega _0| \sim |k_{\parallel 0} v_{\text {A}}| \gg |k_{\parallel 0} c_{\text {S}}|\) and \(|\omega _{{\text {A}}-r}| \sim |k_{\parallel {\text {A}}-} v_{\text {A}}| \gg |k_{\parallel A-} c_{\text {S}}|\). Thus, to satisfy the frequency and wave-number matching conditions for the resonant decay, \(|\omega _{sr}| \sim |k_{\parallel {\text {s}}} c_{\text {S}}|\), we must have \(k_{\parallel {\text {A}}-} \simeq k_{\parallel 0}\) or \(k_{\parallel {\text {s}}} \simeq 2 k_{\parallel 0}\). Consequently, we have \((\omega _0/k_{\parallel 0}) (\omega _{{\text {A}}-r}/k_{\parallel {\text {A}}-})<0\); i.e., the decay KAW daughter wave, \(\varvec{\varOmega }_-\), has parallel (to \(\varvec{B}_0\)) group velocity opposite to that of the pump wave. In other words, \(\varvec{\varOmega }_-\) can be understood as a KAW due to backscattering of the \(\varvec{\varOmega }_0\) pump wave by \(\varvec{\varOmega }_{\text {s}}\) fluctuations. Finally, \(|\varPhi _0|\) must be over a threshold value set by \(\gamma _{{\text {ds}}}\) and \(\gamma _{{\text {dA}}-}\) to achieve \(\gamma > 0\).

For \(\tau \equiv T_e/T_i \lesssim 5\), the \(\varvec{\varOmega }_{\text {s}}\) SSW mode is, in general, heavily ion-Landau damped; i.e., it becomes a quasi-mode. The \(\varvec{\varOmega }_-\) KAW mode, meanwhile, remains a weakly damped normal mode. The PDI growth rate, \(\gamma \), is then determined by the imaginary part of the dispersion relation [Eq. (52)]:

$$\begin{aligned} \left( \gamma +\gamma _{{\text {dA}}-}\right) \left( -\frac{\partial \epsilon _{{\text {A}}k-r}}{\partial \omega _{{\text {A}}-r}}\right)= & {\mathbb {I}}\mathrm{m} \left[ \chi _{{\text {A}}-}^{(2)} - \frac{C_{k}}{\epsilon _{{\text {s}}k}}\left| \varPhi _{0}\right| ^{2}\right] \nonumber \\= & \left| \lambda \varPhi _0\right| ^2 \left[ G/\varGamma _s + H^2/\left| \epsilon _{{\text {s}}k} \right| ^2\right] {\mathbb {I}}\mathrm{m} \epsilon _{{\text {s}}k}, \end{aligned}$$

where, again, \(G\ge 0\), H is given by Eq. (60),

$$\begin{aligned} {\mathbb {I}}\mathrm{m} \epsilon _{{\text {s}}k} = \tau \varGamma _{\text {s}} {\mathbb {I}}\mathrm{m} [\xi _{\text {s}} Z(\xi _{\text {s}})], \end{aligned}$$

and \(\xi _s = \omega _{{\text {s}}r}/|k_{\parallel s}| v_{{\text {ti}}} = (\omega _0+\omega _{{\text {A}}-r})/|k_{\parallel 0} + k_{\parallel {\text {A}}-}| v_{{\text {ti}}}\). Since \({\mathbb {I}}\mathrm{m} \epsilon _{{\text {s}}k}\) maximizes around \(\xi _{\text {s}} \sim {{\mathcal {O}}}(1)\) or \(\omega _{{\text {s}}r} \sim |k_{\parallel 0} + k_{\parallel A-}| v_{{\text {ti}}}\), and, again, we have \(|\omega _0| \sim |\omega _{{\text {A}} - r}| \gg |k_{\parallel 0}| v_{ti} \sim |k_{\parallel A-}| v_{{\text {ti}}}\), the \(\varvec{\varOmega }_-\) KAW mode is, again, a backscattered KAW normal mode with frequency lower than the pump wave frequency \(\omega _0\).

Note, from Eqs. (61) and (62), that the parametric decay instability growth rates increase with the nonlinear coupling coefficient, \(\left| C_k |\varPhi _0|^2\right| \) of Eq. (52), which can be readily shown to scale with \(|k_\perp \rho _{\text {i}}|^4 |\delta \varvec{B}_{\perp 0}/B_0|^2\) for \(|k_\perp \rho _{\text {i}}|^2\ll 1\) and \(|\delta \varvec{B}_{\perp 0}/B_0|^2/|k_\perp \rho _{\text {i}}|\) for \(|k_\perp \rho _{\text {i}}|^2\gg 1\). The decay instabilities are, thus, strongest when \(|k_\perp \rho _{\text {i}}| \sim {{\mathcal {O}}}(1)\); and it clearly demonstrates the necessity of keeping FILR kinetic effects in dealing with the decay instabilities of KAW.

Finally, it is illuminating to compare the decay instabilities of KAWs versus those of SAWs in the MHD regime (Sagdeev and Galeev 1969). In a nutshell, employing the ideal MHD fluid theory, the PDI dispersion relation takes the form similar to the KAW PDI dispersion relation, Eq. (52), with KAW terms replaced by corresponding SAW terms; e.g., \(\epsilon _{{\text {A}}k-}\) by \(\epsilon _{{\text {A}}-}\) etc. The more fundamental change lies in the nonlinear coupling term; that is, \(C_k\) is replaced by \(C_{\text {I}}\) given as:

$$\begin{aligned} C_{\text {I}}= & \left| \varvec{k}_{\perp 0} \rho _s \cdot \varvec{k}_{\perp -} \rho _{\text {s}} \right| ^2/\left[ b_{{\text {s}} -} ( 1 + \varGamma _{\text {i}} T_{\text {i}}/T_{\text {e}})\right] \nonumber \\= & \left[ b_{{\text {s}} 0} / ( 1 + \varGamma _{\text {i}} T_{\text {i}}/T_{\text {e}})\right] \cos ^2 \theta _0. \end{aligned}$$

Here, \(\theta _0\) is the angle between \(\varvec{k}_{\perp 0}\) and \(\varvec{k}_{\perp -}\), and \(\varGamma _i\) is the ion ratio of specific heats. \(C_k\), meanwhile, can be expressed as:

$$\begin{aligned} C_k = (\varOmega _{\text {i}}/\omega _0)^2 (b_{{\text {s}} 0}/\sigma _-) H^2 \sin ^2 \theta _0. \end{aligned}$$

We then have:

$$\begin{aligned} |C_k|/|C_{\text {I}}| \sim {{\mathcal {O}}} ( | \varOmega _i H/ \omega _0|^2), \end{aligned}$$

which becomes, noting H given by Eq. (60):

$$\begin{aligned} |C_k|/|C_{\text {I}}| \sim {{\mathcal {O}}} ( | \varOmega _i/ \omega _0|^2) |k_\perp \rho _{\text {i}}|^4; \quad \mathrm{for} \quad |k_\perp \rho _{\text {i}}|^2\ll 1, \end{aligned}$$


$$\begin{aligned} |C_k|/|C_{\text {I}}| \sim {{\mathcal {O}}} ( | \varOmega _{\text {i}}/\omega _0|^2); \quad \mathrm{for} \quad |k_\perp \rho _{\text {i}}| \sim {{\mathcal {O}}}(1). \end{aligned}$$

Equation (67) indicates that, for \(1> |k_\perp \rho _{\text {i}}|^2\ > |\omega _0/\varOmega _{\text {i}}|\), nonlinear couplings via kinetic effects dominate. Noting that \(|\omega _0/\varOmega _{\text {i}}| \sim {{\mathcal {O}}}(10^{-3})\) in typical laboratory plasmas, the validity regime of MHD fluid theory for the SAW nonlinear physics is rather limited. Furthermore, at the \(|k_\perp \rho _{\text {i}}| \sim {{\mathcal {O}}}(1)\) regime where KAW nonlinear effects maximize, we have \(|H| \sim {{\mathcal {O}}}(1)\) and \(|C_k|/|C_{\text {I}}| \sim {{\mathcal {O}}} ( | \varOmega _{\text {i}}/\omega _0|^2) \sim {{\mathcal {O}}}(10^{6})\) for typical parameters.

In addition to the significantly enhanced PDI growth rates, there is, perhaps, more significant qualitative difference between KAW and SAW PDI in terms of the wave-vector of the scattered daughter wave with respect to that of the pump wave. Note, from Eq. (64), \(C_{\text {I}} \propto \cos ^2 \theta _0\) and, thus, the SAW scattering maximizes around \(\theta _0 = 0\) and \(\pi \); i.e., when \(\varvec{k}_{\perp -}\) is parallel or anti-parallel to \(\varvec{k}_{\perp 0}\); or \(\varvec{k}_{0}\) and \(\varvec{k}_{-}\) are co-planar. In contrast, we have, from Eq. (65), \(C_k \propto \sin ^2 \theta _0\) and, thus, the KAW scattering maximizes around \(\theta _0 = \pm \pi /2\); i.e., \(\varvec{k}_{\perp 0}\) and \(\varvec{k}_{\perp -}\) are orthogonal. This difference not only affects, as might be expected, the nonlinear evolution of KAW turbulence, but also, as we will argue further below and perhaps more significantly, charged particle transports induced by the KAW decay processes.

Let us consider the pump wave be the mode-converted KAW at the Earth’s dayside magnetopause; thus, \(\varvec{k}_{\perp 0} = k_{\perp 0} \hat{\varvec{r}}\) with \(\hat{\varvec{r}}\) being in the Sun-Earth radial direction. Now, according to the ideal MHD theory, the decay wave tends to have \(\varvec{k}_{\perp -} = k_{\perp -} \hat{\varvec{r}}\) and, thus, the East–West azimuthal symmetry is in general kept. In other words, charged particle’s East–West azimuthal generalized momentum, \(P_\phi \), is conserved, which implies no or little radial transport (Chen 1999). On the other hand, in the KAW regime, the decay wave would have wave-vector in the East-West azimuthal direction; i.e., \(\varvec{k}_{\perp -} = k_{\perp -} \hat{\varvec{\phi }}\) and, hence, the East–West azimuthal symmetry is broken by the daughter wave and, consequently, \(P_\phi \) is no longer conserved and finite radial transports could occur (Chen 1999). These features are observed in the numerical simulations to be discussed below. In addition, the MHD fluid theory would suggest that the turbulence in the perpendicular to \(\varvec{B}_0\) plane to be preferentially anisotropic in the \(\hat{\varvec{r}}\) direction, while KAW turbulence would tend to be more isotropic.

Insights to the above qualitative and quantitative transitions in the nonlinear coupling coefficient between the long-wavelength MHD fluid and the short-wavelength KAW regimes can be also gained by examining the responsible nonlinear coupling mechanisms. More specifically, while in the MHD regime, slow-sound fluctuations are nonlinearly generated by the \((\delta \varvec{J}_\perp \times \delta \varvec{B}_\perp )\cdot \varvec{b}_0/c\) parallel (to \(\varvec{B}_0\)) force; in the KAW regime, the nonlinear force is due to the \(m_i n_i (\delta \varvec{u} \cdot {\varvec{\nabla }}) \delta u_\parallel \) convective nonlinear term. Similarly, while in the MHD regime, scatterings of the SAW by the slow-sound fluctuations occur via the \(\delta n_s (\partial \delta \varvec{u}_0/\partial t)\) nonlinear ion density modulation; scatterings of the KAW occur, again, via the \(n_i (\delta \varvec{u} \cdot {\varvec{\nabla }}) \delta \varvec{u}_0\) convective nonlinearity.

Numerical simulations on the linear mode conversion of KAW and the ensuing nonlinear wave generations were carried out by Lin et al. (2012) using a three-dimensional hybrid model, in which ions are treated as fully kinetic particles and electrons are treated as a massless fluid. Readers are referred to the original work for details. Here, we summarize and discuss the essentials. Specifically, consider a slab plasma with \(\varvec{B}_0 = B_0 \hat{\varvec{z}}\) and non-uniformities in the x (radial) direction. Simulations demonstrated that an incoming fast compressional Alfvén wave mode converted into a short-wavelength KAW with \(|k_x \rho _i| \sim {{\mathcal {O}}}(1)\) localized about the Alfvén resonance point. This mode-converted KAW then serves as a pump KAW and nonlinearly excited secondary KAWs with, preferentially, short azimuthal wavelengths; i.e., \(|k_\perp \rho _i| \sim |k_y \rho _{\text {i}}| \sim {{\mathcal {O}}}(1)\). To analyze the nonlinear wave generation mechanism in more details, Lin et al. (2012) further carried out dedicated simulations with a prescribed pump KAW in a uniform plasma. The resultant \((k_\parallel , \omega )\) spectra of \(\delta B_x\) and \(\delta E_\parallel \) are shown in Fig. 10.

Fig. 10
figure 10

\(k_\parallel \)-\(\omega \) spectra of \(\delta B_x\) and \(\delta E_\parallel \) obtained from the simulation of decay of an initial pump KAW in a uniform plasma. The solid black line indicates the dispersion relation of the MHD shear Alfvén mode for reference. Multiples of the parallel pump KAW wave-number, \(k_{\parallel 0}\) (indicated here as \(k_{\parallel {\text {p}}}\), consistent with the original figure), are also shown [reproduced from Ref. Lin et al. (2012)]

In the right plot of \(\delta E_\parallel \), we can see the pump KAW at \(\varvec{\varOmega }_0 = (k_{\parallel 0} = 0.2, \omega _0 = 0.6)\) and the slow-sound wave at \(\varvec{\varOmega }_s = (k_{\parallel s} \simeq 2 k_{\parallel 0} = 0.4, \omega _s = 0.2)\). Correspondingly, in the left plot of \(\delta B_x\), we see the backscattered decay KAW with \(\varvec{\varOmega }_- = (k_{\parallel -} = - 0.2, \omega _0 = 0.4)\). Note, since \(T_{\text {e}}/T_{\text {i}} \simeq 0.4\), the slow-sound wave, \(\varvec{\varOmega }_{\text {s}}\), is a heavily ion-Landau damped quasi-mode, and the PDI corresponds to the nonlinear ion-induced scattering. Both the \(\varvec{\varOmega }_-\) and \(\varvec{\varOmega }_{\text {s}}\) modes have preferentially short wavelengths in the \(\hat{\varvec{y}}\) direction; i.e., \(|k_{\perp s} \rho _{\text {i}}| \sim |k_{y {\text {s}}} \rho _{\text {i}}| \sim |k_{y -} \rho _{\text {i}}| \sim {{\mathcal {O}}} (1)\). The simulation results are, thus, consistent with analytical theories discussed above.

As noted by Lin et al. (2012), the \(\delta B_x\) spectrum also showed excitations around \((k_\parallel \approx 0, \omega \approx 0)\) with \(|k_{y} \rho _{\text {i}}| \sim {{\mathcal {O}}} (1)\) short wavelengths; which, as suggested, correspond to magnetostatic convective cells (Chu et al. 1978). The nonlinear excitations of convective cells also explain the appearance of \(\delta B_x\) fluctuations at \(k_{\parallel 0}\) and \(\omega _0\), since the pump KAW with \(k_{y0} \simeq 0\) has \(\delta B_{x 0} \simeq 0\). These interesting simulation results, thus, naturally lead to the following gyrokinetic analytic theory on excitations of convective cells via the modulational instabilities of a KAW pump wave.

3.3 Nonlinear excitations of convective cells

Convective cells have been of theoretical interests since the 1970s (Chu et al. 1978; Lin et al. 1978; Okuda and Dawson 1973; Taylor and McNamara 1971), since they lead to vortex dynamics perpendicular to the confining magnetic field and, consequently, carry significant implications to the cross-field transport (Sagdeev et al. 1978). Historically, convective cells have been classified into two categories, the electrostatic convective cells (ESCC) with \(\delta \varvec{E} = \delta \varvec{E}_\perp \) (Okuda and Dawson 1973; Taylor and McNamara 1971) and the magnetostatic convective cells (MSCC) with \(\delta \varvec{B} = \delta \varvec{B}_\perp \) (Chu et al. 1978). In recent years, there has been renewed interest in convective cells since they may be regarded as paradigms of the so-called zonal structures in laboratory fusion plasmas (Chen and Zonca 2013; Zonca et al. 2015). Zonal structures are fluctuations with \(\varvec{k} \cdot \varvec{B}_0 = 0\) and varying only in the radial direction (Hasegawa et al. 1979). Zonal structures may have frequencies either around \(\omega = 0\) or a finite frequency (i.e., the so-called geodesic acoustic mode (Winsor et al. 1968)). The \(\omega =0\) zero-frequency zonal structures could be either zonal flow or zonal field/current; corresponding, respectively, to ESCC and MSCC. In this respect, zonal structures may be regarded as subset of convective cells.

Since convective cells have \(\omega \approx 0\), they are nominally damped by either viscosity and/or resistivity; and, thus, generally require nonlinear excitations in order to achieve finite intensities. In laboratory fusion plasmas, nonlinear excitations of convective cells (i.e., zonal structures) usually occur via mode–mode couplings of ambient drift-wave and/or Alfvén-wave instabilities. In this respect, zonal structures may be regarded as spontaneous growth of corrugations of the radial equilibrium profiles, which, in turn, scatter the ambient instabilities into the radially short-wavelength stable domain. Zonal structures, therefore, provide self-regulatory mechanisms for the ambient turbulences and the associated transports. We refer to the recent review (Chen and Zonca 2016) for readers interested in this important topic.

In the present review, we will focus on nonlinear excitations of convective cells by KAWs in uniform plasmas to explore in sufficient details the underlying physics mechanisms. Since convective cells have \(\varvec{k} \cdot \varvec{B}_0 = 0\), their nonlinear excitations involve couplings between co-propagating SAWs with the same \(k_\parallel \); which vanishes in the ideal MHD limit due to the cancelation between the Reynolds and Maxwell stresses; i.e., the pure Alfvénic state (Alfvén 1942, 1950; Walén 1944). It, thus, has long been recognized that only non-ideal MHD fluctuations, such as KAW, can nonlinearly excite convective cells (Chen and Zonca 2013; Mikhailovskii et al. 2007; Onishchenko et al. 2004a, b; Pokhotelov et al. 2003, 2004; Zhao et al. 2011). Furthermore, since having \(\omega =0\), it is also recognized that it takes the form of modulational instabilities for the spontaneous excitations of convective cells by KAWs; that is, deviation from the wave periodic behavior is further reinforced by nonlinearity, which may lead to spectral sidebands and possibly to breaking of the periodic fluctuation into modulated pulses (Benjamin and Feir 1967; Chen and Zonca 2013; Zonca et al. 2015). Previous theoretical studies, however, suffer from two limiting considerations; (1) employing two-fluid or drift-kinetic descriptions, and (2) assuming that ESCC and MSCC are de-coupled. By (1), effects due to finite ion-Larmor radii (FILR) are ignored. Both limiting considerations have been adopted to simplify the theoretical analysis and, as will be shown here, lead to erroneous conclusions on the nonlinear excitation mechanisms. Here, we will employ the nonlinear gyrokinetic equation and demonstrate that both the FILR as well as the finite coupling between ESCC and MSCC play qualitatively crucial roles in the dynamics of the modulational excitations of convective cells. Only key points of the theoretical analysis and results will be highlighted here. Readers are referred to the original works for details.

We consider a uniform Maxwellian plasma immersed in a confining magnetic field, \(\varvec{B}_0 = B_0 \hat{\varvec{z}}\). Furthermore, we assume \(1 \gg \beta _{\text {e}}, \beta _{\text {i}} \gg m_{\text {e}}/m_{\text {i}}\) and ignore the compressional Alfvén wave; i.e., \(\delta B_\parallel \approx 0\). Denoting \(\varvec{\varOmega }_0 = (\omega _0, \varvec{k}_0)\) as the finite-amplitude pump KAW and \(\varvec{\varOmega }_z = (\omega _z, \varvec{k}_z)\) as the convective cell (CC) mode, four-wave modulational instability then involves couplings with the upper and lower KAW sidebands denoted, respectively, as \(\varvec{\varOmega }_+ = (\omega _+=\omega _z+\omega _0, \varvec{k}_+ = \varvec{k}_z + \varvec{k}_0)\) and \(\varvec{\varOmega }_- = (\omega _-=\omega _z-\omega _0, \varvec{k}_- = \varvec{k}_z - \varvec{k}_0)\). With compressional Alfvén wave suppressed due to frequency separation, the field variables are \(\delta \phi _{\varvec{k}}\) and \(\delta A_{\parallel \varvec{k}}\) with \(\varvec{k} = 0, \, z, \, \pm \) corresponding to the \(\varvec{\varOmega }_0\), \(\varvec{\varOmega }_z\) and \(\varvec{\varOmega }_\pm \) fluctuations. The governing equations, meanwhile, are the nonlinear gyrokinetic equation, Eq. (45), the quasi-neutrality condition, Eq. (48), and the nonlinear gyrokinetic vorticity equation, Eq. (49).

Carrying out the standard perturbative analysis to \({{\mathcal {O}}}(|\delta \phi _0|^2)\), we then derive, after some straightforward but lengthy algebra (Zonca et al. 2015), the following coupled equations between \(\delta \phi _z\) and \(\delta \psi _z \equiv \omega _0 \delta A_{\parallel z}/(k_{\parallel 0}c)\):

$$\begin{aligned} \left[ \gamma _z^2 + \frac{\varDelta ^2}{1 + \varDelta /\omega _0} \right] \delta \phi _z= & - \alpha _\phi (\delta \phi _z - \delta \psi _z) + \beta _\phi \delta \psi _z, \nonumber \\ \left[ \gamma _z^2 + \frac{\varDelta ^2}{1 + \varDelta /\omega _0} \right] \delta \psi _z= & - \alpha _\psi (\delta \phi _z - \delta \psi _z) + \beta _\psi \delta \psi _z. \end{aligned}$$

Here, we have let \(\omega _z = i \gamma _z\) and \(\varvec{k}_z \perp \varvec{k}_0\) to maximize the nonlinear couplings:

$$\begin{aligned} \frac{\varDelta }{\omega _0} = \frac{b_+ \sigma _+ (1 - \varGamma _0) - b_0 \sigma _0 (1 - \varGamma _+)}{2 b_0 \sigma _0 (1 - \varGamma _+)} \end{aligned}$$

is the frequency mismatch between the normal-mode frequency of KAW at \(\varvec{k}_\pm \) and \(\omega _0\), \(\varDelta > 0\), and we have noted that \(b_-=b_+\) as well as \(\varGamma _- = \varGamma _+\) for \(\varvec{k}_z \cdot \varvec{k}_0 =0\), and applied the KAW dispersion relation, Eq. (30). Furthermore:

$$\begin{aligned} \alpha _\phi= & \left| \frac{c}{B_0} k_z k_{\perp 0} \delta \phi _0 \right| ^2 \frac{1}{1 - \varGamma _+} \left[ \frac{\varGamma _0 - \varGamma _+}{1-\varGamma _z} \left( \varGamma _0 - \varGamma _z \right. \right. \nonumber \\&\left. \left. - \frac{b_+}{b_0} \frac{1- \varGamma _0}{1 + \varDelta /\omega _0} \right) + \frac{b_z (1 - \varGamma _0)}{b_0(1-\varGamma _z)} \left( \left( 1 - \varGamma _+ \right) \sigma _0 \right. \right. \nonumber \\&\left. \left. - \frac{\left( \varGamma _0 - \varGamma _z \right) \sigma _+}{1 + \varDelta /\omega _0} \right) \right], \end{aligned}$$
$$\begin{aligned} \beta _\phi= & \left| \frac{c}{B_0} k_z k_{\perp 0} \delta \phi _0 \right| ^2 \frac{1}{1 - \varGamma _+} \left[ \frac{b_z (1 - \varGamma _0)}{b_0(1-\varGamma _z)} \frac{\sigma _+}{ 1 + \varDelta /\omega _0} \right. \nonumber \\&\left. - \frac{\varGamma _0 - \varGamma _+}{1-\varGamma _z} \right] \left( 1 - \varGamma _z - b_z \frac{1- \varGamma _0}{b_0} \right) , \end{aligned}$$
$$\begin{aligned} \alpha _\psi= & \left| \frac{c}{B_0} k_z k_{\perp 0} \delta \phi _0 \right| ^2 \frac{\sigma _0}{1 - \varGamma _+} \frac{\varDelta /\omega _0}{1 + \varDelta /\omega _0} \nonumber \\&\times \left[ \left( 1 - \varGamma _+ \right) \sigma _0 - \left( \varGamma _0 - \varGamma _z \right) \sigma _+ \right] , \end{aligned}$$


$$\begin{aligned} \beta _\psi= & \left| \frac{c}{B_0} k_z k_{\perp 0} \delta \phi _0 \right| ^2 \frac{\sigma _0}{1 - \varGamma _+} \frac{\varDelta /\omega _0}{1 + \varDelta /\omega _0} \nonumber \\&\times \sigma _+ \left( 1 - \varGamma _z - b_z \frac{1- \varGamma _0}{b_0} \right) . \end{aligned}$$

Equation (69) clearly indicates that \(\delta \phi _z\) (ESCC) and \(\delta \psi _z \propto \delta A_{\parallel z}\) (MSCC) are intrinsically coupled. From Eq. (69), one readily obtains the following modulational instability dispersion relation for the spontaneous excitations of the CCs by the \(\varvec{\varOmega }_0\) pump KAW:

$$\begin{aligned} Y^2 + Y (\alpha _\phi - \alpha _\psi - \beta _\psi ) + (\alpha _\psi \beta _\phi - \alpha _\phi \beta _\psi ) = 0, \end{aligned}$$


$$\begin{aligned} Y = \gamma _z^2 + \varDelta ^2/(1 + \varDelta /\omega _0). \end{aligned}$$

Equation (75), in general, needs to be solved numerically and the numerical results will be presented later (cf. also Appendix 1 for further details). It is, however, instructive to examine the stability properties in two limiting cases. First, let us consider the long wavelength limit, where \(|b_{\varvec{k}}| \ll 1\). Straightforward algebra then readily shows that the unstable (or least stable) branch of the modulational instability dispersion relation, Eq. (75), is given by:

$$\begin{aligned} Y\simeq & {} \gamma _z^2 + \left[ \left( \frac{3}{4} + \tau \right) \frac{b_z}{2} \omega _0\right] ^2 \simeq \frac{(\alpha _\psi \beta _\phi - \alpha _\phi \beta _\psi )}{(\alpha _\psi - \alpha _\phi )} \nonumber \\\simeq & {} \frac{3}{32} \left| \frac{c}{B_0} k_z k_{\perp 0} \delta \phi _0 \right| ^2 \left( \frac{3}{4} + \tau \right) ^2 b_z^3 \left( 1 - \frac{b_z^2}{b_0^2} \right), \end{aligned}$$

where we have applied the \(|b_{\varvec{k}}| \ll 1\) limits of Eqs. (71)–(74). Equation (77) indicates that a necessary condition for instability is \(b_z < b_0\) and that the corresponding threshold condition is:

$$\begin{aligned} \left| \frac{\delta B_{\perp 0}}{B_0} \right| _{th}^2 = \frac{8}{3} \left( \frac{b_0^2/b_z^2}{1 - b_z^2/b_0^2}\right) \frac{k_{\parallel 0}^2 \rho _{\text {i}}^2}{b_0^2} . \end{aligned}$$

Here, we have noted \(\delta \phi _0 = \delta \psi _0/\sigma _0 \simeq \omega _0 \delta A_{\parallel 0}/(k_{\parallel 0} c)\), \(k_{\perp 0} \delta A_{\parallel 0} = \delta B_{\perp 0}\), and expressed the amplitude in terms of \(\delta B_{\perp 0}\), which is more convenient for comparisons with simulations. Equation (78) indicates that, as \(|k_\perp \rho _{\text {i}}|^2 \ll 1\), \(|\delta B_{\perp 0}/B_0|_{th}\) rapidly increases as \(|k_{\perp 0} \rho _i|^{-2}\) and, hence, finite \(|k_\perp \rho _{\text {i}}|\) effects are necessary of the instability to set in. Well above the threshold condition, we have:

$$\begin{aligned} \left( \frac{\gamma }{\varOmega _{\text {i}}} \right) \simeq \frac{\sqrt{3}}{4 \beta _{\text {i}}^{1/2}} \left( \frac{3}{4} + \tau \right) b_z^2 \sqrt{1-\frac{b_z^2}{b_0^2}} \left| \frac{\delta B_{\perp 0}}{B_0} \right| . \end{aligned}$$

Furthermore, since

$$\begin{aligned} |\delta \psi _z|/|\delta \phi _z| \simeq 1, \end{aligned}$$

ESCC and MSCC are, indeed, strongly coupled, and arbitrary de-coupling assumptions could lead to erroneous conclusions on the stability. It is readily seen that the threshold in Eq. (78) is minimized for \(b_0 = \sqrt{2} b_z\) that yields \(|\delta B_{\perp 0}/B_0|^2_{th, \mathrm min} = (32/3) k_{\parallel 0}^2 \rho _i^2/b_0^2\). Thus, \(b_0^2 = (32/3) k_{\parallel 0}^2 \rho _{\text {i}}^2/|\delta B_{\perp 0}/B_0|^2_{th, \mathrm min}\ll 1\) for effective mode excitation, which is hard to meet at long wavelength. For this reason, in the original works on CC nonlinear excitation by KAW via modulational instability (Chen and Zonca 2013; Zonca et al. 2015), it was noted that nonlinear excitations of convective cells by KAW are always suppressed in the long-wavelength limit, although only the \(Y \simeq (\alpha _\psi - \alpha _\phi ) < 0\) root was discussed therein and in the recent review on this subject (Chen and Zonca 2016).

The other limit is the short-wavelength limit; i.e., \(|b_{\varvec{k}}|\gg 1\), where FILR effects exhibit distinctively. Taking this limit and, to further simplify the analysis, assuming \(b_z \ll b_0\); we can readily show that Eq. (75) yields the following unstable solution:

$$\begin{aligned} Y \simeq \gamma _z^2 + \omega _0^2 \left( \frac{b_z}{2b_0}\right) ^2 \simeq \tau \alpha _\phi , \end{aligned}$$


$$\begin{aligned} \alpha _\phi \simeq \left| (c/B_0) k_z k_{\perp 0} \delta \phi _0 \right| ^2 (b_z/b_0) (1+\tau ). \end{aligned}$$

Equation (81) indicates that, in this \(|b_{\varvec{k}}| \gg 1\) short-wavelength limit, convective cells can be modulationally excited when the pump KAW amplitude exceeds the following threshold value, noting \(\delta \phi _0 \simeq \delta \psi _0/(1+\tau )\):

$$\begin{aligned} \left| \frac{\delta B_{\perp 0}}{B_0} \right| _{th}^{2} = \frac{(1+\tau )}{4 \tau } \left( \frac{k_{\parallel 0}}{k_{\perp 0}}\right) ^2. \end{aligned}$$

Well above the threshold value, we have:

$$\begin{aligned} \left| \frac{\gamma _z}{\omega _0} \right| \simeq \left( \frac{\tau }{1+\tau }\right) ^{1/2} \left| \frac{k_{\perp 0}}{k_{\parallel 0}} \right| \left( \frac{b_z}{b_0}\right) ^{3/2} \left| \frac{\delta B_{\perp 0}}{B_0} \right| . \end{aligned}$$

Meanwhile, ESCC and MSCC remain strongly coupled:

$$\begin{aligned} |\delta \psi _z|/|\delta \phi _z| \simeq (1 + \tau )/2. \end{aligned}$$

We emphasize that the above two limiting analyses clearly demonstrate that finite \(|k_\perp \rho _{\text {i}}|\) effects are necessary for the modulational excitations of convective cells, and that ESCC and MSCC are intrinsically coupled. Taking \(\varvec{k}_{\perp 0} = \hat{\varvec{x}} k_x\), \(\varvec{k}_{\perp z} = \hat{\varvec{y}} k_y\) and \(\delta \varvec{B}_{\perp 0} = \hat{\varvec{y}} \delta B_y \sin (\omega _0 t - k_x x - k_{\parallel 0} z )\), we shall assume \(\delta B_y/B_0 = 2 \delta B_{\perp 0}/B_0\) in the comparison of numerical simulation results with theoretical predictions discussed above. The complete dispersion relation is numerically solved in the \((k_x \rho _{\text {i}}, k_y \rho _{\text {i}})\)-plane for fixed \(k_{\parallel 0} \rho _i = 0.02\), \(\tau = 1\) and \(\beta _e = \beta _{\text {i}} = 0.2\) and different values of \(\delta B_y/B_0\).

Fig. 11
figure 11

Marginal stability curves in the \((k_x \rho _{\text {i}}, k_y \rho _i)\)-plane as a function of the pump KAW amplitude \(\delta B_y/B_0\). Fixed parameters are \(k_{\parallel 0} \rho _{\text {i}} = 0.02\), \(\tau = 1\), and \(\beta _e = \beta _{\text {i}} = 0.2\) [from original figure in Ref. Zonca et al. (2015)]

Figure 11 shows the marginal stability curves. It clearly demonstrates, consistent with the above analytical predictions, the crucial roles of the finite \(k_\perp \rho _{\text {i}}\) effects in the stability properties. Marginal stability curves demonstrate the existence of a necessary condition for instability, \(b_z > k_{y \ell }^2 \rho _{\text {i}}^2 \equiv b_{z \ell }\), given by: Zonca et al. (2015)

$$\begin{aligned} \frac{b_{z \ell }\left( 1-\varGamma _{z \ell }\right) }{2 \varGamma _{z \ell }-\tau \left( 1-\varGamma _{z \ell }\right) }= \frac{4 k_{\Vert 0}^{2} \rho _{i}^{2}}{\left| \delta B_{y} / B_{0}\right| ^{2}}, \end{aligned}$$

where \(\varGamma _{z\ell } \equiv \varGamma _z(b_z = b_{z \ell })\), which holds for \(b_0 \gg 1\) and arbitrary \(b_{z \ell }\). Figure 12, meanwhile, plots the calculated growth rates vs. \(\delta B_y/B_0\) for \((k_x \rho _i, k_y \rho _i) = (0.8, 0.6)\) and \((k_x \rho _i, k_y \rho _i) = (1.0, 0.8)\). Corresponding hybrid simulations have also been carried out to investigate the nonlinear excitations of convective cells by a pump KAW (Zonca et al. 2015).

Fig. 12
figure 12

Modulational instability growth rates (continuous lines), including finite \(\gamma _z/\omega _0\) (cf. Appendix 1) vs. \(\delta B_y/B_0\) are compared with hybrid simulation results (circles: error bars are a measure of discrete particle noise) for \((k_x \rho _{\text {i}}, k_y \rho _i) = (0.8, 0.6)\) (ble) and \((k_x \rho _i, k_y \rho _{\text {i}}) = (1.0, 0.8)\) (red). Fixed parameters are the same as in Fig. 11 [from original figure in Ref. Zonca et al. (2015)]

The observed growth rates, as shown in Fig. 12, agree reasonably well with the theoretically predicted values. Meanwhile, simulations also show that, for \(\varvec{k}_z = \hat{\varvec{y}} k_y\), the ESCC (\(\delta E_{y z}\)) and MSCC (\(\delta B_{x z}\)) are coupled and both are spontaneously excited; consistent, again, with the theoretical predictions. For \(\gamma _z/\omega _0 = {{\mathcal {O}}}(1)\), in general, it is necessary to solve for CC dispersion relation and polarization from Eq. (120) in the Appendix 1, which allows determining both CC magnetic perturbation as well as the corresponding inductive electric field, that is:

$$\begin{aligned} \left| \frac{\delta B_{x z}}{\delta B_{y 0}} \right|= & \frac{|k_y \rho _i|}{2 \sigma _0 |k_x \rho _i|} \frac{|\delta \psi _z|}{|\delta \phi _0|}, \end{aligned}$$
$$\begin{aligned} \left| \frac{c \delta E_{\parallel z}}{v_{\text {A}} \delta B_{y 0}} \right|= & \frac{\sqrt{\beta _{\text {i}}/2}}{2 \sigma _0 |k_x \rho _{\text {i}}|} \left| \frac{\gamma _z}{\varOmega _{\text {i}}}\right| \frac{|\delta \psi _z|}{|\delta \phi _0|}. \end{aligned}$$

Finally, we remark, as noted above, nonlinear excitations of CC by KAWs may be regarded as breaking up the pure Alfvénic states in uniform plasmas via the non-ideal FILR effects. The counterparts of CC in laboratory fusion plasmas are termed as zonal flows and currents, or zonal field structures. There, however, effects due to realistic plasma non-uniformities and magnetic field geometries can often break up the pure Alfvénic states more efficiently and render excitations of the zonal field structures possible at a lower pump threshold; \(|\delta B/B_0| \sim {{\mathcal {O}}}(10^{-3})\) or less (Chen and Zonca 2012).

3.4 Quasi-linear transports induced by KAWs

In the presence of finite \(\delta E_\parallel \), KAWs can exchange energy and generalized momenta with charged particles when the wave–particle resonance condition is satisfied. Such energy–momentum exchanges, thus, could lead to efficient acceleration/heating, current/flow, as well as cross-field transports; that is, wave-induced collisionless transports in the charged particle’s phase space. The self-consistent analysis of charged particle’s phase-space dynamics and the corresponding dynamics of collective electromagnetic fluctuations represents, indeed, fundamental and complex investigations on the frontier of plasma physics research. Such complexities, in one aspect, are associated with the complexities of phase-space dynamics of charged particles in the presence of electromagnetic fluctuations that vary, self-consistently, in space and time. Detailed analyses on this topic are beyond the intended scope of this review. Interested readers are referred Ref. Chen and Zonca (2016) for further discussions. In this review, we assume the fluctuations have sufficiently broad spectral widths and finite but small intensities, such that charged particles diffuse stochastically in the phase space and we may employ the quasi-linear description. Furthermore, we will limit our considerations to KAWs in a slab plasma. More general analyses in realistic geometries can be found in Chen (1999).

Let x be the non-uniformity (radial) direction and \(\mathbf{B}_0 = B_0(x) \hat{\varvec{z}}\). Assuming \(\beta \ll 1\), \(B_0\) is then approximately constant. The particle distribution function, f, can be de-coupled into an “equilibrium” component, \(F_0\), and a fluctuating component, \(\delta f\), that is:

$$\begin{aligned} f=F_0(x,\mu ,v_\parallel , \epsilon t) + \delta f(\mathbf{x}, t), \end{aligned}$$

where \(\epsilon t\) with \(\epsilon \ll 1\) denotes that \(F_0\) is slowly varying in time and vary spatially only in the non-uniformity x direction. \(\delta f\), meanwhile, is given by the linear gyrokinetic equations, Eqs. (14) and (15) with, however, \(F_M\) and \((-q F_{\text {M}}/T)\) replaced, respectively, by \(F_0\) and \((q/m v_\parallel ) (\partial F_0/\partial v_\parallel )\). Employing the nonlinear gyrokinetic equations (Frieman and Chen 1982; Brizard 1995), it is then straightforward to show that \(F_0\) satisfies the following quasi-linear gyrokinetic equations (Chen 1999):

$$\begin{aligned} {\partial \over \partial t} (B_0 F_0) + {\partial \over \partial x} (B_0 \overline{\delta \dot{x} \delta G_{res} }) + {\partial \over \partial v_\parallel } (B_0 \overline{ \delta \dot{v}_\parallel \delta G_{res}} )= 0, \end{aligned}$$


$$\begin{aligned} \delta \dot{x} = \left( {c \mathbf{b} \over B_0} \times {\varvec{\nabla }} \langle \delta L_{\text {g}} \rangle _\alpha \right) \cdot \hat{\varvec{x}}, \end{aligned}$$

\(\delta L_{\text {g}}\) is given by Eqs. (16) and (17), that is:

$$\begin{aligned} \langle \delta L_{\text {g}} \rangle _\alpha= & \langle e^{\mathbf{\rho } \cdot {\varvec{\nabla }}} (\delta \phi - v_\parallel \delta A_\parallel /c)\rangle _\alpha , \end{aligned}$$
$$\begin{aligned} \delta \dot{v}_\parallel= & - {q \over m} \mathbf{b} \cdot {\varvec{\nabla }} \langle \delta L_{\text {g}}\rangle _\alpha , \end{aligned}$$

and \(\mathbf{b} = \hat{\varvec{z}}\).

Meanwhile, \(\delta G_{res}\) in Eq. (90) represents the contribution of resonant particles to \(\delta f\) (Chen 1999), i.e., from Eq. (15):

$$\begin{aligned} \delta G_{res} = i \pi {q \over m} \delta \left[ -i(v_\parallel \mathbf{b} \cdot {\varvec{\nabla }} + {\partial \over \partial t})\right] QF_0 \langle \delta L_{\text {g}}\rangle _\alpha , \end{aligned}$$


$$\begin{aligned} QF_0 \langle \delta L_{\text {g}}\rangle _\alpha = i\left[ {1 \over v_\parallel } {\partial F_0 \over \partial v_\parallel } {\partial \over \partial t} + {\varvec{\nabla }} F_0 \cdot \left( {\mathbf{b} \over \varOmega } \right) \times {\varvec{\nabla }}\right] \langle \delta L_{\text {g}}\rangle _\alpha . \end{aligned}$$

Finally, in Eq. (90) , \(\overline{(\dots )}\) denotes averaging over the (fast) wave periods.

Taking perturbations to be of the following form:

$$\begin{aligned} \delta \phi (\mathbf{x}, t) = {1 \over 2} \sum _\mathbf{k} \{ \delta {{\hat{\phi }}}_k (x) \exp \left[ i(\mathbf{k} \cdot \mathbf{x} - \omega _k t) \right] + c.c. \}, \end{aligned}$$

with \(\mathbf{k} = k_y \hat{\varvec{y}} + k_\parallel \mathbf{b}\), and \({\hat{y}}\) corresponds to the azimuthal (east-west) direction. We then have:

$$\begin{aligned} \delta \dot{x}= & -i {c \over 2B} \sum _\mathbf{k} \{ k_y J_0 (k_\perp \rho ) \delta {\hat{L}}_k \exp \left[ i ( \mathbf{k} \cdot \mathbf{x} - \omega _k t) \right] + c.c. \}, \end{aligned}$$
$$\begin{aligned} \delta \dot{v}_\parallel= & -i {q \over 2m} \sum _\mathbf{k} \{ k_\parallel J_0 (k_\perp \rho ) \delta {\hat{L}}_k \exp \left[ i ( \mathbf{k} \cdot \mathbf{x} - \omega _k t) \right] + c.c. \}, \end{aligned}$$


$$\begin{aligned} \delta {\hat{L}}_k = (\delta {\hat{\phi }} - v_\parallel \delta {\hat{A}}_\parallel /c)_\mathbf{k}. \end{aligned}$$

Note that, for resonant particles:

$$\begin{aligned} \delta \left[ -i(v_\parallel \mathbf{b} \cdot {\varvec{\nabla }} + {\partial \over \partial t} )\right] = \delta (k_\parallel v_\parallel - \omega _k), \end{aligned}$$

and, hence:

$$\begin{aligned} \delta {\hat{L}}_{k, res} = (\delta {\hat{\phi }} - \delta {\hat{\varPsi }})_k \equiv \delta {\hat{\phi }}_{\parallel k}. \end{aligned}$$

Here, we recall Eq. (20), \(\delta {\hat{\varPsi }}_k = (\omega \delta {\hat{A}}_\parallel / c k_\parallel )_k\), and Eq. (21), \(\delta {\hat{E}}_{\parallel k} = -i k_\parallel \delta {\hat{\phi }}_{\parallel k}\). \(\delta G_{res}\) can then be expressed, correspondingly, as:

$$\begin{aligned} \delta G_{{\text {res}}} = {1 \over 2} \{ \delta {\hat{G}}_{k, {\text {res}}} \exp \left[ i( \mathbf{k} \cdot \mathbf{x} - \omega _k t) \right] + c.c. \}, \end{aligned}$$


$$\begin{aligned} \delta {\hat{G}}_{k, {\text {res}}} = i \pi {q \over m} \delta (k_\parallel v_\parallel - \omega _k) {\hat{Q}}_k F_0 J_0(k_\perp \rho ) \delta {\hat{\phi }}_{\parallel k}, \end{aligned}$$


$$\begin{aligned} {\hat{Q}}_k F_0 = (\omega _k {\partial \over v_\parallel \partial v_\parallel } + {k_y \over \varOmega } {\partial \over \partial x}) F_0. \end{aligned}$$

Substituting Eqs. (98), (101), and (102) into Eq. (90), we can readily derive the expression of the quasi-linear gyrokinetic equation in terms of \(\delta {\hat{\phi }}_{\parallel k}\). More specifically, we have:

$$\begin{aligned} \overline{ \delta \dot{x} \delta G_{res}} = -\left( {\pi \over 2}\right) \left( {c \over B_0}\right) \left( {q \over m}\right) \sum _\mathbf{k} \left[ \delta (k_\parallel v_\parallel - \omega _k) k_y J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 {\hat{Q}}_k F_0 \right] , \end{aligned}$$

and, similarly, obtain:

$$\begin{aligned} \overline{ \delta \dot{v}_\parallel \delta G_{{\text {res}}}} = -\left( {\pi \over 2}\right) \left( {q \over mc}\right) ^2 \sum _\mathbf{k} \left[ \delta (k_\parallel v_\parallel - \omega _k) k_\parallel J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 {\hat{Q}}_k F_0 \right] . \end{aligned}$$

Taking the various moments of Eq. (90), we then obtain the transport equations for density, parallel momentum/current, and energy. Specifically, defining the slowly varying “equilibrium” density, \(N(x, \epsilon t)\), as:

$$\begin{aligned} N(x, \epsilon t) = 2\pi \int B_0 {\text {d}}\mu {\text {d}}v_\parallel F_0 \equiv \langle F_0 \rangle _v, \end{aligned}$$

Eq. (90) along with Eq. (105) then yields the following particle transport equation (Chen 1999; Hasegawa and Mima 1978; Lee et al. 1994):

$$\begin{aligned} {\partial N \over \partial t} + {\partial \over \partial x} \varGamma _x = 0, \end{aligned}$$


$$\begin{aligned} \varGamma _x= & \varGamma _{xc} + \varGamma _{xd}, \end{aligned}$$
$$\begin{aligned} \varGamma _{xc}= & -\left( {\pi \over 2}\right) \left( {q \over m}\right) \left( { c \over B_0}\right) \sum _\mathbf{k} \langle \delta (k_\parallel v_\parallel - \omega _k ) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 k_y \omega _k \left( {\partial \over v_\parallel \partial v_\parallel }\right) F_0 \rangle _v, \end{aligned}$$
$$\begin{aligned} \varGamma _{xd}= & -\left( {\pi \over 2}\right) \left( {q \over m}\right) \left( { c \over B_0}\right) \sum _\mathbf{k} \langle \delta (k_\parallel v_\parallel - \omega _k ) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 \left( {k_y^2 \over \varOmega }\right) \left( {\partial \over \partial x}\right) F_0 \rangle _v. \end{aligned}$$

Equations (109)–(111) demonstrate that the particle flux, \(\varGamma _x\), intrinsically consists of a convective, \(\varGamma _{xc}\), and a diffusive, \(\varGamma _{xd}\) component, even though charged particles diffuse stochastically in the phase space. Note also that \(\left| \varGamma _{xc} \right| \) and \(\left| \varGamma _{xd} \right| \) scale, respectively, with \(k_y\) and \(k_y^2\). Thus, no transport occurs if \(k_y=0\). This, of course, is expected, since for \(k_y=0\), \(P_y = mv_y + q A_y/c\) is conserved. As \(\langle P_y \rangle _\alpha = qA_y(X_{gc})/c\) with \(X_{gc}\) being the guiding-center position in x, long-time transport will occur only if \(P_y\) conservation is broken by finite-\(k_y\) symmetry-breaking perturbations. Equations (110) and (111) also indicate that the relative magnitudes between \(\varGamma _{xc}\) and \(\varGamma _{xd}\) depend on the detailed spectral properties of \(\left| \delta {\hat{\phi }}_{\parallel k} \right| ^2\). We can also employ Eq. (90) to derive the equation for energy transport and heating. Letting \(K= \langle mv^2 F_0/2 \rangle _{\text {v}}\), it is then straightforward to show (Chen 1999):

$$\begin{aligned} {\partial K \over \partial t} + {\partial \over \partial x} q_x = S, \end{aligned}$$

where \(q_x=q_{xc} + q_{xd}\) is the energy flux with:

$$\begin{aligned} q_{xc}= & -\left( {\pi \over 2}\right) \left( {c q \over B_0}\right) \sum _\mathbf{k} \langle \delta (k_\parallel v_\parallel - \omega _k) \left( {v^2 \over 2}\right) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 (k_y \omega _k) \left( {\partial \over v_\parallel \partial v_\parallel }\right) F_0 \rangle _{\text {v}}, \end{aligned}$$
$$\begin{aligned} q_{xd}= & -\left( {\pi \over 2}\right) \left( {c q \over B_0}\right) \sum _\mathbf{k} \langle \delta (k_\parallel v_\parallel - \omega _k) \left( {v^2 \over 2}\right) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 \left( {k_y^2 \over \varOmega }\right) \left( {\partial \over \partial x}\right) F_0 \rangle _{\text {v}}, \end{aligned}$$

and S is the local heating rate:

$$\begin{aligned} S = -\left( {\pi \over 2}\right) \left( {q^2 \over m}\right) \sum _\mathbf{k} \omega _k \langle \delta (k_\parallel v_\parallel - \omega _k) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 \left( \omega _k {\partial \over v_\parallel \partial v_\parallel } + {k_y \over \varOmega } {\partial \over \partial x} \right) F_0 \rangle _v. \end{aligned}$$

Similarly, we can derive the following equation for parallel momentum transport and generation (Chen 1999):

$$\begin{aligned} {\partial \over \partial t} P_\parallel + {\partial \over \partial x} \varPi _x = F_\parallel , \end{aligned}$$

where \(P_\parallel = \langle mv_\parallel F_0 \rangle _v\), \(\varPi _x = \varPi _{xc} + \varPi _{xd}\) is the parallel momentum flux:

$$\begin{aligned} \varPi _{xc}= & -\left( {\pi \over 2}\right) \left( {qc \over B_0}\right) \sum _\mathbf{k} \langle \delta (k_\parallel v_\parallel - \omega _k) (v_\parallel ) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 (k_y \omega _k) \left( {\partial \over v_\parallel \partial v_\parallel }\right) F_0 \rangle _{\text {v}}, \end{aligned}$$
$$\begin{aligned} \varPi _{xd}= & -\left( {\pi \over 2}\right) \left( {qc \over B_0}\right) \sum _\mathbf{k} \langle \delta (k_\parallel v_\parallel - \omega _k) (v_\parallel ) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 \left( {k_y^2 \over \varOmega }\right) \left( {\partial \over \partial x}\right) F_0 \rangle _{\text {v}}, \end{aligned}$$


$$\begin{aligned} F_\parallel = -\left( {\pi \over 2}\right) \left( {q^2 \over m}\right) \sum _\mathbf{k} k_\parallel \langle \delta (k_\parallel v_\parallel - \omega _k) J_0^2 \left| \delta {\hat{\phi }}_{\parallel k} \right| ^2 \left( \omega _k {\partial \over v_\parallel \partial v_\parallel } + {k_y \over \varOmega } {\partial \over \partial x} \right) F_0 \rangle _v \end{aligned}$$

is the local effective parallel force due to KAWs. Multiplying Eq. (116) by (q/m) naturally leads to the equation of current transport and generation.

As noted in Chen (1999), the transport equations derived above have the appealing physical pictures that transports as well as acceleration/heating may be viewed as “collisions” between charged particles of energy \(= m {{\mathcal {E}}} = mv^2/2\) and generalized momentum \(=\mathbf{P} = m\mathbf{v} + q\mathbf{A}/c\) with wave packets or quasi-particles of energy \(=\omega \) and momentum \(= \mathbf{k}\). The transport equations also clearly demonstrated that, in collisionless plasmas, wave–particle resonances are responsible for phase-space transports, and the transports consist of both convective and diffusive components with coefficients depending critically on the spectral properties of, in this case, KAWs.

4 Conclusions and discussion

In this paper, we argue that short-wavelength KAWs are ubiquitous in realistic non-uniform magnetized plasmas due to the very existence of SAW continuous spectra. Employing the powerful theoretical tool of gyrokinetic equations, we then re-examine and explore further the linear and nonlinear physics of KAWs. Our analyses clearly demonstrate that kinetic effects due to, e.g., finite ion-Larmor radii, can qualitatively and quantitatively modify the nonlinear processes. More specifically, we show that in contrast to the MHD fluid description, the FILR effects lead to the significantly enhanced electron–ion de-coupling, which, in turn, leads to significantly enhanced nonlinear coupling coefficients. Our analyses, furthermore, suggest that the KAW turbulence spectra will be more isotropic than those according to the MHD description. In addition, convective cells could be nonlinearly excited only in the short-wavelength regime. These spectral properties obviously carry important implications to the symmetry-breaking wave-induced transports of charged particles. In other words, based on our theoretical studies, we submit that one needs to employ first-principle-based self-consistent kinetic or gyrokinetic theories to develop a reliable and accurate understanding of KAW physics; especially, when effects associated with nonlinearities, realistic non-uniformities, and geometries are considered.

Since the primary aim of the present paper is to illuminate physics of KAWs based on the gyrokinetic theory approach, our focus, therefore, has been on the fundamental processes. This paper, thus, is not, and, indeed, never intends to be a comprehensive review of all aspects of the rich KAW physics. For complimentary readings, we refer to the monograph by Wu (2012), the review article by Chen and Zonca (2016), and the recent works by Qiu et al. (2019) for KAWs in fusion tokamak plasmas.

As we, hopefully, have demonstrated, there are many interesting issues; especially, in the nonlinear regime, associated with the KAW physics. Some of them remain little explored; for example, the phase-space dynamics of nonlinear wave–particle interactions as well as the physics of the fully developed KAW turbulence, including frequency/wave-number cascading and, possibly, filamentary structures via nonlinear excitations of convective cells. Obviously, careful studies of these physics issues employing the powerful gyrokinetic approach analytically and/or via numerical simulation will make significant impacts to our deep understandings of the charged particle dynamics and Alfvenén wave turbulences in nature and laboratory plasmas.