1 Introduction

1.1 Discovery of cosmic rays

Cosmic rays (CRs) are relativistic charged particles that originate from outside the Solar System. The highest energy CRs are the fastest nuclei in the Universe, moving close to the speed of light. Most CRs are protons and alpha particles. Collisions between these primary CRs and atoms or ions in space produce secondary CRs. In particular, collisions of primary CRs with the atoms in Earth’s atmosphere produce CR muons that can be directly detected at sea level (thousands of these particles will pass through your body in the few minutes it takes to read the introduction to this review). CRs were discovered over a century ago by an Austrian-American physicist Victor Hess in Bohemia, which was then part of the Austro-Hungarian Empire. In a series of high-altitude balloon flights, Victor Hess measured the rate of ionization as a function of height above ground using an electroscope. At the time these experiments were conducted it was generally believed that atmospheric ionization was due to radioactivity of the ground but results were inconclusive in part due to instrumentation defects. This made determining the source of the ionization a longstanding mystery ever since the discovery of radioactivity in 1896 by Henri Becquerel. In 1912, Hess determined that the ionization rate increased with height above a certain altitude up to four times compared to the value measured at sea level, and this conclusion was unaffected even when the experiments were conducted during a partial solar eclipse (Hess 1912b).Footnote 1 These experiments firmly established that the source of ionization was extraterrestrial and that the Sun was not the main contributor to the ionizing flux.Footnote 2 The first evidence for cosmic “rays” being charged particles was provided by a Dutch physicist Jacob Clay, who observed a decrease in the ionizing flux closer to the equator during his sea voyages between Italy and Indonesia (Clay 1927, 1928). This finding was later interpreted as the result of CRs deflection by the Earth’s magnetic field, demonstrating that CRs are charged particles rather than ionizing radiation. In 1936, Victor Hess received a Nobel Prize for the discovery of CRs.Footnote 3

1.2 Basic properties of cosmic rays

A variety of modern instruments have been used to determine the spectrum of CRs across many orders of magnitude in particle energy. Figure 1 combines the results of various measurements including data for the extragalactic \(\gamma \)-ray and neutrino backgrounds for comparison. The spectrum in this figure is multiplied by the square of particle energy to provide a proxy for the energy density per decade in particle energy. As stated above, the energy density of CRs is dominated by contributions from hydrogen and helium, followed by that of heavier nuclei, electrons, positrons, and antiprotons. In particular, the electron-to-proton energy density ratio at energies of 10 GeV (which are large enough so that those CR fluxes are not affected by the magnetized solar wind) is about \(K_{\textrm{ep}}\approx 0.02\). Elemental abundances of CR nuclei closely track those seen in the Solar System, with the notable exception of lithium, beryllium, and boron (as well as the elements surrounding titanium, chromium, and manganese), which are more abundant in CRs. This deviation can be attributed to the spallation process in which CR carbon nuclei (for the light elements) and CR iron nuclei (for the heavy element groups) collide with interstellar medium (ISM) hydrogen atoms to form these more abundant elements.

The spectrum of CRs exhibits a number of important features. At proton energies well below several GeV (and for heavier nuclei below energies that are larger by a factor equal to the ion charge), it is a power-law characterized by a spectral index \(\alpha \sim 0\) (\(\textrm{d}n/\textrm{d}E \propto E^{-\alpha }\)). When observed close to Earth, the spectrum in this energy range is severely attenuated by solar modulation – a process in which CRs interacting with the turbulent and magnetized solar wind are effectively scattered and partially prevented from reaching Earth. The low-energy data shown in Fig. 1 comes from the Voyager probes after they had passed the heliopause, which is the boundary between the solar wind and the ISM, and is thus unaffected by the modulation. The maximum contribution to the CR energy density is provided by particles with energies of a few GeV. At larger particle energies, the spectrum declines as a power-law in energy, \(\textrm{d}n/\textrm{d}E \propto E^{-2.7}\), where n is the number density of particles of energy E. This power-law is commonly attributed to a combination of CR acceleration in supernova remnant (SNR) shock waves yielding a spectrum close to \(\textrm{d}n/\textrm{d}E \propto E^{-2.2}\) and diffusive escape losses that steepen the spectral index of CRs remaining within the disk by 0.5. Given the finite duration of the shock waves and the decreasing shock speed in the energy-conserving Sedov–Taylor phase, the maximum energy to which CR protons can be accelerated via this process is limited to about a few \(\times ~10^{14}\) eV possibly reaching as far as the “knee” (\(\sim 3\times 10^{15}\) eV) depending on the physics of the magnetic field amplification, particle transport, and ambient ISM conditions. Past the knee, the composition of CRs includes substantial contributions from heavier ions and the all-particle spectrum steepens to a power law with spectral index \(\alpha \approx 3.1\) that extends up to the “ankle” (\(\sim 10^{18}\) eV), where it flattens again.Footnote 4

Fig. 1
figure 1

Cosmic ray spectrum obtained using a variety of instruments. Original figure adopted from Lenok (2022, see references therein for data sources) and extended to include electron and positron data detected by Voyager 2 (Stone et al. 2019). Note that Voyager 2 data has been measured outside the magnetopause and is thus free from the effects of solar modulation (shown with a fading color gradient towards small energies below 10 GeV for protons and below higher energies for heavier ions). Thin colored lines connecting the Voyager to Earth-bound low-energy CR data indicate the shape of the interstellar CR spectra in the Local Bubble. Notably the Voyager 2 data show an electron flux that dominates over that of protons at the lowest energies

At the location of the ankle, the sources of CRs likely change from Galactic to extragalactic accelerators because, at this energy, the CR gyroradius is already comparable to the thickness of the Galactic disk (e.g., Kampert 2007). Collisions of CR protons with cosmic microwave background (CMB) photons can generate electron-positron pairs resulting in a pair production dip in the CR proton spectrum between \(1\times 10^{18}\) and \(4\times 10^{19}\) eV, which could be responsible for (some of) the observed spectral flattening (Berezinsky et al. 2006). Following the flattening, the spectrum exhibits a sharp cutoff at around \(5\times 10^{19}\) eV, the reason for which is still the subject of active research. Initially, this cutoff was thought to be due to the Greisen–Zatsepin–Kuzmin (GZK, Greisen 1966; Zatsepin and Kuz’min 1966; Aloisio 2013) limit, which arises because of the collisions of CR protons with the CMB photons that lead to the photo-production of pions, a process that drains energy from protons. Additional data from the Pierre Auger Collaboration (Aab et al. 2017) established a change in trend from a proton-dominated CR composition at the ankle to an increasingly heavier composition with oxygen-group elements dominating the region at the cutoff (Aab et al. 2020). This casts doubt on the GZK interpretation and suggests that the maximum energies of heavier CR nuclei could be limited by photo-disintegration due to collisions with the CMB and extragalactic background light. Alternatively, the cutoff in the CR spectrum could signal a limiting voltage of the acceleration process because it is very challenging for astrophysical accelerators to reach the highest CR energies (Kotera and Olinto 2011). The largest detected CR energy is \(\sim 3\times 10^{20}\) eV, which coincidentally is the energy of a standard tennis ball travelling at a typical serve speed of 100 miles per hour (corresponding to \(\sim 160~\textrm{km}\) per hour), and exceeds the highest particle energy achievable at the world’s largest particle accelerator, the Large Hadron Collider at CERN, by about \(3\times 10^3\) times.Footnote 5

1.3 Role of cosmic rays in galaxy evolution

Switching the focus to a seemingly unrelated topic – galaxy formation and evolution in a cosmologically expanding Universe – allows us to relate the underlying physics of galaxy formation to CR transport and to demonstrate that CRs could hold the key to solving some of the most important problems in this field. The standard cosmological cold dark matter model of the Universe with a cosmological constant \(\varLambda \) predicts that baryons should follow dark matter potential wells and sink to their centers as a result of very efficient cooling. However, the baryon content of dark matter halos formed during the process of hierarchical structure formation falls significantly below the cosmological mean baryon density, which is known as the “missing baryon problem” (e.g., Bregman 2007; Tumlinson et al. 2017). In particular, the distribution of the ratios of stellar-to-dark matter halo mass as a function of halo mass (Moster et al. 2010, see bottom graph in Fig. 2) reveals that even at halo masses around the Milky Way (\(\sim 10^{12}\,\textrm{M}_{\odot }\)), where star formation is most efficient, only \(\sim \)20% of the available baryons are converted into stars, and the star formation efficiency falls off rapidly toward both ends of the mass spectrum. These missing baryons either did not fall into dark matter potential wells (or remain undetected) or were expelled due to feedback processes, which can be ejective (when gas is expelled from galaxies before it can form stars) and/or preventative (when gas is heated and prevented from accreting onto galaxies). Star formation suppression is thought to be due to stellar and supermassive black hole feedback below and above this critical halo mass threshold, respectively (e.g., Somerville and Davé 2015).

Fig. 2
figure 2

Top: From left to right the images show the starburst galaxy M82 (infrared in red: NASA/JPL-Caltech/Univ. of AZ/Engelbracht; optical in yellow: NASA/ESA/STScI/AURA/The Hubble Heritage Team; X-ray in blue: NASA/CXC/JHU/Strickland), Fermi Bubbles in the Milky Way (Platz et al. 2022, reproduced with permission, copyright by ESO), and a LOFAR image of the giant elliptical galaxy M87 (de Gasperin et al. 2012, reproduced with permission, copyright by ESO). Bottom: Stellar-to-halo mass ratio as a function of dark matter halo mass (Moster et al. 2010); reproduced with permission, copyright by AAS

Key challenges facing commonly employed thermal and radiation pressure–driven stellar feedback models are (i) the overcooling problem, where the thermal energy injected by supernovae (SNe) is quickly radiated away, thus diminishing the impact of SN explosions on regulating star formation (or, in the case of high-resolution simulation models, causing the expelled gas to be recycled too fast and re-accreted in fountain flows), and (ii) poor coupling of stellar radiation to the gas and thus reduced momentum imparted to the gas due to photoionization and/or radiation pressure that open up low-density channels in the optically thick gas and dust enshrouding star-forming environment, through which radiation can escape following the path of least resistance (Rosdahl et al. 2015). Both of these issues severely reduce the efficiency of stellar feedback and galactic wind launching. However, as we will argue in this review, CRs accelerated at SNR shocks (Blandford and Eichler 1987) provide an efficient feedback mechanism that addresses both challenges because CR cooling times are typically much longer than those of thermal gas, and CRs are generally well coupled to the plasma via particle-wave interactions facilitated by CR-driven plasma instabilities as well as direct inelastic collisions with the gas. These processes result in CRs imparting momentum to and heating of the plasmas, and can change in fundamental ways the properties and evolution of astrophysical objects. In fact, the CR energy density (\(\sim 1\) eV cm\(^{-3}\)) is close to equipartition with the thermal, magnetic, and turbulent energy densities in the ISM of our Milky Way (Boulares and Cox 1990), suggesting that CRs are dynamically very important in regulating star formation.

Despite travelling at speeds close to the speed of light, c, arguments based on the production of secondary CRs via spallation reactions and the observed boron-to-carbon ratio reveal that CRs are confined to the Galactic disk over timescales (\(\tau _\textrm{diff}\sim 3\times 10^{7}\) yr) much longer than the light crossing time of the disk thickness (\(\sim 3\times 10^{3}\) yr; assuming disk thickness \(h_{\textrm{disk}}\sim 1\) kpc). When the escape of CRs from the Galaxy over these long timescales is balanced by their acceleration in SNRs, CR pressure can build up to the observed dynamically important levels (Ginzburg and Syrovatskii 1964). Since a typical mean free path of these CRs is, on average, on the order of \(\lambda _\textrm{mfp,cr}\sim 3h_{\textrm{disk}}^{2}/(c\tau _{\textrm{diff}})\sim \) 0.3 pc, i.e., much smaller than for photons, CRs may be well coupled to the ISM, and their diffusion from the disk can establish dynamically important pressure gradients that may drive strong galactic winds in late-type galaxies, as first realized by Ipavich (1975). Transport of CRs into the circumgalactic medium (CGM) can affect its observational properties and provide non-thermal pressure support in the CGM, that could in turn affect thermal instability, precipitation, and thus feeding of star formation in the disk as we will explain in this review. On even larger scales characteristic of active galactic nuclei (AGN) jet outflows in early-type galaxies in galaxy groups and clusters, CRs accelerated in relativistic jets may play a key role in addressing the classic cooling flow (or cool core) problem by facilitating efficient heating of the intracluster medium (ICM) and preventing excessive star formation (e.g., Guo and Oh 2008). Even when the CR pressure support is low in the CGM and ICM, CR heating of the gas can be competitive with radiative cooling, thus helping to prevent excessive mass accretion rates.

It is evident from the above considerations that CRs should be crucially important for our understanding of cosmological galaxy formation and evolution. The last decade has seen tremendous progress in our understanding of the role of CRs in these feedback processes. On the theoretical side, our ability to model CRs has improved dramatically thanks to a significant and concerted effort in the community to model from first-principles the transport of CRs in the context of galactic wind launching and AGN feedback. One of the key challenges is to identify and fully explore the essential physics needed to realistically model CR feedback. This will likely be accomplished via a combination of research involving particle-in-cell (PIC) simulations of CR transport, one-dimensional models needed to interpret results from multi-dimensional and multi-physics simulations, and fully cosmological high-resolution simulations incorporating the physics-driven parameterizations of CR transport and momentum and energy deposition from stars and AGN. On the observational side, new data from telescopes such as LOFAR, MeerKAT, Jansky VLA, Fermi, H.E.S.S., MAGIC, VERITAS, HAWC, IceCube, and space missions such as Voyager, AMS-02 and others have transformed our understanding of the role of CRs in feedback processes, and new and upcoming observatories such as Square-Kilometre Array (SKA; radio), James Webb Space Telescope (JWST; optical to the mid-infrared), and Cherenkov Telescope Array (CTA; \(\gamma \) rays) will continue to drive progress in this very active field of research. This conviction is shared by the Decadal Survey on Astronomy and Astrophysics 2020 panel which has voiced its appreciation of the importance of CR feedback in galaxy formation: “The impact of CRs is one of the largest uncertainties in understanding feedback in galaxy formation. The primary uncertainty is how CRs are scattered by small-scale fluctuations in the magnetic field, which sets whether CRs can escape a region or whether their pressure builds up to the point where it can drive an outflow. [...] It is remarkable that tiny solar-system scale fluctuations in the galactic magnetic field are a key ingredient in understanding how galaxies drive winds on scales of tens of kiloparsecs, or that the large-scale magnetic field properties or distant supernovae can affect the formation of pre-stellar cores” (Astro2020 Decadal Survey 2021).

Perhaps the following closing statement from the Nobel lecture by Cecil F. Powell in 1950 accurately captures not only the spirit of the heady years after the original CR discovery but also the current sentiment and excitement in the field of galaxy formation and CR feedback: “It will indeed be of great interest if the contemporary studies of the primary radiation lead us [...] to the study of some of the most fundamental problems in the evolution of the cosmos.”Footnote 6

1.4 Scope and context of the review

The overarching goal of this review is to present a comprehensive and pedagogical review of the field of the astrophysical CR feedback. Our approach encompasses an overview of the essential physical processes that play key roles in CR feedback followed by a systematic review of the role of CRs over a very wide range of physical scales emphasizing in the process the connections between CR physics operating on various scales, and putting CR feedback processes in context of other stellar and supermassive black hole feedback mechanisms.

A number of other reviews discussing CRs have appeared recently. While these reviews mention CRs, our review occupies a distinct niche as it focuses almost exclusively on the in-depth discussion of the physics of CR feedback across a wide range of physical scales – from the scales of individual SNe to the scales of galaxy clusters. Here we list some of the other reviews stating also their main focus and order them from small plasma kinetic to large cosmological scales, starting with (i) Zweibel (2017), a concise overview of CR physics with applications to feedback; (ii) Caprioli (2016); Pohl et al. (2020), particle-in-cell simulations of CR acceleration; (iii) Marcowith et al. (2020), simulations of particle acceleration in astrophysical systems; (iv) Padovani et al. (2020); Gabici (2022), impact of low-energy CRs in the ISM; (v) Grenier et al. (2015), CR impact on the ISM, focusing on interstellar chemistry and CR propagation in molecular clouds; (vi) Becker Tjus and Merten (2020), observational aspects of CRs and their secondaries; (vii) Bykov et al. (2020), high-energy particles and radiation in star-forming regions; (viii) Veilleux et al. (2020), in-depth review of the observational aspects of galactic outflows; (ix) Amato and Blasi (2018), CR transport in the Milky Way; (x) Recchia (2020), a concise overview of CR-driven winds including discussion of one-dimensional models; (xi) Yang et al. (2018), origin of the Fermi Bubbles; (xii) Faucher-Giguère and Oh (2023), physical processes in the CGM; (xiii) Hlavacek-Larrondo et al. (2022), AGN feedback in groups and clusters, (xiv) Yang & Bourne (2023); macro- and micro-physics of AGN jet feedback in galaxy clusters, (xv) Kunz et al. (2022), plasma physics of the ICM; (xvi) Vogelsberger et al. (2020), short overview of recent advances in galaxy formation; (xvii) Kotera and Olinto (2011), astrophysics of ultra-high energy CRs; and (xviii) Hanasz et al. (2021), numerical methods for CR transport.

1.5 Outline of the review

We start the review with an in-depth discussion of the physics relevant to CR interactions with plasmas (Sect. 2). In that section, we describe a number of processes that are essential for the modelling of the various astrophysical phenomena and the interpretation of multi-wavelength observations. These processes include particle-wave interactions and CR-driven instabilities (Sect. 2.1), CR acceleration (Sect. 2.2), spatial and spectral CR transport (Sects. 2.3 and 2.5), and CR energy loss mechanisms (Sect. 2.4). In the process, we also present a brief overview of the approaches to study these processes (kinetic, hybrid, and fluid descriptions).

Following this pedagogical physics introduction, we transition to the discussion in Sect. 3 of the applications of these processes to astrophysical situations. In that section, we organize the discussion by relevant astrophysical scales, focusing first on CR impact on small scales before moving onto the discussion of the phenomena relevant to larger physical scales. We also separate the discussion of low-energy CRs (with energies \(\lesssim \) GeV) that are critically important for ISM ionization, our understanding of non-thermal emission, and calibrating transport coefficients, but do not contribute significantly to gas pressure, from the discussion of CRs with energies above \(\sim \) GeV that play a crucial role in dynamical feedback. Specifically, we discuss the role of CR physics in the following astrophysical settings: low energy CR ionization in the ISM (Sect. 3.1), stellar feedback and CR-driven winds (starting with one-dimensional models and moving on to progressively more sophisticated descriptions of the dynamical interactions of CR with the ISM including the role of CRs in the dynamics of SNe, CR interactions with cold clouds and multiphase ISM, impact of CRs on star formation, and the physics of galactic wind launching; Sect. 3.2), impact of CRs in cosmological galaxy formation simulations (Sect. 3.3), the role of CRs in thermal instability in the CGM and ICM (Sect. 3.4), and the impact of CR feedback on the CGM and ICM in massive hot halos of the largest galaxies and galaxy clusters (Sect. 3.5).

As the next step, we discuss the observational signatures of CR feedback (Sect. 4), covering CR propagation in the Milky Way (Sect. 4.1), CR aided outflows in the Milky Way (Sect. 4.2), non-thermal emission from galaxies (Sect. 4.3), observational signatures of CR feedback in the CGM (Sect. 4.4) and ICM (Sect. 4.5 and Sect. 4.6), and current and future observatories (Sect. 4.7). We conclude in Sect. 5, where we identify open questions and future directions. While we discuss the GeV-TeV CR connection as a means of calibrating CR feedback, we refrain from discussing issues related to the maximum energy of Galactic accelerators (e.g., hadronic PeVatrons) and do not cover the topic of ultra-high energy CRs.

2 Physics

Before we discuss the details and subtleties of CR feedback in galaxies and clusters, we introduce the basics of CR acceleration and transport. We put particular emphasis on the different physics models used to study CRs, from plasma-kinetic simulations to hydrodynamical models that characterize the CR population with a small number of thermodynamic variables. We start our discussion with CR interactions with electromagnetic waves in Sect. 2.1 and explain the concept of CR pitch-angle scattering and CR-driven instabilities. In Sect. 2.2, we review our knowledge on CR acceleration and escape from SNRs to the ISM. In particular, we provide an overview of the general picture of acceleration processes and discuss in detail diffusive shock acceleration of ions and electrons. We then transition to larger scales and discuss CR spatial transport in Sect. 2.3, including a detailed explanation of one-moment and two-moment CR hydrodynamics. This is followed by a discussion of new results on the CR streaming instability, wave damping mechanisms and CR self-confinement as well as CR scattering with magneto-hydrodynamic (MHD) turbulence, which gives rise to external CR confinement by turbulence. In Sect. 2.4, we discuss radiative and non-radiative CR (ion and electron) interactions and their cooling times. Finally, we review the CR spectral transport in Sect. 2.5. Specifically, we discuss how CR transport acquires a momentum dependence in the self-confinement and external-confinement picture of CR transport and provide an overview of the various numerical methods developed for evolving the CR momentum spectrum in space and time.

2.1 Cosmic ray interactions with electromagnetic waves

In order to motivate the importance of collective interactions of CRs with plasma waves, we first provide some order of magnitude estimates of CR number densities. We then explain the various ways in which CRs interact with electromagnetic waves and end by introducing CR-driven resonant plasma instabilities.

2.1.1 Estimates of cosmic ray number densities

CRs are charged particles that form a collisionless species that is embedded in a magnetized background plasma, which is composed of various phases ranging from cold clouds (with temperatures \(T\sim 10\) K) to ionized and neutral warm (\(T\sim 10^4\) K) and hot phases (\(T\sim 10^6\) K in galaxies and \(T\sim 10^7\)\(10^8\) K in galaxy clusters), which dominate by volume (Cox 2005; Draine 2011). In the midplane of the Milky Way, the energy densities of CRs, magnetic fields, turbulence, and the thermal population are in equipartition (Boulares and Cox 1990; Cox 2005; Naab and Ostriker 2017), suggesting that these components are crucial for maintaining the energy equilibrium of the ISM. As discussed in Fig. 1, the Galactic CR population is dominated by particles with \(\sim \) GeV energies while the warm ISM is dominated by thermal particles with energies around \(\sim \) eV, which amounts to a ratio of \(10^9\) in particle energy. Being in pressure equilibrium, we conclude that to order of magnitude, CRs must be extremely rare and only one in \(\sim 10^9\) ISM particles is a CR ion (i.e., mostly a CR proton), implying a CR number density of about \(10^{-9}~\textrm{cm}^{-3}\) for a mean density of the warm ISM phase of \(1~\textrm{cm}^{-3}\). The intracluster plasma is characterized by particle energies of \(k_{\textrm{B}} T\sim 1\)–10 keV and densities \(\sim 10^{-3}~\textrm{cm}^{-3}\). The non-detection of \(\gamma \) rays from clusters translates into an upper limit of the CR-to-thermal pressure ratio of \(\lesssim 10^{-2}\) in the bulk of the ICM (Ackermann et al. 2010; Aleksić et al. 2010, 2012; Arlen et al. 2012; Ackermann et al. 2014a) so that we obtain a CR-to-background density ratio of \(\lesssim 10^{-7}\), implying CR number densities in clusters of \(\lesssim 10^{-10}~\textrm{cm}^{-3}\). Because of the very low CR number density in the ISM and ICM, CRs almost never collide with each other. The interactions of CRs with background ions via Coulomb or hadronic particle collisions cool the CR population on too long time scales to establish a collisional equilibrium between CRs and thermal gas particles (see Sect. 2.4.2). Instead, CRs collectively interact with the background plasma predominantly via particle-wave scattering, which substantially reduces the effective mean free path of CRs (Wentzel 1974).

2.1.2 Cosmic ray-wave scattering and diffusion

In the following, we first explore the interaction of a single CR particle with an MHD wave and then discuss collective wave–particle interactions of CR populations. There are two types of CR interactions with MHD waves.

(i) A gyro-resonant interaction occurs when the Doppler-shifted rotation rate \(\omega _{\textrm{r}}\) of a circularly polarized plasma wave is a multiple of the CR gyro frequency (Schlickeiser 2002),

$$\begin{aligned} k_\parallel \mu v- \omega _{\textrm{r}} = \pm n \varOmega , \qquad \text{ where } \qquad \varOmega =\frac{qB}{\gamma mc} \end{aligned}$$
(1)

is the relativistic gyro frequency of a CR (ion or electron) population with characteristic Lorentz factor \(\gamma =[1-(v/c)^2]^{-1/2}\), velocity \(v\), particle mass m and charge \(q=Ze\) (where Z is the charge number and e is the elementary charge). Here, c is the light speed, \(k_\parallel \) is the component of the wave number parallel to the mean magnetic field, and \(\mu =\cos \theta =\varvec{p}\varvec{\cdot }\varvec{B}/(pB)\) is the cosine of the pitch angle between the magnetic field and momentum vectors, \(\varvec{B}\) and \(\varvec{p}\), while \(B=|\varvec{B}|\) and \(p=|\varvec{p}|\) are the magnitude of the magnetic field and the relativistic CR momentum, respectively, all measured in the frame that is comoving with the background plasma. For a positive sign of the wave vector, the sign ± denotes right and left circular polarization of the CR. Note that n is a natural number and \(n\ge 1\) denotes the order of the resonant interaction. The case \(n=1\) denotes the interaction with plasma waves propagating parallel to the magnetic field and \(n>1\) accounts for the interaction with obliquely propagating waves, which can be seen by exploring the geometry of the wave–particle interaction.

To understand the physics of the resonant wave–particle interaction, we consider the motion of a particle in a uniform and constant magnetic field along which a small-amplitude shear Alfvén waveFootnote 7 travels. Because this represents a small perturbation to the constant magnetic field, the resulting particle motion is approximately described by a circular gyration in the plane perpendicular to the constant field, while the particle moves at uniform velocity along the field. Evaluating the two solutions of the resonance condition (1), we find that for super-Alfvénic particle motions (\(\mu v-v_{\textrm{a}}>0\)), the magnetic field perturbation as seen by the particle has the same sense of polarization as the particle itself (i.e., the sense of rotation direction for a fixed coordinate along the field but for varying time). For sub-Alfvénic motions, the particle also probes the same wave polarization for each solution, but with an opposite polarization in comparison to the super-Alfvénic case. Hence, a resonant wave–particle interaction requires a Doppler-shifted Alfvén wave with the same rotation direction and frequency as that of the particle gyration frequency in its rest frame.Footnote 8

(ii) The case \(n=0\) in equation (1) (also called the “Landau resonance”) describes to a non-resonant wave–particle interaction named “transit time damping” with \(\omega _{\textrm{r}}=k_\parallel \mu v=k_\parallel v_\parallel \). This implies that the time it takes for a particle to traverse the confining region (i.e., the “transit time” of the particle), \(\tau =\lambda _\parallel /v_\parallel = 2\pi /(k_\parallel v_\parallel )\) matches the wave period, \(T=2\pi /\omega _{\textrm{r}}\). Physically, this means that an electron or proton is confined by a magnetic mirror force, which requires the presence of compressible electromagnetic perturbations. In MHD, those compressible perturbations are caused by fast and slow wave modes,Footnote 9 with slow modes containing most of the compressible energy in subsonic turbulence. The particle surfs the wave (i.e., the particle experiences an accelerating electrostatic field of the wave) and gains energy because head-on interactions between particle and wave are more frequent than head-tail interactions.

Fig. 3
figure 3

The schematic drawing shows a CR proton (red) orbiting around a magnetic field line (left) and resonantly interacting with an Alfvén wave (center), i.e., when its wavelength equals the particle’s gyroradius. In this drawing, we choose the phase of the Alfvén wave and gyrating CR such that the resulting Lorentz forces act opposite to the parallel velocity component of the proton along its entire orbit and decelerate it. Note that for a CR pitch angle \(<90^\circ \) the magnetic vector of the circularly polarized Alfvén wave must be tilted such that the resulting Lorentz force decelerates the parallel velocity component. Since there are no electric fields in the reference frame of the moving wave, the proton energy (and total velocity \(v^2=(v_\parallel - v_{\textrm{a}})^2+v_\perp ^2\)) is conserved and the perpendicular velocity component must increase, increasing the pitch angle between the momentum and magnetic field vectors. Switching the magnetic field perturbations (\(\delta \varvec{B}\)) by \(180^\circ \) would result in an accelerating Lorentz force along the CR orbit and a decreasing pitch angle. Hence, for a CR population with random phase shifts between the Alfvén wave and CR gyro orbits, the CR protons would experience random gyro-averaged Lorentz forces with random net pitch angle changes. In consequence, proton scattering by an Alfvén wave packet (right) can be described by a diffusion process leading to an isotropization of the CRs in the reference frame of the wave for frequent CR-wave scatterings (see Fig. 4). In this case, CRs propagate on average at the Alfvén velocity, which in the ISM is about \(10^4\) times smaller than the speed of light at which individual relativistic CR particles approximately travel (Jacob & Pfrommer)

Fig. 4
figure 4

A visualization of the CR scattering process with an Alfvén wave in CR velocity space, where velocities are measured in the frame that is comoving with the Alfvén wave. An anisotropic CR distribution moving either rightward (red) or leftward (blue) has initial values of the cosine of the pitch angle \(|\mu |=|v_\parallel /v|\lesssim 1\). Scattering off of an Alfvén wave leads to smaller (larger) values of \(v_\parallel \) if initially \(v_\parallel >v_{\textrm{a}}\) (\(v_\parallel <v_{\textrm{a}}\)) while conserving the particle energy in the rest frame of the Alfvén wave (see also Fig. 3). This process can be described as a diffusion process in \(\mu \) along the equal-energy circle shown here so that the initially anisotropic distribution diffuses over time to assume a homogeneous distribution in \(\mu \)

Back to the resonant interaction of a CR particle with a shear Alfvén wave. As demonstrated in Fig. 3, upon resonantly interacting with an Alfvén wave, CRs change their pitch angle. This can be visualized by adopting the dispersion relation of Alfvén waves, \(\omega _{\textrm{r}}=k v_{\textrm{a}}\) and evaluating the resonance condition (1) for wave propagation parallel to the magnetic field, yielding \(\left( v_\parallel - v_{\textrm{a}} \right) k_\parallel = \pm \varOmega \). This is a manifestation of gyroresonance and can be visualized as a circle in CR velocity space, centered on \(v_{\textrm{a}}\). Shifting our frame of reference so that it is moving with the Alfvén speed has interesting consequences. In this frame, the Alfvén wave only retains a magnetic field while the electric field vanishes. As a result, CRs can neither change their energy nor their total velocity, because this would require the presence of an electric field. In other words, \(v^2 = (v_\parallel - v_{\textrm{a}})^2 + v_\perp ^2 = \mathrm {const.}\), which is the mathematical representation of the velocity space circle in Fig. 4. The remaining magnetic field of the Alfvén wave is able to shift the pitch-angle of the CRs in an energy-conserving manner, which leads to the motion of CRs along the velocity space circle. The interaction of an Alfvén wave with a CR population that is randomly distributed in space leads to random phase shifts between the individual gyro motions and the wave (see Fig. 3). This causes a random change in pitch angle for each scattering event so that we are witnessing a random walk-in \(\mu \), implying that this can be described by a diffusion process along the equal energy surface in velocity space (see Fig. 4). CR particles with \(v_\parallel >v_{\textrm{a}}\) decrease their parallel velocity component (or equivalently, decrease their pitch angle cosine \(\mu =v_\parallel /v\), which corresponds to an increasing angle \(\theta \)) and particles with \(v_\parallel <v_{\textrm{a}}\) increase their \(\mu \). Formally, scattering across pitch angles of \(\sim 90^\circ \) (corresponding to \(\mu \sim 0\)) would require wave modes with vanishing small wavelengths, \(\lambda = 2\pi /k_\parallel \rightarrow 0\). However, those modes do not exist because of the fast damping of these waves.

It turns out that there are several effects that circumvent this hypothetical problem, rendering it irrelevant in practice. First, the scattering coefficient for \(\mu \rightarrow 0\) remains finite by retaining \(\omega \) in the resonance condition, implying a broadening of the resonance as a result of dielectric effects (Fedorenko 1983; Schlickeiser 1989). Second, the momentum of particles with \(\mu \rightarrow 0\) can be reversed by mirror interactions with long-wavelength compressible modes (Felice and Kulsrud 2001), thereby emphasizing the important role of non-resonant wave–particle interactions in the scattering process. Finally, resonance broadening as a result of non-linear effects of the interaction enables populating the other hemisphere of CR velocity space so that the system eventually approaches an isotropic CR pitch angle distribution (Shalchi 2005). Mathematically, these collective CR pitch angle scatterings (representing the random walk in \(\mu \)) can described by a diffusion process for the gyro-averaged CR distribution \(f=f(\varvec{x},p,\mu )\) in \(\mu \):

$$\begin{aligned} \frac{\partial f}{\partial t} \Bigg \vert _{\textrm{scatt}} = \frac{\partial }{\partial \mu } \left[ \frac{1-\mu ^2}{2} \,\nu (p,\mu )\, \frac{\partial }{\partial \mu } f\right] , \end{aligned}$$
(2)

where the time and the pitch angle are evaluated in the Alfvén wave frame, \(\nu (p,\mu )\) denotes the scattering frequency and the factor \((1-\mu ^2)/2=(\sin ^2\theta )/2\) derives from gyro averaging the Lorentz force terms. A transparent derivation of quasi-linear theory of CR transport and pitch-angle scattering of CRs starting from elementary physics considerations can be found in Thomas (2022).

The effective spatial scattering coefficient along the magnetic fields can be derived using the following heuristic random walk arguments. CRs can scatter on Alfvén waves if the resonance condition (1) is met. Adopting the dispersion relation for Alfvén waves \(\omega _{\textrm{r}}=v_{\textrm{a}}k\) and realizing that those propagate at typical speeds of tens of km s\(^{-1}\), i.e., \(v_{\textrm{a}}/c\ll 1\), we drop \(\omega _{\textrm{r}}\ll k_\parallel v\) (because CRs move relativistically, \(v\sim c\)) and obtain the simplified resonance condition \(k_{\parallel }^{-1} = \mu r_{\textrm{L}}\), where \(r_{\textrm{L}}=p c/(qB)\) is the CR gyroradius. This defines a minimum CR momentum that is in resonance with the Alfvén wave of a given \(k_\parallel \),

$$\begin{aligned} p_{\textrm{min}}=\frac{qB}{ck_\parallel }. \end{aligned}$$
(3)

This corresponds to individual scattering events each lasting \(\sim r_{\textrm{L}}/c\), i.e., a gyration time \(\sim \varOmega ^{-1}\). In the frame of the Alfvén wave, Lorentz forces acting during a scattering event alter the CR pitch angle by \(|\delta \alpha |\sim \delta B/B\ll 1\), where \(\delta B\) is the magnetic field fluctuation associated with the wave and B is the mean magnetic field, but do not change CR energy. Multiple and uncorrelated wave–particle interactions lead to the random walk of CRs in pitch angle, which approaches CR isotropization when \(N_{\textrm{scatt}}(\delta \alpha )^{2}\sim 1\), where \(N_{\textrm{scatt}}\) is the number of scattering events. The isotropization timescale \(t_{\textrm{i}}\sim N_{\textrm{scatt}}\varOmega ^{-1}\) of the CR is thus \(t_{\textrm{i}}\sim \varOmega _{\mathrm{}}^{-1}(\delta B/B)^{-2}\), and the corresponding spatial diffusion coefficient along the mean direction of the magnetic field is \(\kappa \sim c^{2}t_{\textrm{i}}\sim c r_{\textrm{L}}(\delta B/B)^{-2}\) (Blandford and Eichler 1987). In the following sections, we will improve upon this heuristic argument and distinguish between spatial CR transport owing to diffusion and streaming, the latter of which is the limiting case of CR transport of an almost isotropic CR distribution in the Alfvén frame.

2.1.3 Cosmic ray driven plasma instabilities

In the previous section, we discussed the effect that Alfvén wave interactions with CRs have on their motion through phase space. We implicitly assumed that the waves have been driven by some unspecified process and have not accounted for a possible backreaction of the CRs on wave propagation. Here, we show that this backreaction naturally results from various resonant CR-driven instabilities while we defer a discussion of the non-resonant hybrid instability (Bell 2004) to Sect. 2.2.2.

The nature of CR-driven plasma instabilities can be understood by studying the response of a plasma to resonant driving provided by streaming (and gyrating) CRs. Perturbing the Vlasov equation for the CR distribution function and deriving a Fokker-Planck equation for CR transport leads to the picture of quasi-linear diffusion of CRs in phase space (Schlickeiser 2002). Assuming an equilibrium configuration with a uniform background magnetic field and no net electric field, we can analyze the behavior of small perturbations by linearizing the system. The resulting coupled system of equations describes electromagnetic modes in the plasma that are excited by anisotropically drifting CRs and consists of the dispersion tensor applied to electric perturbations, \(T^{ij}\delta E^j=0\) with the boundary condition that these electric perturbations obey the linearized set of Maxwell’s equations in Fourier space, \(\varvec{k} \varvec{\times }\delta \varvec{E}=\omega \delta \varvec{B}\) and \(\varvec{k} \varvec{\cdot }\delta \varvec{B} = 0\). The dispersion tensor, \(T^{ij}\), is complicated in general (see Chapter 8 in Schlickeiser 2002) but simplifies for parallel propagating wave modesFootnote 10:

$$\begin{aligned} T^{ij} = \left( \begin{array}{ccc} T^{11} &{} 0 &{} 0 \\ 0 &{}T^{22} &{}T^{23} \\ 0 &{} -T^{23} &{}T^{22} \\ \end{array} \right) . \end{aligned}$$
(4)

The explicit form of the matrix elements is given in Shalaby et al. (2021, Appendix B) and we assume that the magnetic field and wave modes are aligned with the x axis, i.e., \(\varvec{B}_0 = B_0 \varvec{{\hat{x}}}\) and \(\varvec{k} = k\,\varvec{{\hat{x}}}\).

The stability of electromagnetic plasma modes is determined by the linear dispersion relation in Fourier space, which assesses whether a perturbation of a certain wave mode decays or grows over time. The dispersion relation is the determinant of the dispersion tensor, \(|T^{ij}| = T^{11} \left[ (T^{22})^2 + (T^{23})^2 \right] =0\). The solutions to \(T^{11}=0\) represent electrostatic wave modes for which only the parallel electric field is finite while the solutions to \(T^{22} \pm \textrm{i} T^{23} =0\) represent electromagnetic wave modes that are characterized by transverse electric and magnetic field components and a vanishing parallel electric field. Physically, the linear dispersion relation represents a relation between the complex wave frequency, \(\omega =\omega _{\textrm{r}}+ \textrm{i} \varGamma \), and the wave number k of waves propagating along the magnetic field:

$$\begin{aligned} D^\pm \equiv T^{22} \pm \textrm{i} T^{23} = 1-\frac{k^2 c^2}{\omega ^2} + \sum _s \chi _s^\pm = 0, \end{aligned}$$
(5)

where \(\chi _s^\pm \) is the linear plasma response for species s, which includes the background electrons and ions as well as the drifting CR electrons and ions (e.g., Shalaby et al. 2021). The circularly polarized eigenmodes of the background plasma relevant for CR scattering lie on the electron and ion cyclotron branches: on large scales with wave numbers \(k d_{\textrm{i}}\ll 1\), the electron cyclotron branch hosts forward-propagating Alfvén waves, which rotate at a rate \(\omega _{\textrm{r}}= kv_{\textrm{a}}\). These scales are larger than the ion skin depth, \(d_{\textrm{i}}=c/\omega _{\textrm{i}}\) (where \(\omega _{\textrm{i}}\) is the ion plasma frequency) and represent the MHD limit. The Alfvén waves turn into whistler wavesFootnote 11 at smaller scales for \(k d_{\textrm{i}}>1\) (with a dispersion relation \(\omega _{\textrm{r}} = k^2 d_{\textrm{i}}v_{\textrm{a}}\) for parallel whistlers) to approach electron cyclotron waves at scales smaller than the electron skin depth, \(k d_{\textrm{e}}=k d_{\textrm{i}}\sqrt{m_{\textrm{e}}/m_{\textrm{i}}}\gg 1\). Those electron cyclotron waves show a constant wave rotation rate \(\omega _{\textrm{r}} = |\varOmega _{\textrm{e}}|\) (see the upper dashed black line in Fig. 5). The ion cyclotron branch hosts backward-propagating Alfvén waves on scales \(k d_{\textrm{i}}\ll 1\), which directly turn into ion cyclotron waves on smaller scales with \(\omega _{\textrm{r}} = -\varOmega _{\textrm{i}}\) (see the lower dashed black line in Fig. 5). By contrast, a population of gyrating and drifting CRs along the mean magnetic field excites a CR ion–cyclotron wave, which is a rotating and propagating electromagnetic wave with the rotation rate \(\omega _{\textrm{r}}=-\varOmega _{\textrm{i}}+kv_\parallel \) in the background plasma frame, i.e., its rotation rate equals the Doppler-shifted CR ion–cyclotron frequency. Hence, it is no coincidence that these gyrating CRs also obey the gyro-resonance condition (1) for parallel wave modes, which in this case is interpreted such that the Doppler-shifted plasma wave frequency resonates with the CR gyro frequency.

A resonant CR-driven instability emerges if the rotating electromagnetic field of the CR ion–cyclotron wave as seen in the background frame equals one of the circularly polarized eigenmodes of the background plasma, namely those provided by the ion and electron cyclotron branches. As realized by Shalaby et al. (2023), this deep insight has a simple graphical interpretation, which is visualized in Fig. 5: the locations at which the rotation rate of the CR ion–cyclotron wave \(\omega _{\textrm{r}}=-\varOmega _{\textrm{i}}+kv_\parallel \) intersects the wave frequency \(\omega _{\textrm{r}}\) of the electron and ion cyclotron branches maximize the linear growth rate of a CR-driven instability. Interestingly, CR ion–cyclotron modes are driven unstable not only at the points of intersection with the background circularly polarized eigenmodes but also when the rotation frequency of a given CR ion–cyclotron mode is smaller in magnitude than that of the closest background mode (in \(\omega _{\textrm{r}}\)) of either the ion or the electron cyclotron branch (see Fig. 5; Shalaby et al. 2023).

Starting on large scales, the first two intersection points result from a resonance of the CR ion–cyclotron wave with the backward- and then the forward-propagating Alfvén wave, with \(\omega _{\textrm{r}}=\mp kv_{\textrm{a}}\), respectively. In consequence, the two peaks of the gyro-resonant instability (Kulsrud and Pearce 1969) emerge at wave numbers \(kd_{\textrm{i}}=(v_\parallel /v_{\textrm{a}}\pm 1)^{-1}\), where \(v_\parallel =v\mu \) is the parallel CR velocity component. This instability can be seen to the left in the growth rate plot in the bottom panel of Fig. 5. On smaller scales, streaming CRs excite the intermediate-scale instability (Shalaby et al. 2021), with the shortest unstable wave mode at \(kv_\parallel =|\varOmega _{\textrm{e,0}}|=eB/(m_{\textrm{e}}c)\), where \(\varOmega _{\textrm{e,0}}\) denotes the non-relativistic electron gyro frequency. Interestingly, the steepening of the wave rotation rate in the whistler regime causes the emergence of the intermediate-scale instability that extends to the electron cyclotron scale, where the real part of the electron cyclotron branch flattens again and approaches \(\varOmega _{\textrm{e}}\).

Fig. 5
figure 5

Graphical visualization of resonant CR ion driven instabilities for reduced parameters \(v_{\textrm{a}}=10^{-4}c\), \(n_{\textrm{cr}}/n_{\textrm{i}}=10^{-6}\), \(v_\parallel /v_{\textrm{a}}= 0.9\sqrt{m_{\textrm{i}}/m_{\textrm{e}}}=2.7\), \(v_\perp =v_{\textrm{a}}\), and \(m_{\textrm{i}}=36\,m_{\textrm{e}}\) for visual clarity. Shown are wave rotation rate \(\omega _{\textrm{r}}\) (top panel) and instability growth rate in the linear regime (bottom panel) versus wave number times the ion skin depth \(d_{\textrm{i}}\). Resonant CR-driven instabilities are excited right at the intersection points (vertical red dashed lines) of the CR ion–cyclotron wave, \(\omega _{\textrm{r}}=-\varOmega _{\textrm{i}}+k v_\parallel \) (red), and the circularly polarized plasma waves (black dashed). At the gyroscale, there are two peaks, which correspond to the crossing of the CR ion–cyclotron wave with backward and forward Alfvén waves (left to right), \(\omega _{\textrm{r}}=\mp kv_{\textrm{a}}\), respectively (zoom-in panel). The electron cyclotron branch steepens at smaller scales to become a (parallel) whistler wave with \(\omega _{\textrm{r}} \propto k^2\) until it levels off at \(k d_{\textrm{e}}\gtrsim 1\) to turn into an electron cyclotron wave with \(\omega _{\textrm{r}}=|\varOmega _{\textrm{e}}|\) (top panel). The bottom panel shows the growth rate of the gyro-resonant instability on large scales and the intermediate-scale instability toward smaller scales (larger k values). Image reproduced with permission from Shalaby et al. (2023)

Physically, the gyro-resonant streaming instability describes the ability of CRs to excite resonant Alfvén waves (Kulsrud and Pearce 1969). While Fig. 5 was computed for gyro-tropic CRs with a specific pitch angle, in practice the CR population shows a distribution in \(\mu \). In this case, we can approximate the linear growth rate of the gyro-resonant instability for \(n_\textrm{cr}/n_{\textrm{i}}\ll 1\) (Kulsrud 2005; Thomas 2022) in the frame that is comoving with the gas:

$$\begin{aligned} \varGamma _{\textrm{gyro}} \approx \pi \int \textrm{d}^3p \, \frac{1-\mu ^2}{2} \frac{q^2 v}{p} \left[ p \frac{\partial f}{\partial p} + \left( \frac{v}{v_\textrm{a}} - \mu \right) \frac{\partial f}{\partial \mu }\right] \delta _{\textrm{D}}\left[ k (\mu v- v_\textrm{a}) \mp \varOmega _{\textrm{i}}\right] , \end{aligned}$$
(6)

where \(\delta _{\textrm{D}}\) denotes Dirac’s \(\delta \) distribution and represents the resonance condition (1) for CR-Alfvén wave interactions. However, this condition is now interpreted from the viewpoint of the Alfvén waves and solely selects those CRs that meet this condition of resonantly interacting with parallely propagating Alfvén waves of wave number k to excite or damp them, depending on the overall sign of the momentum-derivatives of the CR distribution function in the square bracket of equation (6): a positive sign of the bracket signals instability while a negative sign indicates wave damping. Note that the combination of momentum derivatives in square bracket reduces to \(\partial f/\partial \mu \) in the wave frame (see discussion surrounding Eq. 2 and Fig. 4). Importantly, in the gyro-averaged CR rest frame, the CR gyro-motion needs to match the wave rotation frequency so that the CRs resonate with the Doppler-shifted wave oscillation. For a power-law momentum distribution of CRs, \(f\propto p^{-\alpha }\), this growth rate can be simplified to yield (Kulsrud 2005)

$$\begin{aligned} \varGamma _{\textrm{gyro}} \sim \frac{\pi }{4}\,\varOmega _{\textrm{i,0}} C_\alpha \,\frac{n_\textrm{cr}(>p_{\textrm{min}})}{n_{\textrm{i}}}\,\left( \frac{v_{\textrm{d}}}{v_{\textrm{a}}} - 1\right) , \end{aligned}$$
(7)

where \(\varOmega _{\textrm{i,0}}=qB/(m_{\textrm{i}}c)\) is the non-relativistic ion gyro frequency, \(C_\alpha =(\alpha -3)/(\alpha -2)\) is a constant of order unity, \(p_{\textrm{min}}\) is defined in equation (3), \(n_\textrm{cr}(>p_{\textrm{min}})\propto p_{\textrm{min}}^{3-\alpha }\) is the number density of resonant CRs, \(n_{\textrm{i}}\) is the number density of thermal ions, and \(v_{\textrm{d}}\) is the CR drift velocity that defines a frame with respect to which the CR distribution is isotropic. Equation (7) shows that only super-Alfvénic CRs can drive Alfvén waves unstable with a growth rate that depends on the resonant CR-to-thermal number density ratio, which decreases towards larger CR momenta due to the diminishing spectra at high energies for \(\alpha >4\).

On smaller scales, CRs with a finite pitch angle resonantly excite parallel electromagnetic waves on scales between the ion and electron gyro-resonances through the intermediate-scale instability (Shalaby et al. 2021). As we can infer from Fig. 5, CR ions drive background ion–cyclotron modesFootnote 12 unstable, which are comoving with the CR ions. While the electron modes have the wrong sense of rotation to interact with the CR ions in the background frame, Lorentz transforming to a frame that is comoving with the CR ions changes the sense of rotation of these wave modes to adopt that of CR ions and hence to enable resonance. Provided that \(n_\textrm{cr}\ll n_{\textrm{i}}\) and \(v_{\textrm{a}}\ll c\), Shalaby et al. (2021) approximate the peak growth rate by

$$\begin{aligned} \frac{ \varGamma _{\textrm{inter}} }{\varOmega _{\textrm{i,0}} } \approx \left( \frac{n_\textrm{cr}}{n_{\textrm{i}}}\right) ^{3/4}+ \left( \frac{n_\textrm{cr}}{3n_{\textrm{i}}}\right) ^{1/3} \left( \frac{v_\parallel v_{\perp }}{v_{\textrm{a}}^2} \right) ^{2/3}, \end{aligned}$$
(8)

where \(v_\perp \) is the perpendicular CR velocity of a gyrotropic distribution. This intermediate-scale instability typically grows faster by more than an order of magnitude in comparison to the previously discussed gyro-resonant streaming instability. The two peaks of the intermediate-scale instability are approximately given by \(kc/\omega _{\textrm{i}} = v_\parallel /v_{\textrm{a}}\) and \(k c/ \omega _{\textrm{i}} = m_{\textrm{r}} v_{{\textrm{a}}}/v_\parallel -v_\parallel /v_{\textrm{a}} \), where \(\omega _{\textrm{i}} = (4\pi e^2 n_{\textrm{i}}/m_{\textrm{i}})^{1/2}\) is the ion plasma frequency, \(m_{\textrm{r}}=m_{\textrm{i}}/m_{\textrm{e}}\) is the ion-to-electron mass ratio, and \(k c/ \omega _{\textrm{i}} = m_{\textrm{r}} v_{{\textrm{a}}}/v_\parallel \) is the electron gyroscale. This amounts to a peak separation of

$$\begin{aligned} \frac{ \varDelta k c }{ \omega _{\textrm{i}} } \approx m_{\textrm{r}} \frac{v_{\textrm{a}}}{v_\parallel } - 2 \frac{ v_\parallel }{v_{\textrm{a}}} \qquad \Rightarrow \qquad \frac{ \varDelta k v_\parallel }{ \varOmega _{\textrm{i,0}} } \approx m_{\textrm{r}} - 2 \left( \frac{v_\parallel }{v_{\textrm{a}}} \right) ^2, \end{aligned}$$
(9)

which is an excellent fit for low values of \(v_\parallel /v_{\textrm{a}}\). For increasing \(v_\parallel /v_{\textrm{a}}\), the peak separation decreases to the point were both peaks merge at \(v_\parallel /v_{\textrm{a}}=\sqrt{m_{\textrm{r}}}/2\) (i.e., the factor 2 is replaced by 4 in equation 9). For larger CR drift speeds, the instability ceases to exist because there is no intersection of the CR ion–cyclotron wave with the circularly polarized electron cyclotron branch at the whistler scales. Using the graphical representation in Fig. 5, the slope of the CR ion–cyclotron wave is too steep to intersect and resonate with the overturning electron cyclotron branch towards small scales. While the gyro-resonant instability is governing CR transport in interstellar, circumgalactic, and intracluster media, the intermediate-scale instability may also play an important role in CR scattering and transport (Shalaby et al. 2021) as well as in pre-accelerating electrons at collisionless shocks to enable them to participate in diffusive shock acceleration (Shalaby et al. 2022, see also Sect. 2.2).

2.2 Cosmic ray acceleration and escape from sources

Provided the CR distribution exhibits only a small degree of anisotropy, we can expand the CR distribution in a suitable system of basis functions (Legendre or Taylor polynomials) and truncate the expansion at finite order (e.g., after the dipolar or quadrupolar anisotropy) to derive a moment-based description of CR transport. This is governed by the wave–particle scattering rate, which itself depends on the amplitudes of resonant waves that result from the interplay of wave growth and damping. The resulting system of equations can be readily applied to the problem of galactic CR propagation, as we will detail in Sect. 2.3. In situations where large CR fluxes (and associated anisotropies) are present, which is realized in the vicinity of shock fronts, one must solve the full kinetic equations and study the excitation of plasma instabilities in the regime of large CR fluxes, which scatter the particles and thereby reduce the CR flux, as will be discussed in this section.

2.2.1 Cosmic ray sources

The sources of Galactic CRs are generally assumed to be SNRs in the Milky Way. While there is an ongoing debate about whether this source class is solely responsible for all CRs up to the “knee” in the CR momentum spectrum (at \(E\approx 3\times 10^{15}\) GeV for CR protons while heavier CR ions have a break at larger energies) or whether other sources contribute some flux, energetic considerations clearly argue for a dominant contribution of SNRs for generating the pressure carrying (GeV–TeV) CRs. This was suggested in a visionary paper by Baade and Zwicky (1934) and made quantitative by Ginzburg and Syrovatskii (1964); here we will sketch the argument by estimating the luminosity required to supply all the Galactic CRs and balance their escape losses:

$$\begin{aligned} \mathcal {L}_\textrm{cr}=\frac{V \varepsilon _\textrm{cr}}{\tau _{\textrm{esc}}}\sim 3\times 10^{40}~ \mathrm {erg~s}^{-1}, \end{aligned}$$
(10)

where V is the volume of the thick galactic disk (“the CR scattering halo”) with half-height \(H\approx 2\) kpc and radius \(R\approx 8\) kpc, \(\varepsilon _\textrm{cr}\approx 0.8\,\mathrm {eV~cm}^{-3}\) is the CR energy density, and \(\tau _{\textrm{esc}}\approx 3\times 10^7\) yr is the diffusive escape time (as inferred from the boron-to-carbon ratio of CRs and assuming a mean hydrogen density along the CR column in the scattering halo of \({\bar{n}}_{\textrm{H}}\approx 0.1~\textrm{cm}^{-3}\), see Sect. 4.1.1). This power can be delivered by Galactic SNe if 10% of their kinetic energy is transferred to CRs:

$$\begin{aligned} \mathcal {L}_\textrm{cr}\sim 0.1\mathcal {L}_{\textrm{sn}}\sim \frac{0.1E_{\textrm{sn}}}{\tau _{\textrm{sn}}}\sim \frac{0.1\times 10^{51}\textrm{erg}}{100\,\textrm{yr}}\sim 3\times 10^{40}~ \mathrm {erg~s}^{-1}, \end{aligned}$$
(11)

where \(\tau _{\textrm{sn}}\) is the average time scale between Galactic SNe. Within the Galaxy, various other (less prominent) sources of CRs exist. These include termination shocks found a stellar wind bubbles and superbubbles, shocks resulting from the collision of stellar winds in binary systems, bow-shocks generated by massive runaway stars, and pulsar wind nebulae. Time-averaging the mass loss rate times the square of the terminal velocity of stellar winds throughout different stages such as the main sequence, red supergiant, and Wolf-Rayet phases, the collective stellar wind luminosity from all massive stars in the Galaxy amounts to approximately \(\mathcal {L}_{\textrm{w}}\approx 1\times 10^{41}~\mathrm {erg~s}^{-1}\) (Seo et al. 2018), which is about 1/3 of the power of SN explosions, \(\mathcal {L}_{\textrm{sn}}\approx 3\times 10^{41}~\mathrm {erg~s}^{-1}\). Internal recollimation shocks in relativistic AGN jets as well as the termination or back-flow shocks are other important sites of CR acceleration, yet primarily for injecting CRs into the intergalactic and intracluster plasmas.

2.2.2 Particle acceleration and magnetic amplification

General picture. We now focus on the mechanism of diffusive shock acceleration that energizes CR electrons and protons at astrophysical non-relativistic shocks (Krymskii 1977; Axford et al. 1978; Bell 1978a, b; Blandford and Ostriker 1978) and discuss the popular case of an SNR shock that causes the surrounding ISM to expand radially. Since the mean free path in the hot phase of the ISM is larger than the SNR, the shock wave cannot set the ISM in motion by momentum transfer from individual particle-particle collisions, but only by particles scattering at electromagnetic fluctuations in the surrounding plasma that are driven unstable by the expanding flux of plasma (e.g., for negligible magnetization through the Weibel instability as shown by Medvedev and Loeb 1999, while other electromagnetic plasma instabilities dominate the magnetized case). Such a shock wave is referred to as a collisionless shock wave. Essential to the CR acceleration process is the amplification of standing Alfvén waves by currents driven by high-energy ions (Bell 2004) that outrun the shock towards the upstream region to generate a foreshock precursor (see Fig. 6 for a schematic drawing). This sources an (electron) return current in the thermal plasma that exponentially grows a spectrum of helical Alfvén waves with wavelengths smaller than the gyro radii of the high-energy protons. These waves act as a mediator that absorbs energy and can effectively scatter resonant protons and electrons at lower energies (Fig. 3) that are diffusing into the precursor. The scattering rate is close to, or below, the Bohm limit for a range of energies (Stage et al. 2006), i.e. they are scattered once per gyro orbit, implying that scattering in self-excited turbulence significantly reduces the diffusion coefficient (Reville et al. 2008). This leads to isotropization of charged particles in the respective reference frames (before and after the shock front) so that they pick up momentum through head-on scatterings with the magnetic field without recoil to (re)cross the shock front.Footnote 13 In each of these cycles (downstream, upstream and back downstream), a part of the accelerated particles is advected with the plasma behind the shock front, so that they do not participate in the further acceleration process. Many acceleration cycles, i.e., repeated reflections on both sides of the shock wave, generate relativistic CR particles. As long as the gyroradius is smaller than the radius of the shock front, this scale-invariant acceleration process produces a (non-thermal) power law in the CR momentum distribution with an efficiency that scales with \(v_{\textrm{s}}/c\), where \(v_{\textrm{s}}\) is the shock velocity in the laboratory frame (which is sometimes referred to as first-order Fermi acceleration, even though Fermi did not discover this process). Effectively, all particle acceleration at shocks is rooted in the electrostatic field \({{\varvec{E}}}\) of the shock, that slows down the incoming flow and causes the global velocity divergence.

Fig. 6
figure 6

Density structure of a quasi-parallel collisionless shock that accelerates CRs. The plasma is entering from the right and is gradually slowed down over the “foreshock precursor” where electron and ion-driven plasma instabilities drive plasma turbulence that scatters ionized particles. As the plasma enters the shock transition of strong electromagnetic fluctuations that stretches across several ion gyro radii \(r_{\textrm{gi}}\), frequent particle-wave scattering ensues. This slows down the incoming flow and steeply increases the plasma density in the form of a “ramp” and an “overshoot”, after which the plasma settles to a downstream state (Bohdan, private comm.)

As originally pointed out by Fermi (1949, 1954), particles can also be accelerated without a shock through interacting with externally driven turbulence. Resonant scattering off of moving magnetic irregularities, with \({{\varvec{E}}}=\textbf{0}\) in the local rest frame causes isotropic and elastic scattering in the scattering center rest frame. As a result there is a momentum gain for head-on collision and a momentum loss for tail-on collisions in the laboratory frame. On average, the mean energy is conserved but because there are statistically more head-on than tail-on collisions due to the particle motion, this causes the width of the particle distribution to increase so that a small fraction of particles in the high-energy tail experiences acceleration. In the quasi-linear picture of particle transport in a bath of linear (Alfvén, magnetosonic) waves (Kennel 1966; Skilling 1971, 1975; Schlickeiser 2002; Shalchi 2009), this can be described by a diffusion process in momentum space, which models the energy gain through resonant interactions. This process is comparably inefficient because it scales as \((v_{\textrm{w}}/c)^2\), i.e., it is only second order in the dimensionless wave velocity of the plasma wave scattering centers \(v_{\textrm{w}}\) in the laboratory frame. In the following, we will not consider this second-order Fermi process for particle acceleration (while it may play an important role in particle transport).

The (microscopic) processes at shocks are governed by plasma kinetics and can be studied by means of particle-in-cell (PIC) simulations that use macro particles to represent electrons and ions of the thermal plasma and the energetic CRs. According to Maxwell’s equations, moving charges source currents that perturb electromagnetic fields. These generate Lorentz forces that accelerate charged particles and hence modify the charge distribution and currents. The system is evolved by numerically iterating this loop on a fraction of the electron plasma timescale for macro-particles representing the individual elementary particles of a plasma. This methodology is ideal for exploring kinetic instabilities in the collisionless plasma around a shock. Prominent PIC codes include Tristan-MP (Spitkovsky 2005), VPIC (Bowers et al. 2008), photo-plasma (Haugbølle et al. 2013), SHARP (Shalaby et al. 2017, 2021), and Warp-X (Vay et al. 2018); see also Pohl et al. (2020) for a review. However, this limits the physical length scale and total simulation time of PIC simulations to microscopic dimensions. In order to allow for longer run times of the simulations and/or multiple spatial dimensions, the electron timescale is integrated out in hybrid-PIC simulations, where the electron population is represented by an adiabatic fluid and the ions are treated as macro particles in the kinetic PIC model. Approaching even larger time and length scales (to study certain properties of CR transport) requires to treat the entire background either as an MHD fluid or as separate fluids for each particle species, which are coupled to the PIC component representing CR particles (see Sect. 2.3.5). This approach is however not well suited to study CR acceleration at shocks as this would require adopting a recipe for injecting non-thermal particles from the thermal (MHD) fluid, which then participate in the diffusive shock acceleration process.

Fig. 7
figure 7

Hybrid-PIC simulation of CR ion acceleration at a collisionless, non-relativistic strong shock. The top panel shows the downstream ion energy spectrum of a quasi-parallel shock, color coded by different times. The thermal distribution can be accurately described by a Maxwellian distribution with a temperature that is 80% of the expected temperature for a shock (with a Mach number of 20) that does not accelerate particles (dashed line). The remaining energy is shared among magnetic turbulence and CRs that follow a power-law spectrum, which has an increasing maximum energy with time. To demonstrate the consistency with the theoretically predicted energy scaling in the non-relativistic regime of diffusive shock acceleration of test particles, the spectrum is scaled by \(E^{1.5}\). This corresponds to a momentum spectrum \(\propto p^{-4}\) (see inset). The bottom two panels show the magnitude of the total magnetic field for \(\mathcal {M}=50\) shocks and a magnetic obliquity of \(\theta _{Bn}=0^\circ \) and \(80^\circ \), respectively (see gray arrows), implying that magnetic field amplification and thus CR acceleration is only efficiently at work in quasi-parallel shocks. Image reproduced with permission from Caprioli and Spitkovsky (2014a), copyright by AAS

Fig. 8
figure 8

Visualization of the underlying principle of Bell’s non-resonant streaming instability. The CR current, \({{\varvec{j}}}_{\textrm{d}}\) induces a return current in the background electrons, \(-{{\varvec{j}}}_{\textrm{d}}\), which amplifies a helical magnetic perturbation and stretches it via the Lorentz force \({{\varvec{F}}} = -({{\varvec{j}}}_{\text {d}} \varvec{\times } {{\varvec{B}}})\,c^{-1}\). Image reproduced with permission from Zirakashvili et al. (2008), copyright by AAS

Ion acceleration at shocks. Hybrid-PIC simulations in large computational domains and long run times demonstrate the formation of non-thermal tails at strong, non-relativistic shocks. While the efficiency of ion acceleration far into the relativistic regime approaches zero for quasi-perpendicular shocks (with magnetic fields nearly perpendicular to the shock normal, \(\theta _{Bn}\approx 90^\circ \)), it is maximized for quasi-parallel shock geometries (Caprioli and Spitkovsky 2014a, b, see Fig. 7, bottom panels) because of efficient magnetic field amplification through the Bell instability (Bell 2004). This nonresonant instability is the dominant CR streaming instability for sufficiently large CR fluxes, i.e., when the CR drift speed obeys (Shalaby et al. 2021)

$$\begin{aligned} v_{\textrm{d}}> \left( \frac{3^{1/2}\gamma n_{\textrm{cr}}}{4 n_{\textrm{i}}}\right) ^{-2/3}v_{\textrm{a}}, \end{aligned}$$
(12)

where \(\gamma \) is the Lorentz factor of CR ions. This can be realized in quasi-parallel shock configurations where the acceleration process at the shock transition launches a strong CR flux towards the upstream along the mean magnetic field. The positive CR current induces a return current in the background electrons, which destabilized helical magnetic field fluctuations. The associated Lorentz force pulls the field lines outwards (Fig. 8), so that they start to approach each other, which thus exponentially amplifies the circularly polarized magnetic field (Bell 2013). Because CRs with a broad spectrum of momenta grow this field incoherently, there is a spectrum of Alfvénic turbulence generated on scales much smaller than the CR gyroradii. Clearly, this instability relies on a CR current that is aligned with the mean magnetic field. Thus, the magnetic amplification process saturates once the highest energy CRs become magnetized, i.e., if their trajectories start to be bent in the amplified field. This happens if the amplified magnetic energy density is comparable to the CR energy flux density at the shock divided by the light speed, yielding a saturated value for the amplified magnetic energy density (Bell 2004; Niemiec et al. 2008; Riquelme and Spitkovsky 2009; Ohira et al. 2009; Gargaté et al. 2010; Blasi et al. 2015; Zacharegkas et al. 2022):

$$\begin{aligned} \varepsilon _{B,\textrm{sat}} \sim \frac{1}{2}\frac{v_{\textrm{s}}}{c}\,\varepsilon _\textrm{cr}, \end{aligned}$$
(13)

where \(v_{\textrm{s}}\) is the shock speed, which is equal the CR drift velocity relative to the upstream plasma.

The average distance that energetic ions travel before experiencing pitch-angle scattering is similar to their gyro radii calculated in the turbulence they generate. In the case of moderately strong shocks, the magnetic field amplification can be characterized in the quasi-linear regime. This implies that all particles diffuse with their self-generated diffusion coefficient and undergo scattering once per gyro orbit (i.e., they diffuse in the Bohm limit). By contrast, in the case of very strong shocks, the magnetic field undergoes significant amplification, reaching non-linear levels. Most of the magnetic energy is concentrated in modes with wavelengths similar to the gyroradii of the highest-energy ions. As a result, only those particles undergo Bohm-like diffusion, while others scatter less efficiently (Caprioli and Spitkovsky 2014c). The evolution of Bell’s (2004) instability at high Mach number, quasi-parallel shocks is qualitatively similar in two and three-dimensional simulations (van Marle et al. 2019).

Interestingly, quasi-parallel shocks are not stationary but they reform quasi-periodically on ion cyclotron timescales. This indicates that when ions encounter the steepest gradient of the shock discontinuity, they undergo specular reflection. This only happens for a quarter of the time. As a result, those ions are energized via shock-drift accelerationFootnote 14 and only a fraction of those gain enough energy to be injected into the process of diffusive shock acceleration (Caprioli et al. 2015). Hence, the popular thermal leakage model where all particles above a critical momentum in the tail of the Maxwell-Boltzmann distribution are accelerated is likely not applicable and too simplified to capture the correct physics. Energy transfer to a non-thermal (relativistic) particle population causes shocks to be modified and to assume an increased density jump at the shock transition due to an increased compressibility brought about by the decrease in the adiabatic index in the presence of the relativistic (CR) population. As a result, flatter CR spectra should emerge at high energies in comparison to the test-particle prediction \(\propto p^{-4}\) (Drury and Voelk 1981; Malkov and Drury 2001). This is in contradiction to the steeper slopes inferred from the non-thermal emission at SNRs implying momentum spectra \(\propto p^{-4.3}\). Possible solutions to this spectral steepening include the insight that (i) the energy required to turbulently amplify the magnetic field during the particle acceleration at shocks extracts energy from the CR population and steepens the CR energy spectrum (Bell et al. 2019), (ii) that CRs potentially isotropize in the frame of Alfvénic turbulence downstream of the shock in a region known as the “postcursor” so that they would be advected downstream with this turbulence, implying a steepening of the CR momentum spectrum despite the density jump exceeding the canonical value for a non-relativistic gas of four (Caprioli et al. 2020), and (iii) varying magnetic obliquity along the shock surface (Hanusch et al. 2019).

Electron acceleration at shocks. By contrast, the gyroradii of electrons are smaller by the mass ratio relative to ions, \(r_{\textrm{e}}/r_{\textrm{i}}=m_{\textrm{e}}/m_{\textrm{i}}\), so that they random walk through the shock transition and do not see a coherent electrostatic shock potential. This implies that thermal electrons cannot directly participate in the diffusive shock acceleration process that accelerates electrons to highly relativistic energies and requires a pre-acceleration process, which increases their momentum by a factor \(\sim (m_{\textrm{i}}/m_{\textrm{e}})(c/v_{\textrm{s}})\sim 3\times 10^4\), thereby constituting the famous “electron injection problem” at shocks. Several processes have been suggested that differ depending on the magnetic obliquity in the upstream. Quasi-perpendicular shocks (with an obliquity \(\theta _{Bn}\sim 90^\circ \)) are characterized by a narrow shock transition so that particles cannot escape the shock and are (inefficiently) accelerated via shock-surfing (Shimada and Hoshino 2000; Hoshino and Shimada 2002; Bohdan et al. 2019a, b), magnetic reconnection (Matsumoto et al. 2015; Bohdan et al. 2020), stochastic Fermi acceleration (Matsumoto et al. 2017; Bohdan et al. 2019b), and compressional heating. In oblique shocks (with an obliquity \(\theta _{Bn}\approx 50^\circ - 70^\circ \)), electrons can escape the shock transition region to form the electron foreshock where they generate electrostatic electron-acoustic waves far upstream and whistler waves closer to the shock that can accelerate electrons (Xu et al. 2020; Morris et al. 2022; Bohdan et al. 2022). In addition, shock-surfing, magnetic reconnection, and stochastic shock drift acceleration also energize electrons (Matsumoto et al. 2017; Katou and Amano 2019; Amano et al. 2020).

However, by far the most efficient electron accelerators are quasi-parallel shocks (\(\theta _{Bn}\lesssim 50^\circ \)) where ions can escape the shock transition region forming the ion foreshock (Park et al. 2015; Hanusch et al. 2020; Arbutina and Zeković 2021). In particular, the recently found intermediate-scale instability (Shalaby et al. 2021) provides a natural way to produce large-amplitude electromagnetic fluctuations in parallel shocks. The instability drives ion–cyclotron waves unstable that are comoving with the upstream plasma at the shock front (see Sect. 2.1.3). These unstable ion–cyclotron modes scatter the electrons parallel to the magnetic field so that some of them get accelerated to the required energies to be injected into diffusive shock acceleration (Shalaby et al. 2022). However, these PIC simulations are numerically very challenging, which limits the achievable simulation time and/or dimensionality of the problem. Hence, we do not yet have a complete picture of the kinetic plasma physics that is responsible for particle acceleration.

Meso-scale models of shock acceleration. There is an enormous range of scales between an entire SNR (with a typical radius of \(r\sim 3~\textrm{pc}\approx 10^{19}\) cm) and typical scales of a plasma. In a cold plasma, the ion and electron distributions oscillate at their characteristic plasma frequencies, \(\omega _{\textrm{i,e}} = (4\pi e^2 n_{\textrm{i,e}}/m_{\textrm{i,e}})^{1/2}\), which implies corresponding ion and electron skin depths of \(d_{\textrm{i}} = c/\omega _{\textrm{i}}\sim 2\times 10^7~n_{{\textrm{i,}}0}^{-1/2}\textrm{cm}\) and \(d_{\textrm{e}} = c/\omega _{\textrm{e}}\sim 5\times 10^5~n_{{\textrm{e,}}0}^{-1/2}\textrm{cm}\), where \(n_{{\textrm{e,}}0} = n_{{\textrm{i,}}0} = 1~\textrm{cm}^{-3}\). Those length and time scales need to be resolved by PIC models of non-relativistic shocks at SN blast waves, which have typical shock transition widths of a few \(d_{\textrm{i}}\). In consequence, these PIC simulations can only afford a limited simulation run time typically corresponding to \(300~ \varOmega _0^{-1}=300\, m_{\textrm{p}}c/(eB)\sim 9 B_{\mu \textrm{G}}^{-1}~\textrm{hours}\) of physical time and rarely model two or even three spatial dimensions, though hybrid PIC simulations are able to significantly expand upon these constraints. To bridge this range in scales, several different meso-scale approaches have been developed, which aim at answering different open problems of the CR shock acceleration problem. The answer to the question of whether SNR shocks are able to accelerate CRs to the “knee” at \(\sim 3\) PeV (for CR protons and greater energies for heavier ions) depends on the specific properties of the SNR shock, the CR diffusion coefficient, and eventually on the magnetic amplification upstream the shock.

To demonstrate this, we provide an order of magnitude argument that this is only possible if the magnetic field is amplified at the shock to levels of \(100~\mu \)G by balancing the acceleration time and the lifetime of SNR (Lagage and Cesarsky 1983a, b; Bell 2013). Assuming that the CR number density \(n_{\textrm{cr}}\) in a small energy range has a uniform diffusion coefficient ahead of the shock, we can balance the CR diffusion flux \(-\kappa \partial n_{\textrm{cr}}/\partial x\) with the advective flux \(vn_{\textrm{cr}}\) to obtain an exponential precursor in the immediate upstream of the shock: \(n_{\textrm{cr}}=n_{\textrm{s}}\exp (-x/L)\), where x is the coordinate reaching from the shock to the upstream, \(L=\kappa /v_{\textrm{s}}\) is the precursor scale height, \(n_{\textrm{s}}\) is the CR number density at the shock discontinuity, and \(v_{\textrm{s}}\) is the shock velocity. The CR number in the precursor region is given by \(N=n_{\textrm{s}}LA\), where A is the unit shock surface area. The rate at which CRs transition across the shock boundary from the upstream to the downstream and back to the upstream can be calculated from kinetic theory to yield \({\dot{N}}=n_{\textrm{s}}cA/4\). Combining these estimates enables us to estimate the average residency time of CRs ahead of the shock between crossings, \(\varDelta t=N/{\dot{N}}=4\kappa /(c v_{\textrm{s}})\). The CR acceleration rate is then given by \(\textrm{d}E/\textrm{d}t\approx \varDelta E/\varDelta t=E v_{\textrm{s}}^2/(4\kappa )\), where we have used the average fractional energy gain of CRs per shock transition, \(\varDelta E/E=v_{\textrm{s}}/c\) (Bell 2013). This simple calculation only accounts for the time the CRs spend in the upstream, which is reasonable if the magnetic field in the downstream region is increased due to shock compression and if it is highly turbulent, implying a reduced diffusion coefficient. Assuming that additionally accounting for the time of CRs spent downstream doubles the total acceleration time yields \(\tau _{\textrm{acc}}=8\kappa /v_{\textrm{s}}^2\), which can be equated to the SNR lifetime \(\tau \) to obtain a maximum CR energy, \(E_{\textrm{max}}\). We use the expression for the diffusion coefficient, \(\kappa =c\lambda /3\), where \(\lambda \) is the CR scattering free path measured in units of the CR gyroradius, \(r_{\textrm{L}}= E/(ZeB)\). Hence, the maximum CR energy is

$$\begin{aligned} E_{\textrm{max}}=\frac{3ZeB}{8c} \left( \frac{\lambda }{r_{\textrm{L}}}\right) ^{-1} \tau v_{\textrm{s}}^2 =3\times 10^{15}\,\left( \frac{\lambda }{r_{\textrm{L}}}\right) ^{-1} \left( \frac{v_{\textrm{s}}}{5000~\mathrm {km~s}^{-1}}\right) ^2 \,B_{2}\tau _{3}~\textrm{eV}, \end{aligned}$$
(14)

where \(B_{2}\) is the magnetic field in units of \(100\,\mu \)G, \(\tau _{3}\) is the SNR age in units of 1000 years, and we adopted CR protons with \(Z=1\). Because \(\lambda \) cannot be smaller than \(r_{\textrm{L}}\) and because \(\tau _3\approx 0.4\) for historic SNRs such as Cas A, Kepler or Tycho, it is a challenge to reach the CR “knee” (Bell 2013). In particular, reaching this CR “knee” requires field strengths to be much larger than the \(\mu \)G fields of the average ISM. This provides circumstantial evidence for an efficient magnetic amplification mechanisms at SNR shocks such as Bell’s instability.

Three-dimensional MHD-kinetic simulations, which employ a spherical harmonic expansion of the Vlasov–Fokker–Planck equation (Bell et al. 2013; Reville and Bell 2013) find that at the present time, historical SNRs such as Cas A, Tycho and Kepler appear to be expanding too slowly to accelerate CRs to the knee and argue for a larger maximum energy at an earlier stage of their evolution. Coupling the hydrodynamics to the transport equations for CRs and magnetic turbulence in one spherical dimension and zooming onto the shock transition enables one to resolve the scales of the fast growing Bell instability in the shock precursor and the associated Bohm diffusion while following diffusive acceleration of CRs (Brose et al. 2020). This yields the time evolution of the CR spectrum and facilitates accurate comparisons to observational data. The simulated \(\gamma \)-ray emission spectra (Brose et al. 2021) of the different stages of the remnant’s evolution such as free expansion, energy-conserving Sedov–Taylor, and early post-adiabatic expansion phases appear to match the observed time sequence of TeV \(\gamma \)-ray spectra (Funk 2015).

Three-dimensional MHD models of shock acceleration. A complementary line of research uses three-dimensional MHD simulation models, where a suitable algorithm finds the shock surface during the run-time of the simulation and characterises its jump conditions, Mach number, and magnetic obliquity (Pfrommer et al. 2017a; Pais et al. 2018; Dubois et al. 2019; Böss et al. 2023). In the next step, CR ions are injected and transported either in the grey approximation (Pfrommer et al. 2017a) or by accounting for the Fokker-Planck evolution of the full CR ion spectrum (Yang and Ruszkowski 2017; Girichidis et al. 2019, 2022), which enables us to account for the dynamical back reaction of CRs on the MHD flow. Additionally, a full spectrum of (primary, shock-accelerated) CR electrons can be injected according to a “subgrid” model, which is inferred from PIC simulations or motivated by theoretical considerations. The CR electron spectrum is transported with the flow while accounting for all re-acceleration and cooling processes that modify the spectrum (Winner et al. 2019). At every time step, we can compute the hadronic proton-proton reaction that generates pions, which decay into \(\gamma \) rays and secondary electrons/positrons and neutrinos (see Sect. 2.4). Subsequently, the (primary and secondary) leptons radiate synchrotron emission from the radio to the X-ray band (provided the shock generated sufficiently energetic electrons and amplified magnetic fields). In addition, they undergo IC interactions, which results in the emission from the hard X-ray to the \(\gamma \)-ray regime, which can be compared to observational data.

Comparison between observed and simulated emission maps, ranging from the radio to X-rays to the \(\gamma \)-ray regime and the detailed multi-frequency spectrum of SN1006 argue in favor of a preferred quasi-parallel electron and ion acceleration for TeV electrons (Winner et al. 2020, see Fig. 9). While simulated radial profiles in the equatorial sectors match X-ray data very well, there is some residual radio flux visible in the radio observations, which is missed by those simulations. This may suggest that the acceleration efficiency for GeV electrons is less strongly dependent on magnetic obliquity while the acceleration efficiency for TeV electrons, which radiate synchrotron X-rays, is suppressed by a factor of about 10 for quasi-perpendicular shocks in comparison to quasi-parallel shock geometries. By incorporating the obliquity-dependent CR ion acceleration model of Caprioli and Spitkovsky (2014a) into these comprehensive three-dimensional MHD models, we can account for the puzzling range of morphologies observed in TeV shell-type SNRs by adjusting the magnetic morphology and attribute regions exhibiting high (low) \(\gamma \)-ray intensity to quasi-parallel (quasi-perpendicular) shock configurations. Moreover, this enables constraining the magnetic coherence scale in the environment of several SNRs by relating the degree of \(\gamma \)-ray patchiness of SNRs to the correlation scale in the magnetic field (Pais et al. 2020; Pais and Pfrommer 2020).

Fig. 9
figure 9

Multi-frequency maps of the SNR SN 1006: the bottom row shows observations in the 1.4 GHz radio band (Dyer et al. 2009), the soft X-ray band (Cassam-Chenaï et al. 2008), and at very-high energy \(\gamma \) rays (Acero et al. 2010). Those are compared to simulated maps derived from global MHD simulations of Sedov explosions, which account for CR ion acceleration at the shock and evolve the CR electron spectrum (Winner et al. 2019). Initially, the magnetic field points from the bottom right to the top left. These simulations account for an acceleration efficiency that depends on magnetic obliquity, implying efficient acceleration at quasi-parallel shocks (i.e., when the shock travels at a narrow-angle relative to the orientation of the magnetic field in the upstream region), and inefficient acceleration at quasi-perpendicular shock configurations. Hence, they explain the polar cap morphology seen in the maps. The simulated \(\gamma \)-ray map is convolved with the point-spread function of the H.E.S.S. instrument and incorporates Gaussian distributed noise, which matches the observed amplitude and correlation structure. The maps have a side length of 21 pc or 42.5’ at a distance of 1.66 kpc. Image reproduced with permission from Winner et al. (2020), copyright by the author(s)

2.2.3 Particle escape from supernova remnants to the interstellar medium

Diffusive shock acceleration is a self-regulating process: provided the level of upstream turbulence is low, many particles escape upstream, which sources a large current and a very efficient amplification of resonant and non-resonant Alfvén waves through streaming instabilities (Kulsrud and Pearce 1969; Bell 2004). In turn, this causes the level of Alfvénic upstream turbulence to increase until the magnetic energy density in the amplified field is comparable to the CR energy flux density at the shock divided by the light speed, see Eq. (13). The strong Alfvénic turbulence scatters particles more efficiently and suppresses particle escape in the upstream (Schure and Bell 2014; Cardillo et al. 2015). Besides regulating the shock itself, this phenomenon also has a nonlinear impact on the escape and confinement times of CRs in the immediate vicinity of their sources (e.g., Bell et al. 2013; Marcowith et al. 2021). The self-generated magnetic turbulence increases the scattering rate of CRs, resulting in a reduction of the diffusion coefficient. Consequently, CRs accumulate within the environment of the sources and start to build up a significant pressure gradient. This gradient excavates a cavity around the source and cause to the formation of a CR-dominated bubble, so that the CRs can accumulate significant grammage \(X(t)=\rho c t\) in the bubbles before moving into the ISM (Reville et al. 2008; Malkov et al. 2013; Schroer et al. 2021, 2022; Recchia et al. 2022); the level of the increased CR scattering depends on the ISM properties in the immediate environment of the sources (Reville et al. 2007, 2021; Nava et al. 2016, 2019).

A crucial point to consider when discussing CR feedback in galaxies is the presence of a spatially varying and reduced diffusion coefficient within the source regions in comparison to its galactic average value. This modification affects the phase-space structure of star-forming regions, leading to the accumulation of CRs. As a result, even in situations where the disk is unstable, the accumulation of CRs prevents local fragmentation and preserves a well-defined spiral structure (Semenov et al. 2021). Provided the level of electromagnetic fluctuations is small enough so that the quasi-linear theory of CR transport applies, advancing to the next level calls for the two-moment model of CR transport, which accounts for a temporarily and spatially varying diffusion coefficient in the self-confinement picture (Jiang and Oh 2018; Thomas and Pfrommer 2019, 2021; Thomas et al. 2021). These hold the promise to elucidate the picture of how exactly CRs migrate from the sites of their increased confinement around sources to the galactic environment where transport is faster.

2.3 Cosmic ray spatial transport

2.3.1 Theoretical background

To model CR transport in macroscopic systems such as galaxies, galaxy clusters, or AGN jets, and to study the dynamical impact of CRs on the evolution of these systems, we need to coarse grain the kinetic physics and develop a fluid description for CRs that is coupled to MHD. Here, we sketch the derivation and various approximations of the different approaches. The CR distribution is defined in phase space, which is spanned by the momentum and spatial coordinates \(\varvec{p}\) and \(\varvec{x}\), respectively, and reads

$$\begin{aligned} f\equiv f(\varvec{x},\varvec{p},t)=\frac{\textrm{d}^6N}{\textrm{d}{x^3}\textrm{d}{p^3}}. \end{aligned}$$
(15)

Its time evolution is governed by the Vlasov equation in the semi-relativistic regime in the frame that is comoving with the fluid,

$$\begin{aligned} \frac{\partial f}{\partial t} + (\varvec{v} + \varvec{v}_\textrm{cr}) \varvec{\cdot } \varvec{\nabla }_{\varvec{x}} f + \varvec{F} \varvec{\cdot } \varvec{\nabla }_{\varvec{p}} f = 0, \end{aligned}$$
(16)

where the distribution function is advected by the total velocity that consists of the mean gas velocity \(\varvec{v}\) and the CR velocity \(\varvec{v}_\textrm{cr}\). Note that \(\varvec{v}\) is measured in the laboratory frame while CR velocity and momentum \(\varvec{p}\) are measured in the comoving frame. The total force is represented by \(\varvec{F}\). Introducing the comoving frame introduces pseudo forces, which are denoted as \(\varvec{F}_{\textrm{pseudo}}\) and which come about because the momentum measured by a comoving observer changes as a result of a change of the frame velocity \(\varvec{v}\). Consequently, the momentum of CRs is altered by the Lorentz force, which can be divided into contributions from large-scale and small-scale electromagnetic fields denoted as \(\varvec{F}_{\textrm{macro}}\) and \(\varvec{F}_{\textrm{micro}}\), respectively (as derived with a covariant formalism by Thomas and Pfrommer 2019, Appendix C):

$$\begin{aligned} \varvec{F}&=\varvec{F}_{\textrm{pseudo}} + \varvec{F}_{\textrm{macro}} + \varvec{F}_{\textrm{micro}} \end{aligned}$$
(17)
$$\begin{aligned}&=-m\frac{\textrm{d}\varvec{v}}{\textrm{d}t}-\varvec{(\varvec{p} \varvec{\cdot } \varvec{\nabla })} \varvec{v} + q\frac{\varvec{v}_\textrm{cr}\varvec{\times } \varvec{B}}{c} + q \left( \delta \varvec{E} + \frac{\varvec{v}_\textrm{cr}\varvec{\times } \delta \varvec{B}}{c}\right) , \end{aligned}$$
(18)

where m and q are the CR mass and charge, \(\varvec{B}\) is the local mean magnetic field, and \(\textrm{d}/\textrm{d}t = \partial / \partial t + \varvec{v} \varvec{\cdot } \varvec{\nabla }\) denotes the Lagrangian time derivative. The first pseudo-force arises from the accelerating, comoving the frame itself. From the perspective of a comoving observer, a CR at rest in the laboratory frame appears to be accelerating. The second pseudo force stems from spatial variations in the velocity field of the background plasma. If a CR moves in the laboratory frame, any change in its position leads to a change in the frame velocity due to the inhomogeneities. This relationship between the laboratory and comoving frames results in acceleration in the comoving frame. The small-scale fluctuations, denoted as \(\delta \varvec{E}\) and \(\delta \varvec{B}\), represent electric and magnetic fluctuations, respectively. These fluctuations are generated by plasma waves, especially Alfvén waves on MHD scales and ion–cyclotron waves that are comoving with the CRs on smaller scales, and they serve as a source of scattering for CR (see Fig. 3):

$$\begin{aligned} \frac{\partial f}{\partial t} \Bigg \vert _{\textrm{scatt}} = \varvec{F}_{\textrm{micro}} \varvec{\cdot } \varvec{\nabla }_{\varvec{p}} f. \end{aligned}$$
(19)

The waves are either provided by external turbulence and cascading from large scales or generated by the CR-driven gyroresonant instability.

The CR proton gyroradius is \(r_{\textrm{g}}=p_\perp c/(eB)=0.22~\textrm{AU}\, E_{\textrm{GeV}}\,B_{\mu \textrm{G}}^{-1}\) and is much smaller than any macroscopic scale in galaxies or galaxy clusters. Hence, we can project out the full-phase dynamics of CRs by taking their gyroaverage to arrive at the focused transport equation (Skilling 1971; Zank 2014), which is expressed in the reference frame that moves with the average velocity of the gas

$$\begin{aligned}&\frac{\partial f}{\partial t} + (\varvec{v} + \mu v_\textrm{cr}\varvec{b}) \varvec{\cdot } \varvec{\nabla } f + \left[ \frac{1-3\mu ^2}{2} (\varvec{b} \varvec{\cdot } \varvec{\nabla } \varvec{v} \varvec{\cdot } \varvec{b}) - \frac{1-\mu ^2}{2} \varvec{\nabla } \varvec{\cdot } \varvec{v} \right] p \frac{\partial f}{\partial p} \nonumber \\&\quad + \left[ v_\textrm{cr}\varvec{\nabla } \varvec{\cdot } \varvec{b} + \mu \varvec{\nabla } \varvec{\cdot } \varvec{v} - 3 \mu (\varvec{b} \varvec{\cdot } \varvec{\nabla } \varvec{v} \varvec{\cdot } \varvec{b}) \right] \frac{1-\mu ^2}{2} \frac{\partial f}{\partial \mu } = \frac{\partial f}{\partial t} \Bigg \vert _{\textrm{scatt}}. \end{aligned}$$
(20)

Here, the mean gas velocity \(\varvec{v}\) and the direction of the mean magnetic field on large scales, \(\varvec{b} = \varvec{B} / B\) are measured in the laboratory frame while the magnitude of the CR momentum, \(p=|\varvec{p}|\), and the cosine of the pitch angle, \(\mu = \varvec{v}_\textrm{cr}\varvec{\cdot } \varvec{b} / v_\textrm{cr}\) are given in the comoving gas frame. As a result, the CR distribution function depends on 6 variables, \(f=f(\varvec{x}, p, \mu , t)\). If the CR distribution function is characterized by a high degree of anisotropy (which is the case for CR escape ahead of shocks), we would have to directly solve the focused transport equation (20) without making further approximations. Here, we define the CR anisotropy as the deviation of the CR momentum-space distribution from an isotropic distribution, \(\delta f/f_0\), where \(f_0=f_0(\varvec{x}, p, t)\) is the isotropic part of the CR distribution function.

2.3.2 One-moment cosmic-ray hydrodynamics

In our Galaxy, the CR anisotropy at particle energies in the range of GeV to TeV is observed to be very small of order \(\mathcal {O}(10^{-4})\) (Kulsrud 2005). The simplest approximation for CR transport is the one-moment method, which approximates f by the isotropic part of the CR distribution function and is derived from the quasi-linear theory of CR transport that assumes electromagnetic fluctuations with a small amplitude relative to the mean field. Taking the zeroth \(\mu \)-moment of the focused transport equation (20), yields a Fokker-Planck equation for CR transport (Skilling 1971, 1975; Schlickeiser 2002),

$$\begin{aligned} \frac{\partial f_0}{\partial t} + (\varvec{v} + \varvec{v}_{{\textrm{st}}}) \varvec{\cdot }\varvec{\nabla } f_0 = \varvec{\nabla }\varvec{\cdot }\left[ \kappa \varvec{b} \left( \varvec{b}\varvec{\cdot }\varvec{\nabla } f_0\right) \right] + \frac{1}{3} p\frac{\partial f_0}{\partial p}\,\varvec{\nabla }\varvec{\cdot }(\varvec{v} + \varvec{v}_{{\textrm{st}}}) + \frac{1}{p^2}\frac{\partial }{\partial p}\left[ p^2\varGamma _{\textrm{p}}\,\frac{\partial f_0}{\partial p}\right] ,\nonumber \\ \end{aligned}$$
(21)

where \(p=|\varvec{p}|\) is the magnitude of the momentum, \(\varGamma _{\textrm{p}}\) is the momentum diffusion rate related to second-order Fermi acceleration, and we have omitted CR sources and non-adiabatic CR losses. \(\varvec{v}_{{\textrm{st}}}\) and \(\kappa \) are the CR streaming speed and spatial diffusion coefficient along the mean magnetic field, which will be introduced in detail below. This equation holds true only for time periods significantly longer than the relaxation time for pitch angle scattering \(\tau \sim \mathcal {O}(D_{\mu \mu }^{-1})\) where \(D_{\mu \mu }=D_{\mu \mu }(\varvec{x},p,\mu )\) is the Fokker-Planck coefficient representing the frequency of pitch angle scattering of CRs by hydromagnetic waves.

Multiplying the Fokker-Planck equation (21) by the CR kinetic energy and integrating it over momentum space results in the evolution equation for the CR energy density, \(\varepsilon _\textrm{cr}\):

$$\begin{aligned} \frac{\partial \varepsilon _{{\textrm{cr}}}}{\partial t} + \varvec{\nabla }\varvec{\cdot }\left[ P_{{\textrm{cr}}} \varvec{v}_{{\textrm{st}}} + \varepsilon _{{\textrm{cr}}} (\varvec{v} + \varvec{v}_{{\textrm{st}}} + \varvec{v}_{{\textrm{di}}}) \right]= & {} - P_{{\textrm{cr}}}\varvec{\nabla }\varvec{\cdot }\varvec{v} + \varvec{v}_{{\textrm{st}}}\varvec{\cdot }\varvec{\nabla }P_{{\textrm{cr}}}, \end{aligned}$$
(22)
$$\begin{aligned} \frac{\partial \varepsilon }{\partial t} + \varvec{\nabla }\varvec{\cdot }\left[ (\varepsilon +P_{{\textrm{th}}}+P_{{\textrm{cr}}}) \varvec{v}\right]= & {} \quad P_{{\textrm{cr}}}\varvec{\nabla }\varvec{\cdot }\varvec{v} -\varvec{v}_{{\textrm{st}}}\varvec{\cdot }\varvec{\nabla }P_{{\textrm{cr}}}. \end{aligned}$$
(23)

Note that the CR equation (22) resembles the evolution equation for the thermal and kinetic energy density, \(\varepsilon = \varepsilon _{{\textrm{th}}}+\rho v^2/2\) shown in equation (23). The equations of state relate the thermal and CR pressures, \(P_{\textrm{th}}\) and \(P_\textrm{cr}\), to the corresponding energy densities via

$$\begin{aligned} P_{\textrm{th}}&= (\gamma _{\textrm{th}} - 1) \varepsilon _{\textrm{th}},\quad \gamma _{\textrm{th}} = \frac{5}{3}, \end{aligned}$$
(24)
$$\begin{aligned} P_\textrm{cr}&= (\gamma _{\textrm{cr}} - 1) \varepsilon _\textrm{cr}, \quad \gamma _{\textrm{cr}} = \frac{4}{3}. \end{aligned}$$
(25)

CR streaming and diffusion are characterized by characteristic velocities, \(\varvec{v}_{{\textrm{st}}}\) and \(\varvec{v}_{{\textrm{di}}}\) (defined below). Note that here we neglected second-order Fermi acceleration and any non-adiabatic gain and loss terms for the thermal plasma and CRs for the sake of transparency. Combining eqs. (22) and (23) causes the terms on the right-hand side to vanish identically. This means that in the absence of non-adiabatic gain and loss terms, the total energy in form of CRs, thermal plasma and kinetic energy is conserved. This can be readily inferred by integrating the sum of eqs. (22) and (23) over volume and applying Gauss’ theorem. The physics of the CR energy equation (22) can be more transparently discussed by expanding the \(P_{{\textrm{cr}}} \varvec{v}_{{\textrm{st}}}\) term in the brackets to arrive at

$$\begin{aligned} \frac{\partial \varepsilon _{{\textrm{cr}}}}{\partial t} + \varvec{\nabla }\varvec{\cdot }\left[ \varepsilon _{{\textrm{cr}}} (\varvec{v}+\varvec{v}_{{\textrm{st}}} + \varvec{v}_{{\textrm{di}}})\right] = - P_{{\textrm{cr}}}\varvec{\nabla }\varvec{\cdot }(\varvec{v}+\varvec{v}_{{\textrm{st}}}). \end{aligned}$$
(26)

This equation states that \(\varepsilon _{{\textrm{cr}}}\) is advected with the total CR velocity, \(\varvec{v}+\varvec{v}_{{\textrm{st}}}+\varvec{v}_{{\textrm{di}}}\). CRs can either adiabatically gain or lose energy, depending on the sign of the total velocity divergence: for adiabatic compression, \(\varvec{\nabla }\varvec{\cdot }(\varvec{v}+\varvec{v}_{{\textrm{st}}})<0\), so that CRs gain energy.

This surprising simplicity of CR transport can be understood by considering the kinetic physics of CR-wave interactions. Because the magnetic field is flux frozen into the ionized plasma in the MHD approximation, CRs are advected with the field lines as they are themselves advected with the fluid velocity \(\varvec{v}\). When CRs propagate along the magnetic field, their microphysical transport depends crucially on the scattering frequency. A small but non-negligible scattering rate implies diffusive transport along the magnetic field. In addition to diffusion, CR particles can collectively drift at an average velocity: when CRs move at a speed exceeding the local Alfvén speed, they generate gyroresonant Alfvén waves through a phenomenon known as the streaming instability (Kulsrud and Pearce 1969). In turn, these waves increase the CR scattering (Fig. 3), which decelerates the CR population along the mean magnetic field, effectively transferring CR energy to further growing resonant Alfvén waves. This self-reinforcing feedback loop is stopped when CRs flow on average at the speed of Alfvén waves because there is no more net energy transfer from CRs to Alfvén waves.

Wave-damping processes transfer wave energy originally borrowed from the CRs to the background thermal plasma. This means that CRs exert pressure on the thermal plasma by scattering off Alfvén waves. In the presence of weak wave damping, the high (remaining) wave amplitudes enable a strong coupling, causing CRs to stream alongside Alfvén waves. Conversely, when wave damping is strong, wave amplitudes are reduced to the level that they cannot maintain a frequent scattering that isotropizes CRs in the wave frame so that the CRs preferentially diffuse. This self-limiting picture of CR transport is confirmed by recent simulations using the PIC technique (Holcomb and Spitkovsky 2019; Shalaby et al. 2021), hybrid PIC (Weidl et al. 2019; Haggerty et al. 2019) as well as MHD PIC models (Lebiga et al. 2018; Bai et al. 2019; Plotnikov et al. 2021; Bambic et al. 2021). The highly idealized pure CR transport modes advection, diffusion and streaming are visualized in Fig. 10.

Fig. 10
figure 10

Left: The magnetic fields, which are frozen-in, are carried along with the plasma at an average velocity, \(v=v_\textrm{adv}\). Hence, CRs orbiting individual field lines are also advected alongside the plasma. Middle: When the amplitude of Alfvén waves is strongly damped, CRs are weakly scattered and (after an initial time) diffuse away from the source. They are transported with a characteristic root-mean-square velocity of \(\sqrt{2\kappa /t}\), where t is the time and \(\kappa \) denotes the diffusion coefficient along the magnetic field. Right: If Alfvén waves are weakly damped, their amplitude remains large so that CRs are effectively scattered and stream at the Alfvén speed, \(v_{\textrm{a}}\), along the orientation of the magnetic field. These different transport modes of CR fluids are not realized in their pure forms in Nature as they are instead superposed onto each other. Image reproduced with permission from Thomas et al. (2020), copyright by AAS

To concretize this qualitative discussion, we define the pitch-angle averaged CR scattering frequencies of right- and left-ward propagating Alfvén waves using their energy densities \(\varepsilon _{{\textrm{a}},\pm }\) (Thomas and Pfrommer 2019):

$$\begin{aligned} {\bar{\nu }}_\pm =\frac{3 \pi \, \varOmega }{16} \frac{\varepsilon _{{\textrm{a}},\pm }}{\varepsilon _B} =\frac{c^2}{3\kappa _\pm }, \end{aligned}$$
(27)

where \(\varepsilon _B=B^2/(8\pi )\) is the magnetic energy density and \(\kappa _\pm \) denote the CR diffusion coefficients associated with the corresponding scattering frequencies. Hence, a larger level of Alfvénic fluctuations increases the scattering rate. CRs stream down their own pressure gradient relative to the background plasma along the local direction of \({{\varvec{B}}}\) with a velocity (Skilling 1975)

$$\begin{aligned} \varvec{v}_{{\textrm{st}}}=\varvec{v}_{{\textrm{a}}}\, \frac{{\bar{\nu }}_+ - {\bar{\nu }}_-}{{\bar{\nu }}_+ + {\bar{\nu }}_-} \rightarrow -\varvec{v}_{{\textrm{a}}}\,\textrm{sgn}(\varvec{B}\varvec{\cdot }\varvec{\nabla }P_\textrm{cr}). \end{aligned}$$
(28)

The limit on the right-hand side is realized in the self-confinement picture of CR transport where CRs streaming down the CR gradient excite resonant Alfvén waves. Those waves are moving in the direction of the streaming CRs so that their wave amplitudes dominate over the counter propagating type of waves, which are efficiently damped through various collisionless damping processes, see Sect. 2.3.5. If \(\varvec{B}\) points in the opposite direction of \(\varvec{\nabla }P_\textrm{cr}\), equation (28) implies streaming along the magnetic field, \(\varvec{v}_{{\textrm{st}}}=\varvec{v}_{{\textrm{a}}}\) and a dominating scattering rate \({\bar{\nu }}_+\). Efficient scattering of CRs isotropizes them in the Alfvén wave rest frame (see Sect. 2.1). As discussed above, CRs can also diffuse in the wave frame along the local direction of \({{\varvec{B}}}\) due to less frequent pitch angle scattering by MHD waves (Thomas and Pfrommer 2019):

$$\begin{aligned} \varvec{v}_{{\textrm{di}}}=-\kappa {{\varvec{b}}}\, \frac{{{\varvec{b}}}\varvec{\cdot }\varvec{\nabla } \varepsilon _{{\textrm{cr}}}}{\varepsilon _{{\textrm{cr}}}}, \qquad \text{ where } \qquad \kappa =\frac{c^2}{3({\bar{\nu }}_+ + {\bar{\nu }}_-)} \end{aligned}$$
(29)

is the total CR diffusion coefficient, which decreases for an increased CR-wave scattering rate. Because \(\varvec{v}_{{\textrm{di}}}\) depends on the gradient of the CR energy density, this type of transport process spreads the initial CR distribution in time. As a result, the diffusion operator is characterized by a Laplacian (combining eqs.  26 and 29 and assuming \(\kappa =\textrm{const}\)). In one dimension, the solution of the diffusion equation is given by convolving the initial condition with a Gaussian representing the Green’s function of the diffusion operator. This causes the diffusive spread of CRs away from a local maximum of CRs along the magnetic field.

The first numerical simulations to include dynamical CR feedback on the hydrodynamics studied the problem of diffusive shock acceleration at a SN blast wave (Dorfi 1984, 1985) and employed the so-called “two-fluid” model laid out in eqs. (22) and (23), which are closed by the equations of state with the corresponding ratio of specific heats, (24) and (25), for the thermal and CR populations (Drury and Voelk 1981; Kang and Jones 1990). These early studies assumed simplified expressions for the spatial and momentum dependence of the CR diffusion coefficient and neglected the CR streaming term. Using a simplified analytic two-fluid model of the SN expansion, Drury et al. (1989) account for dynamical CR feedback as well as the influence of Alfvén wave heating on the diffusion coefficient. The first fully time-dependent and non-linear numerical simulations of spherical SN blast waves with the “two-fluid” model of CR-hydrodynamics find that the shock can transfer \(\sim \)10% of its kinetic energy to the CR population at the end of the adiabatic, Sedov–Taylor phase (Jones and Kang 1990; Dorfi 1990). The first CR hydrodynamical simulations coupled to an equation for the resonant Alfvén wave energy demonstrate the emergence of CR-driven galactic winds with one-dimensional flux-tube models (Breitschwerdt et al. 1991, 1993).

In the new century, new computational capabilities enabled the study of the dynamical effect of CR pressure and heating in galaxy formation and galaxy clusters, using three-dimensional (magneto-)hydrodynamical simulations. Pioneering work was performed with the Eulerian MHD code PIERNIK that followed anisotropic CR diffusion (Hanasz and Lesch 2003) and succeeded by simulations of galaxies, clusters and cosmological volumes with the smoothed particle hydrodynamics code Gadget, which evolves a CR population described by a single power-law momentum spectrum while accounting for isotropic CR diffusion (Pfrommer et al. 2006; Enßlin et al. 2007; Jubelgas et al. 2008) and CR streaming (Uhlig et al. 2012). The promising results of galactic winds driven by CR streaming were confirmed by several other groups that modelled isotropic CR diffusion using adaptive mesh-refinement codes Ramses (Booth et al. 2013), Enzo (Salem and Bryan 2014), and ART (Semenov et al. 2021). Studies of anisotropic CR diffusion coupled to MHD were made possible by CR implementations in the Eulerian Pencil code (Snodin et al. 2006), the adaptive mesh-refinement codes Flash (Yang et al. 2012; Girichidis et al. 2014) and Ramses (Dashyan and Dubois 2020), and the unstructured moving-mesh code AREPO (Pakmor et al. 2016a; Pfrommer et al. 2017a). Anisotropic CR streaming in MHD simulations was modelled by Ruszkowski et al. (2017b, with Flash), Butsky and Quinn (2018, with Enzo), and Dubois et al. (2019, with Ramses) while employing the regularization of the anisotropic CR streaming term proposed by Sharma et al. (2010a).

2.3.3 Two-moment cosmic-ray hydrodynamics

The previously discussed picture of one-moment CR hydrodynamics has the disadvantage that the (direction of the) CR streaming is ill-defined at the extremes of the CR distribution (see equation 28). This causes numerical instabilities in the CR streaming term, which can be cured by regularizing the sharp transition of the sign function and smoothing it out (Sharma et al. 2010a), but at the expense of introducing a numerical parameter that dominates the solution in the case of a CR source on top of a CR background (see Fig. 6 of Thomas and Pfrommer 2019). Capitalizing on analogies of CR and radiation hydrodynamics, Jiang and Oh (2018) propose a two-moment CR hydrodynamics scheme that assumes steady-state CR scattering approximation and additionally solves for the CR flux density as an independent and evolved quantity, thereby solving the problem of non-uniqueness of the CR streaming flux at the extremes of f (see also Chan et al. 2019, for a similar development).

In an alternative approach, we can expand f into a complete set of basis functions in pitch-angle (such as the Legendre polynomials, see Sect. 2.1), which has a long history in CR transport (see e.g. Klimas and Sandri 1971; Earl 1973; Webb 1987; Zank et al. 2000; Snodin et al. 2006; Litvinenko and Noble 2013; Rodrigues et al. 2019). Using the Eddington approximation for the CR distribution, \(f = f_0 + 3 \mu f_1\), and requiring that the isotropic term dominates over the anisotropy, \(f_0 \gg f_1\), Thomas and Pfrommer (2019) compute the zeroth and first \(\mu \)-moments of the focused transport equation (20) to arrive at evolution equations for the isotropic and anisotropic parts of the CR distribution, \(f_0\) and \(f_1\). Multiplying those with the particle energy and the energy flux, respectively, and integrating the equations for \(f_0\) and \(f_1\) over momentum space yields the evolution equations for the CR energy and momentum density, \(\varepsilon _\textrm{cr}\) and \(\varvec{f}_\textrm{cr}/c^2\). In the laboratory frame, the pressure, energy and momentum densities transform as follows (neglecting the relativistic corrections to the CR energy and pressure):

(30)

so that we obtain the governing equations for the CR energy and momentum density as measured in the laboratory frame (Thomas and Pfrommer 2019, see Appendix E):

(31)
(32)

where the Alfvén wave velocity in the laboratory frame is given by \(\varvec{w}_\pm = \varvec{v} \pm \varvec{v}_{{\textrm{a}}}\), \(S_\varepsilon \) and \(\varvec{S}_f\) are non-adiabatic source terms of CR energy and flux, respectively, and the Lorentz force density in the ultra-relativistic limit is \(c^2\varvec{g}_{\textrm{Lorentz}} = \varvec{f}_\textrm{cr}\varvec{\times }\varvec{\varOmega }\), where \(\varvec{\varOmega } = \varOmega \varvec{b}\).

These equations also contain the previously discussed three CR transport modes: advection, streaming, and diffusion. In the absence of sources and along the magnetic field (where \(\varvec{g}_{\textrm{Lorentz}} = \varvec{0}\)), we recover CR streaming and diffusion. In the presence of efficient CR-wave scattering, i.e., for a large CR scattering frequency \({\bar{\nu }}_\pm = c^2/(3\kappa _\pm )\), we approach a homogeneous CR flux with a steady-state CR energy density, implying \(\partial _t \varepsilon _{\textrm{cr},\textrm{lab}} + \varvec{\nabla } \varvec{\cdot } \varvec{f}_{\textrm{cr},\textrm{lab}} = 0\). In consequence, the CR enthalpy streams with the Alfvén wave frame, \(\varvec{f}_{\textrm{cr},\textrm{lab}} = \varvec{w}_\pm (\varepsilon _{\textrm{cr},\textrm{lab}} + P_{\textrm{cr},\textrm{lab}})\), where the sign of \(\varvec{w}_\pm \) determines the CR propagation direction. If CR scattering is inefficient but finite, we also approach a steady state CR flux (\(\partial _t \varvec{f}_{\textrm{cr},\textrm{lab}} = \varvec{0}\)) and recover the CR diffusion velocity, \(\varvec{v}_{\textrm{di}} \varepsilon _{\textrm{cr},\textrm{lab}} = -\kappa _\pm \varvec{b}\varvec{b}\varvec{\cdot }\varvec{\nabla }\varepsilon _{\textrm{cr},\textrm{lab}}\), where we have identified the non-vanishing bracket on the right-hand side with \(\varvec{v}_{\textrm{di}} \varepsilon _{\textrm{cr},\textrm{lab}}\) and used the ultra-relativistic equation of state for CRs, \(P_\textrm{cr}= \varepsilon _\textrm{cr}/3\). Finally, we recover CR advection in steady state, \(\varvec{f}_{\textrm{cr},\textrm{lab}} = \varvec{v} (\varepsilon _{\textrm{cr},\textrm{lab}} + P_{\textrm{cr},\textrm{lab}})\) due to the balance of forces parallel and perpendicular to the magnetic field: while the perpendicular pressure force balances the Lorentz force acting on CRs, \(\varvec{\nabla }_\perp P_{\textrm{cr},\textrm{lab}} = -\varvec{g}_{{\textrm{Lorentz}}}\) (which may however act on kinetic time scales), we have forces due to CR streaming and diffusion balancing the pressure forces parallel to the magnetic field.

These equations closely resemble the lab-frame equations for radiation energy and momentum density, \(\varepsilon \) and \(\varvec{f}/c^2\) (Mihalas and Mihalas 1984; Lowrie et al. 1999)

(33)
(34)

where is the radiation pressure tensor, \(\sigma _{\textrm{s}}\) is the scattering coefficient, and \(S_{\textrm{m}}\) describes photon absorption and emission processes, thus coupling radiation to matter. The similarities of the CR and photon transport equations are apparent and result from the same underlying physics: while efficient photon scattering causes the photon enthalpy to be advected with the gas, efficient CR-wave scatterings causes the CR enthalpy to stream with the Alfvén waves. However, there are two main differences: (i) the propagation direction of photons is given by the path of least resistance unlike for CRs, whose gyrotropic motion is directed along the local mean magnetic field and (ii) in contrast to photon transport, the CR lab-frame equations require resolving rapid gyro kinetics owing to the Lorentz force. As pointed out by Thomas and Pfrommer (2019), transforming those lab-frame equations into the comoving frame enables us to project out the fast gyro kinetics so that the CR hydrodynamics theory can be applied to macroscopic astrophysical scales.

CR scattering and hence their spatial transport depends on the amplitudes of resonant Alfvén waves, which are excited by the gyro-resonant instability and lose energy due to various collisionless damping processes as well as second-order Fermi acceleration. Hence, a complete theory of two-moment CR hydrodynamics not only needs to couple the equations of CR energy and flux density to the MHD equations but also to the equations for resonant Alfvén wave energy (Dewar 1970; Jacques 1977; Breitschwerdt et al. 1991; Zweibel 2017) to account for spatial and temporal variations of wave growth and damping. The non-linearly coupled subsystem of four CR-wave equations in the comoving frame reads as follows (Thomas and Pfrommer 2019):

$$\begin{aligned}&\frac{\partial \varepsilon _\textrm{cr}}{\partial t} + \varvec{\nabla } \varvec{\cdot } [\varvec{v} (\varepsilon _\textrm{cr}+ P_\textrm{cr}) + \varvec{b} f_\textrm{cr}] = \varvec{v} \varvec{\cdot } \varvec{\nabla } P_\textrm{cr}\nonumber \\&\quad - \frac{v_{\textrm{a}}}{3\kappa _+} \left[ f_\textrm{cr}- v_{\textrm{a}} (\varepsilon _\textrm{cr}+ P_\textrm{cr}) \right] + \frac{v_\textrm{a}}{3\kappa _-} \left[ f_\textrm{cr}+ v_{\textrm{a}} (\varepsilon _\textrm{cr}+ P_\textrm{cr}) \right] , \end{aligned}$$
(35)
$$\begin{aligned}&\frac{\partial f_\textrm{cr}/c^2}{\partial t} + \varvec{\nabla } \varvec{\cdot } \left( \varvec{v} f_\textrm{cr}/c^2 \right) + \varvec{b} \varvec{\cdot } \varvec{\nabla } P_\textrm{cr}= - ( \varvec{b} \varvec{\cdot } \varvec{\nabla } \varvec{v}) \varvec{\cdot } (\varvec{b} f_\textrm{cr}/c^2) \nonumber \\&\quad -\frac{1}{3\kappa _+} \left[ f_\textrm{cr}- v_{\textrm{a}} (\varepsilon _\textrm{cr}+ P_\textrm{cr}) \right] - \frac{1}{3\kappa _-} \left[ f_\textrm{cr}+ v_{\textrm{a}} (\varepsilon _\textrm{cr}+ P_\textrm{cr}) \right] ,\ \end{aligned}$$
(36)
$$\begin{aligned}&\frac{\partial \varepsilon _{{\textrm{a}},\pm }}{\partial t} + \varvec{\nabla } \varvec{\cdot } \left[ \varvec{v} (\varepsilon _{{\textrm{a}},\pm } + P_{{\textrm{a}},\pm }) \pm v_{\textrm{a}} \varvec{b} \varepsilon _{{\textrm{a}},\pm } \right] = \varvec{v} \varvec{\cdot } \varvec{\nabla } P_{{\textrm{a}},\pm } \nonumber \\&\quad \pm \frac{v_{\textrm{a}}}{3\kappa _\pm } \left[ f_\textrm{cr}\mp v_{\textrm{a}} (\varepsilon _\textrm{cr}+ P_\textrm{cr})\right] - S_{\textrm{a},\pm }, \end{aligned}$$
(37)

where \(P_{{\textrm{a}},\pm }=\varepsilon _{{\textrm{a}},\pm }/2\) are the ponderomotive pressures of Alfvén waves, which arise due to the nonlinear force experienced by a charged particle in an oscillating electromagnetic field with spatial variations. These Alfvén waves have wave lengths comparable to the gyroradii of pressure-carrying CRs, well below the resolved scales in the simulation. \(S_{\textrm{a},\pm }\) are wave energy loss terms due to damping processes.

In contrast to the laboratory frame equations of CR transport (31) and (32), here the CR flux density is aligned with the local mean magnetic field, \(\varvec{f}_\textrm{cr}= \varvec{b} f_\textrm{cr}\) to reduce the dimensionality of the system of equations. Clearly, the terms of the laboratory frame equations are also apparent in this comoving frame. In addition, there are pseudo forces that result from transforming into the non-inertial comoving frame as discussed in Sect. 2.3.1. Notably, the adiabatic term on the right-hand side of Eq. (35) is the manifestation of a pseudo force in this two-moment method. Like all hydrodynamical theories that are derived from a moment hierarchy and are truncated at finite order, a closure relation is required that relates the highest-order moment to lower-order moments in order to arrive at a consistent and closed system of evolution equations. Unlike the case of radiation transport, where there are significant differences in the solutions with the various closure relations, the different closures for the CR transport models produce the same results for frequent CR-wave scattering, which is typically expected in the (warm and hot phases of the) interstellar, circumgalactic or intracluster media (Thomas and Pfrommer 2021). This is because CR transport is primarily constrained to magnetic field lines, allowing CRs to propagate only along or against the direction of the magnetic field. This is not the case for radiation, because its transport may occur in arbitrary directions.

Fig. 11
figure 11

A radio harp in the MeerKAT observation by Heywood et al. (2019) at the Galactic center (left). The observed radio brightness profiles along the individual radio-emitting filaments, the “harp strings” are compared with model calculations (orange) of streaming and diffusion (middle) and pure diffusion (right). In the lower two profiles, where the CR electrons had more time to propagate, the models show significant differences and favor the streaming model. Image reproduced with permission from Thomas et al. (2020), copyright by AAS

Are there methods to observationally validate these theoretical considerations? The observed level of CR anisotropy is an indirect measure of the degree of isotropization, which should be of order \(v_{\textrm{a}}/c \sim 10^{-4}\) in the ISM if CRs are indeed streaming close to the Alfvén speed (Kulsrud 2005). Observing the time sequence of CR profiles as they are propagating from a source would provide a strong insight into the physics. In recent radio observations of the center of our Milky Way using the MeerKAT telescope, radio-emitting structures with nearly parallel filaments have been discovered. These filaments are lined up by their length, resulting in a morphology reminiscent of a harp, with the filaments resembling radio-synchrotron emitting “strings” (Heywood et al. 2019, see left-hand panel of Fig. 11). These structures are formed when massive stars or pulsars fly through ordered magnetic fields of the ISM, discharging CR electrons along their paths into these magnetic fields either (i) via magnetic reconnection of shocked ISM fields with the magnetic field of pulsar wind nebulae or (ii) turbulent mixing of the shocked ISM and shocked stellar wind, which contains CRs accelerated at a strong stellar wind termination shock. The particles propagate along the ISM magnetic field lines, usually transverse to the star’s orbit, causing the magnetic fields to be illuminated in the radio through the synchrotron process, thus explaining their appearance in the form of strings of a harp (with the different string lengths corresponding to the timing of the discharge). If this propagation was a diffusion process, the radio profiles should have a rounded bell shape (see right-hand panel of Fig. 11). Instead, the radio profiles have a flat-top shape, which can only be explained by streaming CRs: propagating relativistic electrons excite the magnetic fields of the “strings” to resonantly oscillate with their gyro orbits, which then amplifies Alfvén waves through the streaming instability (Kulsrud and Pearce 1969). This in turn decelerates CRs by the described scattering processes so that CRs can be described by a streaming fluid (plus a small amount of diffusion, Thomas et al. 2020). This validates the predictions of the theory of self-limiting hydrodynamic transport of CRs by streaming and diffusion. Recently, this picture obtained observational support from angular associations of compact radio and infra-red objects with non-thermal filaments that both share similar latitude distributions, suggesting that they both co-exist spatially (Yusef-Zadeh et al. 2022a).

The two-moment method of CR transport cures the numerical problems of the streaming term and provides a more faithful representation of CR transport by including a spatially and temporally varying diffusion coefficient. However, it comes with new challenges. In addition to Alfvén characteristics, the system of eqs. (35) to (37) also contains light-like characteristics with eigenvalues \(\pm c/\sqrt{3}\) (which is not equal to c because of the truncation of the expansion at linear order). Resolving time steps associated with this velocity would not only make the solutions numerically very expensive, but also increase the numerical diffusivity at a fixed numerical resolution (Thomas et al. 2021), suggesting the use of a reduced speed of light. This has only a minor impact on the transport of CR momentum and energy because there are no large mass fluxes associated with these light-like characteristics, unlike in the case of radiation transport, where this velocity impacts the propagation speed of ionization fronts. Most importantly, if the magnetic dynamo is not fully resolved and the magnetic field does not saturate at its true value, the resulting CR transport would be too slow and bias the solution. In addition, if a plasma instability is not modelled in the macroscopic fluid model or if the adopted damping rates are too strong/weak (perhaps because of an incomplete modelling of the multi-phase nature of the ISM), the CR transport speed would also be wrong and may not supply the correct amount of feedback. These considerations make a strong case for an increased effort in micro- and meso-scale studies of CR transport.

2.3.4 Cosmic ray energy vs. entropy methods

The one- and two-moment methods of CR hydrodynamics both evolve the CR energy. However, this is not a conserved quantity because only the total energy composed of kinetic, magnetic, thermal and CR energies is conserved. As a result, the CR energy equation (26) contains an adiabatic source term that establishes a coupling between CRs and the thermal gas. This problem is also present in the two-moment approach of CR hydrodynamics because the one-moment equation emerges in the limit of a steady CR flux density. This can cause problems for different discretizations of this adiabatic term at shocks, where kinetic energy is dissipated into thermal energy while the CR energy should only be adiabatically compressed in the picture of grey CR hydrodynamics. In practice, because the energy formulation does not ensure the conservation of CR entropy, there could be artificial numerical CR entropy generated that would render the solution inaccurate, thereby defining the “non-uniqueness problem” of the two-fluid equations. While Gupta et al. (2021) find that the formulation where the adiabatic source term of Eq. (26) is integrated over the cell volume and Gauss’ theorem is applied does minimize the problem, Semenov et al. (2022) instead argue in favor of the entropy formulation of CR hydrodynamics.

However, there are several points to consider as pointed out by Weber et al. (2023): (i) in the limit of low numerical resolution (which is the typical case for galaxy-scale simulations), both schemes perform well and are largely unaffected by the shock Mach number; (ii) however, the absolute truncation error is substantially larger for fixed-grid Eulerian methods because of the increased numerical diffusion in comparison to moving mesh formulations of MHD as, e.g., employed in the one-moment CR hydrodynamics formulation (Pfrommer et al. 2017a; Pakmor et al. 2016a) in the AREPO code (Springel 2010; Pakmor et al. 2016c), and (iii) a true collisionless shock can accelerate CRs because of self-regulating plasma kinetic processes, which cause CRs to stream into the upstream, to generate magnetic turbulence, which ensures CR scattering, acceleration and a non-linearly modified shock structure as discussed in Sect. 2.2. Hence, only accounting for adiabatic compression of CRs at a shock is an inaccurate academic representation of the underlying plasma physics and must be accompanied by a subgrid model that accounts for CR acceleration (Pfrommer et al. 2017a). Moreover, when accounting for CR acceleration in low-resolution simulations, the entropy formulation shows artificial density oscillations in the post-shock regime, which are sourced by numerical noise as a result of converting injected CR energy to entropy. This causes the shock to propagate with a much faster velocity in the entropy formulation in comparison to the CR energy formulation (Weber et al. 2023).

2.3.5 Streaming instability, wave damping mechanisms and cosmic ray self-confinement

Setting the stage. CR-driven instabilities play an essential role in CR propagation in galaxies and galaxy clusters (Kulsrud and Pearce 1969; Shalaby et al. 2021). On the one hand, these instabilities amplify magnetic field fluctuations at the scale of CR gyro radii, which is explained in great detail in Kulsrud (2005) and Thomas (2022). On the other hand, collisionless wave damping processes reduce the wave power. This modulates CR-wave scattering and tightly couples the intrinsically collisionless CR population to the thermal plasma to enable dynamical feedback on macroscopic scales. Among the various (collisionless) wave damping mechanisms, there are linear and non-linear Landau damping, turbulent damping, and ion–neutral damping, which are explained in the following paragraphs.

Landau damping describes the energy exchange between an electromagnetic wave with phase velocity \(v_{\textrm{ph}}\) and charged particles with velocity along the mean magnetic field close to \(v_{\textrm{ph}}\). Provided the particle velocities are slightly less than \(v_{\textrm{ph}}\), the electric field of the wave will accelerate the particles to move at \(v_{\textrm{ph}}\), while particles with velocities slightly greater than \(v_{\textrm{ph}}\) experience a decelerating Lorentz force so that they lose energy to the wave. As a result, particles tend to synchronize with the wave. Non-linear Landau damping occurs because of the interaction of two circularly polarized Alfvén waves with similar wave numbers \(k_1\) and \(k_2\) that propagate in the same direction. The interaction generates a beat wave (Lee and Völk 1973; Volk and McKenzie 1981; Miller 1991; Kulsrud 2005; Wiener et al. 2013a), which propagates at the group velocity

$$\begin{aligned} v_{\textrm{beat}} = \frac{\omega _1 - \omega _2}{k_1 - k_2}. \end{aligned}$$
(38)

The magnetic mirror force associated with this beat wave accelerates thermal particles travelling at similar velocities, implying a net extraction of wave energy by the particles and leading to efficient wave damping. Physically, the interaction between the two waves occurs via their beat wave at the Landau resonance with thermal particles, \(v_{\textrm{th}}\approx v_{\textrm{beat}}\).

Magneto-hydrodynamic Alfvénic turbulence is anisotropic on spatial scales much smaller than the turbulent injection scale (Goldreich and Sridhar 1995) so that the elongated “eddies” are aligned with the mean magnetic field. Turbulent damping is not a classical wave-damping process but occurs because of the shearing of two counter-propagating Alfvén wave packets, which causes field-line wandering. While propagating along the perturbed field lines of the colliding partner, one wave packet undergoes transverse distortion with respect to the average magnetic field on a time comparable to the eddy turnover time (Lithwick and Goldreich 2001). This causes cascading of energy to higher wave numbers \(k_\parallel \), which decreases the wave energy at the resonant scale and makes CR transport more diffusive (Farmer and Goldreich 2004; Lazarian 2016; Lazarian and Xu 2022).

The ion–neutral damping of Alfvén waves occurs as a result of the frictional forces between ions and neutrals within a partially ionized medium (see Appendix C of Kulsrud and Pearce 1969). ion–neutral collisions equilibrate the temperature of these two species to approach \(v_{\textrm{i}}^2/v_{\textrm{n}}^2=m_{\textrm{n}}/m_{\textrm{i}}\), where neutrals refer to hydrogen and helium (for damping rates of this three-component fluid, see Soler et al. 2016). Additionally, ions are accelerated by the Lorentz force exerted by the Alfvén waves, which are self-generated by streaming CRs. However, this force is opposed by friction between neutral and ionized particle species, which damps the waves. In summary, the energy loss by the waves is converted to the thermal energy of ions and neutrals.

Furthermore, there are other processes that quench the CR streaming instability via CR trapping by magnetic bottles (Holcomb and Spitkovsky 2019), as well as pressure anisotropy (Zweibel 2020) or streaming bottlenecks as CRs are propagating in a multi-phase plasma with warm, dense clouds embedded in a hot phase in (approximate) pressure equilibrium: provided the magnetic field does not substantially differ between these phases (which can be obtained via thermal instability and collapse along the magnetic field), those cool clouds are characterized by a decreasing Alfvén velocity. As CRs propagate into the dense cloud, they are decelerated by efficient scattering to adjust to the smaller Alfvén speed. As a consequence, a reservoir of stationary CRs accumulates ahead of the cloud and a pressure gradient across the cloud, which in turn gets accelerated as a result of this CR pressure gradient (Wiener et al. 2017a, 2019; Thomas et al. 2021). In all these processes, inhomogeneities of the magnetic field and/or the density of the background plasma cause additional confinement of the CRs, which modifies their transport speed and hence their momentum and energy transfer to the thermal plasma.

CR scattering and transport in the kinetic picture. The PIC technique allows for the computation of the nonlinear development of the gyroresonant streaming instability. It also enables the study of the concurrent evolution of electron and ion distributions along with the self-generated wave spectra; albeit only in one-dimensional setups due to the numerical complexity. Holcomb and Spitkovsky (2019) perform such simulations of the CR streaming instability and show that the initial instability growth of Alfvén waves agrees with the predictions from linear physics. However, the behavior of the instability during the non-linear saturation stage differs depending on the degree of anisotropy of the initial CR distribution: CR distributions that are highly anisotropic cannot efficiently isotropize in the Alfvén frame due to the reduced generation of left-handed resonant modes. Most importantly, because the CR gyroradius is required to be sufficiently small so that the resonant modes are well-resolved within the computational domain, numerical limitations enforce the choice of a rather high value of the Alfvén speed of \(v_{\textrm{a}} = 0.1 c\). This causes the streaming instability to saturate via particle trapping in magnetic bottles rather than via turbulent or non-linear Landau damping, which should be the dominating damping mechanisms in the ionized interstellar plasma, where \(v_{\textrm{a}} \sim 10^{-4} c\). Nevertheless, streaming CRs efficiently couple momentum and energy to the background plasma, causing the emergence of bulk flows and confirming the microphysical basis of CR-driven winds. If the CRs have a non-zero pitch angle, the nature of the dominant instability changes and instead CRs excite background ion–cyclotron modes in the frame that is comoving with the CRs (Shalaby et al. 2021). The associated growth rate is typically much larger in comparison to that of the resonant streaming instability at the ion gyroscale. Because this new instability grows waves on intermediate scales between the gyroradii of CR ions and electrons, lower-energy (resonant) CRs should get efficiently scattered and more strongly coupled to the background plasma (see Sect. 2.1.3).

Fluid-PIC modeling of CR scattering and transport. In order to access inhomogeneities on larger scales, simulate multi-dimensional effects, or approach realistic ISM parameters (with a large-scale separation of Alfvén and light speeds and a CR-to-thermal background density ratio of \(\sim 10^{-9}\)), the background needs to be treated as a fluid, of which the simplest description is provided by MHD models while the CR component is modeled in the kinetic picture with the PIC method (Bai et al. 2015). MHD-PIC simulations of the CR streaming instability show that the CR distribution can be fully isotropized in the wave frame as a result of non-linear wave–particle interactions (rather than mirror reflections, Lebiga et al. 2018; Bai et al. 2019). In order to reduce the Poisson noise inherent to the PIC method, Bai et al. (2019) employ the \(\delta f\) method, where the CR distribution is split into an (analytically known) isotropic part \(f_0\) and the difference from the full distribution function \(\delta f = f-f_0\), which is evolved using individual particles as Lagrangian PIC particles.

Adopting ion–neutral drag to damp Alfvén waves in a portion of the simulation domain, the simulations of Bambic et al. (2021) show the emergence of spatial CR gradients across the fully ionized region, which is directed opposite to the CR flux. This supports predictions of CR hydrodynamics, in which the combination of a CR energy density gradient and time-dependent energy flux balances wave–particle scattering throughout the domain. As the ion–neutral damping rate increases, Alfvén wave are efficiently damped and CRs are not any more efficiently isotropized (Plotnikov et al. 2021). A systematic MHD-PIC study of CR scattering rates in steady state where the CR streaming instability balances ion–neutral damping yields momentum scalings consistent with quasi-linear theory, but with a reduced normalization; thus offering the promise of calibrating CR hydrodynamic models with kinetic simulations (Bai 2022). MHD-PIC simulations of charged dust and CRs embedded in magnetized gas show the growth of resonant drag instabilities, that cause the formation of charged dust concentrations (Ji et al. 2022). Those excite Alfvén waves that efficiently scatter the initially perfectly streaming CRs, which eventually become fully isotropic and are decelerated to drift at the Alfvén speed. The associated momentum transport to the background plasma may be responsible for outflows in the dusty CGM around quasars or superluminous galaxies.

All these models use the MHD to describe the background plasma, which precludes Landau damping into separate electron and ion fluids and integrates out the electron scale. Moreover, the MHD limit operates on scales larger than the ion skin depth, \(kd_{\textrm{i}}\ll 1\), which also makes it impossible to resolve the faster-growing intermediate-scale instability (see Shalaby et al. 2023 and Fig. 5) and thus precludes studying the complete plasma system. A more promising approach to include these physical processes is to model the background with separate fluids for the electron, ion and neutral components (which are equipped with Landau closures for the electron and ion populations). Those fluids are then coupled with the CR population that is represented by PIC particles via Maxwell’s equations. Simulations of one spatial dimension and three velocity space dimensions with this approach show that the CR streaming instability indeed saturates via processes that remove wave energy from the resonant scale, namely non-linear Landau damping as well as (inverse) cascading of wave energy towards larger scales in the regime of small Alfvén velocities \(v_{\textrm{a}}\lesssim 10^{-2}c\) (Lemmerz et al. 2023).

2.3.6 Cosmic ray scattering on MHD turbulence – external confinement by turbulence

CR interactions with Alfvén waves and fast and slow magnetosonic waves. Provided MHD turbulence is injected on scales much larger than the CR gyro radii, Alfvén and slow modes are less efficient in pitch-angle scattering particles (Chandran and Dennis 2006; Yan and Lazarian 2004; Maiti et al. 2022; Xu and Lazarian 2018). This is a consequence of the critical balance condition in the inertial-range between linear wave periods and non-linear turnover timescales in MHD turbulence (Goldreich and Sridhar 1995), which leads to elongated “eddies” along the magnetic field on small spatial scales with a wave number scaling \(k_\perp \propto k_\parallel ^{3/2}\). While the Alfvénic cascade perpendicular to the mean magnetic field exhibits Kolmogorov scaling with \(E(k_\perp )\textrm{d}k_\perp \propto k_\perp ^{-5/3}\textrm{d}k_\perp \), the parallel scaling is significantly steeper,

$$\begin{aligned} E(k_\parallel ) \textrm{d}k_\parallel = E\left[ k_\perp (k_\parallel )\right] \frac{\textrm{d}k_\perp }{\textrm{d}k_\parallel }\textrm{d}k_\parallel \propto k_\parallel ^{-2}\textrm{d}k_\parallel . \end{aligned}$$
(39)

Because CRs resonate with the parallel wave vectors (see equation 1 and Fig. 3), the steep spectrum leaves little wave energy to scatter CRs on the resonant scale. In configuration space, this result can be intuitively understood because of the alignment of the elongated eddies along the direction of the mean magnetic field so that the gyroradius of a CR encloses numerous eddies. These incoherently aligned neighboring eddies exert incoherent Lorentz forces on the CR and cause it to random walk during a gyro orbit. This phenomenon attenuates and broadens the gyro resonance, resulting in a decrease in the efficiency of CR scattering. Provided the compressible fast-mode cascade is isotropic and independent of the strength of the large-scale field with respect to the turbulence (Cho and Lazarian 2003; Makwana and Yan 2020), and provided it has a flat spectral slope similar to the Iroshnikov-Kraichnan (Iroshnikov 1963; Kraichnan 1965) cascade \({\propto ~}k^{3/2}\) (Zakharov and Sagdeev 1970; Cho and Lazarian 2003; Makwana and Yan 2020), fast modes may dominate CR propagation for the typical interstellar conditions (Yan and Lazarian 2004; Maiti et al. 2022). Nonetheless, fast modes experience significant kinetic damping on collisionless scales (Klein et al. 2012; Told et al. 2016). The precise details of their cascade and spectral slope remain uncertain.

CR interactions with compressible modes can also accelerate these particles via a second-order Fermi process. In the scenario where CRs are only advected with the gas, any adiabatic energy gain through compression will be completely lost during the process of rarefaction. Additionally accounting for diffusion changes this picture: in a single compression event, the CRs adiabatically gain energy as they develop a peaked distribution while diffusing outwards conserves CR energy. Thus, there is a net CR energy gain in a compression event while there is a net energy loss in a rarefaction event. Because interactions with compressible waves are more probable than with expanding waves, the gain in mean CR energy is second order in the wave velocity divided by the light speed (Ptuskin 1988). This acceleration efficiency depends on the balance between advective and diffusive timescales, \(\kappa /(c_{\textrm{s}}L)\), where \(c_{\textrm{s}}\) is the sound speed and L is the size of the compressible perturbation. The net CR energy gain is reduced if we account for CR streaming in addition to CR diffusion because (i) CRs drain energy from gas motions at a reduced rate that is modified by the factor \(1-v_{\textrm{a}}/c_{\textrm{s}}\) and (ii) the excitation of the streaming instability increases the wave energy at the extend of CR energy (which is less important because the CR pressure gradients are typically misaligned with the magnetic field; Bustard and Oh 2022). This removal of compressible wave energy leads to a steepening of the compressive turbulent power spectrum when the time required for wave damping (\(t_{\textrm{damp}}\sim \rho v^2/{\dot{\varepsilon }}_{\textrm{cr}} \propto \varepsilon _{\textrm{cr}}^{-1}\)) becomes similar to the timescale of the turbulent cascade process (Bustard and Oh 2023).

Waves in strong MHD turbulence are not long-lived but have a decay time that is comparable to their eddy turn-over time. Hence wave–particle interactions are not efficiently mediated through linear resonances, but are instead mediated through non-linearly broadened resonances (Yan and Lazarian 2008; Lynn et al. 2012). This is particularly important for particles with perpendicular pitch angles with respect to the local magnetic field orientation since those particles would otherwise find no scattering modes that meet the resonance condition on the correspondingly tiny length scales because those modes are subject to efficient damping.

MHD and thermal effects on CR transport. Using a characteristic ISM volume that is initially in a thermally unstable state, Commerçon et al. (2019) study CR propagation within the turbulent and magnetised interstellar plasma. They identify a clear transition in the ISM dynamics where the thermal instability is suppressed for CR diffusion coefficients below a critical value of \(\kappa _{\textrm{crit}}\sim 10^{24}\)\(10^{25} \textrm{cm}^2 \textrm{s}^{-1}\) or in regions where the CR pressure is at least ten times larger than the thermal pressure. This is because of the efficient trapping of CRs in these regions and because of the substantially larger cooling times of CR ions with relativistic energies (see Sect. 2.4) compared to the quickly cooling thermal plasma (Jubelgas et al. 2008). The transport of streaming CRs depends critically on the level of ionic Alfvén fluctuations. Beattie et al. (2022) study compressible MHD turbulence simulations and find that for sub-Alfvénic turbulence, the probability density function of the ionic Alfvén velocity only depends on the density fluctuations that result from shocks forming parallel to the magnetic field. By contrast, for super-Alfvénic turbulence, the correlations between magnetic and density fluctuations are more complex. Sampson et al. (2023) use a large ensemble of MHD turbulence simulations to quantify how plasma properties affect the transport of streaming CRs. They find that the macroscopic CR transport can be described by a combination of streaming along the mean field and superdiffusion along and across it. The Alfvén Mach number \(\mathcal {M}_{\textrm{a}}\) (that characterizes the strength of the large-scale field with respect to the turbulence) sets the anisotropy between parallel and perpendicular diffusion and the ionization fraction modulates the magnitude of the diffusion coefficient. CR transport does not depend on the compressibility except in the sub-Alfvénic (\(\mathcal {M}_{\textrm{a}}\lesssim 0.5\)) regime.

2.4 Radiative and non-radiative cosmic ray processes and their cooling times

2.4.1 Overview

Fig. 12
figure 12

Schematic overview of relativistic particle populations and non-thermal radiative processes in galaxies and galaxy clusters. Cosmologically growing galaxies and groups accrete matter and merge with other halos to assemble larger structures, thus releasing gravitational binding energy in the form of kinetic energy. Other sources of kinetic energy injection are SNe and AGNs as the most important non-gravitational energy sources (in red). These energy sources give rise to plasma processes at shocks and interactions with unstable electromagnetic plasma modes (in green), which accelerate relativistic particle populations, so-called CRs (in blue). Non-thermal observables (in yellow) link these CR populations to the underlying physical acceleration process. However, there is a degeneracy because the radio synchrotron and inverse Compton (IC) radiation could be emitted by any of the primary, secondary, and re-accelerated CR electron populations. This degeneracy can be (partially) broken by observing the characteristic spectral pion-decay feature in the \(\gamma \)-ray spectrum that is associated with hadronic CR interactions with gas protons. This characteristic serves as a distinctive indicator of the presence of a population of CR protons and has been observed in SNRs and in the diffuse emission in the Milky Way. The decay of charged pions that are also produced in hadronic CR interactions yields neutrinos at a very low flux (not shown). Image reproduced with permission from Pfrommer et al. (2008), copyright by the author(s)

Figure 12 provides an overview of the various relativistic particle populations and radiative processes in galaxies and galaxy clusters. Shocks driven by cosmologically growing structures, AGN jets, and galactic winds can directly accelerate primary CR electrons and ions. The hadronic interaction between CR protons and protons of the surrounding gas results in the production of secondary relativistic electrons and positrons (see the decay chain in equation 41). These two CR electron populations quickly cool at high energies via synchrotron emission in the ubiquitous magnetic fields in galaxies and clusters and by means of inverse Compton (IC) interactions with the radiation fields provided by the CMB or by stellar light to settle down at Lorentz factors \(\gamma _{\textrm{e}}=E_{\textrm{e}}/(m_{\textrm{e}} c^2)\sim 100\)–300. This makes them invisible in our accessible observational windows in the radio and at \(\gamma \) rays. Continuous in-situ re-acceleration of these CR electrons by means of interactions with turbulent MHD waves leads to the emergence of a distinct population of re-accelerated relativistic electrons.

All these three CR electron populations contribute to the observed radio synchrotron emission (which are described in detail below) and should Compton upscatter CMB and starlight photons into the X-ray and \(\gamma \)-ray regime. Owing to the uncertainty in the distribution of magnetic field strengths, the predictive ability of radio synchrotron emission alone is limited. In the case of clusters, the observation of the conceptually simpler IC emission is challenging due to the intense radiation background in the soft and hard X-ray range. In galaxies, IC interactions with the intense starlight photon fields generates a significant level of \(\gamma \) rays and rivals the pion-decay \(\gamma \)-ray emission resulting from hadronic CR-proton interactions. Generally, it is hard to distinguish these leptonic and hadronic emission components. The “pion bump”, i.e., the spectral decay signature at half the neutral pion’s rest mass at around 67.5 MeV in the spectral photon density, which originates from the two photons leaving the interaction site back to back, is a unique electromagnetic signature of hadronic interactions. It has been detected at old SNRs (Ackermann et al. 2013) and can be analytically modeled (Pfrommer and Enßlin 2004a). The Fermi Gamma-ray Space Telescope (hereafter called Fermi) produced a \(\gamma \)-ray map of the entire sky, which is composed of resolved and unresolved point sources, and the diffuse and compact emission from various source classes in the Milky Way and nearby galaxies. Using information field theory (Enßlin et al. 2009), the \(\gamma \)-ray map can be decomposed into several independent emission components while simultaneously employing correlations of the diffuse \(\gamma \)-ray flux in angular and energy space. Reconstructing one point source component and three diffuse components allows one to separate spectrally and spatially distinct diffuse components that resemble the leptonic IC emission and two pion-decay components that trace the cold and warm phases of the ISM in the Milky Way, respectively (see Fig. 13, Platz et al. 2022, improving upon an earlier analysis by Selig et al. 2015). Alternatively, the accompanying neutrinos provide another unique signal of hadronic CR proton reaction.

Fig. 13
figure 13

Decomposition of the diffuse \(\gamma \)-ray sky observed by the Fermi \(\gamma \)-ray space telescope in a Mollweide projection. This physics-informed model reconstructs one point source component (not shown) and two hadronic pion decay components, one of which uses the Planck dust map as a modifiable template and probes CR interactions with the cold ISM phase that is narrowly distributed around the Galactic midplane (panel a) while the other one traces the more extended warm phase of the ISM (with a soft spectral index, visualized with an orange color in panel b). The third diffuse component exhibits a harder spectral index, characteristic of the leptonic IC component (denoted by blue colors in panel b), and also includes the Fermi Bubbles. The sum of all three diffuse components is shown in panel c. Image reproduced with permission from Platz et al. (2022), copyright by ESO

To quantify these considerations, in the following we introduce the physics of hadronic and leptonic CR interactions and pay special attention to the cooling timescales of these relativistic particle species. The reader is referred to Rybicki and Lightman (1979) for a more detailed exposition of the physics of radiative processes or to Sarazin (1999), for solutions of the energy spectrum of primary CR electrons under simplified assumptions.

2.4.2 Cosmic ray ion interactions

Streaming instability losses. As CRs stream down their gradient, they resonantly excite Alfvén waves through the streaming instability (Kulsrud and Pearce 1969). The associated transfer of energy to Alfvén waves is given by the last term of equation (22) and causes CRs to lose their energy density at a rate

$$\begin{aligned} {{\dot{\varepsilon }}}_{\textrm{st}} = -\left| \varvec{v}_{\textrm{st}}\varvec{\cdot }\varvec{\nabla }P_\textrm{cr}\right| \quad \Rightarrow \quad \tau _{\textrm{st}} = \frac{\varepsilon _\textrm{cr}}{\left| {{\dot{\varepsilon }}}_{\textrm{st}}\right| }, \end{aligned}$$
(40)

where \(\tau _{\textrm{st}}\) is the CR loss timescale due to CR streaming and \(\varvec{v}_{\textrm{st}}\) is the CR streaming velocity in the steady-state streaming limit defined in equation (28). This definition of the energy density loss rate ensures that the CR cooling goes to zero in the limit of balanced turbulence, i.e., for \({\bar{\nu }}_+={\bar{\nu }}_-\), while it is maximized for imbalanced turbulence, \({\bar{\nu }}_\pm \gg {\bar{\nu }}_\mp \), which can be realized in the self-confinement picture for strong CR fluxes. As detailed in Sect. 2.3.5, the wave energy is dissipated through various collisionless wave-damping processes, thereby heating the background plasma.

Hadronic interaction. Of particular relevance for deciphering the CR proton population is the hadronic reaction of a CR proton with a thermal proton: when the momentum of CR protons surpasses the kinematic threshold of approximately \(0.78~\text{ GeV }/c\), the interaction generates pions that subsequently decay, producing secondary electrons, positrons, neutrinos, and gamma raysFootnote 15:

$$\begin{aligned}{} & {} \pi ^\pm \rightarrow \mu ^\pm + \nu _{\mu }/{\bar{\nu }}_{\mu } \rightarrow e^\pm + \nu _{e}/{\bar{\nu }}_{e} + \nu _{\mu } + {\bar{\nu }}_{\mu }\nonumber \\{} & {} \pi ^0 \rightarrow 2 \gamma . \end{aligned}$$
(41)

Thus, only CR protons with momentum exceeding this kinematic threshold are observable through the detection of their decay products either directly in the form of \(\gamma \)-ray and neutrino emissionFootnote 16 or indirectly via radiative processes such as synchrotron and IC emission of secondary electrons and positrons, making them observationally detectable. The cooling timescale due to hadronic processes above the kinematic threshold for pion production is given by

$$\begin{aligned} \tau _{\textrm{pp}} = \frac{1}{0.5\,n_{\textrm{n}} v_{\textrm{cr}}\sigma _{\textrm{pp}}} \approx 6.6\,\left( \frac{n_{\textrm{n}}}{10^{-2}~\textrm{cm}^{-3}}\right) ^{-1}\,\textrm{Gyr}. \end{aligned}$$
(42)

Here, \(\sigma _{\textrm{pp}} \approx 32~\)mbarn represents the inelastic cross section of protons with an inelasticity of approximately 0.5. Additionally, \(n_{\textrm{n}}=\rho /m_{\textrm{p}}\) denotes the number density of target nucleons for the hadronic reaction (assuming gas of mass density \(\rho \) that is mainly composed of hydrogen and helium) and \(v_{\textrm{cr}}\approx c\) is the CR proton velocity that approaches the light speed c for relativistic CR energies.

Coulomb interactions of CR ions with a thermal plasma. The Coulomb field of electrons of the background plasma can deflect a CR ion. The resulting momentum and energy transfer from the CR ion to background electrons (i.e., the stopping power) decelerates the ion. The calculation is most easily done in the center-of-momentum reference frame, which nearly coincides with the CR ion rest frame. In this frame (that is denoted by primed quantities), the electron is deflected by an angle \(\theta _{\textrm{d}}'\) and attains a perpendicular momentum change of \(\varDelta p_{\textrm{e}}'=m_{\textrm{e}}^{} v_{\textrm{e}}'\theta _{\textrm{d}}'\) (in the non-relativistic limit). Because this change of momentum occurs perpendicular to the boost direction from the lab to the CR ion rest frame, we have \(\varDelta p_{\textrm{e}}'=\varDelta p_{\textrm{e}}\). Hence in the lab frame, the associated energy gain of the electron corresponds to the energy loss of the ion of

$$\begin{aligned} \varDelta E=\frac{\left( \varDelta p_{\textrm{e}}'\right) ^2}{2 m_{\textrm{e}}} =\frac{m_{\textrm{e}}}{m_{\textrm{i}}}\,\theta _{\textrm{d}}'^2 E, \end{aligned}$$
(43)

where \(E=\frac{1}{2}m_{\textrm{i}}^{}v_{\textrm{i}}^2\) is the CR ion energy in the lab frame.Footnote 17 Note that the impact of ion-ion scattering rate on the energy loss of the incoming fast ion is suppressed because the inertia of the background ions is much larger than that of the background electrons. If the impact parameter of the interaction is less than a critical impact parameter (at which the electron’s kinetic energy balances the electrostatic potential energy on average in the CR ion frame),

$$\begin{aligned} b_0=\frac{2 Z e^2}{m_{\textrm{e}} v_{\textrm{i}}^2} =\frac{2 r_0 c^2}{v_{\textrm{i}}^2}, \end{aligned}$$
(44)

we observe a (rare) large-angle scattering event. In the above expression Ze and \(v_{\textrm{i}}\) are the charge and velocity of the CR ion, \(r_0=Ze^2/(m_{\textrm{e}} c^2)\) denotes the classical electron radius, \(m_{\textrm{e}}\) is the electron mass and e is the elementary charge. Because there are many more electrons at distances larger than \(b_0\), small-angle deflections at large impact parameters up to the Debye length (which characterizes the scale at which the charge of a plasma particle is screened) dominate the Coulomb scattering rate by a factor of \(2\ln \varLambda \), where \(\ln \varLambda \sim 35\)–40 is the Coulomb logarithm. The timescale at which the average squared deflection angle \(\langle \theta _{\textrm{d}}'^2\rangle \) in Eq. (43) becomes approximately equal to unity corresponds to the deflection time, \(\tau _{\textrm{d}}^\textrm{ei}\). The Coulomb cooling timescale (\(\tau _{\textrm{Coul,i}}\)) for a CR ion as it moves through a plasma is determined by dividing the particle’s energy by its rate of energy loss. In other words, \(\tau _{\textrm{Coul,i}}\) can be calculated as the deflection timescale divided by the average relative energy transfer to the background plasma:

$$\begin{aligned} \tau _{\textrm{Coul,i}} = \frac{E}{|{\dot{E}}|}\Bigg |_{\textrm{Coul,i}} \approx \tau _{\textrm{d}}^\textrm{ei}\, \frac{m_{\textrm{i}}}{m_{\textrm{e}}} = \frac{m_{\textrm{i}}}{m_{\textrm{e}} n_{\textrm{e}} v_{\textrm{i}} \sigma _{{\textrm{ei}}}} = \frac{m_{\textrm{i}}}{m_{\textrm{e}} n_{\textrm{e}} v_{\textrm{i}} \pi b_0^2\, 2\ln \varLambda } = \frac{m_{\textrm{i}}^{}v_{\textrm{i}}^3}{8\pi m_{\textrm{e}} n_{\textrm{e}} r_0^2 c^4\ln \varLambda },\nonumber \\ \end{aligned}$$
(45)

where \(n_{\textrm{e}}\) is the electron number density and \(\sigma _{{\textrm{ei}}}\) is the Coulomb cross section. In the second step, we adopted the non-relativistic limit for simplicity. While the hadronic and Coulomb cooling times both scale with the inverse gas density, the strong velocity dependence of \(\tau _{\textrm{Coul}}\propto v_{\textrm{i}}^3\) implies that for proton energies \(\lesssim 1\) GeV, Coulomb interactions are more effective than the hadronic reaction in removing energy from the CR proton (see right-hand panel of Fig. 14). A more precise calculation for this process by Gould (1972a) also takes into account quantized plasma oscillations which slightly modifies the Coulomb logarithm.

Fig. 14
figure 14

Cooling timescales of CRs in various astrophysical plasmas as a function of kinetic energy. Left: CR electron cooling times resulting from Coulomb and IC/synchrotron interactions for densities and magnetic field strengths as indicated. Right: CR proton cooling times as a result of Coulomb and hadronic interactions for the same parameters as in the left panel. It is evident that CR protons with energies above 10 GeV have a lifespan that is at least 60 times longer than that of CR electrons at any energy. CR electrons can only persist for a Hubble time without re-acceleration in the dilute outer regions of the CGM and the ICM. Electrons emitting radio waves at a frequency of 1.4 GHz possess an energy of approximately 5 GeV in microgauss magnetic fields, resulting in a lifespan of 0.2 Gyr or less. If these electrons are generated through hadronic interactions of CRs, their parent CR protons would have had energies around 80 GeV, so that they can be injected over a considerably longer lifetimes. Image reproduced with permission from Enßlin et al. (2011), copyright by ESO

Ionization interactions of CR ions. Ionization losses are important both for low-energy CRs and for the ISM. Ionization losses for CR protons that move with velocity \(v_{\textrm{p}}\) can be obtained from the Bethe-Bloch equation (Groom and Klein 2000; Enßlin et al. 2007). The associated ionization timescale in the non-relativistic limit reads

$$\begin{aligned} \tau _{\textrm{ion}} = \frac{E}{|{\dot{E}}|}\Bigg |_{\textrm{ion}} \approx \frac{m_{\textrm{p}}^{}v_{\textrm{p}}^3}{\displaystyle 8\pi m_{\textrm{e}} r_0^2 c^4\sum _Z Z n_Z\ln \left( 2 m_{\textrm{e}} v_{\textrm{p}}^2/I_Z\right) }, \end{aligned}$$
(46)

where \(n_Z\) represents the density of atomic species characterized by an electron number Z, \(I_Z\) is the ionization potential (\(I_Z=13.6\) and 24.6 eV for hydrogen and helium, respectively), and we ignore a density correction factor of order unity. The similarity of ionization and Coulomb loss timescales (Eq. 45) is not a coincidence but is rooted in the interaction physics that only differs by the ionization process.

2.4.3 Cosmic ray lepton interactions

Coulomb interactions of CR electrons with a thermal plasma. Those interactions can be derived analogously to the case of CR ions, but now we consider the scattering of a CR electron in the Coulomb field of an electron of the background plasma. This increases the energy transfer by a factor of the mass ratio \(m_{\textrm{i}}/m_{\textrm{e}}\) (compared to the energy transfer rate for ion-electron collisions) because of the identical masses of the scattering partners, thus making this process more efficient than electron-ion scattering while the scattering rates of both processes are nearly identical. Hence, we obtain from Eq. (45):

$$\begin{aligned} \tau _{\textrm{Coul,e}} = \frac{E_{\textrm{e}}}{|{\dot{E}}_{\textrm{e}}|}\Bigg |_{\textrm{Coul,e}} \approx \frac{1}{n_{\textrm{e}} v_{\textrm{e}} \sigma _{{\textrm{e e}}}} = \frac{1}{n_{\textrm{e}} v_{\textrm{e}} \pi b_0^2\, 2\ln \varLambda } = \frac{v_{\textrm{e}}^3}{8\pi n_{\textrm{e}} r_0^2 c^4\ln \varLambda }, \end{aligned}$$
(47)

where \(\sigma _{{\textrm{ee}}}\) is the Coulomb cross section and \(b_0=2 r_0 c^2/v_{\textrm{e}}^2\) is the impact parameter where the electron’s kinetic energy balances its electrostatic potential energy. Interestingly, the Coulomb cooling timescale of CR electrons also decreases steeply towards low electron energies as for the CR ions (see left-hand panel of Fig. 14). A more precise calculation by Gould (1972b) results in an Coulomb energy loss timescale for CR electrons which slightly differs from that of CR ions because of two primary factors: firstly, the presence of exchange effects (which slightly modifies the Coulomb logarithm), and secondly, the fact that CR electrons and positrons have the potential to relinquish a significant portion of their energy in a single interaction with a plasma electron.

Inverse Compton interactions. At highly relativistic energies, CR electrons experience synchrotron interactions with the magnetic field and IC interactions with the ambient photon field. In the intergalactic medium and in the bulk of the intracluster plasma, the photon energy density is dominated by the CMB while the energy density of starlight photons (in the infrared-to-ultraviolet regime) exceeds that of the CMB in and around galaxies. In general, the photon energy density is the sum of the CMB and starlight, \(\varepsilon _{\textrm{ph}}=\varepsilon _{\textrm{cmb}}+\varepsilon _\star \).

To derive the energy loss rate of electrons due to IC interactions, we transform to the electron rest frame. In this frame, we can assume elastic photon scattering with a relativistic electron of energy \(E_{\textrm{e}}=\gamma _{\textrm{e}} m_{\textrm{e}} c^2\) in the Thomson regime, i.e., when the Lorentz-boosted photon energy is much less than the electron rest mass, \(\gamma _{\textrm{e}}\langle E \rangle \ll m_{\textrm{e}}c^2\), where \(\langle E \rangle \) is the average photon energy before scattering. Hence, after Lorentz boosting the photon into the electron rest system, reversing the normal component of the photon momentum as a result of elastic scattering, and Lorentz de-boosting it into the lab system, we pick up two Lorentz factors and find a net photon energy gain of \(\langle E_1\rangle =\frac{4}{3}\,\gamma _{\textrm{e}}^2\,\langle E\rangle \), where the factor 4/3 derives from averaging over an isotropic photon field. This photon energy gain corresponds to the IC energy loss rate of a CR electron of

$$\begin{aligned} {\dot{E}}_{\textrm{e}} = -\sigma _{\textrm{T}}c n_{\textrm{ph}} \langle E_1 \rangle = -\frac{4}{3} \sigma _{\textrm{T}}c \varepsilon _{\textrm{ph}} \gamma _{\textrm{e}}^2 = -\frac{\sigma _{\textrm{T}}c}{6\pi }\, B_{\textrm{ph}}^2 \gamma _{\textrm{e}}^2, \end{aligned}$$
(48)

where \(\sigma _{\textrm{T}}=2\pi r_0^2= 2\pi Z^2 e^4/ (m_{\textrm{e}}^2 c^4)\) is the Thomson cross section and \(\varepsilon _{\textrm{ph}}=\langle E\rangle n_{\textrm{ph}}=B_{\textrm{ph}}^2/(8\pi )\) is the photon energy density in the laboratory frame with an equivalent magnetic field strength \(B_{\textrm{ph}}\). In general, the Thomson cross section of a charged particle of mass m interacting with a photon scales as \(\sigma _{\textrm{T}}\propto m^{-2}\) so that the IC interaction rate of ions in comparison to that of electrons is suppressed by the square of the electron-to-ion mass ratio.

Synchrotron interactions. The Lorentz force associated with the magnetic field causes a CR electron or positron to gyrate and hence to emit synchrotron radiation. Formally, this can be described by a scattering process, which obeys the same Feynman diagram as the IC interaction: while the IC process evokes an electron scattering with a real photon, in a synchrotron interaction, the electron borrows a “virtual photon” from the magnetic field. Hence, the total energy loss rate of a CR electron at high energies is given by

$$\begin{aligned} {\dot{E}}_{\textrm{e}}= -\frac{\sigma _{\textrm{T}}c}{6\pi }\, \left( B_{\textrm{ph}}^2+B^2\right) \gamma _{\textrm{e}}^2. \end{aligned}$$
(49)

The first term in equation (49), \({\dot{E}}_{\textrm{e}}\propto B_{\textrm{ph}}^2\), represents the energy loss resulting from IC scattering with photons from the radiation field. The second term \(\propto B^2\) represents the energy loss due to synchrotron emission. The magnetic field strength equivalent to the energy density of the CMB is \(B_{\textrm{cmb}}\simeq 3.2\,(1+z)^2~\mu \textrm{G}\) at a redshift z. Thus, for synchrotron emission to dominate over the IC process, the magnetic field must exceed either \(B_{\textrm{cmb}}\) or \(B_{\textrm{ph}}\) if the energy density of the stellar radiation field exceeds that of the CMB. The cooling time \(\tau _{\textrm{cool}}=E_{\textrm{e}}/{\dot{E}}_{\textrm{e}}\) of a relativistic electron due to synchrotron and IC interactions can be calculated as follows:

$$\begin{aligned} \tau _{\textrm{cool}}=\frac{E_{\textrm{e}}}{|{\dot{E}}_{\textrm{e}}|}= \frac{6\pi m_{\textrm{e}} c}{\sigma _{\textrm{T}}\,\left( B_{\textrm{ph}}^2+B^2\right) \gamma _{\textrm{e}}} \approx 200\,\textrm{Myr}, \end{aligned}$$
(50)

for \(B=1~\mu \)G and \(\gamma _{\textrm{e}}=10^4\) and we assume a negligible starlight contribution. A CR electron population that was injected at one epoch and cools for a time \(t=\tau _{\textrm{cool}}\) shows an exponentially suppressed electron spectrum above the energy/Lorentz factor that corresponds to \(\tau _{\textrm{cool}}\). In practice, CR electrons are generated over a finite time interval so that the cooled spectrum probes a range of cooling times and associated spectral energy cutoffs. Hence, if we observed such a CR electron population at time \(t\gtrsim \tau _{\mathrm {cool,\,init}}\) (the cooling time of the initially injected CR electron population), we would expect to observe a considerably steepened power-law spectrum in comparison to the injected spectrum. The synchrotron frequency, \(\nu _{\textrm{syn}}\), in the monochromatic approximation (Enßlin and Sunyaev 2002; Pfrommer et al. 2022) is given by

$$\begin{aligned} \nu _{\textrm{syn}}= & {} \frac{3 e B}{2\pi \, m_{\textrm{e}} c}\,\gamma _{\textrm{e}}^2 \simeq 1 \, \left( \frac{B}{\mu \text{ G }}\right) \, \left( \frac{\gamma _{\textrm{e}}}{10^4}\right) ^2 \text{ GHz }. \end{aligned}$$
(51)

By combining equations (50) and (51) and eliminating the Lorentz factor \(\gamma _{\textrm{e}}\), we can obtain the cooling time of electrons emitting at frequency \(\nu _{\textrm{syn}}\),

$$\begin{aligned} \tau _{\textrm{cool}} = \frac{\sqrt{54\pi \, m_{\textrm{e}} c\, e B \nu _{\textrm{syn}}^{-1}}}{\sigma _{\textrm{T}}\,(B_{\textrm{ph}}^2+B^2)} \lesssim 190\,\left( \frac{\nu _{\textrm{syn}}}{1.4\,\textrm{GHz}}\right) ^{-1/2}\textrm{Myr}, \end{aligned}$$
(52)

The cooling time \(t_{\textrm{cool}}\) is then bound from above and – in the case of negligible starlight contribution – attains its maximum cooling time at \(B=B_{\textrm{cmb}}/\sqrt{3} \simeq 1.8\,(1+z)^2\mu \textrm{G}\), independent of the magnetic field.

Figure 14 shows the cooling timescales of CR electrons (left) and CR protons (right) as a function of their kinetic energy. It is evident that CR protons with energies above 10 GeV have a lifespan at least 60 times longer than that of CR electrons at any energy. CR electrons can only persist for a Hubble time without undergoing re-acceleration within the low-density regions of galaxy clusters or in the outer CGM. But in this case, they cool down to Lorentz factors \(\gamma _{\textrm{e}}\sim 100\)–300 with kinetic energies \(E_{\textrm{e}}\sim (50\)–150) MeV where they cannot be observed on Earth because the ionospheric plasma cutoff precludes radio waves to propagate through the Earth’s atmosphere at frequencies below 1–10 MHz.Footnote 18 The electrons that emit in the GHz radio range have an energy of approximately 5 GeV in \(\mu \)G magnetic fields (as indicated by equation 51). Consequently, their lifespan is estimated to be 0.2 Gyr or shorter. If they are of hadronic origin, their parent CR protons had energies of about \(E_{\textrm{p}}\approx 16 \langle E_{{\textrm{e}}^\pm }\rangle \approx 80\) GeV, which have considerably longer lifetimes.

2.4.4 Equilibrium electron distribution

We now discuss the connection between the radio synchrotron spectrum and the radiating CR electron population. In particular, we have to distinguish between freshly accelerated electrons and an equilibrium distribution of electrons where the injection is balanced by radiative losses. CR electrons are either directly accelerated at shocks (driven by structure formation in galaxy clusters or by SNe in galaxies) or injected in hadronic CR proton interactions. This implies a CR electron source function \(s_{\textrm{e}}=C_{\textrm{inj}}E_{\textrm{e}}^{-\alpha _{\textrm{inj}}}\), with a spectral index \(\alpha _{\textrm{inj}}\simeq 2.1\)–2.4. Note that the test particle limit of diffusive shock acceleration yields a spectral index \(\alpha _{\textrm{inj}}=2\) in case of a strong shock. In steady state, the acceleration and injection of CR electrons are balanced by the cooling effects of synchrotron and IC processes:

$$\begin{aligned} \frac{\partial }{\partial E_{\textrm{e}}} \left[ \dot{E_{\textrm{e}}}(E_{\textrm{e}}) f_{\textrm{e}} (E_{\textrm{e}}) \right] = s_{\textrm{e}}( E_{\textrm{e}}), \end{aligned}$$
(53)

where the electron energy loss rate, \(\dot{E_{\textrm{e}}}\), is given by equation (49). For \(\dot{E_{\textrm{e}}}(E_{\textrm{e}}) < 0\), the solution to this equation is

$$\begin{aligned} f_{\textrm{e}} (E_{\textrm{e}})&= \frac{1}{|\dot{E_{\textrm{e}}}(E_{\textrm{e}})|} \int _{E_{\textrm{e}}}^\infty \textrm{d}E_{\textrm{e}}' s_{\textrm{e}}( E_{\textrm{e}}') = \frac{C_{\textrm{inj}}}{(\alpha _{\textrm{e}}-1)\,|\dot{E_{\textrm{e}}}(E_{\textrm{e}})|}\,E_{\textrm{e}}^{1-\alpha _{\textrm{inj}}} \propto E_{\textrm{e}}^{-\alpha _{\textrm{inj}}-1}, \end{aligned}$$
(54)

assuming that the dominant processes are synchrotron and IC losses in the last step (see equation 49). Hence, the electron spectral index steepens by unity in steady state, \(\alpha _{\textrm{e}}=\alpha _{\textrm{inj}}+1\). CR electrons with a power-law spectrum, \(f_{\textrm{e}}={C_{\textrm{e}}}E_{\textrm{e}}^{-\alpha _{\textrm{e}}}\), radiate synchrotron emission with a power law in frequency,

$$\begin{aligned} j_\nu \propto C_{\textrm{e}} B^{\alpha _\nu +1}\nu ^{-\alpha _\nu }, \end{aligned}$$
(55)

where \(\alpha _\nu \equiv \textrm{d}{\log }j_\nu /\textrm{d}\log \nu =(\alpha _{\textrm{e}}-1)/2\). Observationally, the spectral index is determined by comparing radio surface brightness maps at two different frequencies \(\nu _1\) and \(\nu _2\),

$$\begin{aligned} \alpha _{\nu _1}^{\nu _2}\equiv \frac{\log (S_{\nu _2}/S_{\nu _1})}{\log (\nu _2/\nu _1)}. \end{aligned}$$
(56)

Hence, for a steady-state CR electron population that has been accelerated by a strong shock, we expect \(\alpha _{\textrm{e}}=\alpha _{\textrm{inj}}+1=3\) and \(\alpha _\nu =(\alpha _{\textrm{e}}-1)/2=1\) in the test particle limit. If we instead were to resolve the freshly accelerated CR electron population directly at the shock, we would obtain \(\alpha _\nu =(\alpha _{\textrm{inj}}-1)/2=0.5\) at the shock provided we can neglect radiative cooling losses. Observed spectral indices at SNR shocks of \(\alpha _\nu \simeq 0.65\) imply \(\alpha _{\textrm{inj}}\simeq 2.3\), which may require a revision of the theory of diffusive shock acceleration (see discussion in Sect. 2.2.2). In steady state and for a negligible starlight contribution, the synchrotron emissivity can be obtained by combining equations (54) and (55), yielding

$$\begin{aligned} j_\nu \propto \frac{C_{\textrm{inj}}B^{\alpha _\nu +1}\nu ^{-\alpha _\nu }}{B_{\textrm{cmb}}^2+B^2}, \end{aligned}$$
(57)

which is nearly independent of B in the synchrotron cooling regime, \(B>B_{\textrm{cmb}} \simeq 3.2\, (1+z)^2\,\mu \)G in stready state. While IC emission only depends on the amount of CR electrons and the photon energy density, the synchrotron emission depends on B in the IC cooling regime, \(B<B_{\textrm{cmb}}\). Figure 15 shows the impact of the various leptonic cooling processes discussed in this section on the CR electron distribution while neglecting any re-acceleration processes. On the left-hand side, we show a freely cooling electron spectrum that develops a cutoff at low and high energies as a result of Coulomb and IC/synchrotron cooling, respectively. After 1 Gyr, only CR electrons with Lorentz factors \(\gamma _{\textrm{e}}\sim 100\)–300 survive, as expected from our discussion of Fig. 14. The right-hand side of Fig. 15 shows the build-up of a steady-state spectrum where continuous injection (with spectral index 2.1) balances cooling due to Coulomb and IC/synchrotron interactions. At early times \(t\lesssim 1\) Gyr, the transition from the acceleration/injection spectrum to the cooled spectrum with an index of \(\alpha _{\textrm{cool}}=\alpha _{\textrm{inj}}+1=3.1\) is clearly visible. With time, this break moves to lower energies and nearly vanishes for \(t\gtrsim 2\) Gyr.

Fig. 15
figure 15

Cooling processes shaping the CR electron spectrum. The left-hand panel shows a freely cooling power-law momentum spectrum with spectral index \(\alpha _{\textrm{e}}=2.5\) where Coulomb cooling at low energies and IC/synchrotron cooling at high energies progressively narrow down the spectrum. The right-hand panel illustrates the accumulation of a steady-state spectrum resulting from continuous injection (with \(\alpha _{\textrm{inj}}=2.1\)) and cooling. In the cooled regime on the right-hand side, the spectrum steepens so that the spectral index increases by unity (\(\alpha _{\textrm{cool}}=3.1\)). The break energy from the unimpeded injection to the cooled spectrum moves to lower energies with time. The solid line represents the semi-analytical solution, while the dotted lines represent the fully numerical solutions to the Fokker-Planck equation of electron transport. The gas density, magnetic field strength and photon energy density are given by \({n_\textrm{gas} = 10^{-3}\,\textrm{cm}^{-3}}\), \(B = 5~\mu \textrm{G}\) and \(\varepsilon _{\textrm{ph}} = 6\,\varepsilon _\textrm{cmb}\), respectively. Image reproduced with permission from Winner et al. (2019), copyright by the author(s)

2.5 Cosmic ray spectral transport

2.5.1 Momentum-dependence of spatial cosmic ray transport

Self-confinement of CRs. Following our discussion of the various radiative and non-radiative cooling processes that shape the CR electron and ion spectra, we now consider the momentum dependence of spatial CR transport. Provided the self-excited Alfvén waves are only weakly damped, CRs are efficiently scattered and stream close to the local Alfvén velocity. In the opposite regime when waves are efficiently damped, CRs are no longer confined to the frame of the local Alfvén waves and diffuse with an effective velocity that exceeds the Alfvén velocity, \(v_{\textrm{di}}\gg v_{\textrm{a}}\). The value of the CR drift speed \(v_{\textrm{d}}\) is determined by the equilibrium between the wave damping rate and the growth rate of the CR streaming instability, \(\varGamma _{\textrm{gyro}}\propto n_\textrm{cr}(>p_{\textrm{min}})/n_{\textrm{i}} \times (v_{\textrm{d}}/v_{\textrm{a}} - 1)\), as shown in equation (7). The CR streaming instability growth rate indirectly depends on the CR momentum because it depends on the number density of CRs, \(n_\textrm{cr}(>p_{\textrm{min}})\), with a resonant minimum momentum \(p_{\textrm{min}}\). Because the CR spectra are typically soft, there are much fewer high-momentum particles, which decreases the wave growth rate towards high CR momenta. The damping rates scale differently with CR particle momentum (Wiener et al. 2013a), depending on whether we consider a linear or non-linear wave-damping process (Kempski and Quataert 2022) and on the specifics of the wave damping process: e.g., ion–neutral damping is essentially independent of wavelength while damping by background turbulence scales as \(\varGamma \propto k^{1/2}\) (Farmer and Goldreich 2004). Provided the prevailing damping process depends weaker on \(n_\textrm{cr}\) than the wave growth rate, the equilibrium of wave growth and damping evolves towards a dominant CR diffusion regime at high CR momenta, which implies that high-momentum CRs drift faster than their lower momentum analogues. Hence, this transport physics introduces an energy dependence of the self-generated CR diffusion coefficient, \(\kappa \propto E^\alpha \) with \(\alpha >0\). Depending on which process dominates wave damping, the correction term to the CR drift velocity is neither truly diffusive- nor streaming-like in nature (Wiener et al. 2013a).

External confinement of CRs. In this case, CR (ion or electron) scattering is dominated by externally driven turbulence, and the specifics of the scattering modes (Alfvénic vs. fast mode turbulence) and their spectral slopes (see Sect. 2.3.6) may imprint an energy dependence on the CR diffusion coefficient as we briefly sketch here. A CR scattering with a parallel propagating Alfvén wave obeys the resonance condition \(r_{\textrm{g}}=2\pi /k_\parallel \) (see equation (1) or Fig. 3), where the gyroradius of a CR particle of mass m and charge q in a magnetic field of strength B is given by

$$\begin{aligned} r_{\textrm{g}} = \frac{p_\perp c}{q B} = \frac{\beta _\perp c}{\varOmega } \rightarrow \frac{E}{qB}. \end{aligned}$$
(58)

Here, \(\varOmega =qB/(\gamma mc)\) denotes the relativistic gyro frequency, \(p_\perp \) and \(\beta _\perp \) are the perpendicular momentum and relativistic \(\beta \) factor, respectively, and the limit in the last step applies to the relativistic regime. The resonant interaction of a CR with a parallel propagating Alfvén wave with energy density \(\varepsilon _{{\textrm{w}},\pm }\), thus generates a CR diffusion coefficient (eqs. 29 and 27)

$$\begin{aligned} \kappa =\frac{c^2}{3({{\bar{\nu }}}_+ + {{\bar{\nu }}}_-)} \propto \frac{c^2}{3\varOmega }\,\frac{\varepsilon _B}{\varepsilon _{{\textrm{w}},+} + \varepsilon _{{\textrm{w}},-}} \rightarrow \frac{c r_{\textrm{g}}}{3}\,\frac{\varepsilon _B}{\varepsilon _{{\textrm{w}},+} + \varepsilon _{{\textrm{w}},-}}, \end{aligned}$$
(59)

where the limit in the last step applies to the relativistic regime. Depending on the wave spectrum and degree of anisotropy, we can distinguish three important cases:

(60)

Inserting the different wave spectra into the expression for the CR diffusion coefficient in equation (59), and adopting the resonance condition, enables us to derive its energy dependence,

(61)

2.5.2 Numerical methods for evolving the cosmic ray momentum spectrum in space

Because CR cooling processes and spatial transport depend on CR momentum, this calls for the development of numerical methods that evolve the CRs distribution function simultaneously in space and momentum. While the time evolution of the CR electron spectrum is essential for understanding observational signatures at different frequencies, following the CR ion spectrum may be additionally important for improving our modeling of CR energy and momentum feedback in galaxies. As shown in Fig. 14, and as discussed in Sect. 2.4, the electron cooling times are substantially shorter than those of CR ions, which suggests the need for different numerical treatments for CR electrons and CR ions.

Spectral CR propagation codes, such as GALPROP (Strong and Moskalenko 1998; Moskalenko and Strong 1998), USINE (Maurin et al. 2001; Putze et al. 2010), DRAGON (Evoli et al. 2008, 2017; Maccione et al. 2011), PICARD (Kissmann 2014), and SPINNAKER (Heesen et al. 2018) aim at numerically solving the CR transport equation (21) in the one-moment approximation for a given magnetic field, gas density, source distribution, and stationary background flow. Another approach is to transform the CR transport equation (21) into a set of equivalent stochastic differential equations, partially equipped with advection fields (Kopp et al. 2012; Merten et al. 2017, 2018). Coupling this approach with a Monte Carlo code for simulating the propagation of ultra-high-energy CRs was achieved in the CRPropa code (Armengaud et al. 2007; Alves Batista et al. 2022), which uses a set of stochastic differential equations for propagating CRs in its latest release. Because these codes aim at understanding CR and \(\gamma \)-ray observables in the Milky Way or radio maps of nearby galaxies, the magnetic field and density distributions are typically inferred from other observations that adopt certain simplifying assumptions (Boulanger et al. 2018). While adequate for these scientific goals, the background state does not necessarily represent a self-consistent solution of the MHD equations and as such cannot be used to study the dynamical impact of CRs. Moreover, these codes face difficulty in modeling a scenario in which the transport of CRs transitions from predominantly streaming to predominantly diffusion as a function of CR energy. To improve upon these issues, a set of stochastic differential equations suited to follow CR transport in simulations of MHD turbulence was implemented in the CRIPTIC code (Krumholz et al. 2022) and used to determine an effective transport theory for CRs that stream through a turbulent MHD plasma (Sampson et al. 2023). However, common to all those approaches is that they do not allow for the back-reaction of CR pressure forces or heating on the (magneto-)hydrodynamics. As such, those approaches cannot be used to study CR feedback in galaxies and galaxy clusters.

Early works that numerically solve the coupled time-dependent CR transport equation (21) and hydrodynamic equations in one spatial dimension (assuming planar or spherical geometry) address the problem of non-linear self-regulation of CR acceleration at shocks (Falle and Giddings 1987; Bell 1987; Kang and Jones 1991). These studies use piece-wise constant values to discretize the CR momentum spectrum. However, when the CR transport equations are coupled to the system of MHD equations, this discretization requires approximately 40–80 bins per momentum decade to produce accurate results of the numerical integration (Winner et al. 2019). A more efficient discretization is given by a piece-wise power-law representation of the CR momentum distribution on a logarithmically spaced momentum grid, which is evolved by assuming the continuity of the momentum spectrum and employing CR number conservation, where the backreaction of CRs on the MHD is either not included (Jones et al. 1999) or included (Yang and Ruszkowski 2017).

Unfortunately, there is a numerical instability associated with this approach for a localized energy injection in momentum space (e.g. as a result of energy-dependent spatial diffusion or CR shock acceleration). Girichidis et al. (2019) show that the continuity assumption enforces changes in the local logarithmic slope that leads to a non-physical, oscillatory behavior across the entire spectrum with alternating convex and concave regions of the spectrum. Generally, the piece-wise power-law representation of the CR distribution exhibits two degrees of freedom, the normalization and spectral slope in every bin. Ensuring CR energy and number conservation (in the absence of CR sources) while evolving the momentum distribution and abandoning spectral continuity is a very promising approach, which is similar in spirit to the discontinuous Galerkin method of hydrodynamics. This so-called “coarse-grained momentum finite volume” method is employed for evolving the CR electron spectrum on Lagrangian particle trajectories in MHD simulations, where the CR pressure does not back-react on the gas dynamics (Miniati 2001; Mimica et al. 2009; Vaidya et al. 2018; Böss et al. 2023).

In order to study CR-modified shocks or CR-driven galactic winds, we need to account for the non-linear feedback of the CR pressure on the MHD while simultaneously solving for the CR transport equation (21). To this end, advection and diffusion processes in physical and momentum spaces are followed by the coarse-grained momentum finite volume method on Eulerian fixed meshes (Jones and Kang 2005; Girichidis et al. 2019; Ogrodnik et al. 2021) or on unstructured moving Voronoi meshes (Girichidis et al. 2022). The very small cooling times at low CR electron and ion momenta as well as high CR electron momenta would substantially slow down the computations, rendering fully three-dimensional MHD simulations of galaxies or SN blast waves with spectral CRs prohibitively expensive. To improve the efficiency and accuracy of these computations, analytical solutions in the low- and high-momentum regimes are connected to numerical solutions in the intermediate momentum range, thus enabling computations at numerically affordable costs (Winner et al. 2019; Girichidis et al. 2019). This allows us to study the impact of obliquity-dependent CR shock acceleration at SN blast waves and to generate multi-frequency emission maps from radio to \(\gamma \) rays (Winner et al. 2020), the dynamical impact of spectral CR ions during galaxy formation (Girichidis et al. 2023), and non-thermal processes in galaxy clusters such as radio halos and relics (Miniati et al. 2001a; Miniati 2003; Pfrommer et al. 2007, 2008; Pfrommer 2008; Donnert et al. 2013; Pinzke and Pfrommer 2010; Pinzke et al. 2013, 2017; Böss et al. 2023). The CR spectral algorithm has also found application in galaxy simulations with star formation and feedback that employ the spatial two-moment method for CR hydrodynamics to yield CR spectra of electrons, positrons, (anti)protons, and heavier nuclei (Hopkins et al. 2022, using a meshless finite-mass discretization of the underlying MHD). This novel approach allows studying the size of the CR scattering halo reaching into the CGM in realistic galaxies and to address which CR spectral features arise from local structures rather than CR transport physics.

3 Astrophysical systems

Fig. 16
figure 16

A collage of simulations of CR-driven winds performed by different groups using various approaches, physics, and numerical techniques: (i) cosmological simulations (left panel; Ji et al. (2021); Liang et al. (2016); Salem et al. (2016); Butsky et al. (2020); Buck et al. (2020); GIZMO, RAMSES, ENZO, CHANGA, AREPO, respectively), (ii) global galactic disks (middle panel; Ruszkowski et al. (2017b); Booth et al. (2013); Butsky and Quinn (2018); Hanasz et al. (2013); Semenov et al. (2021); Pakmor et al. (2016b); FLASH, RAMSES, ENZO, PIERNIK, ART, AREPO, respectively), (iii) stratified boxes, each representing a zoom-in on the galactic disk (right panel; Simpson et al. (2016); Farber et al. (2018); Girichidis et al. (2018); Armillotta et al. (2021); AREPO, FLASH, FLASH, ATHENA, respectively). In all cases, groups of images corresponding to a given code illustrate the differences arising due to varying CR physics

As argued in the Introduction (Sect. 1), CRs may shape in fundamental ways how the astrophysical feedback processes operate in nature. We now begin the discussion of the applications of the physical principles governing CR interactions with plasma waves and matter (Sect. 2) to these feedback processes. Our general philosophy is to separate the discussion by the relevant astrophysical scales, where the discussion of feedback on small scales informs that on larger scales. This allows us to first focus on the in-depth introduction of the detailed impact of CR physics on the dynamical state of the ISM before transitioning to progressively larger scales of galactic halos, where we connect the processes operating on different scales to synthesize the emerging picture of CR feedback. To this end, we discuss the role of CR physics in SNe, cold ISM clouds, multiphase ISM, star formation, galactic wind launching, interactions of galactic winds with the CGM and cosmological infall, and the largest galaxies and galaxy clusters. As we progress from smaller to larger scales, the preferred approach to modeling feedback tends to change from one-dimensional simulations, simulations of individual gas clouds and multiphase medium, small gravitationally stratified boxes, global models of galaxies, to fully cosmological simulations. Examples of images from simulations based on some of these approaches are shown in Fig. 16. In many instances, these approaches are to some extent complementary. Consequently, as an example, we may resort to discussing and interpreting the results from cosmological simulations also using one-dimensional analytical models to help build intuition.

3.1 Cosmic ray ionization

The subject of low-energy CRs is extensively discussed in recent reviews by Padovani et al. (2020) and Gabici (2022). Here we present a limited discussion of this topic focusing on key concepts that have implications for CR feedback. In particular, we discuss the impact of CRs by concentrating on the effect of CR-induced ionization in the ISM. Broadly speaking, this discussion is motivated by the fact that this type of ionization is essential for maintaining the coupling of the magnetic fields with the ISM plasma and for facilitating complex ISM chemistry. Furthermore, these processes have important implications for the dynamical interactions of CRs with the ISM, which we discuss in more detail in Sect. 3.2.2.

The rate of star formation in giant molecular clouds is controlled by the competition between gravity and non-thermal pressure due to turbulence and magnetic fields (Crutcher 2012). The degree to which magnetic forces can participate in this process, and couple to the ISM plasma, depends on the level of ionization in the gas. While the observed ionization fractions in molecular clouds are very low, they are nevertheless sufficient to at least partially couple the magnetic fields to the gas, which allows them to, e.g., enable magnetic braking of interstellar clouds, slow down star formation, and enable magneto-rotational instability to operate in protoplanetary disks. Interestingly, these low ionization levels significantly exceed those expected from photoionization due to UV stellar radiation because of the very large column densities of the clouds (McKee 1989). This strongly suggests that the additional ionization may come from CRs that can penetrate the clouds. The kinetic energy of the electrons generated via CR ionization of the ISM gas also provides the heating needed to maintain the molecular clouds at the observed temperatures. Both the ionization and heating of the gas are due to low-energy CRs with energies up to \(\sim \)1 GeV (e.g., Field et al. 1969). This picture is also consistent with the observations of complex chemistry in molecular clouds. The self-shielding of the clouds prevents the photodissociation of the complex molecules and allows them to form. However, the formation rates of these molecules due to the interaction between neutral species, or neutral-neutral reactions, are too slow. This implies that additional ionization sources, again likely due to low-energy CRs, could lead to ion–neutral reactions that are much faster (e.g., Bergin and Tafalla 2007).

In addition to playing a pivotal role in shaping the properties of molecular clouds in the ISM, low-energy CRs are also responsible for producing light elements – lithium (Li), beryllium (Be), and boron (B) – in the Universe. These elements are easily destroyed in thermonuclear reactions in stellar interiors. The abundances of these light elements in the solar system are much smaller than those of other elements characterized by comparable atomic numbers, e.g., carbon, nitrogen, and oxygen (e.g., Fig. 1 of Tatischeff and Gabici 2018). Curiously, when the comparison is restricted to the composition of CRs, the relative abundances of Li, Be, and B and the neighboring elements in the periodic table, the abundances are comparable, thus underscoring the role of low-energy CRs in the synthesis of these elements. The primary channel for the formation of these elements is the spallation of CNO nuclei by low-energy CRs – a process in which the heavier nuclei (C, N, O) are split as a result of collisions with the low-energy CRs or when heavier CR species, such as carbon, collide with ISM hydrogen and eject nucleons to form lighter CR species (e.g., Meneguzzi et al. 1971, see also Sect. 4.1.1).

The physical origin of low-energy CRs is not fully understood. While most of the CR energy may be generated in SN explosions, superbubbles, and stellar wind termination shocks (e.g., Bykov 2014, see also Sect. 2.2.1) followed by Coulomb and ionization losses (see Sect. 2.4.2), low-energy CRs could also be accelerated in other environments, e.g., along bipolar protostellar jet shocks or at surfaces of protostars, where the matter from the accretion disk is channelled by strong magnetic fields and hits protostellar surface (Padovani et al. 2020). The life cycle of molecular gas in star-forming regions appears to rely significantly on protostellar outflows. The interaction between jets and magnetic fields could potentially enable prolonged star formation across multiple dynamic periods. This process will indirectly impact on CR propagation, as the outflow properties will determine the turbulent characteristics of the cold phase (Wang et al. 2010). Below we briefly discuss direct measurements and indirect constraints on low-energy CRs.

3.1.1 Direct and indirect constraints on low-energy cosmic rays

Unravelling the origin of low-energy CRs is complicated by the fact that their spectra are attenuated due to the interactions of low-energy CRs with the solar wind. When measured close to Earth, these interactions lead to the modulation of CR flux at low particle energies (below \(\sim \) 10 GeV; Vos and Potgieter 2015) with a period of \(\sim \) 11 years. During the solar maximum, the attenuation is the strongest as the low-energy CRs are unable to penetrate the strong turbulent and magnetized solar wind, while the high-energy CRs remain unaffected. This effect is termed solar modulation and it prevents us from directly measuring the flux of low-energy CRs on Earth, though the flux can be approximately de-modulated by considering poorly constrained CR propagation models. However, thanks to recent data from the Voyager probes that have travelled past the heliopause, we now have direct measurements of the low-energy CR flux that is nearly constant in time and unaffected by the modulation. The CR spectra shown in Fig. 1 include the data from the Voyager missions (Cummings et al. 2016; Stone et al. 2019) in addition to high-energy CR data from other missions and instruments that is unaffected by the solar modulation (see caption of that figure for details). The spectra unaffected by the solar modulation can be translated into reliable measurements of total energy density of CR ions and electrons in the solar vicinity (i.e., outside the heliopause).

In addition to direct measurements, low-energy CR spectra can be constrained via indirect means based on \(\gamma \)-ray emission observations. CR protons colliding with interstellar protons can produce neutral pions that decay into \(\gamma \)-ray photons (see also Sect. 2.4.2 for a more detailed discussion of hadronic processes). This reaction is possible as long as the kinetic energy of a CR proton exceeds a threshold value of \(\sim \) 280 MeV. Similarly, heavier CR nuclei can also produce \(\gamma \)-ray emission via collisions with the ISM gas. Since the \(\gamma \)-ray photons can easily penetrate the ISM, \(\gamma \)-ray observations open up a window for observations of CRs including mildly relativistic, low-energy CR nuclei. In fact, the diffuse \(\gamma \)-ray spectrum from the Milky Way disk above \(\sim \) 100 MeV is dominated by this process (Ackermann et al. 2012b; Selig et al. 2015; Platz et al. 2022, see also Fig. 13).

The observed \(\gamma \)-ray emission comes from a convolution of the spatial distribution of CRs and the ISM. Thus, if the spatial distribution of the diffuse gas can be reliably modeled, then the spatial distribution of CR energy density can be obtained. This approach reveals that, in the local ISM, the CR spectra derived using this approach exceed the directly measured spectra from AMS-02 (at high CR energies, where the effects of solar modulation are negligible) by only a few tens of percent (Strong 2016; Orlando 2018). Similarly, when the comparison is made between the CR energy density measured directly in the local ISM and at other galacto-centric radii, the scatter in the typical values is less than a factor of two (Acero et al. 2016). In a similar vein, giant molecular clouds (GMCs) can be used as “CR barometers.” A key difference compared to the diffuse ISM case discussed above is that the GMC can serve as a probe of very localized CR energy density. The appeal of using GMCs as CR probes is their large density, which can make them bright \(\gamma \)-ray sources (Black and Fazio 1973). Consequently, they can be detected at both nearby and remote locations in the Milky Way. The expected \(\gamma \)-ray flux from GMCs depends on a number of factors such as the ability of diffuse CRs to penetrate the clouds or the cloud masses, both of which are poorly constrained. Measurements of CR energy densities based on \(\gamma \)-ray emission from GMCs yield similar results to those based on diffuse \(\gamma \)-ray emission in the Galactic disk (Aharonian et al. 2020; Peron et al. 2021).

In addition to hadronic \(\gamma \)-ray emi