Simulations of cosmic ray propagation

We review numerical methods for simulations of cosmic ray (CR) propagation on galactic and larger scales. We present the development of algorithms designed for phenomenological and self-consistent models of CR propagation in kinetic description based on numerical solutions of the Fokker–Planck equation. The phenomenological models assume a stationary structure of the galactic interstellar medium and incorporate diffusion of particles in physical and momentum space together with advection, spallation, production of secondaries and various radiation mechanisms. The self-consistent propagation models of CRs include the dynamical coupling of the CR population to the thermal plasma. The CR transport equation is discretized and solved numerically together with the set of MHD equations in various approaches treating the CR population as a separate relativistic fluid within the two-fluid approach or as a spectrally resolved population of particles evolving in physical and momentum space. The relevant processes incorporated in self-consistent models include advection, diffusion and streaming propagation as well as adiabatic compression and several radiative loss mechanisms. We discuss, applications of the numerical models for the interpretation of CR data collected by various instruments. We present example models of astrophysical processes influencing galactic evolution such as galactic winds, the amplification of large-scale magnetic fields and instabilities of the interstellar medium.


Introduction
Cosmic rays (CRs) are charged particles with non-thermal energy distributions (Strong et al. 2007;Grenier et al. 2015;Gabici et al. 2019). There are both hadronic and leptonic cosmic rays among which protons and electrons are the most abundant particles. In the hadronic component the composition resembles approximately the element distribution in the universe (e.g. Blasi and Amato 2012;Gaisser et al. 2013), so protons and helium nuclei are by far most abundant and account for most of the energy stored in high energy particles. Heavier elements and in particular unstable isotopes are rare but provide valuable information on the dynamics of CRs and serve as clocks for CR acceleration and transport. CR electrons are usually negligible for the dynamics but provide important information on the magnetic field via loss processes that produce radio and synchrotron radiation (e.g. Gaisser 1991). Being high-energy charged particles, CRs mainly interact with the gas via the magnetic field (e.g. Zweibel 2013). The transport processes and the dynamical interaction with the gas are thus tightly coupled to plasma processes.
The integrated energy in hadronic CRs is large enough to have a dynamical impact. In the interstellar medium the CR proton energy is comparable to the thermal and kinetic counterpart (e.g. Ferrière 2001). On galactic scales there is evidence that CR protons are an important agent in driving galactic winds (see, e.g. Naab and Ostriker 2017;Zweibel 2017). In addition they might have an impact on the distribution of gas in the galactic disc and thus alter the star formation process on molecular cloud scales, even though their dynamical impact on molecular clouds is expected to be much weaker compared to other drivers in the ISM like radiation or supernovae. In the densest part of cloud cores, into which CR protons are able to penetrate they ionize and heat the gas (e.g. Padovani et al. 2020). If CRs are efficiently coupled to the gas or penetrate into regions dense enough such that direct particle-particle collisions become relevent, they provide a temperature floor, which directly impacts the fragmentation scale and the seeds of star formation.

Origin of CRs
Most of the CRs are believed to be produced in shocks via diffusive shock acceleration (DSA). Early theoretical models (Krymskii 1977;Axford et al. 1977;Bell 1978;Blandford and Ostriker 1978) as well as recent numerical simulations (e.g. Caprioli and Spitkovsky 2014) confirm this paradigm, see also Marcowith et al. (2020) for a recent review. In the Milky Way and most of the star forming galaxies supernova remnants (SNR) are by far the most abundant source of strong shocks that provide the conditions to accelerate CRs up to an energy of approximately 10 15 eV, so the particles up to that energy are likely to be of Galactic origin (Ackermann et al. 2013). CRs are however observed up to energies of 10 20 eV, which cannot be explained by SNe. Instead, Active Galactic Nuclei (AGN) are an alternative candidate to provide sufficiently energetic environments. In addition, the gyro-radii of those ulta-high energy CRs are larger than the disk of the Milky Way. If coming from a local source within our Galaxy, the individual CRs could therefore be traced back to their origin with an anisotropic distribution on the sky. However, this is not the case and those ultra high energy CRs are thus of extragalactic origin (Kotera and Silk 2016).
The spectral distribution of CRs shows a peak or a flattening at a few GeV followed by a power law with remarkably little scatter. Up to energies of 10 15 eV a spectrum with E À2:7 is observed. Above that energy the scaling steepens, which leads to naming the kink at 10 15 eV the ''knee''. At 10 20 eV the spectrum flattens again, which is called the ''ankle''. At the highest energies the spectrum is less clear, mainly due to the small statistics. Due to the steep spectrum, most of the energy can be located in a narrow energy range of from 1 to 10 GeV per nucleon. Models accounting for the dynamical impact of CRs therefore focus on GeV protons.

Numerical approaches of CR transport and scope of this review
CRs span a large energetic range from particle energies just exceeding the thermal distribution (* MeV) up to the GZK cutoff (e.g. Kotera and Olinto 2011;Amato and Blasi 2018;Gabici et al. 2019). Over this range in momentum the dynamical range of the particle distribution function spans approximately 10 orders of magnitude. Consequently, the effective (numerical) models that describe CRs and the interactions with their environment differ. First, we need to distinguish between a macroscopic and a microscopic perspective; we start with the macroscopic models.
For low-energy CRs with small gyroradii we are often interested in the overall energy of CRs rather than the individual particle trajectories (e.g. Padovani et al. 2020). On the other hand, for ultra-high-energy CRs with gyro-radii comparable to entire galactic discs the particles' trajectory is of greater interest (Kotera and Olinto 2011). Equally important to the energy per particle is the fraction of CR energy compared to the other energies like the magnetic, thermal, radiation or kinetic one for the coupling between the CRs and their environment. If the integrated CR energy is much smaller that the other components, CRs can be treated as tracers without dynamical back-reaction. Depending on the system under consideration the effective transport of the CR tracers can differ from advection with the gas, to diffusion and to free streaming. This is basically always the case for CR electrons, so CRs can be treated as tracer particles or passive fluid. For the hadronic component we can identify three main regimes, which we outline in more detail in Sect. 1.4. The low-energy CRs do not significantly contribute to the dynamics via their total pressure, so again a passive tracer fluid or tracer particles are appropriate. The regime around a GeV contains comparable energy densities to other components in galaxies, so the back-reaction onto the dynamics needs to be included. There are several models of different complexity, which we outline in this review, ranging from simply adding another pressure to the hydrodynamical equations up to a more complex way of including plasma processes between CR particles and the hydromagnetic waves. The very steep spectrum results again in an overall negligible integrated energy above $ 100 GeV. The passive treatment is then again appropriate. However, the large gyro-radii combined with the low number of particles favours the treatment as passive individual particles and their trajectories.
In the microscopic CR models the interaction of individual particles with the magnetic field is of interest in order to understand plasma processes. This approach is a different numerical class, known as particle-in-cell methods, where the Lorentz force for each particle is computed and the particle distribution function is actually sampled by a large number of CRs. The complexity of this plasma approach exceeds the scope of this review and we would like to highlight the review by Marcowith et al. (2020).
The field of CR physics is large ranging from their acceleration at small spatial scales of individual shock fronts up to the trajectory of the highest energy CRs and their specific origin from extra-galactic sources. Depending on the individual setups and environmental conditions, different models are chosen and combined. Among the three main types of models (particle-in-cell, test particle models and an effective fluid description) we mainly cover the last one in different applications.
In this review we distinguish between the following numerical models: • We consider CR propagation described by the canonical Fokker-Planck equation in a static magnetic field, referred to as 'phenomenological models', which are discussed in Sects. 3.1 and 3.2. • In the second part we discuss models considering joint solutions of the Fokker-Planck equation for CRs combined with the equations of magneto-hydrodynamics (MHD). These models are referred to as 'self-consistent models' (Strong et al. 2007) and are discussed in Sects. 4.1-5.1. They can be further distinguished in two ways.
• The first one covers a dynamical coupling of the CRs to the MHD system by simply including an additional pressure term related to cosmic rays. The interactions between CRs and magnetic fields are included as subgrid models with effective coupling coefficients, i.e. without including the streaming process explicitly in the system. This class of models however, has recently been extended to include spectrally resolved CRs. • The latest approach also considers CR generated Alfvén waves together with CR scattering off self-excited waves (streaming) which additionally couple CRs to the energy equation of the thermal plasma through dissipation of Alfvén waves (cosmic ray heating).

Basics and approaches
We begin with some basic concepts of CR propagation, and relating these to observational data. Most of our knowledge of CR propagation comes via secondary CRs, with additional information from c-rays and synchrotron radiation. It is instructive to point out why secondary nuclei are an ideal probe of CR propagation: the fact that the primary nuclei are measured (albeit only near the solar position) means that the secondary production functions can be computed from primary spectra, cross-sections and interstellar gas densities; the secondaries can then be 'propagated' and compared with observations. Since it became known that CRs fill the Galaxy it has been clear that nuclear interactions imply that their composition contains information on their propagation (Bradt and Peters 1950). An important breakthrough was the advent of satellite measurements of isotopic Li, Be, B in the 1970's (Garcia-Munoz et al. 1975). Since then the subject has expanded continuously with models of increasing degrees of sophistication. The observation that the composition of CRs is different from that of solar, in that rare solar-system nuclei like Boron are abundant in CR, proves the importance of propagation in the interstellar medium. Simplified estimates of the lifetime time of CRs in the interstellar medium conclude that there is a canonical column density of a few 'few g cm -2 ' of traversed material before the CRs interact with the gas and lose their energy (see Sect. 2.1).

Particle versus kinetic versus fluid approach
One main distinction of CR transport is the difference between particle and fluid approaches. If the mean free path of CRs is small compared to the characteristic simulated scales, i.e. CRs scatter efficiently, they are often treated as a fluid. On the other hand, if the mean free path is comparable or larger than the system under consideration the trajectory of individual CRs need to be considered. Which approach is more appropriate depends on the spatial scales, the scattering frequency, and the energy of the CRs. To get an order of magnitude estimate of the typical scales we can investigate the gyro-radius where p ? is the momentum perpendicular to the magnetic field line, |q| is the absolute value of the charge and B is the magnetic field strength. Figure 1 shows the gyro radius of CR protons for different magnetic field strengths as a function of CR momentum together with typical sizes of astrophysical systems. The plot illustrates that for low-energy CRs the gyro-radii are perceptibly smaller than typical astrophysical objects. We would like to highlight that the gyro-radius is not a definite measure for how the CRs need to be treated, but it nicely illustrates the separation of scales.
The confinement of CRs is more complicated and depends also on the total energy in CRs compared to the energy of the background system, i.e. whether CRs can be treated as tracer particles moving through the medium or whether CR themselves modify the magnetic field and gas properties. We can broadly separate three different regimes.
CR momenta (GeV/c The spectrum at low CR energies is not well constrained due to solar modulation (e.g . Webber 1998a;Potgieter 2013). The best data of the CR spectrum is based on Voyager (e.g. Cummings et al. 2016) and AMS-02 (e.g. Aguilar et al. 2014b, a) measurements, which is still in our very local astrophysical neighbourhood. At low energies the cross section of CRs with gas atoms and molecules increases, which leads to efficient losses (Coulomb losses, ionisation losses). On typical scales above a few parsec the gas scatters efficiently and the integrated energy of CRs is low enough such that those CRs are often treated as a diffusing fluid. On scales of star forming regions and protoplanetary discs the individual trajectories might be of interest.
CR momenta between 0.1 and $ 10 GeV/c This regime of the CR spectrum is the most difficult one because the total integrated CR energy is dominated by the contribution of the GeV particles. With energy densities exceeding the magnetic one, CRs do not just scatter off a background system but heavily influence the magnetic field and-connected via MHD-the local turbulent and thermal properties. Typically the scattering is very efficient, which allows a fluid approach. However, the simple diffusion approximation is likely to be violated. The resulting transfer of energy and the CR losses are complicated by the interplay between Coulomb and ionisation losses, hadronic losses and the interactions between CRs and the magnetic field.
CR momenta above $ 10 GeV/c At higher momenta each CR is much more energetic than other particles in the universe. However, the spectrum is so steep that the total energy stored in those CRs is subdominant, so they are dynamically irrelevant. The low scattering efficiency makes those CRs a reliable probe of their Fig. 1 Gyro-radius of CRs as a function of particle momentum. For gyro-radii much smaller than the system under consideration a fluid approximation seems reasonable travel through the magnetised universe. Therefore, in this energy regime CRs are often treated by following individual particles along their trajectories.

Grey versus spectrally resolved CR fluids
In order to account for the global effects of CRs in numerical models one often uses a grey approach, i.e. CR properties integrated over the full momentum range of the distribution function. In this simplified approach the CRs are treated as a fluid with global effective transport properties and interaction efficiencies. As in thermal gas dynamics an underlying spectrum needs to be assumed. Spectrally resolved methods allow for a more accurate treatment of the cooling.

The canonical CR propagation equation
The canonical CR propagation equation can be written as (Skilling 1971(Skilling , 1975a where f ¼ f ðr; p; tÞ is the CR density per unit momentum p at position r and at time t, f ðpÞ dp ¼ 4pp 2 f ðpÞ dp in terms of phase-space density f ðpÞ, b l represents mechanical and adiabatic losses and qðr; p; tÞ denotes the source terms. The effective CR advection velocity is where v 0 is the advection velocity of thermal plasma and v s ¼ Àbv A signðb Á re cr Þ ð 4Þ accounts for scattering of CR particles off self-excited Alfvén waves with b denoting the unit vector aligned with magnetic field, and v A is the Alfvén speed. The waves are excited due to a resonant coupling with the streaming CR population. The condition for resonance is that the Doppler-shifted wave frequency x À kv is an integral multiple of the cyclotron frequency nX (Kulsrud and Pearce 1969). For n ¼ 0 the gyroresonance occurs for waves at the same phase velocity as the particle velocity. If the drift velocity of CRs exceeds the Alfvén speed, the instability occurs for the waves propagating in the same direction. CRs interacting with the wave traveling in the same direction are mainly scattered in pitch angle and give a small amount of energy to the wave. The effect is known as the streaming instability. Although the instability generates directly forward waves propagating along magnetic field in the same direction as cosmic rays, wave-wave interactions imply the existence of backward waves, propagating in the opposite direction (Chin and Wentzel 1972;Skilling 1975a).
The spatial diffusion coefficient is generally anisotropic and thus described by a diffusion tensor with 'bb' meaning the dyadic product of vector b. In one dimension or for isotropic diffusion this reduces to a diffusion coefficient, D xx , which is given by where m AE means the collision frequency against forward (?) and backward (-) waves, v is particle velocity and l ¼ p Á b=p denotes the particle pitch-angle cosine. Diffusive reacceleration is described as diffusion in momentum space determined by the coefficient where m is the particle mass and c is its Lorentz factor. CR sources are usually assumed to be concentrated near the Galactic disk and to have a radial distribution like for example supernova remnants (SNR). A source injection spectrum and its isotopic composition are required; the composition is usually initially based on primordial solar but can be determined iteratively from the CR data themselves for later comparison with solar. The spallation part of qðr; p; tÞ depends on all progenitor species and their energy-dependent cross-sections, and the gas density qðrÞ; it is generally assumed that the spallation products have the same kinetic energy per nucleon as the progenitor. K-electron capture and electron stripping can be included via the time scale for loss by fragmentation s f and q. D xx is in general a function of ðr; b; p=ZÞ where b ¼ v=c and Z is the charge, and p/Z determines the gyro-radius in a given magnetic field; D xx may be isotropic, or more realistically anisotropic, and may be influenced by the CR themselves (e.g. in wavedamping models). The coefficient D pp is related to D xx by D pp D xx / p 2 , with the proportionality constant depending on the theory of stochastic reacceleration (Berezinskii et al. 1990, Seo andPtuskin 1994) as described in Sect. 2.5. v is a function of r and t. The term in $ Á v represents adiabatic momentum gain or loss in the non-uniform flow of gas with a frozen-in magnetic field whose inhomogeneities scatter the CR. s f depends on the total spallation cross-section and qðrÞ. Observationally, the density qðrÞ can be based on surveys of atomic and molecular gas, but can also incorporate small-scale variations such as the region of low gas density surrounding the Sun. In hydrodynamical simulations, the density is naturally included in every computational cell. This equation only treats continuous momentum-loss; catastrophic losses can be included via s f and q. CR electrons, positrons and antiprotons propagation constitute just special cases of this equation, differing only in their energy losses and production rates.
The boundary conditions depend on the model; often f ¼ 0 is assumed at the 'halo boundary' where particles escape into intergalactic space, but this obviously just an approximation (since the intergalactic flux is not zero) which can be relaxed for models with a physical treatment of the boundary.
Equation (2) is a time-dependent equation; often the steady-state solution is required, which can be obtained either by setting of =ot ¼ 0 or following the time dependence until a steady state is reached; the latter procedure is much easier to implement numerically. Depending on the model setup the time-dependence of q can be neglected unless effects of nearby recent sources or the stochastic nature of sources are being studied. In hydrodynamical models, local sources are easily incorporated in regions of star formation or locally identitfied shocks. By starting with the solution for the heaviest primaries and using this source term to compute the spallation source for their products, the complete system can be solved including secondaries, tertiaries etc. Then the CR spectra at the solar position can be compared with direct observations, including solar modulation if required.
Source abundances are determined iteratively, comparing propagation calculations with data; for nuclei with very small source abundances, the source values are masked by secondaries and cross-section uncertainties and are therefore hard to determine. Webber (1998b) gives a ranking from 'easy' to 'impossible' for the possibility of getting the source abundances using Advanced Composition Explorer (ACE) data. A review of the high-precision abundances from ACE is in Wiedenbeck et al. (2001) and for Ulysses in Connell (2001). For a useful summary of the various astrophysical abundances relevant to interpreting CR abundances see Binns et al. (2005).

Streaming and diffusion
Equation (2) contains the term representing advection of the CR population with the velocity u ¼ v 0 þ v s , given by formula (3). This includes the streaming velocity along the direction pointing down the CR energy density gradient. This term originates from the consideration of the cyclotron resonance streaming instability, discovered by Kulsrud and Pearce (1969), which sets in when a population of CRs move with a bulk speed greater than the Alfvén speed v A . Even a slight anisotropy of CRs, which naturally arises in the presence of sources, causes unstable growth of the waves due to momentum transfer from the CRs to waves via pitch-angle scattering. The streaming CRs transfer their energy to Alfvén waves and subsequently scatter off self-generated waves (see e.g. Wiener et al. 2013, for a detailed introduction). Streaming and diffusion are considered respectively as the first and second order effects in expansion of the distribution function in powers of inverse wave particle scattering frequency m (e.g. Skilling 1971;Wiener et al. 2017b).
The waves are subject to damping due to ion-neutral friction, due to waveparticle interactions between thermal ions and the low-frequency beat waves formed by two interfering Alfvén waves (non-linear Landau damping) and due to the cascade of waves to smaller scales (turbulent damping) (see Thomas and Pfrommer 2019, for a brief summary on this topic). The system reaches equilibrium when the wave damping rate becomes equal to their growth rate due to the streaming instability. The scattering leads to isotropisation of the CR distribution in the reference frame co-moving with the waves, implying that the CRs bulk velocity becomes equal to the Alfvén velocity plus a local fluid velocity.
Scattering of CR particles off Alfvén waves manifests itself as diffusion process in phase space, with spatial and momentum diffusion coefficients given respectively by formulae (6) and (7). Diffusion of CRs is intrinsically related to the streaming process which depends on local plasma conditions. In a more general case diffusion might be related also to an extrinsic turbulence driven by sources other than CRs. The effective diffusion coefficient depends on the dominant wave damping mechanism.
Wave damping means dissipation of wave energy and heating of the interstellar medium. The heating rate due to dissipation of Alfvén waves is If the coupling of CRs is reduced, due to wave damping mechanisms, the effective propagation speed of the CR population might be higher than the Alfvén speed, however, it is argued that the heating rate is still given by the expression (8), because all momentum and energy transfer between the CRs and gas is mediated by hydromagnetic waves which propagate at the speed v A (e.g. Zweibel 2013). The concept of CR diffusion explains why energetic charged particles have highly isotropic distributions and why they are retained well in the Galaxy. The Galactic magnetic field which tangles the trajectories of particles plays a crucial role in this process. Typical values of the diffusion coefficient found from fitting to CR data is D xx $ ð3 À 5Þ Â 10 28 cm 2 s À1 at an energy of $ 1 GeV per nucleon and it increases with magnetic rigidity as R 0:3 À R 0:6 in different versions of the empirical diffusion model of CR propagation. Here, the magnetic rigidity is defined as R ¼ pc=Ze with the momentum p and the charge Ze. In a given magnetic field configuration, particles with the same rigidity follow the same trajectory.
On the microscopic level the diffusion of CR results from particle scattering off random MHD waves and discontinuities. The effective ''collision integral'' for charged energetic particles moving in a magnetic field with small random fluctuations dB ( B is given by the standard quasi-linear theory of plasma turbulence (Kennel and Engelmann 1966). The wave-particle interaction is of resonant character so that an energetic particle is predominantly scattered by those irregularities of magnetic field which have their projection of the wave vector on the average magnetic field direction equal to k k ¼ AEs= r g l À Á , where l is the particle pitch angle. The integers s ¼ 0; 1; 2. . . correspond to cyclotron resonances of different order. The efficiency of scattering depends on the polarization of the waves and on their distribution in k-space. The first-order resonance s ¼ 1 is the most important for the isotropic and also for the one-dimensional distribution of random MHD waves along the average magnetic field. In some cases-for the calculation of scattering at small l and for the calculation of perpendicular diffusion-the broadening of resonances and magnetic mirroring effects should be taken into account. The resulting spatial diffusion is strongly anisotropic locally and goes predominantly along the magnetic field lines. However, strong fluctuations of the magnetic field on large scales of L $ 100 pc, where the strength of the random field is several times higher than the average field strength, lead to the isotropization of global CR diffusion in the Galaxy. The rigorous treatment of this effect is not trivial, since the field is almost static and the strictly one-dimensional diffusion along the magnetic field lines does not lead to non-zero diffusion perpendicular to B, see Casse et al. (2002) and the references therein.
Following several detailed reviews of the theory of CR diffusion (Jokipii 1971;Toptygin 1985;Berezinskii et al. 1990;Schlickeiser 2002) the diffusion coefficient at r g \L can be roughly estimated as D xx % dB res =B ð Þ À2 vr g =3, where dB res is the amplitude of the random field at the resonant wave number k res ¼ 1=r g . The spectral energy density of interstellar turbulence has a power law form wðkÞdk $ k À2þa dk, a ¼ 1=3 over a wide range of wave numbers 1=ð10 20 cmÞ\k\1=ð10 8 cm), see Elmegreen and Scalo (2004), and the strength of the random field at the main scale is dB % 5 lG. This gives an estimate of the diffusion coefficient D xx % 2 Â 10 27 bR 1=3 GV cm 2 s -1 for all CR particles with magnetic rigidities R\10 8 GV, in a fair agreement with the empirical diffusion model (the version with distributed reacceleration). The scaling law D xx $ R 1=3 is determined by the value of the exponent a ¼ 1=3, typical for a Kolmogorov spectrum. Theoretically (Goldreich and Sridhar 1995) the Kolmogorov type spectrum might refer only to some part of the MHD turbulence which includes the (Alfvénic) structures strongly elongated along the magnetic-field direction and which are not able to provide the significant scattering and required diffusion of cosmic rays. In parallel, the more isotropic (fast magnetosonic) part of the turbulence, with a smaller value of random field at the main scale and with the exponent a ¼ 1=2 typical for the Kraichnan type turbulence spectrum, may exist in the interstellar medium (Yan and Lazarian 2004). The Kraichnan spectrum gives a scaling D xx $ R 1=2 which is close to the high-energy asymptotic form of the diffusion coefficient obtained in the 'plain diffusion' version of the empirical propagation model. Thus the approach based on kinetic theory gives a proper estimate of the diffusion coefficient and predicts a power-law dependence of diffusion on magnetic rigidity, but the determination of the actual diffusion coefficient has to make use of fitting to models of CR propagation in the Galaxy.

Convection/advection
The transport of CRs is a combination of the locally advective contribution, where CRs are advected with the flow, plus the diffusive and/or streaming transport relative to the gas. Depending on the system under consideration convection and advection have been distinguished. From a hydrodynamical perspective the CRs are advected with the flow. Historically, global models of galaxies do not resolve the gas flow but cover the turbulent motions in the galaxy and the perturbations globally as a convective system. In the literature we therefore find both terminologies.
While the most frequently considered mode of CR transport is diffusion, the existence of galactic winds in many galaxies suggests that advective transport should also be important. Winds are common in galaxies and can be CR-driven (e.g. Naab and Ostriker 2017) CRs also play a dynamical role in galactic halos (Breitschwerdt et al. 1991(Breitschwerdt et al. , 1993. Convection not only transports CR, it can also produce adiabatic energy losses as the wind speed increases away from the disk. Convection was first considered by Jokipii (1976) and followed up by Owens and Jokipii (1977a, b), Jones (1978Jones ( , 1979, Bloemen et al. (1993). Both 1-zone and 2-zone models have been studied: a 1-zone model has convection and diffusion everywhere, a 2-zone model has diffusion alone up to some distance from the plane, and diffusion plus convection beyond.
For one-zone diffusion/convection models a good diagnostic is the energydependence of the secondary-to-primary ratio: a purely convective transport would have no energy dependence (apart from the velocity-dependence of the reaction rate), contrary to what is observed. If the diffusion rate decreases with decreasing energy, any convection will eventually take over and cause the secondary-toprimary ratio to flatten at low energy: this is observed but convection does not reproduce e.g. the Boron-to-Carbon ratio (B/C) very well . Ptuskin et al. (1997) studied a self-consistent two-zone model with a wind driven by CR and thermal gas in a rotating Galaxy. The CR propagation is entirely diffusive in a zone jzj\1 kpc, and diffusive-convective outside. CR reaching the convective zone do not return, so it acts as a halo boundary with height varying with energy and Galactocentric radius. It is possible to explain the energy-dependence of the secondary-to-primary ratio with this model, and it is also claimed to be consistent with radioactive isotopes. The effect of a Galactic wind on the radial CR gradient has been investigated (Breitschwerdt et al. 2002); they constructed a selfconsistent model with the wind driven by CR, and with anisotropic diffusion. The convective velocities involved in the outer zone are large (* 100 km s -1 ) but this model is still consistent with radioactive CR nuclei which set a much lower limit , since this limit is only applicable in the inner zone. Observational support of such models would require direct evidence for a Galactic wind in the halo.

Reacceleration
In addition to spatial diffusion, the scattering of CR particles on randomly moving MHD waves leads to stochastic acceleration which is described in the transport equation as diffusion in momentum space with some diffusion coefficient D pp . One can estimate it as D pp ¼ p 2 V 2 a = 9D xx ð Þwhere the Alfvén velocity V a is introduced as a characteristic velocity of weak disturbances propagating in a magnetic field, see Berezinskii et al. (1990), Schlickeiser (2002 for rigorous formulas.
2 Early models

CR propagation -basic relations and motivations
Assuming that CRs are accelerated from the ISM of our Galaxy, we expect a similar composition of CRs and the thermal gas in the ISM. However, their observed abundances show a relative over-abundance of the light elements Li, Be, and B, which must be produced during their transport through the ISM. These elements are produced in spallation processes and are thus secondary particles. A classical measure of the ratio of stable secondaries to primaries is the B/C ratio, which depends on particle energy, but can be approximated to zeroth order to be $ 0:3. The grammage is the amount of material that the CRs have to pass through before they interact with an atom in the ISM, The grammage can be related to the spallation products by comparing the travelled depth with the mean mass per particle in the ISM m ISM and the typical cross section for the spallation of carbon to boron, Using typical numbers for the ISM we find a grammage of $ 10 g cm À2 . At a disc surface density of approximately R gas $ 10 M pc À2 ¼ 2 Â 10 À3 g cm À2 the CR must cross the disc 10 3 times, assuming that the grammage is accumulated by travelling through the disc. The associated travel or residence time, which can also be understood as the lifetime of a CR in the Galaxy or its escape time from the disc, is given by for a gas scale height of h $ 100 pc and a CR velocity v ¼ c.
The linear distance that a CR travels during the residence time is l ¼ t res c $ 500 kpc, so much larger than galactic scales and the CRs need to be confined to the Galaxy. In order to estimate escape timescales of the CRs unstable secondaries provide a valuable tool. The longest lived and best measured unstable isotope is 10 Be. Its decay time is t decay ¼ 1:39 Myr, which is of the order of the residence time. The cross sections for the spallation of carbon into stable ( 9 Be) and unstable ( 10 Be) beryllium are similar, r C! 10 Be $ r C! 9 Be . This means that an initial abundance ratio ( 10 Be/ 9 Be) of unity at production decreases over time by t decay ð 10 BeÞ=t res . Measurements of this ratio reveal a residence time of 10 À 20 Myr, so larger than the residence time of CRs.
A simple transport equation along the vertical dimension reads Here, N is the number density of CRs, Q 0 is the CR source function, D is the diffusion coefficient, and d is the Dirac-d distribution. By assuming steady state and an injection of CRs close to the midplane at z ¼ 0, we can reduce the equation to and find for z [ 0 where H is the size of the halo, which is poorly constrained to a value of approximately 5 kpc. The total grammage v ¼ qt res c will be reached by moving though the average density of the total volume (disc plus halo), q ¼ lm p n ISM h=H, with a disc scale height of h ¼ 100 pc, an average mean molecular weight of l ¼ 1:4, and an average ISM density of 1 cm À3 . The diffusion coefficient can be estimated to be Even though these estimates are very simplified, they provide two valuable features of GeV CRs that remain valid even with much more complex assumptions. The first is that the CRs are distributed relatively smoothly through the ISM. Locally, the CR density varies, but much less than the gas structures, so molecular clouds are located in an almost uniform sea of GeV CRs. The second is that due to frequent scattering, the CR distribution is locally isotropic.

Weighted slabs and leaky boxes
The closely related leaky-box and weighted slab formalisms have provided the basis for most of the literature interpreting CR data.
In the leaky-box model, the diffusion and convection terms are approximated by the leakage term with some characteristic escape time of CR from the Galaxy. The escape time s esc may be a function of particle energy (momentum), charge, and mass number if needed, but it does not depend on the spatial coordinates. There are two cases when the leaky box equations can be obtained as a correct approximation to the diffusion model: (1) the model with fast CR diffusion in the Galaxy and particle reflection at the CR halo boundaries with some probability to escape (Ginzburg and Syrovatskii 1964), (2) the formulae for CR density in the Galactic disk in the flat halo model ðz h ( RÞ with thin source and gas disks (z gas ( z h Þ which are formally equivalent to the leaky-box model formulae in the case when stable nuclei are considered (Ginzburg and Ptuskin 1976). The nuclear fragmentation is actually determined not by the escape time s esc but rather by the escape length in g cm À2 : x ¼ vqs esc , where q is the average gas density of interstellar gas in a galaxy with the volume of the cosmic ray halo included.
The solution of a system of coupled transport equations for all isotopes involved in the process of nuclear fragmentation is required for studying CR propagation. A powerful method, the weighted-slab technique, which consists of splitting the problem into astrophysical and nuclear parts was suggested for this problem (Davis 1960;Ginzburg and Syrovatskii 1964) before the modern computer epoch. The nuclear fragmentation problem is solved in terms of the slab model wherein the CR beam is allowed to traverse a thickness x of the interstellar gas and these solutions are integrated over all values of x weighted with a distribution function G(x) derived from an astrophysical propagation model. In its standard realization (Protheroe et al. 1981;Garcia-Munoz et al. 1987) the weighted-slab method breaks down for low energy CRs where one has strong energy dependence of nuclear cross sections, strong energy losses, and energy dependent diffusion. Furthermore, if the diffusion coefficient depends on the nuclear species the method has rather significant errors. After some modification ) the weighted-slab method becomes rigorous for the important special case of separable dependence of the diffusion coefficient on particle energy (or rigidity) and position with no convective transport. The modified weighted-slab method was applied to a few simple diffusion models in Jones et al. (2001aJones et al. ( , 2001b. The weighted-slab method can also be applied to the solution of the leaky-box equations. It can easily be shown that the leaky-box model has an exponential distribution of path lengths GðxÞ / expðÀx=XÞ with the mean grammage equal to the escape length X. In a purely empirical approach, one can try to determine the shape of the distribution function G(x) which best fits the data on abundances of stable primary and secondary nuclei (Shapiro and Silberberg 1970). It has been established that the shape of G(x) is close to exponential: GðxÞ / expðÀx=XðR; bÞÞ, and this justifies the use of the leaky-box model in this case. There are various calculations of G(x) (Stephens and Streitmatter 1998;Davis et al. 2000;Jones et al. 2001a, b).
The possible existence of truncation, a deficit at small path lengths (below a few g cm À2 at energies near 1 GeV/n), relative to an exponential path-length distribution, has been discussed for decades (Shapiro and Silberberg 1970;Garcia-Munoz et al. 1987;Webber 1993;Duvernois et al. 1996). The problem was not solved mainly because of cross-sectional uncertainties. In a consistent theory of CR diffusion and nuclear fragmentation in the cloudy interstellar medium, the truncation occurs naturally if some fraction of CR sources resides inside dense giant molecular clouds (Ptuskin and Soutoul 1990).
For radioactive nuclei, the classical approach is to compute the 'surviving fraction' which is the ratio of the observed abundance to that expected in the case of no decay. Often the result is given in the form of an effective mean gas density, to be compared with the average density in the Galaxy, but this density should not be taken at face value. The surviving fraction can better be related to physical parameters (Ptuskin and Soutoul 1998). None of these methods can face the complexities of propagation of CR electrons and positrons with their large energy and spatially dependent energy losses.

Explicit models
Finally the mathematical effort required to put the 3-D Galaxy into a 1-D formalism becomes overwhelming, and it seems better to work in physical space from the beginning: this approach is intuitively simple and easy to interpret. We can call these 'explicit models'. The explicit solution approach including secondaries was pioneered by (Ginzburg and Ptuskin 1976) and applied to newer data by Webber et al. (1992), Bloemen et al. (1993) with analytical solutions for 2D diffusionconvection models with a cosmic-ray source distribution, which however had many restrictive approximations to make them tractable (no energy losses, simple gas model). More recently a semi-empirical model which is 2D and includes energylosses and reacceleration has been developed (Maurin et al. 2001(Maurin et al. , 2002. This is a closed-form solution expressed as a Green's function to be integrated over the sources. It incorporates a radial CR source distribution, but the gas model is a simple constant density within the disk. Taillet et al. (2004) give an analytical solution for the time-dependent case with a generalized gas distribution.
A 'myriad sources model' (Higdon and Lingenfelter 2003), which is actually a Green's function method without energy losses, yields similar results to Strong et al. (2004) for the diffusion coefficient and halo size.
The most advanced explicit solutions to date are the fully numerical models described in other sections. Even this has limitations in treating some aspects (e.g. when particle trajectories become important at high energies) so one might ask whether a fully Monte-Carlo approach (as is commonly done for energies [ 10 15 eV) would not be better in the future, given increasing computing power. This would allow effects like field-line diffusion (important for propagation perpendicular to the Galactic plane) to be explicitly included. However it is still challenging: a GeV particle diffusing with a mean free path of 1 pc in a Galaxy with 4 kpc halo height takes $ ð4000=1Þ 2 % 10 7 scatterings to leave the Galaxy, which would even now need supercomputers to obtain adequate statistics. Hence we expect a numerical solution of the propagation equations to remain an important approach for the foreseeable future.

GALPROP
The GALPROP project  was invented with the following aims: 1. to enable simultaneous predictions of all relevant observations including CR nuclei, electrons and positrons, c-rays and synchrotron radiation, 2. to overcome the limitations of analytical and semi-analytical methods, taking advantage of advances in computing power, as CR, c-ray and other data become more accurate, 3. to incorporate the best current information on Galactic structure and CR source distributions, 4. to provide a publicly-available code as a basis for further expansion.
The first point was the real driving factor, the idea being that all data relate to the same system, the Galaxy, and one cannot for example allow a model which fits CR secondary/primary ratios while not fitting c-rays or not being compatible with the known interstellar gas distribution. There are so many simultaneous constraints, and that to find one model satisfying all of them is a challenge, which in fact has not been met up to now. GALPROP has been adopted as the standard for diffuse Galactic c-ray emission for NASA's Fermi-LAT c-ray observatory.
We give a very brief summary of GALPROP; for details we refer the reader to the relevant papers Moskalenko and Strong 1998;Strong et al. 2000;Moskalenko et al. 2002;Strong et al. 2004;Ptuskin et al. 2006) and two Annual Reviews articles, (Strong et al. 2007) with new developments described in Grenier et al. (2015). Developments of GALPROP have continued. It is maintained as public software 1 which includes an Explanatory Supplement describing the method in full detail. Recent developments are described in (Porter et al. 2017(Porter et al. , 2019Boschini et al. 2020b).
The CR propagation equation is solved numerically on a spatial grid, either in 2D with cylindrical symmetry in the Galaxy or in full 3D. The boundaries of the model in Galactocentric radius and height above the disk, and the grid spacing, are userdefinable. In addition there is a grid in momentum; momentum (not e.g. kinetic energy) is used because it is the natural quantity for propagation. Parameters for all processes in Eq. (2) can be controlled with input parameters. The distribution of CR sources can be freely chosen, typically to represent supernova remnants (SNR). Source spectral shape and isotopic composition (relative to protons) are input parameters. Interstellar gas distributions are based on current HI and CO surveys, and the interstellar radiation field (ISRF), for lepton energy losses and inverse Compton scattering is based on a separate detailed calculation. CR fragmentation and destruction cross-sections are based on extensive compilations and parameterizations (Mashnik et al. 2004). The numerical solution proceeds in time until a steady-state is reached; a time-dependent solution is also an option. Checks for convergence are implemented. Starting with the heaviest primary nucleus considered (e.g. 64 Ni) the propagation solution is used to compute the source term for its spallation products, which are then propagated in turn, and so on down to protons, secondary electrons and positrons, and antiprotons. In this way secondaries, tertiaries etc. are included. (Production of 10 B via the 10 Be-decay channel is important and requires a second iteration of this procedure.) GALPROP includes K-capture and electron stripping processes, where a nucleus with an electron (Hlike) is considered a separate species because of the difference in the lifetime. Since H-like atoms have only one K-shell electron, the K-capture decay half-life has to be increased by a factor of 2 compared to the measured half-life value.
Primary electrons are treated separately. Normalization of protons, helium and electrons to experimental data is provided (all other isotopes are determined by the source composition and propagation). c-rays and synchrotron are computed using interstellar gas survey data (for pion-decay and bremsstrahlung) and the ISRF model (for inverse Compton). Spectra of all species on the chosen grid and the c-ray and synchrotron skymaps are output in a standard astronomical format for comparison with data. Extensions to GALPROP includes non-linear wave damping (Ptuskin et al. 2006).
We remark that while GALPROP has the ambitious goal of being 'realistic', it is obvious that any such model can only be a crude approximation to reality. Some known limitations are: boundary condition (flux set to zero) at the halo boundary is not physical, only energies below 10 15 eV are treated (no trajectory calculations), spatially uniform source abundances are assumed, though stochastic sources in space and time are also implemented.
CR propagation is traditionally treated as a spatially smooth, steady-state process. Because of the rapid diffusion and long containment time-scales in the Galaxy this is to first order a sufficient approximation, but there are cases where it breaks down. The rapid energy loss of electrons and positrons above about 100 GeV and the stochastic nature of their sources produces spatial and temporal variations. Supernovae are stochastic events and each SNR source of CR accelerates for only 10 4 À 10 5 years, which leaves an imprint on the distribution of electrons. This leads to large fluctuations in the CR electron/positron density at high energies, so that the lepton spectrum measured near the Sun may not be typical (Strong et al. 2004). These effects are much smaller for nucleons since there are essentially no energy losses except ionization at low energies, but they are still included. Such effects can influence the B/C ratio (Taillet et al. 2004;Büsching et al. 2005). A recent timedependent GALPROP model is described in Porter et al. (2019).
Here we give some technical details of the GALPROP package, taken from the GALPROP Explanatory Supplement supplied with the code. These can be compared with the other approaches described in this review.
Transport equation. GALPROP solves the transport equation with a given source distribution and boundary conditions for all cosmic-ray species. This includes Galactic wind (convection), diffusive reacceleration in the interstellar medium, energy losses, nuclear fragmentation, and decay. The numerical solution of the transport equation is based on an implicit second-order scheme (e.g. Press et al. 1992). The spatial boundary conditions assume either zero CR density at the boundaries or, more physically plausible, free particle escape at the boundaries. Since we have a 3-dimensional (R, z, p) or 4-dimentional (x, y, z, p) problem (spatial variables plus momentum) we use ''operator splitting'' to handle the implicit solution.
The propagation equation is written in the form: where N ¼ N ðr; p; tÞ is the density per unit of total particle momentum, N ðpÞdp ¼ 4pp 2 f ðpÞ in terms of phase-space density f ðpÞ, qðr; pÞ is the source term, D xx is the spatial diffusion coefficient, V is the convection velocity, reacceleration is described as diffusion in momentum space and is determined by the coefficient D pp , _ p dp=dt is the momentum loss rate, s f is the time scale for fragmentation, and s r is the time scale for the radioactive decay.
One can estimate Þwhere the Alfvén velocity V a is introduced as a characteristic velocity of weak disturbances propagating in a magnetic field, see (Berezinskii et al. 1990;Schlickeiser 2002) for rigorous formulas.
For a given halo size the diffusion coefficient as a function of momentum and the reacceleration or convection parameters is determined by boron-to-carbon ratio data. The spatial diffusion coefficient is taken as D xx ¼ bD 0 ðq=q 0 Þ d if necessary with a break (d ¼ d 1;2 below/above rigidity q 0 ), where the factor b (¼ v=c) is a consequence of a random-walk process. For the case of reacceleration the momentum-space diffusion coefficient D pp is related to the spatial coefficient D xx where d ¼ 1=3 for a Kolmogorov spectrum of interstellar turbulences. The convection velocity (in z-direction only) V(z) is assumed to increase linearly with distance from the plane (dV=dz [ 0 for all z); this implies a constant adiabatic energy loss. Since the wind cannot blow in both directions at z ¼ 0 this formulation requires a zero velocity there. A more general case where the wind starts at zero and reaches a constant value at a specified z has therefore been implemented using a tanh function.
The distribution of cosmic-ray sources is parameterized and usually chosen to follow the pulsar distribution from radio observations, since pulsars should be a good tracer of SNR, which are difficult to detect at large distances. The injection spectrum of nucleons is assumed to be a power law in momentum, dqðpÞ=dp / p Àc . Energy losses for nuclei by ionization and Coulomb interactions are included, and for electrons by ionization, Coulomb interactions, bremsstrahlung, inverse Compton, and synchrotron. The code uses cross-section measurements and energy dependent fitting functions.
The code calculates the production and propagation of secondary antiprotons from pp collisions. Secondary positrons and electrons in cosmic rays are the final product of decay of charged pions and kaons which in turn created in collisions of cosmic-ray particles with gas. Pion production by pp; pHe; Hep; HeHe collisions are included.
The nuclei are aligned on the same kinetic energy per nucleon E kin since this simplifies the secondary-to-primary computation, where primaries produce secondaries of the same E kin . However the basic CR density used has units of density per total momentum p since this is natural for propagation. The actual units used When the flux IðE kin Þ in cm -2 sr -1 s -1 (MeV/nucleon) À1 is necessary, it can be simply obtained from where A is the nucleus mass number. This follows from dp ¼ A b dE kin . The combined requirements of transport and fragmentation are thus elegantly met. The normal units for presentation of CR data are cm -2 sr -1 s -1 (MeV/nucleon) À1 , and with this scheme the conversion is trivial. The nucleus energy scales are logarithmic in E kin .

Numerical solution of the propagation equation
Full explicit method. The diffusion, reacceleration, convection and loss terms in eq. (18) can all be finite-differenced for each dimension (R, z, p) or (x, y, z, p) in the form where all terms are functions of (R, z, p) or (x, y, z, p). This is the fully time-explicit method (Press et al. 1992) where the updating scheme is which generalizes simply to any number of dimensions since all the quantities are known from the current step. It gives more accurate solutions, which tend to the exact solution according to the computed diagnostics, but are not unconditionally stable (while Crank-Nicolson is). For this reason it is only applicable for short enough timesteps. Since no solution of matrix equations is required, this method is faster than Crank-Nicolson for the same timesteps, and this compensates for the need for smaller steps.
Fully implicit method. The diffusion, reacceleration, convection and loss terms in eq. (18) can all be finite-differenced for each dimension (R, z, p) or (x, y, z, p) in the form where all terms are functions of (R, z, p) or (x, y, z, p). This is the fully time-implicit method where the updating scheme is This method is unconditionally stable for all a and Dt, but is only 1st-order accurate in time.
The tridiagonal system of equations is solved for the N tþDt i by standard methods. Note that for energy losses we use 'upwind' differencing to enhance stability, which is possible since we have only loss terms (adiabatic energy gain is not included here).
Crank-Nicolson method. Alternatively, the propagation equation can be finitedifferenced in the form This is the Crank-Nicolson method where the updating scheme is It thus uses a combination of implicit and explicit terms, forming the time-average of the differentials. Like the fully implicit method, this method is unconditionally stable for all a and Dt, but is 2nd-order accurate in time, so that larger time-steps are possible than with the 1st-order scheme.
The tridiagonal system of equations is again solved for the N tþDt i by the standard method. Note that the RHS has all known quantities from the current time-step.
The Crank-Nicolson method described above applies to a one-dimensional case; the application to 2 or 3 spatial and one momentum dimension requires a generalization. A straightforward expansion to more dimensions implies solving large matrix equations (no longer tridiagonal); instead the so-called ADI (alternating direction implicit) method is used, in which the implicit solution is applied to each dimension in turn. Each application uses just the operator for that dimension, so the tridiagonal scheme can still be used. This however is not completely valid since it solves a slightly different problem from that with the full operator; however for small enough timesteps the solution is accurate (see Section on Tests of GALPROP in the GALPROP Explanatory Supplement).
The explicit method, where the full operator can be used in each timestep without any overhead for solving matrix equations, is also useful for obtaining an accurate solution at the end of a run. Although it is not unconditionally stable, this does not matter provided the timesteps are small enough, which is in any case required for the implicit methods to maximise their accuracy. A suitable mix of explicit and implicit methods to obtain an accurate solution with minimum computing requirements, is the goal.
For 2D, three spatial boundary conditions may be imposed at each iteration. This is not physically expected, although it is a common conventional assumption. More physically plausible is free escape at the boundaries, which is not the same. For this, it is sufficient simply to not impose the above conditions in the updating scheme, since the a 1 ; a 3 coefficients do not act outside the boundaries, so there is no diffusive or convective flux inwards at the boundaries. The resulting solutions then have N [ 0 at the boundaries. In future more physical boundary conditions could be implemented, e.g. specifying the outward streaming velocity or the escape probability at the boundaries. No boundary conditions are imposed or required at R = 0 or in p. Grid intervals are typically DR ¼ 1 kpc, Dz ¼ 0:1 kpc; for p a logarithmic scale with ratio typically 1.2 is used. Although the model is symmetric around z ¼ 0 the solution is generated for Àz h \z\z h since this is required for the tridiagonal system to be valid. For 3D, the spatial boundary conditions N ðAEx h ; y; z; pÞ ¼ N ðx; AEy h ; z; pÞ ¼ N ðx; y; AEz h ; pÞ ¼ 0 ð30Þ may be imposed at each iteration, and free escape at the boundary is an option as for 2D. Again no boundary conditions are imposed in p. Grid intervals are typically Dx ¼ Dy ¼ 0:5 kpc, Dz ¼ 0:1 kpc. Since we have a 3-dimensional (R, z, p) problem we use 'operator splitting' to handle the implicit solution, as follows. We apply the implicit updating scheme alternately for the operator in each dimension in turn, keeping the other two coordinates fixed. The source and fragmentation, decay terms are used in every step, so to account for the 3 substeps, 1 3 q i and 1 3s are used instead of q i , 1=s for the source term and the fragmentation, decay terms respectively. The coefficients for 3 spatial dimension are the same except that R is replaced by (x,y) and the finite differencing coefficients have the same form as for z, and 1 4 q i and 1 4s are used for the source term and the fragmentation, decay terms respectively, to account for the 4 substeps. With this scheme the solution can be done via the tridiagonal solution for each dimension in turn, as described in Press et al. (1992). The spatial 3D scheme is simpler than the 2D one since the diffusion operator is easier to formulate (x,y,z have the same form), and in addition it does not have the problem of the boundary condition at R=0. In the case of anisotropic diffusion, D zz is used in the z-direction.
Finally all nuclei are normalized to the proton flux from the parameter file, using the relative abundances given as parameters. The value of E kin is taken as the reference value for the proton normalization. All results are output to FITS files for comparison with data and further analysis.

Other codes
Following the success of the GALPROP approach described above, other projects with similar goals were started. Here we just mention these without details, which can be found in the papers. The DRAGON project (Gaggero et al. 2014) and references therein which extends the CR propagation to anisotropic and spatiallydependent diffusion. DRAGON is also publicly available. 2 A more physical approach to diffusion with turbulence incorporated in DRAGON is given in Evoli and Yan (2014).
USINE is a semi-analytical CR-propagation package, which has the advantage of speed for model parameter explorations, and which has been the basis of many recent investigations. USINE is publicly available 3 as described by Maurin (2020). It has been used for many years already (Putze et al. 2011) and references therein. It has recently been used to study secondary antiprotons ) and the size of the Galactic CR halo (Weinrich et al. 2020).
The numerical packages mentioned have limitations in terms of both accuracy and speed, and hence the spatial resolution achievable. Hence their use has mainly been restricted to 2D models with cylindrical symmetry. A new approach is implemented in the PICARD model (Werner et al. 2013;Kissmann 2014), which is fully 3D in concept and has state-of-the art numerical techniques. This makes it possible to handle models with spiral structure at good (e.g. 10 pc) resolution with reasonable computer resources.

The system of equations
To reduce the physical and computational complexity of the CR propagation problem numerous authors neglected the explicit consideration of the streaming process by eliminating the Alfvén wave-related component from the CR propagation speed u in Eq. (2) and by defining the spatial and momentum diffusion coefficients as free parameters of the model. In phenomenological models the values of diffusion coefficient are deduced from secondary to primary CR abundances taken from observational data. Depending on particular needs Eq. (2) might be integrated directly or with the aid of its number density or energy density moments.
The latter quantities enable construction of numerical solvers in a conservative manner, using finite volume methods both in spatial and in momentum dimensions. In the following considerations we neglect particle acceleration processes, therefore we assume D pp ¼ 0. We also omit particle acceleration and radioactive decay.

CR number density
The number density of CR particles in an arbitrarily chosen range ½p L ; p R in momentum space is defined as We multiply Eq. (2) by 4pp 2 and integrate over p to get the evolution equation for the particle number density where Q LR is the spatial density of CR sources and hD n i is the momentum-averaged spatial diffusion tensor of the particle number density (see also Sect. 4.6). In the limit of p L ¼ 0 and p R ¼ 1 we get the conventional form of the diffusion-advection equation for CR particle number density on cr ot ¼ Àr Á ðun cr Þ þ r hD n irn cr ð ÞþQ: ð33Þ

CR energy density
The CR energy density in a section ðp L ; p R Þ of the momentum axis is defined as e LR cr ðx; tÞ Again, we integrate Eq. (2) multiplied by 4pp 2 T over p, where T ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p 2 c 2 þ m 2 c 4 p À mc 2 is kinetic energy and m is the rest-mass of CR protons, electrons or other particles. The resulting equation for the CR energy density reads where hD e i denotes the momentum averaged diffusion coefficient (see Sect. 4.6) of the CR energy and S LR sources of CR energy. The CR pressure contribution from particles in the momentum range ½p L ; p R is In the limit of p L ¼ 0, p R ¼ 1 and b l ¼ 0 we get the conventional form (see e.g. Schlickeiser and Lerche 1985) of the momentum-integrated diffusion-advection equation for the CR particle energy density together with an adiabatic equation of state relating the momentum integrated CR pressure to CR energy density P cr ¼ ðc cr À 1Þe cr with adiabatic index equal to 4/3 for a relativistic gas or its value is in the range [4/3, 5/3] if the trans-relativistic population of CR particles is significant.

Two-fluid diffusion-advection models
The development of self-consistent methods for the CR transport equation started with theoretical work by Drury and Voelk (1981), Axford et al. (1982) who integrated the kinetic equation (2) in momentum space to get a single equation for the total CR pressure. They supplemented the resulting equation to the set of fluid equations for the conventional gas. The two components, thermal gas and the population of the nonthermal CR particles were coupled by the CR pressure term. The resulting two-fluid system was applied in analytical studies of hydrodynamic shock structure in the presence of CRs. These preliminary studies have shown that even for moderately strong shock waves most of the upstream energy flux in the background plasma is transferred to cosmic rays, and thus demonstrated the importance of the CR induced dynamical effects in the system composed of CRs and thermal plasma. Schlickeiser and Lerche (1985) used the two-fluid system of equations for studies of the dynamics of interstellar gas in galactic disks. Drury and Falle (1986) presented a first numerical algorithm to model the propagation of CRs together with thermal gas in 1-D. The algorithm, based on the solution of an appropriate Riemann problem (see Sect. 4.3.1) has been used to study the stability of CR-modified shocks. Kang and Jones (1990) extended the two-fluid model of diffusive shock acceleration by inclusion of an in-situ CR injection at steady state shocks. In a companion paper Jones and Kang (1990) presented the results of time-dependent numerical simulations of CR-mediated shocks based on the two-fluid model. By means of the piecewise parabolic method (PPM) Colella and Woodward (1984) modelled the evolution of plane parallel, piston-driven shocks and spherical adiabatic blast waves. They investigated shocks that sweep up ambient cosmic rays as well as those that inject the cosmic rays directly. Jones and Kang (1993) developed a time-dependent two fluid model of CR acceleration during the impact of shocks on dense two-dimensional clouds.
Diffusive shock acceleration of relativistic protons and their dynamical feedback on the background flow was included in the two-fluid model of Jones et al. (1994) who simulated the dynamical evolution of cosmic gas clouds moving supersonically through a uniform low-density medium. In their simulations more than 10% of the initial kinetic energy of the flow was converted into a combination of thermal and CR energy. The fraction of energy going to CRs exceeded in some cases the energy transfered to gas. Frank et al. (1994) carried out numerical simulations with a new conservative total variation diminishing (TVD) MHD scheme based on a Godunov-type method by , . They treated CRs as a massless, diffusive fluid governed by a conservation equation for the CR energy derived from the momentum-dependent diffusion-advection equation (Skilling 1975a). The CR energy equation was solved using a second order monotonic advection and Crank-Nicolson scheme. The paper demonstrated first results of time-dependent two-fluid CR modified shock simulations in one spatial dimension. Frank et al. (1995) performed dynamical simulations of oblique MHD cosmicray (CR)-modified plane shock evolution. They solved the system of two-fluid CR-MHD equations and also a more complete system consisting of MHD equations and the momentum dependent diffusion-advection equation. The authors compare the results of two-fluid and momentum-dependent diffusion-advection approaches.
A comprehensive discussion of the strength and weakness of the two-fluid model was presented by Kang and Jones (1997). They found a good agreement of the twofluid model with the dynamical properties of the more detailed diffusion-advection results. They also found that the validity of the two-fluid formalism does not necessarily mean that steady state two-fluid models provide a reliable tool for predicting the efficiency of particle acceleration in real shocks. Hanasz and Lesch (2003) presented a new numerical algorithm for two-fluid diffusion-advection propagation of CRs coupled dynamically with thermal plasma, with anisotropic, magnetic field-aligned diffusion. The algorithm was implemented into the MHD code ZEUS-3D (Stone and Norman 1992a, b). The paper focussed on the development of a stable algorithm for anisotropic diffusion of CRs along magnetic field vectors defined on a staggered mesh. Details of the anisotropic diffusion algorithm are presented in Sect. 5.2. The CR propagation algorithm, involving an explicit diffusion algorithm coupled to the MHD system, has been demonstrated to be stable within the standard CFL time-step restriction Dt 0:5ðDxÞ 2 =D, where D is the diffusion coefficient. The algorithm was applied in simulations of the Parker instability triggered by cosmic rays injected by a SN remnant and subsequently in numerical experiments of the CR-driven dynamo process (Hanasz et al. 2004 and in numerical models of CR-driven winds . Snodin et al. (2006) realized that the usual Fickian approach to the diffusion, which assumes that the flux of the diffusive quantity is proportional to its instantaneous gradient, leads to several problems. They noted that the anisotropic part of CR diffusion tensor becomes singular at magnetic X-points, leading to infinite CR propagation speeds, and consequently limits the timestep of an explicit time-stepping algorithm to zero. In order to ensure finite propagation speed they modified the equation for the diffusive flux to a non-Fickian form motivated by the turbulent transport of passive scalars (Blackman and Field 2003). The resulting equation for the diffusive quantity was reduced to the form of telegraph equation containing an extra second time-derivative, which acts as the displacement current in electrodynamics. The new term was included with an artificially reduced speed of light in order to reduce the propagation speed to numerically acceptable values.

Riemann problem
The aim of this section is to introduce the Riemann problem as well as the principle of the Godunov method to clarify the terms we use in subsequent sections. The basis for the dynamical CR models are the fluid equations for the thermal gas, which can be expressed in primitive (q, u, P) or conserved quantities (q, qu, ). The differential equations of fluid dynamics in primitive form and one spatial dimension read where we use the short notation q t oq=ot. The three equations describe the conservation of mass, momentum and total energy. We define e tot ¼ qð where U is the vector of conservative quantities, F is the vector of corresponding fluxes. The equations of hydrodynamics are hyperbolic partial differential equations (see, e.g. LeVeque 2002; Toro 2009; Balsara 2017), whose time dependent solutions are given by the so-called charateristics. The characteristic curves are lines in spacelime along which U stays constant. Their slope is given by eigenvalues of the Jacobian joF i =ox j j. For the advection equation of quantity q with constant velocity u in one dimension, the characteristcs are linear functions that simply translate the quantity q. More generally we find solutions by geometrical translation of the initial state to a given point along characteristic lines. In the full set of equations we find that besides advection the system is influenced by the pressure in the gas and the corresponding sound waves c 0 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffi cP 0 =q 0 p . The system yields three different characteristcs, which are the ones for advection (k 0 ), the advected fluid and a left-going sound wave (k À1 ) as well as the advection and a right-going sound wave (k þ1 ). The characteristics of this problem are illustrated in Fig. 2.
In numerical setups we discretize the domain into cells i and use the discretized form of conservation laws. Integration over a control volume-a finite section of space and time-leads to the evolution equation (Godunov method) where U n i are averaged conservative quantities over the cell volume and F iAE 1 2 are the fluxes between cells averaged over the timestep.
The Riemann problem describes an initial value problem using a discontinuous initial state S with piecewise constant values on either side of the interface x 0 The time evolution is described by three types of wave structures, namely shocks, rarefaction waves and contact discontinuities. The solutions can be found by solving the so-called Rankine-Hugoniot jump conditions which represent conservation laws across discontinuities which is explicitly written as Using the Mach number we can write the ratios of densities, pressures and temperatures as follows Exact solutions can be found with some help of numerical root finding. In numerical practice it is computationally less expensive to compute approximate solutions which do not rely on root finding routines, but instead rely on purely algebraical methods, using explicitely Rankine-Hugoniot conditions (HLL). A particularly important setup is the Sod shock tube (Sod 1978) with the initial conditions The most prominent example is a shock tube, with four important interfaces, which are illustrated in Fig. 3. From left to right we highlight the head and the tail of the rarefaction wave, the contact discontinuity as well as the shock front moving to the right.
The system described so far does not include magnetic fields, which extend the set of characteristics by magnetic waves. Despite the importance of the magnetic field for CRs in general we refrain from deriving the additional equations here. This setup is for one simple thermal fluid. CRs can be included in three different manners into this setup. The first and most simple inclusion is simply the advection of the CR fluid with the gas velocity. The second possibility includes the CR pressure into the jump condition and accounts for the possibility of two different adiabatic indices for the different fluids as described in Pfrommer et al. (2006). The third possibility does not just include CRs as an existing fluid, but includes the acceleration of CRs in a phenomenological way. A fraction of the dissipated energy at the shock is converted to CRs and treated as a source term in the shocked region. This approach is described in Pfrommer et al. (2017). Shock tube solution at late time. One can clearly distringush between the unperturbed states at the very left and right, the rarafaction wave, the contact discontinuity and the shock front. Image reproduced with permission from LeVeque (2002), copyright by CUP

Extensions of the Riemann problem including CRs
To address the question of dynamical importance of shock-injected CRs during the structure formation in the KCDM universe (Pfrommer et al. 2006) derived a complete analytic solution of the Riemann shock-tube problem (Sod 1978) for the medium composed of thermal gas and relativistic CR component in two-fluid approximation. They applied their solution in smooth particle hydrodynamics (SPH) cosmological simulations designed for studies of CR energy injection at cosmological shocks. Miniati (2007) added the number density moment of the CR propagation equation (2) to the system of HD equations and performed characteristics analysis of the system of quasilinear equations describing the dynamical evolution of thermal gas combined with a spectral evolution of a CR population. The study focussed on the hydrodynamical part of the momentum-dependent algorithm (see Sect. 4.5 for details) for the evolution of CRs, introduced in the COSMOCR code (Miniati 2001). The CR population was approximated with a piecewise power-law distribution function. A set of conservation laws for the number density of CR particles in individual spectral bins, treated as passive scalars, has been supplemented to the system of conservation laws for thermal gas density, momentum and total (gas plus CR) energy density. The exchange of energy between the thermal fluid and the CR components is modeled with flux conserving methods both in configuration and in momentum space. The proposed numerical method is based on a combination of Glimm's method (Glimm 1965) preserving the discontinuous character of shocks and a higher order Godunov method (Toro 2009) in the smooth flows. Kudoh and Hanawa (2016) analyzed the CR-MHD equations and proposed a fully conservative approach to the system of equations consisting of the set of MHD equations and the equation describing the number density of CR particles in a twofluid approach. They proposed a numerical scheme providing solutions that satisfy the Rankine-Hugoniot conditions. By using the CR concentration normalized to the thermal gas density v ¼ q cr =q they have shown that their conditions are equivalent to those obtained by Pfrommer et al. (2006). They derived the corresponding Roetype MHD solver (Roe 1981) for the full system of MHD equations (with the total energy including the energy of thermal gas, cosmic rays and magnetic fields) supplemented with the evolution equation for the CR concentration. They found that solutions obtained for different forms for the CR energy equations, with source terms v Á $p cr (Kuwabara et al. 2004) or ÀP cr $ Á v Yang et al. 2012;Dubois and Commerçon 2016), differ from the Riemann solution between the shock front and the contact discontinuity. They find that the Rankine-Hugoniot relation is seriously violated when the CR pressure dominates in the postshocked gas. Gupta et al. (2021) found that the standard CR two-fluid model described in terms of three conservation laws (expressing conservation of mass, momentum and total energy) and one additional equation (for the CR pressure) cannot be cast in a satisfactory conservative form. The presence of non-conservative terms with spatial derivatives in the model equations prevents a unique weak solution behind a shock. Nevertheless, all methods converge to a unique result if the energy partition between the thermal and non-thermal fluids at the shock is prescribed a priori. This highlights the closure problem of the two-fluid equations at shocks. They extended the analysis further and made a comprehensive comparison of different discretization strategies. They constructed a full classification of the available discretization options for the two-fluid cosmic-ray hydrodynamics by comparison of numerical aspects of the 'pdv' and 'vdp' methods applied for different combinations of mathematically equivalent energy equations. They compared the cases of the gas energy equation with the CR energy equation, the total energy equation with the CR energy equation and total energy equation plus CR entropy equation in two variants including the operator split and unsplit method. After extensive numerical testing, they found that the numerical results are consistent for various reconstruction schemes only with the 'pdv' method in a fully unsplit fashion with v and p cr computed from the HLL Riemann solver values of momentum and density states.

CR injection in SN shocks
A separate issue is how to properly inject CRs in astrophysical shocks. The standard approach is to follow the assumption of Diffusive Shock Acceleration theory (DSA), relating the injection efficiency of CRs at shocks to the Mach number of the shock (Kang and Jones 2007). They find that the shock Mach number can be determined using Rankine-Hugoniot boundary conditions across shocked cells (Miniati et al. 2000;Ryu et al. 2003a;Skillman et al. 2008;Vazza et al. 2009). Vazza et al. (2012a) implement an algorithm for CR energy injection in the ENZO code and apply an approach based on the differences of the gas and CR pressure between cells. Their algorithm selects candidate shocked cells by requiring that the flow in the cell is converging: rv\0. Then the 3D distribution of cells around the candidate cell is analyzed along the three axes, and it is checked that the gas temperature T, and the gas pseudo-entropy, S ¼ pq Àc change in the same direction, rS Á rq [ 0 (Ryu et al. 2003a;Skillman et al. 2008;Vazza et al. 2009). The gradient of the temperature then sets the candidate post-shock and pre-shock cells. The Mach number is then evaluated from the information of the pressure jump between cells given by the Rankine-Hugoniot condition If the shock Mach number is known the energy flux of shock accelerated protons can be computed as where q u is the comoving upstream gas density and gðMÞ is a function of M (Kang and Jones 2007). The injected energy density of CRs within each cell results from integration of the CR energy flux over time step where q is gas density in the cell, dx is the cell size and dt is the timestep. The method has been further extended by inclusion of the effects of the upstream CR-tothermal ratio by Kang and Ryu (2013) and the magnetic obliquity by Pais et al. (2018). Pfrommer et al. (2017) presented new methods to integrate the CR evolution equations coupled to magneto-hydrodynamics on an unstructured moving mesh of the parallel AREPO code for cosmological simulations. They included an effective recipe for diffusive shock acceleration of CRs at resolved shocks based on Schaal and Springel (2015) and at supernova remnants in the interstellar medium. CR losses are included in terms of Coulomb and hadronic interactions, with CR streaming neglected. The accuracy of their model was demonstrated using simulations of plane-parallel shock tubes that were compared to exact solutions of the Riemann shock tube problem for a composite of CRs and thermal gas with effective CR acceleration. Capabilities of the new algorithm have been verified in large scale cosmological simulations and in simulations of galaxy formation.

Unified Alfvén wave regulated CR transport
One of the first attempts to incorporate the Alfvén wave regulated CR transport introduced in Sect. 1.5.1 in numerical solutions of CR propagation was undertaken by Breitschwerdt et al. (1991) in studies of galactic wind driving mechanisms. Their system of equations, following the analytical work by MacKenzie and Völk (1982), included the set of stationary MHD equations extended with the CR propagation (energy balance) equation and with a separate equation for the energy density of the fluctuating field of Alfvén waves propagating down the CR pressure gradient. A further extension of the model by Dorfi and Breitschwerdt (2012) discussed time dependent non-stationary solutions of galactic winds. Sharma et al. (2010) incorporated CR energy in the form derived by MacKenzie and Völk (1982) where P cr is the cosmic ray pressure, v is the plasma velocity and is the streaming velocity with v a ¼ B ffiffiffiffiffi ffi 4pq p representing Alfvén velocity directed down the CR pressure gradient and b ¼ B=jBj is the magnetic unit vector. Equation (63) describes the streaming of CR gas along the magnetic field direction down the gradient of its pressure.
Equation (63) is challenging for numerical algorithms because the standard upwind time-evolution algorithms with a CFL-limited time-step result in spurious oscillations at the grid scale. These oscillations arise due to unphysically large flux gradients near the minima and maxima and propagate through the computational domain. Using a hydro-based CFL (Dt ¼ CDx=v s ) leads to spurious oscillations which originate from the local extrema even for a smooth initial condition. Monotonicity can be achieved at high computational price for an upwind update by setting the timestep Dt / Dx 3 . Numerical solutions of Eq. (63) are unstable even with implicit methods and regularization of the equation by an additional smoothing term, resembling incorporation of explicit viscosity in Euler's equations is necessary. Sharma et al. (2010) proposed to replace the discontinuous streaming velocity in Eq. (63) by its smooth approximation tanhðv s =Þ, where is a small parameter. This operation introduces an additional diffusive term to the CR propagation equation with the diffusion coefficient v s = and numerical stability is achieved for the timestep limitation given by It has been demonstrated that stable numerical solutions are achievable due to the regularization. Uhlig et al. (2012) and later on Dubois et al. (2019) showed that streaming can be recasted into a diffusion term and be solved with an implicit solver for diffusion. The streaming transport term in the CR energy equation $ Á ½ðe cr þ P cr Þv s becomes and can be interpreted as anisotropic diffusive term, where D s is the diffusion coefficient corresponding to the streaming of CRs along magnetic field lines. The streaming-diffusion process has been modeled with an anisotropic implicit diffusion algorithm and is an interesting alternative to the one proposed by Sharma et al. (2010). Characteristic for the solutions obtained with the two different approaches are the flattened extrema apparent in numerical tests, such as those initiated with sinusoidal perturbations (see Fig. 4).
To avoid the ad hoc smoothing Jiang and Oh (2018) proposed a new numerical algorithm based on the two-moment methods of radiative transfer under the reduced speed of light approximation. The system of equations for the propagation of CR energy density and flux reads where V m is the maximum speed at which CRs can propagate which, for practical reasons, is reduced with respect to the speed of light c. The authors demonstrated that their results are not sensitive to the exact value of V m as long as it is much larger than the maximum Alfvén and flow velocities in the simulations. On the other hand V m can be assumed much smaller than the speed of light, which is important for an effective mitigation of the CFL condition. Essential advantage of the method is that it ensures a stable evolution for isotropic and field aligned streaming and diffusion without any regularization procedure. The timestep given by the standard Courant condition scales linearly with resolution. The full set of equations consists of the standard set of conservative MHD equations extended with the set of the two Eq. (67) and with additional CR-related source terms in the momentum and energy equations. The difference with respect to the approach by Sharma et al. (2010) is that the interaction coefficient (inverse CR diffusion coefficient) r cr becomes zero when $P cr approaches zero. In this regime the term 1 V 2 m oF cr ot becomes important, which is the main difference between these two approaches. The flat extrema in the spatial CR distribution can be understood as the result of CR streaming outwards until the flat top develops. CRs cannot stream further to produce an inverted profile because that would require CRs to stream up their gradient. An important element of the approach is the closure relation between the CR pressure tensor and the CR energy density. Jiang and Oh (2018) assume that P cr ¼ Ie cr =3. This assumption is correct only when CRs are strongly coupled to the gas, but in the case of strong wave dumping CRs may stream freely and the CR pressure can be very anisotropic. The algorithm has been implemented in the publicly available MHD code Athena?? (Stone et al. 2020). Thomas and Pfrommer (2019) developed a new macroscopic transport theory, which includes both diffusion and streaming of CRs in the selfconfinement picture. They incorporated resonant excitation of Alfvén waves through the gyroresonant instability by CRs streaming along magnetic field lines. CR scattering off these waves modulates the macroscopic CR transport. Their approach relies on the system  Sharma et al. (2010), copyright by SIAM of three equations including an equation for the CR energy density e cr and its flux density F cr in the Eddington approximation of radiative transport and the equation describing the evolution of the energy density of Alfvén waves e a;AE excited by the streaming CRs. The ± signs denote co-and counter-propagating waves with respect to the large scale magnetic field. The Alfvén-wave regulated CR transport equations [eqs. (10-16) in Thomas and Pfrommer (2019)] in the fluid (comoving) reference frame are where e cr and F cr , are measured with respect to the comoving (fluid) frame and the Alfvén wave energy density e a;AE is measured in the lab frame, b ¼ B=B is the unit vector directed along magnetic field and are the Lorentz forces due to small-scale magnetic field fluctuations of Alfvén waves. The subsystem is closed by the grey approximation for the CR diffusion coefficients linking the diffusion of CR energy density directly to the energy density of Alfvénic turbulence where X ¼ ZeB=ðcmcÞ is the relativistic gyrofrequency of a CR population with charge Ze and Lorentz factor c.
The scattering rate of CRs particles depends on the Alfvén wave energy density, so this energy density has to be dynamically evolved together with the propagation of CRs. The transport of Alfvén waves depends on several damping processes such as non-linear Landau damping, ion-neutral damping, turbulent and linear Landau damping as well as by sub-Alfvénically streaming CRs.
The CR-Alfvénic subsystem is coupled to the set of ideal MHD equations in a manner ensuring energy and momentum conservation in the non-relativistic limit of MHD with the vector of conserved variables U, the fluxes F FðUÞ, and the source terms S. The three vectors can be described as (Thomas and Pfrommer 2019) Here, q is the mass density, v is the velocity, and Q þ , Q À are the source terms of thermal energy due to Alfvén wave energy losses. For the ion-neutral damping Q inAE ¼ C in e a;AE , where C in is the damping rate for Alfvén waves due to ion-neutral collisions. For the nonlinear Landau damping Q nllAE ¼ ae 2 a;AE , where the interaction coefficient a ¼ ffiffi hki is a function of thermal velocity, magnetic field energy density and an averaged wavenumber. The turbulent damping and linear Landau damping lead to Q turbþllAE ¼ ðC turb þ C ll Þe a;AE with turbulent damping rate given by C turb ' v a ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k k;min k mhd;turb p , where k mhd;turb is the wavenumber at which the large scale MHD turbulence is driven and k k;min $ 1=r L , r L is the Larmor radius [details see Thomas and Pfrommer (2019), and references therein].
Symbol g represents forces between CRs, Alfvén waves and the thermal gas: where g Lorentz ¼ À ð1 À b bÞÞ Á $ ? P cr ; ð76Þ are the Lorentz force due to the large-scale magnetic field and the ponderomotive force respectively, and P a;þ þ P a;À are the ponderomotive pressures. The thermal and CR energy densities are given by e th and e cr , respectively. The total pressure P and the total energy density excluding CRs are e ¼ e th þ qv 2 2 þ B 2 2 : The system of the equations is closed using an equation of state for both the thermal as well as the CR fluid, The adiabatic indices are usually set to the canonical value of c th ¼ 5=3 and to c cr ¼ 4=3 implying a fully relativistic fluid. Thomas and Pfrommer (2019) presented numerical solutions of the new CR-Alfvén wave subsystem in one spatial dimension along the magnetic field. The inclusion of the nonequilibrium kinetic effects in the Alfvén wave propagation advances significantly the treatment of CR propagation over the approaches assuming a fixed Alfvén wave background. One of the tests reported by the authors exhibits the expected property of the system that the non-linear Landau damping process reduces the wave energy, increases the CR flux density and makes the CR transport more diffusive. The formalism itself does not depend on free parameters, since the CR diffusion and the wave Landau damping coefficients are given by MHD quantities and the characteristic gyrofrequency of CR population in the grey approximation. The only tunable parameter-the reduced speed of light-is chosen so that the solution does not depend on its specific value.
The aproach by Thomas and Pfrommer (2019) recovers the equations of Sharma et al. (2010) in the steady-state limit. The major difference between the formalisms of Jiang and Oh (2018) and Thomas and Pfrommer (2019) is that the diffusion coefficient of CRs in the first case is reconstructed from the distribution of e cr while in the latter case it is related to the energy density of Alfvén waves and may be influenced by various damping processes, such as Landau damping, inaccessible directly in the former approach. The results obtained with the method by Jiang and Oh (2018) are almost identical to those obtained with the Thomas and Pfrommer (2019) in a test case initiated with a Gaussian profile of CR energy density, however a similar test involving the Gaussian profile superposed to an initial uniform CR background, shown in Fig. 5 indicates advantages of the separate treatment of Alfveń wave component in the method proposed by Thomas and Pfrommer (2019). Another interesting alternative for the direct incorporation of the streaming process on large galactic scales has been proposed by Krumholz et al. (2020) who develop theoretical models of CR propagation, relating CR diffusion to the streaming process in magnetic field line random walk (FLRW). They describe the FLRW as the macroscopic diffusion coefficient to indicate that (1) the averaging is done over scales larger than the coherence length of MHD turbulence, (2) the process relies on the Alfvén wave regulated transport, which arises from the first-order term in distribution of pitch angles, while diffusion of CR relative to the bulk flow represents a second-order microscopic diffusion. They present an explicit formula for the CR parallel diffusion coefficient (their Eq. 20) and show its dependence on energy (their Fig. 2) for three starbust galaxies and the Milky Way. They apply the computed diffusion coefficient to a simple model of CR escape and loss, and show that the resulting c-ray spectra are in good agreement with the observed spectra of the starbursts NGC 253, M82, and Arp 220. The model reproduces relatively hard GeV c-ray spectra and softer TeV spectra without the need for any fine-tuning of advective escape times or the shape of the CR injection spectrum.

Momentum-dependent diffusion-advection models
Up to now we have discussed numerical treatments of the diffusion-advection equation (2) integrated over momentum space which leads us to the two-fluid approaches applied for the combination of a thermal fluid and CRs treated as a relativistic fluid. Neglecting the evolution of the whole CR population in momentum space is a significant simplification.
However, the Fokker-Planck equation (2) can be also discretized and solved numerically in both spatial and momentum spaces to determine a time-evolution of the distribution function f ðx; pÞ in a way similar to the approaches implemented in GALPROP , PICARD (Kissmann 2014), DRAGON2  (2018) and by Sharma et al. (2010). Profiles of the CR energy distribution, energy flux and streaming spead reveal signifficant differences. Image reproduced with permission from Thomas and Pfrommer (2019), copyright by the authors (Evoli et al. 2017), but with dynamical coupling to the thermal plasma taken into account.
Including the additional momentum dimensions in full generality becomes computationally demanding because the space of independent variables expands from three to six dimensions. A practical option is to reduce the problem to four dimensions by assuming that the distribution function is isotropic. A standard discretization with piecewise constant values for f requires relatively high spectral resolution in order to obtain accurate results. Because of a typically high dynamical range of the power-low distribution function a few hundred bins are necessary to achieve a reasonable accuracy of the numerical integration (see e.g. Winner et al. 2019;Girichidis et al. 2020b). Falle and Giddings (1987) and Bell (1987) used partial differential equation solvers to investigate the time-dependent structure of plane parallel shocks propagating parallel to the large-scale magnetic field. They solved the CR transport equation (2) in momentum space, together with the transport of gas and CRs in one spatial dimension. They used first-order Godunov methods to evolve the gas in a conservative manner together with a non-conservative treatment of the CR population represented by the function g ¼ p 4 f on a uniform grid of the independent variable y ¼ ln p. The effectively smaller dynamical range of g resulted in a possibility to reduce resolution of the momentum grid. These authors applied a Lagrangian formalism for advection and diffusion of the CR spectrum and used the Crank-Nicholson method. They included injection of particles at the shock position and assumed a momentum-dependent CR diffusion coefficient.
Another way to reduce the computational cost is to consider CRs as a passive component needed only for observational diagnostics of the simulated object. In this case a smaller spatial resolution of synthetic images is usually sufficient, therefore the spectrum can be evolved for a relatively small set of Lagrangian particles. If dynamical impact of CRs on the thermal fluid is to be taken into account, the spectrum is needed for each computational element of the spatial grid. This would imply the need to process hundreds of momentum bins on multi-dimensional spatial grids leading to very high computational costs of numerical models.
To mitigate the computational demands several methods have been proposed to solve the Fokker-Planck equation using a piecewise power-law representation. The currently popular methods can be classified as one-moment and two-moment approaches, involving number density n cr and/or energy density e cr moments of the distribution function f defined in Sect. 4.1.

One-moment approaches
To achieve a more accurate evolution of the spectrum with a lower number of momentum bins one needs an accurate representation of the spectrum within a single bin. Since the observed spectra of CRs are power-laws to a good approximation, a natural choice is the numerical representation by the piecewise power-low distribution function. Jun and Jones (1999) assumed a piece-wise powerlaw electron spectrum of the type where edges of i-th bin are placed at p iÀ1=2 and the logarythmic slope q i is attributed to bin centers. They used the number density as the dependent variable representing CRs to obtain the number density moment of the CR discretized transport equation (2). They noted that only a few bins may capture the basic structure of the spectrum. They assumed a continuous spectrum at the bin boundaries. By imposing the same values of the slope q i for two lowest bins in the log of momentum space one can solve, with the use of continuity assumption, the system of equations recursively for the entire spectrum. The method has been combined with the ideal MHD equations for an adiabatic plasma, solved using the ZEUS-3D Eulerian code (Stone and Norman 1992a, b;Jun et al. 1994). The advection equation for the electron number density was solved by the standard fluid transport algorithm and the adiabatic compression term was updated implicitly for the time-centered number density of electrons in each momentum bin. The CR electrons were treated as a passive population without a significant pressure affecting the thermal plasma component. The diffusion term in the CR propagation and the synchrotron losses were ignored. The code has been used to study the MHD evolution of a Type Ia supernova remnant with CR electrons and for SN interactions with clouds in two spatial dimensions.
Using the piecewise-powerlaw approximation within another code Jones et al. (1999) solved the equations of ideal nonrelativistic magnetohydrodynamics (MHD) in cylindrical coordinates. Tregillis et al. (2001) used the method combined with 3D MHD simulations of electron transport in radio-galaxies.
The code is an MHD extension of the Harten (1983) conservative, second-order finite difference total variation diminishing scheme, as detailed in , Ryu et al. ( , 1998. The electron transport equation was derived from the general momentum-dependent diffusion advection equation (2) with the effects of spatial diffusion neglected. To properly treat the particle density at shock discontinuities their algorithm evolved the normalized concentration of relativistic particles b i ¼ n i =q, where q is the gas density.
The one-moment piece-wise power-law method proposed by Jun and Jones (1999) is relatively straightforward and computationally inexpensive, however, there is a fundamental problem with a continuous description of f in particular for a locally varying spectrum. Numerical tests (Girichidis et al. 2020b) show that if energy is injected at one part of the spectrum, the continuity assumption enforces changes of the local slope across the entire spectrum. The resulting continuous representation then alternates between a concave and a convex spectrum. Artificial oscillations superposed to the initially smooth spectrum have a tendency to grow towards high energy bins revealing a numerical instability of the scheme. To avoid these oscillations, the energy in all bins need to be adjusted. The problem is less severe for a small number of bins, equal or less than 10. An alternative for the continuous spectrum is the discontinuous modeling, which requires to constrain two degrees of freedom. Within the one-moment approach one may consider the additional restriction imposing a constant curvature of the spectrum through the relation q iþ1 À q i ¼ q i À q iÀ1 , which turns out to be suitable only for CR nucleons, but not for the fast cooling electrons (Miniati 2001).
An interesting one-moment approach to the propagation of CR electrons, not affected by the numerical instability described above, was proposed by Mimica et al. (2009) and implemented in SPEV code and followed by Vaidya et al. (2018) who incorporated their algorithm in the PLUTO code. These authors presented a hybrid framework which describes the spectral evolution of highly energetic particles by means of (mesh-less) Lagrangian macro-particles embedded in a classical or relativistic MHD fluid. The main purpose of this work was the inclusion of sub-grid micro-physical processes at macroscopic astrophysical scales where the fluid approximation is adequate. The MHD equations were integrated by means of standard Godunov-type finite-volume schemes, while propagation of macroparticles representing CRs is described by the relativistic version of the cosmicray transport equation (2). In the numerical implementation the number density moment of the equation corresponding to Eq. (32) was solved for the CR particle number density normalized to the fluid number density v p ¼ n p =n (Vaidya et al. 2018). Spatial diffusion of CR particles as well as back-reaction from particle to the fluid were not included. The novel aspect of their approach is the Lagrangian approach in momentum space assuming that bin edge positions drift in accordance with the flow of particles in momentum space due to actual cooling and heating processes. Position changes of bin-edges were calculated by means of the method of characteristics in momentum space reducing the problem to the set of ordinary differential equations for each Lagrangian particle moving in physical coordinate space. The Lagrangian discretization in momentum space implies that the particle number within each bins remains constant. The power-law index q needed for computing of synchrotron emissivity was computed a posteriori.

Two-moment approaches
Another method proposed by Miniati (2001) binds the pair of distribution function parameters ðf iÀ1=2 ; q i Þ to its two moments: number density n i and energy density e i . In absence of energy loss processes (radiative and adiabatic losses) these two moments are conservative quantities and therefore they are suitable for an accurate evolution with conservative transport schemes in both the spatial and momentum domains. A summary of the two-moment piece-wise power-law method is presented in Sect. 5.3.
Using a similar approach Jones and Kang (2005) modelled the propagation of CRs including advection plus diffusion in space and advection in momentum. The propagation equation does not include any strong cooling effects, such as the ionization or synchrotron cooling which act predominantly on one of the edges of the spectrum. Therefore the spectrum can be processed in an arbitrarily chosen range of momentum space, with the momentum boundary condition setting the particle number density to zero at the high energy end. At the low energy range of the spectrum the distribution was matched to the thermal particle distribution. The authors refer to the method as to ''Coarse-Grained Momentum finite Volume'' or ''CGMV''. It extends the ideas introduced in Jones et al. (1999), Miniati (2001) for the case of CR transport across an Eulerian grid including advection and diffusion. Jones and Kang (2005) extended the CGMV method so that it can be applied to the treatment of fully nonlinear CR modified shocks. The algorithm has been implemented in the CRASH code which is based on the high order Godunov-like shock tracking algorithm. The hydrodynamics routine in that code employs a nonlinear Riemann solver to follow shock discontinuities within the zones of an initially uniform grid. Thus, gas sub-shocks in CR-modified shocks remain discontinuous throughout a simulation, allowing CR transport to be modeled down almost to the scale of the physical shock thickness. CRASH also employs adaptive mesh refinement (AMR) around shocks in order to reduce the computational effort on the spatial grid.
With the aim of studying the Fermi bubbles (kpc-scale gamma-ray features centred on the Galactic Centre) Yang and Ruszkowski (2017) develop a new CRSPEC module for FLASH code to track the CR spectrum during MHD simulations. They construct a scenario for the Fermi bubbles resulting from AGN jet activity in the center of Milky Way. The specific features of the Fermi bubble's spectra including their spatial uniformity, shapes of the spectra and high energy cutoff at 110 GeV suggest a leptonic origin. The physical nature of the problem requires, therefore, a method to evolve the spectra of CR electrons dynamically together with the fluid-dynamical evolution of the expanding bubble. The method they adopted to trace the CR electrons spectral evolution follows the piece-wise power-law approach by Miniati (2001). They solve the two-moment set of equations for a population of CR electrons advected along trajectories of a set of tracer particles while CR diffusion neglected. They assumed a hydrodynamical evolution of the thermal component and a fixed magnetic field taken from GALPROP models. In this sense their approach represents a intermediate step between the CR propagation in a fixed plasma background and the self-consistent models. Winner et al. (2019) presented an efficient post-processing code for Cosmic Ray Electron Spectra that are evolved in Time (CREST) on Lagrangian tracer particles. The novel element of their method is the division of the overall spectrum into three parts treated with different methods. They note that for very low momenta the timescale related to the Coulomb cooling process becomes shorter than the typical time step of MHD simulations. Similarly, for very high momenta the synchrotron and inverse Compton energy losses proceed on arbitrarily short timescales. In order to efficiently calculate the CR electron spectrum with time steps similar to the MHD time step, they use analytical solutions for low and high momenta together with the fully numerical treatment for intermediate momenta. The situation is outlined in Fig. 6.
The analytical approach relies on the consideration of the momentum losses corresponding to the cooling terms in the CR transport equation (2). The momentum losses integrated over a time step lead to the change of the distribution function due to the cooling which shifts particles with from p ini to p within Dt The cooled spectrum can be interpreted as a momentum shift of the initial spectrum at time t 0 multiplied with a momentum-dependent cooling factor. In the central part of the spectrum the CR transport equation (2) is solved with the aid of direct numerical integration relying on the operator split technique. The spectrum is calculated on a discrete momentum grid with piecewise constant values. A cubic interpolation function is used to calculate the analytic solution after a time step Dt. The combination of analytical and numerical methods enables the integration of the spectra on reasonably large MHD time steps. The hybrid analytical-numerical method is implemented in the CREST module and coupled to the cosmological MHD code AREPO (Springel 2010;Pakmor et al. 2016c) with CR protons modeled as a relativistic fluid with a constant adiabatic index of 4/3 in a two-fluid approximation (Pfrommer et al. 2017). The coupling of CR propagation to the MHD system is done with the aid of Lagrangian tracer particles, which trace the velocity field (Genel et al. 2013) and are passively advected with the gas. The CR electron transport equation is solved for each particle in a post-processing routine on every MHD time step. The hybrid technique enables modeling of the spectral CR electron evolution such as adiabatic expansion and compression, Coulomb losses, radiative losses: inverse Compton, bremsstrahlung and synchrotron processes, diffusive shock acceleration and re-acceleration, and Fermi-II reaccelleration. The authors demonstrate that the CR electron spectra are efficiently and accurately evolved in shock-tube and Sedov-Taylor blast wave simulations. This technique opens up the possibility to produce self-consistent synthetic observables of nonthermal emission processes in various astrophysical environments like SNR (Winner et al. 2020).
A similar technique based on a combination of the numerical integration with the aid of piecewise power-law method and the analytical prescription of the Coulomb cooling has been applied by Girichidis et al. (2020b). The analytical solution helps to overcome the excessive shortening of the integration timestep whenever the cooling timescale is significantly shorter than the hydrodynamical timescale. The numerical integration scheme incorporated the two-moment approach by Miniati (2001). The original algorithm was modified in the part responsible for cooling losses, that were computed through a modification of inter-bin fluxes, rather than by inclusion of source terms (incorporated through the 'R'-term in the original approach, see Sect. 5) to the evolution equation for the spectral energy density of CRs. The modification enables a more accurate computation of the spectral slope and is a less-diffusive alternative for the propagation of CR population in momentum space. Tests of the algorithm have confirmed that a combination of injection and cooling of CRs including Coulomb and hadronic losses shows a very good agreement with theoretical steady state spectra.
Solutions of the CR transport equation (2) including advection, diffusion and spectral evolution coupled with the MHD evolution of the thermal plasma turn out to be challenging. Up to now only approaches restricted to a subset of physically relevant processes have been successful. The GALPROP-type codes include an extensive set of physical interactions of CR particles under the assumption of fixed magnetic field, the CGMV method of Jones and Kang (2005) is capable of modeling spectral evolution together with diffusion, advection and dynamical coupling of the CR population to the thermal plasma, but cannot cope with the fast cooling processes. Other implementations (Miniati 2001;Winner et al. 2019) engaging Lagrangian test particles include particle acceleration and various cooling processes, but ignore diffusion and dynamical coupling to thermal plasma.
Limitations of the original CGMV method become severe in the presence of fast cooling processes, such as synchrotron emission, which naturally tends to evacuate all particles from high energy bins, producing steep spectra near the upper cutoff. To avoid these limitations Ogrodnik et al. (2021) extended the CGMV method with movable boundaries, which change in response to CR momentum gains or losses near the extremes of the population distribution. The extension relies on an operational definition of spectral cutoffs which define spectral boundaries of the spectrum. The extension involves a special treatment of momentum bins containing spectral cutoffs. Contrary to the regular bins of fixed width, those bins have variable-width, and their outer edges coincide with spectral cutoffs.
The CR spectra may have different cutoffs in each spatial grid cell, because local cooling conditions (magnetic and velocity fields) are generally different. Due to advective and diffusive propagation of the CRs across the spatial grid, different populations of particles mix in every cell. The essential part of the numerical problem is to estimate an effective cutoff for the mixture of different populations inflowing from neighboring cells with different cutoffs. The cutoff positions are estimated from the particle number density and energy density in the outer bins for an assumed value of an additional parameter e small representing the smallest physically significant level of CR spectral energy density. Estimation of cutoff positions from the number density and energy density moments enables a numerically stable computation of power-law indexes in the active bins, especially those containing the cutoffs, what would not be possible in the case of fixed boundary conditions in momentum space.
The algorithm is designed to follow spectral evolution of CRs coupled with an MHD system on Eulerian grids and is particularly useful for modeling of CR electron population subject to strong synchrotron and inverse Compton cooling in the high-energy part of the spectrum. The algorithm is suitable for studies of CR electron spectrum evolution in MHD simulations of the galactic interstellar medium.

Advances in theory of CR propagation
An important development has been to explicitly compute the trajectories of charged particles numerically in the magnetised ISM, thus going beyond the analytical approximations of e.g. the quasi-linear theory. In principle this involves just the Lorentz force and Maxwell's equations, i.e. very basic physics. This approach was pioneered by Jokipii (1994, 1999). The turbulent magnetic field is modelled as a superposition of modes with a power-law spectrum, using Alfvén waves. The trajectories are computed for large numbers of particles, and then statistical properties-diffusion tensors etc-are calculated (e.g. Reville et al. 2008). The diffusion coefficient perpendicular to the B-field is found to be much smaller than the parallel coefficient. Casse et al. (2002) extended this approach to higher energies and a wider range of environments, including SNR, superbubbles and radio-galaxies. Desiati and Zweibel (2014) show that even simple magnetic nonuniformities combined with pitch angle scattering can enhance cross field line transport by several orders of magnitude, while pitch angle scattering is unnecessary for enhanced transport if the field is chaotic. Perpendicular diffusion remains relatively small compared to parallel diffusion. Snodin et al. (2016) study CR diffusion by means of particle simulations in turbulent magnetic fields. They obtain direct estimates of the diffusion tensor from test particle simulations in random magnetic fields, with the Larmor radius scale being fully resolved and find that diffusion coefficients obtained are consistent with existing transport theories that include the random walk of magnetic lines. Shukurov et al. (2017) calculate cosmic-ray diffusivity in intermittent dynamo-generated magnetic fields using test particle simulations. The results are compared with those obtained from non-intermittent magnetic fields having identical power spectra. The presence of magnetic intermittency significantly enhances cosmic-ray diffusion over a wide range of particle energies. The authors demonstrate that the results can be interpreted in terms of a correlated random walk. Seta et al. (2018) use test particle simulations, tracing the propagation of charged particles (protons) through a random magnetic field, to study the cosmic ray distribution at scales comparable to the correlation scale of the turbulent flow in the interstellar medium (* 100 pc in spiral galaxies). They find that there is no spatial correlation between the cosmic ray number density and the magnetic field energy density. Low-energy cosmic rays can become trapped between magnetic mirrors, whose location depends more on the structure of the field lines than on the field strength. These results are relevant for interpreting synchrotron emission data which often assume energy equipartition of CRs with the magnetic field energy.
Meanwhile observation-based models of the Galactic magnetic fields and related theory are improving, and these can be input to CR propagation studies, as addressed elsewhere in this review.

Spatial diffusion of spectrally resolved CRs
In contrast to advection, spatial diffusion includes an interaction of spatial and spectral changes. Being energy conserving, spatial diffusion itself does not transfer CRs in momentum space at one position in space, i.e., of =op ¼ 0j x . But the transport of number and energy density to neighbouring cells depend on the spatial derivatives of f. Depending on the values in neighbouring fluid elements and combined with a different diffusion tensor for n and e will result in changes of both amplitude f iÀ1=2 and slope q i within one spectral bin, and with that the other physical processes in the following evolution. We derive the terms for number and energy density separately. For the former one the spatial diffusion in momentum range ½p 1 ; p 2 is given by with the diffusion tensor D, The components include the orientation of the magnetic fields. The components are chosen to be (Ryu et al. 2003b), with normalised components of the magnetic field, b i ¼ B i =jBj. The dependency on momentum is expressed as with p 10 ¼ 10 GeV=c. Parallel and perpendicular diffusion coefficients with respect to the magnetic field are denoted as D k;10 and D ?;10 . Equation (86) can be solved directly by replacing the individual components yielding Here, o k ¼ o=ok is the partial derivative with k 2 fx; y; zg, so the spatial derivatives of f need to be taken into account. Alternatively, we can write where the equation formally takes the form of a simple diffusion equation with only a modified diffusion tensor D n h i. The individual components of the diffusion tensor take the form For D ij ¼ D 0 ij ðp=p 10 Þ a we find In an analogous way we treat the energy density, with the modified diffusion coefficients which can be rewritten to include the scaling with momentum The full computation of the spectrally resolved diffusion involves many derivatives and introduces numerical errors. The usual approach of applying limiters is a very powerful method for conserved quantities. For a simple numerical representation of the particle distribution function, e.g. a piecewise constant representation, a limiter can simply be applied. But if more complex representations are chosen like piecewise powerlaws as in e.g. Girichidis et al. (2019), the derivatives include the terms that scale with the spatial derivative of the amplitude as well as the derivative of the slope. Applying limiters to those individual terms are not simply applicable as the quantities are not conserved. A simple approximate solution to the full problem described above is derived in Jones and Kang (2005), where the number density weighted diffusion coefficient reads The corresponding energy weighted counterpart can be expressed as with g i D=p 2 i ¼ R p iþ1 p i pDf dp .

Numerical treatment of CR diffusion
Numerically solving diffusive CR transport faces two main challenges. First of all, the diffusion equation is an elliptic PDE, which means that the solution propagates infinitely fast, i.e. the domain of dependence is the entire domain. Ideally, one would like to solve the diffusion equation with an implicit algorithm including the entire domain. However, this is numerically demanding and approximate solutions are favoured. The solution using an explicit algorithm requires to satisfy the explicit von Neumann stability criterion with a maximum integration time step where Dx is the resolution of an individual cell and D is the diffusion coefficient.
The second complication is that the diffusion of CRs is highly anisotropic with the largest diffusion parallel to the magnetic field lines. The diffusion coefficient thus extends to a diffusion tensor, with coefficients describing the orientation of a flow with respect to the magnetic field orientation. For an explicit algorithm the time step criterion changes to in the case of a dimensional split numerical scheme, where D k and D ? are the parallel and perpendicular diffusion coefficients, repsectively, and C cr \1 is the Courant number (e.g. Hanasz and Lesch 2003). In the case of an unsplit scheme the numerical coefficient changes to 0.3. Whereas the implicit solution for isotropic diffusion can simply be computed using the Green's function, the anisotropic case does not have a simple form of the Green's function. Therefore, iterative methods are used.

Anisotropic diffusion on a regular mesh
In this section we present details of the implementation of the anisotropic diffusion algorithm presented by Hanasz and Lesch (2003). The algorithm has been designed for the staggered mesh setup of the ZEUS-3D MHD code, with the MHD algorithm involving the constraint transport (CT) (Evans & Hawley 1988) method to ensure the divergence-free evolution of the magnetic field. The staggered mesh locates the individual components of the magnetic field on different faces of grid cells: B x on yz faces, B y on xz faces and B z on xy. The CR energy density is located in cell centers and fluxes of CRs are located on cell faces. Different centering of the relevant quantities implies the need of interpolation ensuring numerical stability of the anisotropic diffusion algorithm. The CR diffusion tensor D depends on the spatially variable magnetic field B. Let us consider the diffusive part of the diffusion-advection equation (37) with the diffusion tensor given by Eq. (88): In the discrete representation, the 3-dimensional conservation law reads e nþ1 cr;i;j;k ¼ e n cr;i;j;k À Dt Dx F cr;iþ 1 2 ;j;k À F cr;iÀ 1 2 ;j;k À Dt Dy F cr;i;jþ 1 2 ;k À F cr;i;jÀ 1 2 ;k À Dt Dz F cr;i;j;kþ 1 2 À F cr;i;j;kÀ 1 2 ; ð103Þ where e n cr;i;j;k and e nþ1 cr;i;j;k are volume averaged CR energy densities in the cell i, j, k, and F cr;iÀ 1 2 ;j;k , F cr;iþ 1 2 ;j;k are the CR fluxes through the left and right cell boundaries, in x-direction. The fluxes appearing in equation (103) should be understood as timeaveraged fluxes through respective cell boundaries. If we approximate these fluxes by their values computed at the time-level t n , then we obtain an explicit algorithm for the numerical integration of the CR diffusion equation. To compute the diffusion tensor components at cell faces, we need all components of the unit vector n, parallel to B at each cell face. We start with computing magnetic field aligned unit vectors at cell-faces perpendicular to the x-axis: and then do the same for the other faces. The magnetic field component B x is already located at proper faces, but the location of B y and B z is different, thus an interpolation is necessary. The linearly interpolated magnetic field vector located at ðx iÀ 1 2 ; y j ; z k Þ is therefore The interpolation scheme for B y , at the cell-face perpendicular to x-axis, is shown in Fig. 7.
To compute the diffusive fluxes of CRs across cell faces, one needs all components of the CR energy density gradient at each cell face. The components of re cr , perpendicular to cell faces, are centered at cell-faces, e.g.: o x e cr ð Þ ðiÀ 1 2 ;j;kÞ ' 1 Dx ðe cr;i;j;k À e cr;iÀ1;j;k Þ; however, an interpolation is needed for the remaining components of CR energydensity gradient. The procedure follows in three steps, as depicted in Fig. 8: 1. interpolation of e cr to centers of cell faces (1) 2. computation of left-and right finite differences of e cr , with respect to coordinates parallel to cell faces, at positions (2) Fig. 7 Interpolation of magnetic-field components to cell-faces shown in 2D projection onto xy-plane o y;l e cr 1 2Dy ðe cr;iÀ1;j;k þe cr;i;j;k ÞÀðe cr;iÀ1;jÀ1;k þe cr;i;jÀ1;k Þ À Á ; ð107Þ o y;r e cr 1 2Dy ðe cr;iÀ1;jþ1;k þe cr;i;jþ1;k ÞÀðe cr;iÀ1;j;k þe cr;i;j;k Þ À Á : 3. computation of face-centered slopes at positions (3) We note that the monotonization of re cr components (step 3.), parallel to the considered cell faces, is essential for stability of the overall (CR?MHD) algorithm. The same procedure is applied to get the monotonized slope of e cr in the zdirection.
o z e cr ð Þ ðiÀ 1 2 ;j;kÞ ' 1 4 o z;l e cr þ o z;r e cr Þð1 þ signð1; o z;l e cr o z;r e cr Þ À Á : Computation of ð$e cr Þ i;jÀ 1 2 ;k and ð$e cr Þ i;j;kÀ 1 2 follows in the analogous way. Another variant of transverse flux computation, designed to avoid negative CR energy densities when there is a large gradient, has been applied by Yang et al. (2012) following Sharma and Hammett (2007). Their implementation relies on a At this point we are ready to compute all CR-fluxes F cr ¼ ÀD$e cr at all cellfaces. We apply the directional-splitting technique to compute the total diffusive change of CR-energy density, in a single timestep. Subsequent updates of CRenergy density proceed in a directionally split manner: where e nþa cr;i;j;k is CR-energy density after the advection step, e nþb cr;i;j;k and e nþ cr;i;j;k are CRenergy densities after these quantities have been updated in x and y-directions, respectively For more recent models of finite difference and finite volume algorithms for anisotropic diffusion we would like to point to van Es et al. (2014van Es et al. ( , 2016.

A semi-implicit extension
Sharma and Hammett (2011) consider purely parallel diffusion along the magnetic field lines, i.e. D ? ¼ 0, Defining the components of the diffusion operator on the right hand side of Eq. (114) as then the method can be formulated as An extension to adaptive mesh refinement using an implicit scheme was introduced by Dubois and Commerçon (2016) in the RAMSES code (Teyssier 2002). In their algorithm the general anisotropic diffusion is split into an isotropic component and a component parallel to the magnetic field, The updated CR energy in cell i, j is given (in 2D) by for cell position i, j. These quantities are evaluated with the centred symmetric scheme proposed by Günter et al. (2005) for the anisotropic part of the flux. The anisotropic flux at cell interfaces F ani iAE1=2;j and F ani i;jAE1=2 are evaluated from their cell corner fluxes F ani iAE1=2;jAE1=2 , thus The anisotropic cell corner flux is where barred quantities are arithmetic averages over the cells connected to the corner. The authors use the diffusion algorithm introduced in Commerçon et al. (2014) including AMR and adaptive time-stepping on a level-by-level basis. Each refinement level ' is evolved with a corresponding time step Dt ' , where each time step is twice as large for each coarser level ' À 1, Dt ' ¼ Dt 'À1 =2. This means that level ' evolves with two consecutive time steps before one time step of level ' À 1 is applied. Each level of refinement ' is connected to two types of non-uniform interfaces, namely the fine-to-coarse interface (between level ' and ' À 1) and the coarse-to-fine interface (between level ' and ' þ 1). Dubois and Commerçon (2016) use Dirichlet boundary conditions, where the cell values at level boundaries are imposed. On the other hand, for Neumann boundary conditions the fluxes are imposed, which are needed to guarantee energy conservation similar to hydrodynamical solvers.
At the interface from fine-to-coarse, Dubois and Commerçon (2016) use the values of ' À 1 at time n as an imposed boundary condition for level '. They use the minmod scheme (van Leer 1979) and interpolate the values of level ' À 1 on a finer virtual grid at level ' to determine the fine-to-coarse boundary. For the coarse-tofine interface they use values of ' þ 1 at time n þ 1 as the imposed boundary conditions for level '. The value of the boundary coarse cell at level ' is restricted to the average value of the 2 dim cells of level ' þ 1 to impose the coarse-to-fine boundary. Equation 119 remains correct since the diffusion solver only deals with data estimated at the same level of refinement. The combination of Dirichlet boundary conditions at the level interfaces and the interpolation or restriction operations does not break the symmetry of matrix A. The imposed values of neighbouring cells at different levels of refinement enter the right-hand side (in vector c) of the above matrix system Ax ¼ c. Pakmor et al. (2016a) implement the diffusion of the CR strictly along the magnetic field lines

Diffusion on a Voronoi mesh
with the CR energy density e cr and the parallel diffusion coefficient D k . Similar to Günter et al. (2005) and Sharma and Hammett (2007) they compute the gradient estimates at the centre of an interface between cells based on the gradient estimates at the corners. In three dimensions every corner of a Voronoi cell connects to four adjacent cells. The residual of the gradient fit reads with the unknown value at the corner of a cell / c ð Þ, the gradient at the corner position c, $/, and the value at the centre of the cell s i , /ðs i Þ. The residual can be rewritten in matrix form where r, q, and Y are N-vectors (N ¼ 4), and X is an N Â N matrix. The individual components are defined as The residual is then minimized by solving There is a unique solution for q with zero residual because X is a square matrix. In order to solve for q a multiplication with X T X À Á À1 from the left is used, which with M ¼ X T X À Á À1 X T . Here, M only depends on the geometry of the mesh. The vector Y only contains values at the cell centers. Therefore, M only needs to be computed once every timestep as the geometry of the mesh does not change. One can obtain the value as well as the gradient of any quantity at a corner from a simple matrix-vector multiplication. Moreover, Eq. (125) shows the linear dependence of the gradient at the corner on the values in the adjacent cells. This fact is needed for the implicit time integration. The estimate of the gradient is given by $e cr;n ¼ e cr;L À e cr;R c L À c R j j The solution is obtained by a semi-implicit time integration, In a second step, the CR energy is advanced according to the fluxes associated with the normal component of the CR energy density gradients computed at the interfaces. This is done with an implicit backward Euler step: The term $e nþ1 cr;n;ij scales linearly with the CR energy densities. This yields a system of coupled linear equations, which can be solved using a matrix solver. Pakmor et al. (2016a) solve the linear system in a two-step procedure using the solvers from the HYPRE library (Falgout and Yang 2002).

Numerical scheme for selfconsistent, spectral CR transport
Here we summarise the general framework formulated by Miniati (2001), which we extend to middly relativistic and non-reativistic ranges of CR momenta.

Evolution of the isotropic CR spectrum on the momentum grid
We assume a piecewise power-law, isotropic (in momentum space) distribution function where the distribution function amplitudes f iÀ 1 2 are defined on left edges of momentum bins p iÀ 1 2 and the spectral indices q i are attributed to bin interiors. The number density of particles in a single momentum bin is The particle density equation in the discretised form (spatial propagation terms and local sources are neglected here) reads where b(p) is the loss term. We integrate Eq. (132) over the timestep interval where Dn Dt We shall approximate kinetic energy of CR particles by another piecewise powerlaw function defined for each momentum bin separately where the function amplitudes g iÀ 1 2 are defined on left edges of momentum bins p iÀ 1 2 and the power indices s i are attributed to bin interiors. The amplitudes g iÀ 1 2 are computed directly by evaluation of gðp iÀ 1 2 Þ on bin edges and the corresponding slopes s i are Thus, the general relation between energy and distribution function amplitudes is Energy equation in the discretised form reads The energy loss term (the integral) on the right-hand side of (139) contains the derivative of kinetic energy Since T(p) is approximated by a power-law function we can use Eq. (136) to represent its derivative We substitute the second right-hand side term in equation (139), by e i R i : where R i expresses the energy loss rate per unit energy For adiabatic cooling To deal with the full (relativistic and non-relativistic) energy range we substitute (141) in (143) where the dots stand for other cooling mechanisms. By elimination of the energy density we find that We integrate now Eq. (142) The following part of this section will detail the calculation of the Dn Dt iÀ1=2 and De Dt iÀ1=2 terms, which involves fluxes of particles and their energies through bin boundaries. The procedure relies by construction on the upwind computation of the fluxes in momentum space.
Computation of upstream momentum p u . Particles loose or gain momentum due to physical effects underlying the source term b(p) in Eq. (139) bðpÞ À dp dt We integrate (150) for the case of the adiabatic process In the case of heating the amount of particle energy transferred from i-th bin through the right bin-face:

Conversion between quantities
The spectral behaviour is best computed using the Fokker-Planck equation using the amplitudes f and the slopes q. For the coupling to hydrodynamical simulations is more convenient to use number density and energy density. Both sets of quantities (f, q and n, e) contain equivalent information, so it is possible to use f and q for the spectral evolution and then convert to n and e for the coupled part to hydrodynamics. The computation of n and e based on f and q is straight forward. The backward conversion can be done by solving the ratio of e i =n i for q i . The amplitude f iÀ1=2 cancels in the ratio and the solution for q i is unique.
6 Astrophysical applications 6.1 Modeling galactic CR propagation with GALPROP 6.1.1 Examples of GALPROP model predictions Figure 9 shows example predictions for a typical GALPROP model: protons, secondary-to-primary boron/carbon ratio, primary electrons and secondary positrons. This model includes diffusive reacceleration, and breaks in both primary injection spectra and diffusion coefficient. The halo height is 4 kpc. The predictions are for the solar position in the Galaxy (though predictions are made for the entire Galaxy). The parameters can be found in the GALPROP parameter file with name galdef_54_reltest23 supplied with the code. Interstellar and example modulated results (simple force-field approximation) are shown. Sample data are also shown for comparison; the fit is reasonable, although no attempt has been made to use the latest available data, nor is any attempt made here to fit the data; this is covered in literature (Grenier et al. 2015;Boschini et al. 2020a, b). Figure 10 shows sample all-sky maps from hadronic interactions with the interstellar gas, and synchrotron radiation from electrons and positrons on the magnetic field. The gamma-ray maps are generated by GALPROP using the computed p and He spectra over the Galaxy, and hadronic cross-sections and the GALPROP neutral, molecular and ionized hydrogen and Helium model, and integration over the line-of-sight from the solar position. The synchrotron maps use The plots were made with the GALPLOT package, available at https://gitlab.mpcdf.mpg.de/aws/galplot the electron and positron spectra over the Galaxy, the magnetic field model, and the full formula for synchrotron including the directional variation with respect to the magnetic field for the regular component. Such models can be directly compared to gamma-ray observations e.g. by the Fermi-LAT instrument and radio surveys by radiotelescopes and the Planck satellite.

Diffusive reacceleration and alternatives
Diffusive reacceleration has been a process frequently included since it explains B/ C without an ad-hoc break in D xx ðpÞ and is compatible with the Kolmogorov index 0.33 in this term. It is simply diffusion in momentum due to the gain and loss of momentum off moving scatterers; hence there is a basic relation between D pp and D xx (see e.g. Strong et al. 2007). However the question remains of the actual importance of this process in the insterstellar medium. In fact in the models with large reacceleration, a significant amount of energy is being injected into CR from the ISM, so that CR acceleration involves more than just the standard sources like SNR. The source of this energy poses a problem. The issue has been recently addressed by Thornbury and Drury (2014). This article has a useful and clear derivation of the reacceleration formula and clarifies its relation to the original Fermi second-order mechanism. In future the energetics involved should be studied in more detail. Reacceleration at the level often invoked seems incompatible with low-frequency synchrotron spectrum which is sensitive to the electron and positron spectra around a few GeV.
On the other hand, pure diffusion models without reacceleration do have a problem to reproduce B/C without either a very large break in D xx ðpÞ  or an additional velocity dependence in D xx ðpÞ (Ptuskin et al. 2006); the latter paper also invokes dissipation of MHD waves as an alternative way to produce the required D xx ðpÞ. The use of B/C depends however critically on the level of solar modulation used to convert the interstellar ratio to the observed one; a lower level of modulation than the value %500 MV often adopted would ease the tension. Lave et al. (2013) compare both reacceleration and pure diffusion models with B/C data including the latest ACE data at a few 100 MeV; a modulation potential of 250 MV is adopted there. More likely convection plays a role in the sub-GeV/nucleon range as well, since it gives an energy-independent escape time.
It is also possible to include the effect of CR on the diffusion coefficient Ptuskin et al. (2006). They studied the possibility that the nonlinear MHD cascade sets the power-law spectrum of turbulence that scatters charged energetic particles. They found that the dissipation of waves due to the resonant interaction with cosmic-ray particles may terminate the Kraichnan-type cascade below wavelengths 10 13 cm. The effect of this wave dissipation has been incorporated in the GALPROP numerical propagation code in order to asses the impact on measurable astrophysical data. The energy dependence of the cosmic-ray diffusion coefficient found in the resulting self-consistent model may explain the peaks in the secondary to primary nuclei ratios observed at about 1 GeV nucleon -1 .

CRs in star formation
Star formation occurs in the densest and coldest regions of the turbulent interstellar medium (Mac Low and Klessen 2004;McKee and Ostriker 2007;Girichidis et al. 2020a). Typical densities of star forming cores exceed 10 6 cm À3 where the gas is as cold as $ 10 K. The spatial extents are small fractions of parsecs. The role of CRs in these regions includes several aspects. As CRs penetrate deeply into the protostellar cores, they deposit energy into regions which are opaque to electromagnetic radiation (see e.g. recent review by Padovani et al. 2020). This effectively sets a minimum temperature on the gas temperature in dense gas. As the Jeans mass scales with the temperature as T 3=2 the impact of CRs on the fragmentation might be dynamically relevant. The second aspect is the impact on the chemical composition. With their high energy, CRs are able to unbind and ionize many molecules. The changes in the chemical reactions alter the observable tracers and contain valuable information on the local conditions (e.g. Bisbas et al. 2017). In the context of star formation the main focus is on low-energy CRs because of their enhanced cross section with the thermal particles. Integrated, the CR energy density in the lowenergy component is generally not comparable to the other energy densities. Consequently, the CR pressure is also not directly driving the motions of the gas and to first order CRs can be treated as tracer particles or a tracer fluid.
Overall, the time scales for CR transport through star forming regions are much shorter than the dynamical times scales of the gas. This means that adiabatic gains and losses of CRs can be neglected contrary to galactic scales where advective and diffusive time scales are comparable. Given the strong magnetic field strengths of * mG (Crutcher 2012) and the low energy of CRs (.GeV) the gyro-radii are smaller than the spatial extents of star forming cores and the CRs can be followed collectively following the magnetic field lines along the path s.
The distribution function is generally a function of the pitch angle, which depends on the local strength of the regular magnetic field, and CR scattering that occurs due to resonant CR interactions with magnetohydrodynamic (MHD) turbulence and gas nuclei. Depending on the degree of pitch angle scattering, there are two CR transport regimes-free streaming and diffusive (Padovani et al. 2018;Silsbee and Ivlev 2019;Padovani et al. 2020). The free-streaming approximation [also known as the continuous slowing-down approximation, CSDA, (see e.g. Takayanagi 1973;Padovani et al. 2009)] is the most common approach used to calculate the propagation of CRs in molecular clouds. Scattering processes are inefficient in this regime, so that the resulting mean squared deviation of the pitch angle along a CR track is small, so l is conserved. The dominant regime of CR transport in regions surrounding dense cores embedded within molecular clouds is debated. MHD turbulence in these regions can resonantly scatter the pitch angles of penetrating CRs (Kulsrud and Pearce 1969), leading to spatial diffusion. The spectrum of MHD turbulence determines the magnitude of the CR diffusion coefficient and its dependence on the particle energy (Schlickeiser et al. 2016;Silsbee and Ivlev 2019). However, MHD turbulence can also be driven by anisotropy in the CR distribution function (Skilling and Strong 1976;Morlino and Gabici 2015;Ivlev et al. 2018), which arises in response to CR absorption in dense cores.
Observationally, the ionisation rate of CRs can be connected to observations of H þ 3 , whose chemical formation chain is primarily regulated by CRs (see, e.g. Indriolo et al. 2007;Indriolo and McCall 2012).

CRs in the interstellar medium
In this section we discuss the interactions of CRs with the interstellar medium. This covers the effects in the immediate surroundings of a supernova, the heating in the interstellar medium, and the dynamical impact within the ISM. The interactions on larger scales like the impact on galactic outflows are discussed further below. We will not cover the acceleration mechanism in SN remnant shocks as this involves the kinetic approach of CRs and is beyond the scope of this review.
Simulations of the interstellar medium allow to resolve the different thermal phases of the gas including the degree of ionisation. The spatial resolution also allows to resolve MHD shocks. On the one hand this allows to investigate in more detail the coupling of CRs with the thermal gas. On the other hand CRs can be injected at the position of the shock and allow a more consistent injection mechanism.
In the ISM the transport speeds of CRs and the local tubulent motions can be comparable, so the local conditions determine whether the CRs diffuse faster than they are advected or vice versa. Commerçon et al. (2019) perfom simulations of the turbulent ISM and find that for most cases the effective diffusion speeds are dominating. As a result, no significant CR pressure gradient can build up and no dynamical effects are expected on scales of a few tens of parsecs. Pfrommer et al. (2017) extend the two-fluid shock tube by CR injection at the shock and applies an on-the-fly shock finder (Schaal and Springel 2015) in several idealised setups such as a Sedov explosion (Sedov 1959) using the AREPO code (Springel 2010). Pais et al. (2018) added an obliquity-dependent injection efficiency with respect to the magnetic field orientation and investigated the impact of CRs injected in the shock region on the Sedov explosion finding that the solution remains self-similar because the ellipticity of the propagating blast wave stays constant over time. Furthermore, their comparison to observed SN remnants suggests a lower injection efficiency of only 5 per cent rather than the canonical 10 per cent of the SN energy. This result is independent of the assumed magnetic coherence length. Dubois et al. (2019) use a similar approach to inject CRs at the loci of the MHD shocks with an on-the-fly shock finder in the larger boxes of the interstellar medium using RAMSES. The efficiency is also coupled to the upstream magnetic obliquity. Their models include CR streaming as well as ansiotropic diffusion. Similar to Pais et al. (2018) they find supernova bubbles with large polar caps in case of a homogeneous background magnetic field, and a patchy structure of the CR distribution in case of inhomogeneous background fields. The application in a turbulent box shows that the presence of shock-injected CRs significantly modifies the structure of the gas.
A more idealized study by Wiener et al. (2017a) investigates a stream of CRs interacting with cold clouds in a hot dilute environment. Using one-dimensional models including CR streaming they find that the bottleneck effect, which occurs when a population of CRs undergoes the streaming instability, will accelerate and heat the gas cloud. Corresponding two-dimensional models including radiative cooling of the clouds (Wiener et al. 2019) confirm the acceleration of clouds by CRs, but reveal that the thermal impact might dominate. The cooling effect will keep the clouds intact to CR wave heating.

CR driven large-scale instabilities of the ISM
Observational data indicate that gas, magnetic fields and CRs appear in approximate energetic equipartition, which is interpreted as the effect of dynamical coupling of the diverse components of the interstellar medium. The stability of a system in vertical equilibrium without CRs was analysed in Newcomb (1961); the inclusion of CRs dates back to Parker (1966). Zweibel and Kulsrud (1975) generalized the model for a family equilibria. The dynamical role of CRs was first recognized by Parker (1966), who noted that vertically stratified ISM consisting of thermal gas, magnetic fields and CRs is unstable due to buoyancy of the weightless components, i.e., the magnetic fields and the CRs. The physical mechanism of the instability stems from the fact that two weightless components-magnetic fields and cosmic rays-contribute to the total pressure of the gravitationally stratified interstellar medium. The thermal gas component is inflated by these nonthermal pressure contributions. The system is prone to a kind of convective instability tending to expel the weightless components from the system and therefore to reduce potential energy of the thermal gas. Linear stability analysis demonstrates that small-amplitude vertical corrugations of magnetic field lines lead to gas motions down to valleys of magnetic fields unweighting tops of the field lines and therefore enhancing the upward motion of the upper parts of the corrugations and a downward motion of gas into the valleys.
The instability, hereafter named Parker instability, was recognized as a plausible mechanism leading to the formation of clouds of gas collecting in valleys of the large-scale galactic magnetic fields (Parker 1967;Blitz and Shu 1980). Parker (1992) proposed, furthermore, that the instability enhanced by continues replenishment of CRs in supernova remnants may lead to an amplification of large-scale magnetic fields in galaxies.
Numerous papers addressing the Parker instability in the ISM by means of linear stability analysis adopted a simplifying assumption that the diffusive propagation of CRs is fast enough to ensure a constant pressure of CRs along magnetic field lines. Ryu et al. (2003b) performed a linear analysis of the Parker instability for the medium composed of thermal gas, horizontal magnetic field and anisotropicaly diffusing CRs. The relevant set of equations consisted of the set of ideal MHD equations and the diffusion-advection equation equivalent to Eq. (37) with the diffusion tensor in the form of Eq. (88) describing the magnetic-field aligned anisotropic propagation of CRs. Their results have shown a strong dependence of the growth rate of the Parker instability on the value of the parallel diffusion coefficient. Lower diffusion coefficients turned out to reduce the growth rate of the instability with respect to the case of constant CR pressure along magnetic field lines, representing the limit of an infinite diffusion coefficient.
The nonlinear evolution of the Parker instability with CRs was addressed by Hanasz and Lesch (2003), who extended the ZEUS 3D code with a stable numerical algorithm combining anisotropic diffusion and advection of CRs and coupled the momentum-integrated diffusion equation with the MHD system equations describing a thermal plasma (see Sect. 4.2). Numerical simulations have shown that the growth rate of the Parker instability depends essentially on the adopted value of the CR diffusion coefficient and that results differ from predictions made on the grounds of linear stability analysis. The difference was plausibly caused by a different triggering mechanism of the instability, which in Ryu et al. (2003b) relied on excitation of a specific 'idealized' eigenmodes, while in ) the instability was excited by an instantaneous injection of CRs in a SN remnant.
The linear and nonlinear analysis of the effects of CR diffusion on the Parker instability was subsequently extended by Kuwabara et al. (2004) who have shown that the growth rate of the Parker instability becomes smaller if the coupling between CRs and thermal gas is stronger (i.e., if the CR diffusion coefficient is smaller). MHD simulations of the Parker instability with an appropriate perturbation confirmed this result. Similar conclusions were derived from studies of Parker-Jeans instability with anisotropic CR diffusion (Kuwabara and Ko 2006). The system is less unstable when the CR diffusion coefficient is smaller (i.e., the coupling between the CRs and plasma is stronger) and if the CR pressure is larger. This conclusion is consistent with the fact that Jeans instability and Parker instability are less unstable when the pressure is higher. In the nonlinear regime the Parker-Jeans instability leads to the formation of gas filaments whose orientation depends on the parallel diffusion coefficient (Kuwabara and Ko 2020). When the diffusion coefficient is large gas filaments form preferentially perpendicular to the magnetic field and when the diffusion coefficient is small the filaments are aligned parallel to the magnetic field. Rodrigues et al. (2016) examined the evolution of the Parker instability in galactic disks using 3D numerical simulations. They have found that the instability develops a multimodal 3D structure, which cannot be quantitatively predicted from the earlier linearized studies. They calculated synthetic polarized intensity and Faraday rotation measure (RM) maps, and the associated structure functions. They suggested that correlation scales inferred from RM maps can be used to probe the cosmic-ray content of galaxies. Heintz and Zweibel (2018) performed a stability analysis of a stratified layer for three different cosmic-ray transport models: decoupled, corresponding to the classic Parker instability, coupled with c c ¼ 4=3 but not streaming, named as modified Parker instability, and coupled with streaming at the Alfvén speed. They demonstrated that cosmic-ray heating of the gas is responsible for the destabilization of the system and concluded that Parker instability with cosmic-ray streaming may play an important role in cosmic-ray feedback. Heintz et al. (2020) expanded the work by including radiative cooling. Heating due to CR streaming has a destabilizing effect which affects significantly the nonlinear regime of the Parker instability. While cooling depressurizes the dense gas, streaming CRs heat and inflate the diffuse extraplanar gas, greatly modifying the phase structure of the medium. The fastest growth affects typically the modes characterized by short wavelengths in the horizontal direction perpendicular to the background magnetic field.

Cosmic ray driven galactic dynamo
The idea of the CR-driven dynamo was originally raised by Parker (1992), who postulated that the buoyancy of CRs together with the Coriolis-force, galactic differential rotation and magnetic reconnection can lead to efficient amplification of galactic magnetic fields.
Preliminary numerical experiments pursued with the aid of the thin flux-tube approximation (Hanasz and Lesch 2000) have shown that the buoyancy of CRs injected in SN remnants may lead, in the presence of the Coriolis force, to an efficient generation of the poloidal magnetic field component out of the initial azimuthal magnetic field (named as a-effect) and to the diffusive transport of magnetic flux in the vertical direction. The CR-induced diffusive transport coefficients for the large mean magnetic field, plugged into the mean-field dynamo equation , (see also Kowal et al. 2006) resulted in exponentially growing solutions, confirming the Parker's conjecture that CRs may efficiently drive the dynamo action on galactic scales.
First MHD numerical simulation models (Hanasz et al. 2004(Hanasz et al. , 2009aSiejkowski et al. 2010) of the CR-driven dynamo were realized in a local, rectangular patch of galactic disk with shearing boundary conditions and rotational pseudo-forces (tidal and Coriolis forces) incorporated to study the magnetic field evolution in rotating galactic disks.
The first global model of the CR-driven dynamo in a Milky Way-type galaxy ) assumed that: (1) Supernovae convert 10% of their explosion kinetic energy into cosmic rays. (2) A weak initial magnetic field of stellar origin was suppled to the system in selected (plerionic type) SN remnants. (3) Differential rotation of the interstellar gas results from an assumed analytical model of an axisymmetric galactic gravitational potential (e.g. Allen and Santillan 1991) or from a computational model of an N-body galactic disk ( Wóltański 2015). (4) The initial gas distribution follows the global Milky Way model by Ferriere (1998).
Magnetic field amplification originating from the small-scale, randomly oriented dipolar magnetic fields is apparent through the exponential growth by several orders of magnitude of both the magnetic flux and the magnetic energy. The growth of magnetic field saturates at about t ¼ 4 Gyr, reaching values of 3 À 5 lG in the disk. During the amplification phase, magnetic flux and total magnetic energy grow by about 6 and 10 orders of magnitude, respectively. The average e-folding time of magnetic flux amplification is approximately equal to 270 Myr, corresponding to the orbital rotation period at the galactocentric radius (% 10 kpc). The magnetic field originated from randomly oriented magnetic dipoles was initially chaotic, as shown in Fig. 11. Later on, the toroidal magnetic field component formed a spiral structure revealing reversals in the plane of the disk. The Magnetic field structure evolved gradually by increasing its correlation scale. The toroidal magnetic field component became almost uniform inside the disk around t ¼ 2:5 Gyr. The volume occupied by the well-ordered magnetic field expanded continuously until the end of the simulation.
To visualise the magnetic field structure in a manner resembling radio observations of external galaxies, synthetic radio maps of the synchrotron radioemission were deduced in a simplified way from the distribution of CR protons. The polarised intensity of synchrotron emission is shown in Fig. 12 together with polarisation vectors. Electric vectors, computed on the basis of integrated Stokes parameters, are rotated by 90 to reproduce the magnetic field direction projected onto the plane of sky. Polarisation vectors, indicating the mean magnetic field direction, reveal a regular spiral structure in the face-on view, and the so-called Xshaped structure in the edge-on view. A particular similarity can be noticed between the edge-on synthetic radio map and the radio maps of observed edge-on galaxies such as NGC 891 (Krause 2009). Kulpa-Dybeł et al. (2011) added an analytical elliptical component to the axisymmetric gravitational potential. In the presence of the bar perturbation the CRdriven dynamo reveals new properties, such as the presence of a ring-like structure as well as a shift of the magnetic arms with respect to the crests of spiral density waves. Wóltański (2015) used N-body simulations to compute gravitational field of a nonaxisymmetric disk with spiral density wave perturbations. Density waves were excited in the disk by addition of a small satellite galaxy. The amplification of magnetic flux by the CR-driven dynamo leads the large-scale magnetic field to the saturation level around 10 lG with local maxima reaching 20 À 30 lG, located near the spiral crests of gas density. Magnetic field vectors are parallel to the gaseous Fig. 11 Distribution of toroidal magnetic field at t ¼ 20 Myr (top-left), t ¼ 700 Myr (top-right), t ¼ 2:5 Gyr (bottom-left), and t ¼ 4:8 Gyr (bottom-right). Unmagnetised regions of the volume are white, while positive and negative toroidal magnetic fields are marked with red and blue respectively. Note that the colour scale in magnetic field maps is saturated to enhance weaker magnetic field structures in disk peripheries. The maximum magnetic field strength are 5:9 Â 10 À4 , 4:4 Â 10 À3 , 1.5 and 29 lG at t ¼ 0:02, 0.7, 2.5 and 4.8 Gyr respectively. Image reproduced with permission from Hanasz et al. (2009b), copyright by AAS are present in practically all numerical realizations of the CR-driven dynamo. The commonly present CR-driven winds leaving galaxies imply magnetic field transport out of the disk and effective magnetisation of intergalactic space. This process is particularly efficient in dwarf galaxies (Siejkowski et al. 2014).
An interesting observation has been made by Butsky et al. (2017) who performed numerical simulations of an isolated galaxy with a stellar feedback prescription in which thermal energy and magnetic energy are supplied by supernova explosions, but CRs are not included. They noted that, similarly to the model of the CR-driven dynamo, the magnetic field reaches equipartition levels over gigayear timescales and raised the question of the relative importance of the two primary types of energy supplied by supernovae: turbulence driven by the thermal expansion of the remnants themselves, or energy released as CRs, since either by itself appears to be sufficient.  Hanasz et al. (2009b), copyright by AAS 6.6 Cosmic ray driven galactic winds As supernovae drive strong shocks into the interstellar medium (ISM), some fraction of the explosion energy is consumed to accelerate ionised particles to relativistic energies, which are then injected into the ISM (Krymskii 1977;Bell 1978;Blandford and Ostriker 1978). This relativistic fluid is coupled to the galactic magnetic field and, in particular the hadronic component, is less prone to energy losses than the gaseous component of the ISM.
The idea of CR wind driving has been proposed by Ipavich (1975) and developed by numerous authors including Breitschwerdt et al. (1991Breitschwerdt et al. ( , 1993, Zirakashvili et al. (1996), Ptuskin et al. (1997), Breitschwerdt et al. (2002), Uhlig et al. (2012), Dorfi and Breitschwerdt (2012), who find that CRs together with magnetic fields and thermal pressure can contribute to the galactic winds phenomenon. Everett et al. (2008) applied a wind model, driven by combined cosmic-ray and thermal-gas pressure, to the Milky Way, and shown that the observed Galactic diffuse soft X-ray emission can be better explained by a wind than by previous static gas models. They find that cosmic-ray pressure is essential to driving the observed wind.
CRs are strongly coupled to magnetic fields and their mutual interaction should be followed in a self-consistent way. Hanasz et al. (2004Hanasz et al. ( , 2009b, Siejkowski et al. (2010), Kulpa-Dybeł et al. (2011) and Heintz and Zweibel (2018) have shown that CRs induce buoyancy effects in the ISM, leading to the break-out of magnetic fields from galactic disks (Parker 1992). Plausibly, such processes are also relevant for star-forming galaxies at high redshift which are observed to have significant magnetic fields at the level of tens of lG (Bernet et al. 2008). Recent observations even demonstrate the existence of large magnetic fields up 50 kpc away from the galaxy, indicating strong large-scale magnetised winds (Bernet et al. 2013).
6.6.1 1D models An early popular model for investigating the impact of CRs on galactic winds are one-dimensional models. Usually, they focus on large scales compared to the details of galactic substructure, such that details of the interstellar medium and substructures inside the galaxy are neglected. The infinitesimally thin disc is modelled as the source of CRs. The simplified geometry does not allow for details of the magnetic field or turbulence to be included, but under the assumption of external confinement of CRs by isotropic turbulent motions or a magnetic field model the radial dynamics of a wind including CRs can be modelled. The first application of CRs hydrodynamics in the context of galactic winds was performed by Breitschwerdt et al. (1991) assuming a steady state flow along magnetic flux tubes. They found that CRs can drive supersonic mass loss for a wide range of parameters. Exploring the parameter space of thermal and CR pressure on the mass loss has been done by Everett et al. (2008) finding that in the Milky Way CRs and thermal pressure contribute equally to the outflow. Extensions of the models to time-dependent winds Dorfi and Breitschwerdt (2012) confirm asymptotic solutions of previous steady state problems and also offer an explanation to the observed galactic CR spectrum. More recent models include a full CR spectrum in a semi-analytic model of advective and diffusive CR transport (Recchia et al. 2016b(Recchia et al. , a, 2017. Whereas the qualitative results of winds are reproduced, the inferred spectra are in tension with observed ones.

3D models
Three dimensional models of the interstellar medium and galaxies including CRs allow for a more natural combination of turbulent motions, a dynamical magnetic field evolution and the interactions with CRs. However, the numerical cost only allows for reduced CR physics and a small parameter range to be explored. Jubelgas et al. (2008) conduct simulations of isolated galaxies including the feedback of the diffusive CR component included in the GADGET code. They find that CRs can significantly reduce the star formation efficiencies of small galaxies, with virial velocities below $ 80 km s À1 , an effect that becomes progressively stronger towards low-mass scales. Yang et al. (2012) simulate the onset of a CR-driven outflow comparable to the Milky Way Fermi bubbles using anisotropic CR diffusion. Uhlig et al. (2012) use SPH simulations of entire isolated galaxies to test the wind dynamics including CRs including CR heating due to streaming. Without magnetic fields, the simulations approximated the Alfvén velocity by the sound speed and only focus on isotropic transport. Similar approaches have been employed by Booth et al. (2013), Salem and Bryan (2014) using isotropic CR diffusion. Both studies find that CRs are able to drive winds, but only for some fraction of the probed parameter space. Booth et al. (2013) note that the effects of CR-driven winds are much larger in dwarf galaxies compared to their Milky Way models, which is still affected by CRs. More recent simulations by Dashyan and Dubois (2020) investigate in more detail the impact of CR on dwarf systems with different CR transport mechanisms. They report a stronger impact on the different transport mechanism than the effective diffusion coefficient.
In a similar framework Salem et al. (2016) investigate the impact of CRs on a forming 10 12 M halo. They find that CR-inclusive runs, contrasted with a run with star formation and energetic feedback, but no CRs, substantially enrich the circumgalactic medium with metals due to robust and persistent outflows from the disk. The CR-inclusive models reveal more diffuse gas at lower temperatures, down to 10 4 K , than the non-CR cases. The CR inclusion leads to a better match of HI, SiIV, CII, and OIV line intensities than the case without CRs. Comparison of gamma-ray luminosity to observational data favor CR diffusion coefficients close to the Milky Way canonical values 3 Á 10 28 cm 2 s À1 .
Galaxy simulations including magnetic fields and anisotropic CR diffusion are performed by Hanasz et al. (2013). Their models are not only able to drive galactic winds with mass loading factors of order unity but also sustain a CR-driven dynamo with magnetic field strength at a several lG level. Pakmor et al. (2016b) compare isotropic and anisotropic diffusion models finding that anisotropic diffusion leads to more realistic magnetic field strengths than in the isotropic case. Including CR streaming in galactic models was investigated by Wiener et al. (2017b), Ruszkowski et al. (2017b). Butsky and Quinn (2018) compare the role of isotropic and anisotropic diffusion and streaming mechanisms for shaping the structure of circumgalactic medium (CGM). They find that all three transport mechanisms result in strong, metal-rich outflows but differ in the temperature and ionization structure of their CGM. Isotropic diffusion results in a spatially uniform, warm CGM that underpredicts the column densities of low ions. Anisotropic diffusion develops a reservoir of cool gas that extends farther from the galactic center, but disperses rapidly with distance. CR streaming projects cool gas out to radii of 200 kpc, supporting a truly multiphase medium.
A systematic parameter study of the impact of CR for different halo masses was performed by Jacob et al. (2018) finding that CRs only have a significant effect on the launching of a wind for halo masses below 10 12 M . The combined effects of AGN and CRs has been simulated by Wang et al. (2020) in isolated galaxies, where the authors argue that CR streaming and the corresponding heating has to be included in order to prevent too efficient cooling of the circumgalactic medium.
Besides the isolated disc setups, recent models include CR transport in cosmological zoom simulations, in which individual galaxies are evolved with high resolution. Buck et al. (2020) rerun simulations from the AURIGA project including CRs. They limit the transport coefficients to the Alfvén speed and include the corresponding heating effect, which changes the structure of the outflow. The modified structure of the circumgalactic medium results in a different distribution of angular momentum and thereby alters the stellar and gaseous disk. In a similar cosmologial setup Hopkins et al. (2021) compare a large variety of subgrid models for CR confinements, which are connected to the MHD equations using effective transport coefficients in the diffusion, the streaming as well as combined unified CR transport. Besides the moderate effect that CRs have on the galactic winds, their models favour effective diffusion coefficients that range from 10 29 À 10 33 cm 2 s À1 , which is based on the comparison with gamma-ray emission.
The details of how CR winds are driven and in particular the relative importance and dynamical interplay between SN-driven motions and CR dynamics cannot be investigated in full galactic models at the current resolutions. Instead simulations of a smaller fraction of the galaxy are needed. Girichidis et al. (2016Girichidis et al. ( , 2018 and Peters et al. (2015) compare thermally and CR driven winds in more sophisticated model of the ISM, which includes a chemical evolution to accurately model the phases of the ISM. The models confirm that CRs alone are able to drive and support an outflow with mass loading of order unity. The CR driven outflows are denser, cooler and smoother than their thermally driven counterparts. Figure 13 illustrates the difference between outflows that are only driven by thermal energy injection by SNe (left-hand panel) and the outflows driven 10% of the energy injected as CRs (three right-hand panels). The two central panels show two different diffusion coefficients. The column density on the right-hand side depicts the structure of the ISM if all SNe explode in dense regions. The resulting strong overcooling illustrates that CRs alone are able to lift gas out of the disk. Similar models by Simpson et al. (2016) compare simulations with advection only and advection plus diffusion. They conclude that diffusion is needed in order to drive an outflow. The decoupling of CR and the thermal gas in a neutral environment leads to a broader spatial distribution of cosmic rays and higher wind speed compared to the uniformly applied advectiondiffusion approach (Farber et al. 2018). A more sophisticated, locally determined coupling between CRs and the gas confirms these results (Holguin et al. 2019).

Cosmic rays in galaxy clusters
CRs are also expected to affect galaxy clusters. One particularly important aspect is the possible heating effect provided by CR protons from AGN. Simulations Fig. 13 Comparison of the column density structure in simulations of the interstellar medium. The lefthand column shows a setup with only thermal energy injection. The three other simulations include CR injection with an efficiency of 10% of the SN energy. The two central panels show two different diffusion coefficients. In the right-hand panel the SNe all explode in dense regions, which limits their thermal impact. This illustrates that CRs alone are able to lift gas out of the galactic disc. Image adapted from Girichidis et al. (2018), copyright by the authors including CR protons produced at shocks reveal that they could provide a substantial additional pressure in the intracluster medium (Miniati et al. 2001b). A follow-up study included CR electrons as an additional non-thermal component (Miniati et al. 2001a). In both cases the CRs are transported with the gas and include cooling effects. Simulations including diffusion of CR protons that are injected in shock regions close to the AGN find large X-ray cavities and radio lobes in the hot diffuse gas of the central regions of galaxy clusters. The cavities are long-lived if the diffusion coefficient does not exceed 10 28 cm 2 s À1 Brighenti 2007, 2008;Ruszkowski et al. 2008). A range of galaxy clusters and shock related CR injection models has been performed by Pfrommer et al. (2007) omitting CR diffusion. They find that CRs can be efficiently accelerated in strong structure formation shocks such as accretion and merger shocks. The high relative fraction of CR pressure in the central region increases the compressibility and allows for more star formation. A detailed discussion on the differences between CR advection, diffusion and streaming by Enßlin et al. (2011) suggest that the details of the transport mechanisms are essential to understand the non-thermal signatures of clusters. Hydrodynamical simulations including CR streaming and the resulting heating effect have been run by Ruszkowski et al. (2017a) find that the CR can efficiently heat the gas and provide a viable channel for the AGN energy thermalization. More recently, Ehlert et al. (2018) couple CRs to an AGN jet model and simulate galaxy clusters including a simplified streaming approach. They introduce an effective CR diffusion coefficient j cr $ l cr v A with the CR gradient length l cr and the Alfvén speed v A , which emulates the combined effects of streaming and spatial diffusion (Sharma et al. 2009;Wiener et al. 2017b). They conclude that the CR-induced Alfvén heating matches the CR heating rates needed to solve the cooling flow problem. Vazza et al. (2012b) model CRs in galaxy clusters by injecting them in numerically identified shocks and advect them with the gas flow. The simulations reveal a small but not dominant impact of CRs on the evolution with changes at the percent level for the temperature, pressure and density distribution.
One of the first attempts to investigate the spectrum of relativistic electrons in cosmological shocks has been made by Keshet et al. (2003) who studies the radiation emitted due to inverse Compton process by shock-accelerated electrons in hydrodynamic cosmological simulations of a Lambda cold dark matter (KCDM) universe. They performed a posteriori detection and analysis of shocks forming in cosmological simulations to predict the spectrum of CR electrons and the spectrum of c-ray photons emitted from cosmological shocks. They subsequently constructed all-sky maps of c-ray emission from the nearby universe. Pfrommer et al. (2006) derived an analytic solution to the one-dimensional Riemann shock tube problem for a composite plasma of CRs and thermal gas. They applied their solution to study the properties of structure formation shocks in highresolution hydrodynamic simulations of the KCDM universe. They found that most of the energy is dissipated in weak internal shocks with Mach numbers M $ 2. They recognized the dynamical importance of shock-injected CRs which is comparatively large in the low-density, peripheral halo infaling regions, buts is less important for the weaker flow shocks ocuring in central high-density regions of haloes. They raised questions of cosmological implications of the CRs component and for observational signatures of this radiation.
In the context of AGN feedback Sijacki et al. (2008) include CRs together with thermal feedback. The two-component fluid with an effectively softer equation of state and the less efficient cooling for CRs compared to the thermal gas allows CR supported bubbles to rise to the outskirts of the clusters and leak into the surrounding intergalactic medium. However, neither the accretion rates onto the black hole nor the star formation rate are significanty affected by the non-thermal component.

Conclusions and outlook
We have reviewed the numerical treatment of CRs assuming a fluid description of this high-energy component. We identified three main approaches in treating CRs numerically. The first is a spectrally resolved approach with a focus on the physical processes of CRs like the primary CRs and the production of secondaries. The most prominent code that represents this approach is GALPROP with more recent frameworks DRAGON and PICARD. The second numerical model focuses on the dynamical coupling of CRs and the thermal gas in (magneto)-hydrodynamical simulations. Here, CRs take the role of dynamical drivers of hydrodynamical flows. In the context of the interstellar medium and galaxy formation CRs are an important additional reservoir of energy. The numerical complexity in this approach is more in the interaction between the two fluids. Simple linear diffusion models are challenged by more complex models that account for the CR streaming instability and couple CRs to the gas including more plasma effects. A third approach tries to spectrally resolve CRs and at the same time include the dynamical impact onto the gas dynamics. The numerical and computational complexity forces this approach to be limited in both spectral complexity and multiple species of CRs and in the details of plasma interactions of relativistic and thermal gas.
Depending on the application the currently developed models still face major limitations. In the interstellar medium and for the dynamics of galaxies all three approaches would need to be merged and coupled. For the connection with observations-in particular for a comparison with the Milky Way-a detailed modeling of secondaries is of major importance. Similarly, a spectrally resolved modeling of CR electrons as tracers of local thermal and CR properties are key to understand the galaxies. The structure of the magnetic field in galaxies is strongly connected to the dynamics of CRs. Magnetohydrodynamical simulations naturally include an evolving magnetic field that follows and/or shapes spiral structures and interacts with dynamos and local instabilities of the gas like Parker loops. The combined CR-MHD models using a single CRs fluid approach reveal that CRs can be dynamically important for shaping the magnetic field, structuring the interstellar medium, and driving galactic outflows and winds. However, the relative importance of CRs compared to other processes in galaxies as well as which details of CR physics are the crucial ones to resolve, include and simulate are still strongly debated. Currently, there is converging consensus that CRs-in particular the dynamically more relevant CR protons-should be coupled to the thermal gas and treated in a time-dependent manner. The missing physical aspects in this approach like a spectral CR description and the inclusion of multiple species and secondaries are on the agenda for the near future.