1 Preface

At this stage in the development of the theory of core-collapse supernovae two possible explosion mechanisms are most often discussed: neutrino-driven and magneto-rotationally-driven. The state of the theory is not sufficiently well developed to determine whether or not there is a clear break between these two cases or whether they represent limiting cases of a continuum. Nonetheless, in any scenario, the physics discussed here is relevant. It either dominates, leading to a neutrino-driven explosion, or sets the stage for a magneto-rotationally-driven supernova. That is, core-collapse supernova theorists have no choice but to first master and, more important, implement realistic models of neutrino transport in core-collapse supernova environments. What is meant by “realistic” will hopefully become clear as we progress through this review, but what will also hopefully become clear: Challenges to achieving realism will be faced on multiple fronts: physical, numerical, and computational.

When charged to write this review, we were asked not to provide an encyclopedic review of past work in the field but, rather, to present the current issues and challenges faced by the core-collapse supernova modeling community, particularly as they pertain to what is arguably the most difficult aspect to model: neutrino transport. Thus, with this charge in mind, we have written our review with an emphasis on the future, on what modelers must and will face to develop realistic models of these most important events.

2 Setting the stage

The idea that core-collapse supernovae could be neutrino driven was first proposed more than 50 years ago by Colgate and White (1966) in their seminal numerical study. This work set neutrinos front and center in core-collapse supernova theory, which has remained the case ever since. The Colgate and White studies were followed by the early studies of Wilson (1971) that cast doubt on the efficacy of their proposal. But the development of the electroweak theory, which predicted the existence of weak neutral currents, would change all that. Given weak neutral currents, Freedman recognized that it would be possible for neutrinos to scatter off of the nucleons in a nucleus collectively. The cross sections for such scattering would be proportional to the nuclear neutron number, N, and would consequently be large. Shortly thereafter, Wilson (1974), using the new weak interaction cross sections for this process, demonstrated that the Colgate and White proposal was in fact viable. The recognition of this intertwined relationship between core-collapse supernova physics and neutrino weak interaction physics drives continued research to this day. Nearly 40 years of further study in the context of the assumption of spherical symmetry was set in motion by this early and foundational work, which traversed a range of descriptions of neutrino transport in stellar cores, a range of sophistication of the treatment of the microphysics input included in the models, which includes the neutrino weak interaction physics and the equations of state describing a stellar core’s nuclear, leptonic, and photonic degrees of freedom.

Neutrino mediation of core-collapse supernova dynamics in its modern instantiation is through charged-current absorption of electron neutrinos and antineutrinos on neutrons and protons, respectively. The nucleons become available as the stalled supernova shock wave dissociates the nuclei in the infalling stellar core material as the material passes through it. The neutrino absorption heats the material, depositing energy behind the shock. The shock loses energy initially to dissociation and neutrino losses. When sufficient energy is deposited by neutrino heating, the shock again becomes dynamical, propagates outward in radius, and reverses the infall of material passing through it, to disrupt the star in a core-collapse supernova (Wilson 1985; Bethe and Wilson 1985). This modern instantiation of neutrinos’ role in the supernova mechanism relies on the developments surrounding the large neutrino–nucleus scattering cross sections discussed earlier. Arnett (1977) was the first to show that such cross sections led to the trapping of the electron neutrinos produced during stellar core collapse through electron capture on nuclei and protons. He demonstrated that, despite their nature as weakly interacting particles, the densities in the stellar core rise sufficiently rapidly to render the electron neutrino mean free paths smaller than the size of the stellar core. Neutrino trapping gives rise to a trapped degenerate sea of electron neutrinos in the inner stellar core that emerge after stellar core bounce and the launch of the supernova shock wave from the proto-neutron star on diffusive time scales.

The proto-neutron star comprises the inner cold unshocked core and a hot shocked mantle of material above it that is not ejected by the shock. Electron degeneracy is lifted in the hot mantle, leading to a significant population of electron–positron pairs, which in turn leads to the production of neutrinos and antineutrinos of all three flavors via electron–positron annihilation. The densities in the mantle are sufficiently high that neutrinospheres for all three flavors of neutrinos and antineutrinos exist, all lying within kilometers of each other, as a function of flavor and energy, in the density cliff that defines the proto-neutron star surface. The post-bounce stratification of the core, setting the stage for neutrino shock revival is shown in Fig. 1. Neutrinos of all three flavors emerge from their respective neutrinospheres at the proto-neutron star surface. Between the proto-neutron star surface and the shock, neutrino heating and cooling take place through charged-current electron neutrino and antineutrino absorption on and emission by nucleons, respectively. The different radial dependencies of neutrino heating and cooling lead to net heating above the “gain radius” and net cooling below it. The region between the gain radius and the shock, where net neutrino heating takes place, is known as the gain region.

Fig. 1
figure 1

Schematic showing the characteristic structure after stellar core bounce and the stall of the supernova shock wave seen in all core-collapse supernova models. All three flavors of neutrinos, together with their antineutrino partners, emanate from the proto-neutron star. Here, a single surface characterizes the proto-neutron star surface and the “neutrinosphere,” the surface of last scattering for the neutrinos. In reality, there are multiple surfaces, although they are very close together. The neutrino interaction cross sections are flavor and energy dependent. Consequently, there is a neutrinosphere for each neutrino flavor and energy “group” in core-collapse supernova models. Between the proto-neutron star surface and the stalled shock wave is the so called gain radius, separating the region of net neutrino cooling (below the gain radius) from net neutrino heating (above the gain radius). Neutrino heating is mediated by charged-current absorption of electron neutrinos (antineutrinos) on neutrons (protons) below the shock, liberated by shock dissociation of nuclei as they pass through it. Cooling is mediated by the inverse weak interactions. Neutrino heating in the “gain region” between the gain radius and the shock is central to the neutrino-driven core-collapse supernova mechanism. Given this neutrino heating, the gain region becomes convectively unstable. Neutrino-driven turbulent convection in this region assists neutrino heating to generate a supernova. The goal is to reverse the infall of the material ahead of the shock and for the shock itself to propagate outward. The neutrino heating in the gain region is sensitive to the neutrino luminosities, spectra, and angular distributions there, all of which depend on the transport of neutrinos through the semitransparent neutrinospheric region, where the neutrinos are neither diffusive nor free streaming

The energy deposition rate per gram of material in the gain region can be expressed in terms of the electron neutrino and antineutrino luminosities, squared rms energies, and inverse flux factor as

$$\begin{aligned} {\dot{\varepsilon }}=\frac{X_n}{\lambda _{0}^{a}}\frac{L_{\nu _e}}{4\pi r^2} \left\langle E^{2}_{\nu _e} \right\rangle \left\langle \frac{1}{{\mathscr {F}}_{\nu _e}} \right\rangle +\frac{X_p}{{\bar{\lambda }}_{0}^{a}}\frac{L_{{\bar{\nu }}_e}}{4\pi r^2} \left\langle E^{2}_{{\bar{\nu }}_e} \right\rangle \left\langle \frac{1}{{\mathscr {F}}_{{\bar{\nu }}_e}} \right\rangle , \end{aligned}$$
(1)

where \(\varepsilon \) is the internal energy of the stellar core fluid per gram, \(X_{n,p}\) are the neutron and proton mass fractions, respectively, \(L_{\nu _e,{\bar{\nu }}_e}\) are the electron neutrino and antineutrino luminosities, respectively, \({\mathscr {F}}_{\nu _e,{\bar{\nu }}_e}\) are the inverse flux factors for the electron neutrinos and antineutrinos, respectively, and \(\lambda _{0}^{a}, {\bar{\lambda }}_{0}^{a}\) are constants related to the weak interaction coupling constants. Thus, knowledge of the neutrino luminosities, spectra, and angular distributions are needed to compute the neutrino heating rates. This requires knowledge of the neutrino distribution functions, \(f_{\nu _e,{\bar{\nu }}_e}(r,\theta ,\phi ,E,\theta _{p},\phi _{p},t)\), from which these quantities can be calculated. The neutrino distribution functions are determined by solving their respective Boltzmann kinetic equations, which will be discussed later. Thus, the core-collapse supernova problem is a phase space problem, in the end involving 6 dimensions plus time. The common parlance, dividing core-collapse supernova models between “1D” (spherical symmetry), “2D” (axisymmetry), or “3D” models is quite misleading. In reality, the dimensionality is 3D for spherical symmetry, involving 1 spatial dimension (radius) and 2 momentum-space dimensions (neutrino energy and a single direction cosine), 5D for axisymmetry, involving 2 spatial dimensions (radius and \(\theta \)) and 3 momentum-space dimensions (neutrino energy and 2 direction cosines), and 6D, involving 3 spatial dimensions (radius, \(\theta \), and \(\phi \)) and 3 momentum-space dimensions (neutrino energy and 2 direction cosines).

The central densities of the proto-neutron star reach values between \(10^{14}\) and \(10^{15}\mathrm {\ g\ cm}^{-3}\). Its mass, which is \(O(1)\,M_\odot \), is initially contained within a radius O(100) km. Such conditions are not Newtonian. Detailed comparisons made in the context of spherically symmetric models of core-collapse supernovae (Bruenn et al. 2001) between Newtonian and general relativistic models revealed the dramatic differences in the overall “compactification” of the postbounce core configuration defined by the neutrinosphere, gain, and shock radii, as well as the dramatic difference between the magnitudes of the infall velocities through the gain region. Moreover, neutrino luminosities and rms energies were increased in the general relativistic case due to the higher core temperatures. These studies made obvious the fact the core-collapse supernova environment is a general relativistic environment.

Models that assume spherical symmetry reached the needed level of sophistication only fairly recently, with fully general relativistic models that included Boltzmann neutrino transport, an extensive set of neutrino weak interactions, and, at the time, an industry-standard equation of state (Liebendörfer et al. 2001; Lentz et al. 2012b). The outcomes of these models were quite discouraging. In all cases, the shock radius reaches a maximum and then recedes with time until the simulations are terminated. Explosion does not occur, and the end outcome of each simulation would be the formation of a stellar-mass black hole.

With the exception of the lowest-mass massive stars (Kitaura et al. 2006) it became clear the Colgate and White proposal was doomed to fail without the aid of additional physics. Specifically, the assumption of spherical symmetry had to be eliminated. In retrospect, it is now obvious why: Neutrino emission by the proto-neutron star, driving the explosion above, is fueled by the accretion of stellar core material onto it. Explosion in spherical symmetry would cut off such accretion entirely once initiated, cutting off the fuel that drives the neutrino emission that drives the explosion. Unless accretion and explosion can occur simultaneously, we are presented with a Goldilocks problem: Enough energy has to be deposited behind the shock before explosion occurs. But for sufficiently energetic explosions, an explosion cannot occur too soon. And given that the accretion rates decrease with time, due to stellar core density profiles, an explosion also cannot occur too late.

The first two-dimensional core-collapse supernova simulations by Herant et al. (1992, 1994) demonstrated that accretion and explosion naturally coexist in the postshock flow. Heating by the proto-neutron star from below generates convection in the gain region. Such “neutrino-driven” convection allows continued accretion while some of the material is heated, expands, and moves outward. Lower-entropy, accreting fingers are evident in Fig. 2, as well as higher-entropy rising plumes. The Herant et al. studies opened the next, much-needed chapter in core-collapse supernova theory. As with spherically symmetric modeling, axisymmetric modeling continues to this day. (See Müller 2020 for a focused and comprehensive review on convection and other fluid instabilities in core collapse supernova environments that are integral to the supernova explosion mechanism.)

Fig. 2
figure 2

Image reproduced with permission, copyright by AAS

Snapshot of neutrino-driven convection at 25 ms after bounce in the two-dimensional core-collapse supernova model of Herant et al. (1994) initiated from a \(25\,M_\odot \) progenitor

The core-collapse supernova modeling community has not yet produced general relativistic axisymmetric models with Boltzmann neutrino transport and with industry-standard weak interaction physics and equations of state, but significant progress has been made. The first simulations to evolve both the neutrino spectra and their angular distributions were performed by Ott et al. (2008). Included were the spatial advection terms on the left-hand side of the Boltzmann equation (corresponding to neutrino transport in each of the spatial dimensions) and the collision term on the right-hand side of the equation (corresponding to neutrino sources and sinks due to emission, absorption, and scattering) with a subset of the weak interactions considered complete today. The simulations were purely Newtonian. Neglected were all relativistic effects in the Boltzmann kinetic equations, describing special relativistic Doppler shift of neutrino energies, general relativistic blue and red shift of neutrino energies, angular aberration of neutrino propagation, etc. Outcomes from their multi-angle, multi-frequency approach were compared with outcomes from a similar simulation performed with multigroup flux-limited diffusion. Notable differences were obtained between the two transport approaches in the results obtained for neutrino radiation field quantities entering the expression for neutrino heating, Eq. (1)—specifically, the inverse flux factors and rms energies—, which translated into notable differences in neutrino heating, which were up to a factor of 3 for rapidly rotating cores. More recent studies assuming axisymmetry by Nagakura et al. (2018) implemented special relativistic Boltzmann neutrino transport with a subset of the neutrino weak interactions regarded as essential in today’s leading multi-physics models, coupled to Newtonian hydrodynamics and gravity. In light of their Boltzmann implementation, these authors were able to make assessments regarding the fundamental assumption at the heart of the most commonly used closure prescription—the so-called M1 closure—currently in use in most multi-dimensional supernova studies deploying multidimensional neutrino transport in a moments approach we will discuss shortly. Nagakura et al. find that the assumption that the neutrino radiation field is not in fact axisymmetric about the outward radial direction, reflected in non-negligible off-diagonal components of the Eddington tensor—specifically, \(k^{r\theta }\). The authors emphasize how such components play a non-negligible role in the evolution of the neutrino fluxes, increasing the neutrino luminosities by \(\sim \)10%. The neutrino heating rate, Eq. (1), is then increased commensurately. Experience has shown that corrections at this level in any or all of the quantities entering the neutrino heating rate are noteworthy and warrant continued exploration, perhaps for all models, but especially in light of marginal cases of explosion for some, perhaps many, progenitors.

Not unexpectedly, given the physical complexity and the computational cost, no simulations have been performed to date that deploy three-dimensional general relativistic Boltzmann neutrino transport in general relativistic core-collapse supernova models—i.e., including general relativistic hydrodynamics and gravity. This is a long-term goal and, as made clear by what we have learned in the context of studies in spherical and axisymmetry, a needed goal. Nonetheless, three-dimensional core-collapse supernova modeling of increasing sophistication is ongoing. The first three-dimensional core-collapse supernova models were performed by Fryer and Warren (2004) using gray (neutrino angle- and energy-integrated) radiation hydrodynamics. The first spectral (neutrino-angle integrated) three-dimensional models were performed by Hanke et al. (2013). The current stable of spectral three-dimensional models fall under two categories. Both implement spectral (but not multi-angle) neutrino transport in a one- or two-moment approach. In one category, the so-called “ray-by-ray” approximation is used. In the other, the neutrino transport is three dimensional. (A clarifying remark: The simulations by Hanke et al. used a Boltzmann solver in the context of their ray-by-ray approach. As such, some angular dependence was kept. However, three-dimensional models require two angles to describe a neutrino’s propagation direction, and in the ray-by-ray approach the angular dependence in one of the angles is approximate in the sense that it is computed assuming spherical symmetry.)

The earliest three-dimensional models—e.g., those of Hanke et al.–implemented ray-by-ray transport. In the ray-by-ray approach, the three-dimensional neutrino transport problem is broken up into \(N=N_{\theta }\times N_{\phi }\) spherically symmetric problems, where \(N_{\theta ,\phi }\) are then number of \(\theta \), \(\phi \) zones used in the simulation. The ray-by-ray approximation follows lateral neutrino transport under the assumption of spherical symmetry, meaning there is lateral transport of individual neutrinos, but the net lateral flux is zero. (For example, neutrinos can propagate along the segment between A and B in Fig. 3, but an equal number of neutrinos must propagate along the path between C and B, such that the net flux at point B is purely radial.) Moreover, as illustrated by Fig. 3, neutrino heating at a point in the gain region may be over- or under-estimated. Consider the point B in the heating region. The backward cone emanating from point B subtends a portion of the neutrinosphere, between points A and C, that is the source of the neutrinos that heat the material at point B. The ray-by-ray approximation, which assumes spherical symmetry for each ray, assumes that the thermodynamic conditions across the neutrinosphere between points A and C are the same as those at point D. If point D is a hot spot, the ray-by-ray approximation will compute the heating at point B assuming the neutrinosphere between points A and C is hot. For neutrino heating at point F, and assuming that point H is not a hot spot, the ray-by-ray approximation will assume that conditions at point H are mimicked across the portion of the neutrinosphere between points E and G, regardless of the fact that point D is hot and within that portion of the surface. Thus, the neutrino heating at point B will be overestimated, whereas the neutrino heating at point F will be underestimated. Whether or not the ray-by-ray approximation leads to significant over- or under-estimations of the neutrino heating over the course of the shock reheating epoch will of course depend on whether or not such variations in the thermodynamic conditions across the neutrinosphere persist, which requires a comparison taking into consideration the time dependence of such thermodynamic conditions. Comparisons between ray-by-ray and non-ray-by-ray approaches in the context of axisymmetric core-collapse supernova models found notable differences in, among other outcomes, the time to explosion (Skinner et al. 2016). However, more recent comparisons in the context of three-dimensional models found no significant differences between the two approaches (Glas et al. 2019). Of course, without three-dimensional transport implementations, it would be difficult to assess the efficacy of using the ray-by-ray approach, or other approximations. In the end, such approximations must be removed, if only just to check them. The ray-by-ray approach of the Oak Ridge group is based on one-moment closure through flux-limited diffusion (Bruenn et al. 2020). They follow the evolution for the lowest angular moment of the neutrino distribution: the number density. The Max Planck group’s ray-by-ray implementation is based on two-moment closure (Rampp and Janka 2000). They solve an approximate Boltzmann equation for the purposes of computing the variable Eddington factor needed to close the system of equations describing the evolution of the first two moments of the neutrino distribution (in spherical symmetry, there is only one first moment, corresponding to the radial number flux, together with the zeroth moment, the neutrino number density).

Fig. 3
figure 3

Schematic showing the key characteristics of the ray-by-ray neutrino transport approximation. Along each radial ray (e.g., along segments DB or HF), a complete solution to the spherically symmetric neutrino transport equations is obtained assuming spherical conditions given by the conditions along each ray. This approximation afforded the ability to implement sophisticated transport solvers that had been developed in the context of models of core-collapse supernovae assuming spherical symmetry, at the expense of ignoring net lateral transport that would occur in multiple spatial dimensions. In spherical symmetry, neutrinos can propagate along the segment AB, which is clearly not a purely radial segment. Therefore, there is lateral transport. However, in spherical symmetry, every neutrino propagating along AB is matched by a neutrino propagating along CB, and the net flux at point B is purely radial. The lateral fluxes cancel exactly. Focusing on neutrino heating at point B, the ray-by-ray approach assumes that the thermodynamic conditions across the proto-neutron star surface (i.e., the neutrinosphere) between points A and C are uniform and given by the thermodynamic conditions at point D. Given a temporary hot spot at point D on the surface, the neutrino heating at point B would be overestimated. Moreover, were point H significantly cooler, relatively speaking, at the same instant, heating at point F would be underestimated because the hot spot at point D would be ignored even though it is within the cone of neutrino trajectories contributing to the neutrino heating at F. Thus, the ray-by-ray approximation may lead to larger angular variations in the neutrino radiation field than would be present were three-dimensional transport used—particularly if the hot spots on the proto-neutron star surface persist

For both two- and three-dimensional core-collapse supernova models that attempt to include general relativity at some level of approximation, if not exactly, Newtonian or general relativistic hydrodynamics, and two- or three-dimensional neutrino transport are all based on the solution of the neutrino moments equations describing the evolution of the lowest angular moments of the neutrino distribution function. For example, in terms of the neutrino distribution function, the number moments (spectral number density, spectral number flux) are defined as

$$\begin{aligned} {\mathscr {N}}(r,\theta ,\phi ,E,t)\equiv & {} \int _{0}^{2\pi }d\phi _p\int _{-1}^{+1}d\mu f(r,\theta ,\phi ,\mu ,\phi _p,E,t), \end{aligned}$$
(2)
$$\begin{aligned} {\mathscr {F}}^{i}(r,\theta ,\phi ,E,t)\equiv & {} \int _{0}^{2\pi }d\phi _p\int _{-1}^{+1}d\mu n^{i}f(r,\theta ,\phi ,\mu ,\phi _p,E,t), \end{aligned}$$
(3)

where \(\mu \equiv \cos \theta _p\) is the neutrino direction cosine defined by \(\theta _p\), one of the angles of propagation defined in terms of the outward pointing radial vector defining the neutrino’s position at time t. In three dimensions, two angles are needed to uniquely define a neutrino propagation direction. The angle \(\phi _p\) provides the second. \(n^i\) is the neutrino direction cosine in the \(i{\mathrm{th}}\) direction, whose components are given as functions of \(\mu \) and \(\phi _p\). E is the neutrino energy. \(E,\theta _p,\phi _p\) can be viewed as spherical momentum space coordinates. Above, \({\mathscr {N}}\) and \({\mathscr {F}}^i\) are the number density and number fluxes, respectively. In three dimensions, there is of course a number flux for each of the three spatial dimensions, delineated by the superscript i. Integration of the neutrino Boltzmann equation over the angles \(\theta _p\) and \(\phi _p\), weighted by 1, \(n^i\), \(n^{i}n^j, \ldots \) defines an infinite set of evolution equations for the infinite number of angular moments of the distribution function, which is obviously impossible to solve. In a moments approach to neutrino transport, the infinite set of equations is rendered finite by truncation, after the equation for the zeroth moment in the case of one-moment closure (e.g., flux-limited diffusion) or after the equations for the first moments in the case of two-moment closure (e.g., M1 closure). In the latter case, closure can be “prescribed” (e.g., M1 closure) or computed (e.g., through a variable Eddington tensor approach). We will discuss these approaches in greater detail later in our review. It is important to understand the essence of the approximations being made in moments approaches to neutrino transport in core-collapse supernova models. One does not integrate out all of the angular information contained within the neutrino distribution function. Some angular information remains. The higher the closure is made in the order of moment equations, the more angular information is kept. For example, two-moment closure keeps the fundamental angular dependencies. The ratio of the number flux in any of the three dimensions to the number density, at any spacetime point, is a measure of how forward peaked the neutrino angular distribution is in that dimension at that point. Thus, a moments approach retains much of the information of the neutrino radiation field contained within the neutrino distribution functions, while providing a sophisticated modeling path forward that is achievable on present leadership-class computing systems. Direct Boltzmann solutions for the neutrino radiation field will have to wait until sustained exascale computing platforms become available over the next decade.

Three-dimensional models that include an approximation to general relativistic gravity in the form of an “effective potential,” Newtonian hydrodynamics, ray-by-ray one- or two-moment neutrino transport with some corrections for special relativity \((O(v=c))\) or general relativity (gravitational redshift of neutrino energies), and a state-of-the-art set of neutrino weak interactions have been performed by the Max Planck and Oak Ridge groups (Hanke et al. 2013; Lentz et al. 2015; Melson et al. 2015a, b; Summa et al. 2018). Three-dimensional models that include general relativistic hydrodynamics and gravity, and three-dimensional, general relativistic, O(1) or fully relativistic (special and general) two-moment neutrino transport with an extensive set of neutrino weak interactions have been performed by Roberts et al. (2016) and Kuroda et al. (2016), respectively. Three-dimensional models that couple Newtonian hydrodynamics and approximate general relativistic gravity, as above, to three-dimensional two-moment neutrino transport with corrections for special and general relativity, as above, and an extensive set of neutrino weak interactions were performed by O’Connor and Couch (2018); Vartanyan et al. (2019); Burrows et al. (2019).

It is clear the core-collapse supernova modeling state of the art in three dimensions is evolving, with some models classifiable as more complete macrophysically—i.e., that implement three-dimensional, general relativistic gravity, hydrodynamics, and neutrino transport—and some models classifiable as more complete microphysically—i.e., that include state-of-the-art microphysics.

Neutrino mass, albeit small in relation to the neutrino energies attained in core collapse supernovae, leads to neutrino flavor transformations. There is growing, though still inconclusive, evidence that such transformations may play a role in neutrino shock reheating (e.g., see Tamborra et al. 2017; Abbar et al. 2019; Delfan Azari et al. 2019). The existence of so called “fast” flavor transformations, which can exist even in the baryon-laden environment below the supernova shock wave, was first brought to the attention of the supernova modeling community by Sawyer (2005). Prior to this work, it was assumed that quantum mechanical coherence among the neutrinos in the region beneath the shock would de-cohere due to neutrino–matter collisions, thereby rendering such effects unimportant to neutrino shock reheating. However, fast modes operate on scales much shorter than a neutrino mean free path and, in fact, are not wiped out by collisions and beg to be considered. As in the classical case, the story boils down to capturing the neutrino angular distributions for all three flavors of neutrinos, as a function of space and time during the evolution of the supernova. The neutrinospheres for the three neutrino flavors are distinguished first and foremost by their interactions with the stellar fluid, with electron neutrinos and antineutrinos interacting through both charged and neutral currents and the muon and tau neutrinos interacting only through neutral currents. Moreover, the preponderance of neutrons over protons reduces the opacity of the stellar fluid to electron antineutrinos, and a hierarchy sets in, with the muon and tau neutrinospheres at the highest densities, followed by the neutrinosphere associated with the electron antineutrinos, followed in turn by the neutrinosphere associated with the electron neutrinos, at the lowest densities, relatively speaking. Given the layering of the neutrinospheres, at a given time during neutrino shock revival, the neutrino angular distributions at a given spatial location in the cavity between the neutrinospheres and the shock will differ by flavor. It is the differences between the angular distributions of each flavor that sets the stage for fast flavor transformation.

Thus, the need, in the classical case, for a Boltzmann description of the neutrino radiation field is multifold: (1) Moments approaches are approximations, whose efficacy cannot be known a priori and must be checked against the exact (classical) result. Examples of this will be discussed here. (2) The development of closure prescriptions for moment models is rife with difficulty, partially because of nonlinearities introduced by the closure procedure. For example, a numerical method for two-moment, multifrequency, general relativistic neutrino transport that respects Fermi–Dirac statistics does not yet exist and will be difficult to develop. Furthermore, the development of nonlinear moment models beyond the two-moment approximation, to capture more kinetic effects, will be even more challenging. (3) Boltzmann and low-order moments approaches can be used together to accelerate convergence of the solution to the Boltzmann equation, potentially becoming competitive, in terms of speed and memory use, with nonlinear, high-order moments approaches. (4) The exploration of fast flavor transformations on the core collapse supernova mechanism will require precise knowledge of the neutrino angular distributions for all three flavors across spacetime of a supernova model. Such information can be obtained only through a solution of the classical Boltzmann kinetic equations for each neutrino flavor in association with simulation of the coherent quantum effects—i.e., through a solution of the multi-angle, multi-frequency neutrino quantum kinetics equations for all neutrino flavors.

While the justification for deploying Boltzmann kinetics in the classical case can be made, it is through a combination of Boltzmann and moments approaches that progress will be made in both the near and the long term. We are attempting to address myriad science questions, and past experience already tells us that the answer to these questions will vary with characteristics of the massive progenitors in which core collapse supernovae occur. How do massive stars explode? Which explode and which do not? Among those that explode, what elements do they produce? How do they contribute to galactic chemical evolution? And the list goes on. At present, there is no foreseeable time at which all of these questions will be addressable with Boltzmann methods, let alone quantum kinetics. An uncountable number of models will ultimately be required to understand the death of the diverse population of stars we are presented with in nature, as well as the death of any one of them. Our understanding of stellar death will not come from a single “hero” simulation, but from many simulations. Thus, it is in the application of both Boltzmann (classically) and moments approaches and, through this, the development of ever more realistic moments approaches that we will be able to advance our knowledge of one of the most important phenomena in the Universe. This is already clear from the modeling history to date. We have come a long way since Colgate and White’s seminal work through precisely the hybrid approach discussed here. Hence, this review will focus on both approaches, as well as point to potentially efficacious hybrid approaches that could be developed and deployed in the future.

3 Design specifications

There have been many lessons learned during the 54 years that have passed since the first numerical simulations of core-collapse supernovae were performed by Colgate and White. These lessons can now be used to construct a list of design specifications for models of neutrino transport that will be used in future core-collapse supernova models:

  1. 1.

    Ultimately, definitive simulations of core-collapse supernovae in the classical limit will require a Boltzmann kinetic description of neutrino transport for all three flavors of neutrinos and their antineutrino partners.

  2. 2.

    In the event sufficient evidence points to the need to consider in greater detail the impact of neutrino quantum kinetics on the supernova explosion mechanism, a quantum kinetics description of neutrino transport would be required. A classical Boltzmann description would be the natural, and required, starting point for the development of a such a quantum kinetics treatment.

  3. 3.

    The simulations must be general relativistic. They must include special and general relativistic effects such as Doppler and red/blue shifts of neutrino energy, respectively, and angular aberration in both cases, due to fluid motion and spacetime curvature.

  4. 4.

    These simulations must include all of the neutrino weak interactions that have been to date demonstrated to be important, and the description of the interactions must be state of the art.

  5. 5.

    The quality of core-collapse supernova simulations will ultimately be gauged by, among other things, the degree to which lepton number and energy are conserved. More specifically, the discretizations of the integro-partial differential Boltzmann equations must conserve lepton number and energy simultaneously.

  6. 6.

    The discretizations of the Boltzmann equations—in particular, the collision terms—must accommodate both small- and large-energy scattering.

  7. 7.

    The numerical methods must also accommodate realistic equations of state for the nuclear, leptonic, and photonic components. In cases where the neutrino opacities depend on the nuclear force model, the neutrino opacities and the equation of state must be consistent.

  8. 8.

    In the interim when moments approaches to neutrino transport must be used until Boltzmann approaches become feasible, all of the above design specifications still hold.

  9. 9.

    For moments models, the closures used must respect the Fermi–Dirac statistics of neutrinos, reflecting the fact that the neutrino distribution functions are bounded.

4 The equations of neutrino radiation hydrodynamics

In core-collapse supernova models, the stellar fluid is modeled as a perfect fluid, augmented by an equation for the electron density in order to accommodate a nuclear equation of state. (For brevity of presentation, we will not include effects due to electromagnetic fields.) The relevant equations are then

$$\begin{aligned} \nabla _{\nu }J_{\text{ B }}^{\nu }&= 0, \end{aligned}$$
(4)
$$\begin{aligned} \nabla _{\nu }T_{\text{ fluid }}^{\mu \nu }&= - G^{\mu }(f_{\nu _{e}},f_{{\bar{\nu }}_{e}},\ldots ), \end{aligned}$$
(5)
$$\begin{aligned} \nabla _{\nu }J_{e}^{\nu }&= - m_{\text{ B }}\,L(f_{\nu _{e}},f_{{\bar{\nu }}_{e}},\ldots ), \end{aligned}$$
(6)

where the baryon rest-mass density current is

$$\begin{aligned} J_{\text{ B }}^{\nu } = \rho \,u^{\nu }, \end{aligned}$$
(7)

where \(\rho =m_{\text{ B }}\,n_{\text{ B }}\) is the baryon rest-mass density, \(m_{\text{ B }}\) the average baryon (rest) mass, \(n_{\text{ B }}\) the baryon density, and \(u^{\nu }\) is the fluid four-velocity. The fluid energy-momentum tensor is

$$\begin{aligned} T_{\text{ fluid }}^{\mu \nu } = \rho \,h\,u^{\mu }\,u^{\nu } + p\,g^{\mu \nu }, \end{aligned}$$
(8)

where \(h=1+(e+p)/\rho \) is the specific enthalpy, e the internal energy density, and p the pressure. The electron density current is given by

$$\begin{aligned} J_{e}^{\nu } = \rho \,Y_{e}\,u^{\nu }, \end{aligned}$$
(9)

where \(Y_{e}\) is the electron fraction. The electron density (technically electron minus positron density) is \(n_{e}=\rho \,Y_{e}/m_{\text{ B }}\). To close the system given by Eqs. (4)–(6), the pressure p is given by an equation of state (EOS); e.g., \(p=p(\rho ,e,Y_{e})\).

The source terms on the right-hand sides of Eqs. (5) and (6), \(-G^{\mu }\) and \(-L\), describe four-momentum and lepton exchange between the fluid and neutrinos. These terms depend on the neutrino distribution functions (or moments of the neutrino distribution functions), as already noted in Sect. 2 , as well as on thermodynamic properties of the stellar fluid. This nonlinear coupling is the key to the supernova mechanism, and associated observables, and is the topic of the present review.

Fig. 4
figure 4

Plots of the neutrino and antineutrino mean free paths at 100 ms after bounce, during the neutrino shock reheating epoch, for all three flavors of neutrinos at select energies. The upper left and right panels show plots of the electron-neutrino and anti-neutrino mean free paths, respectively. The lower left and right panels show plots of the heavy-flavor (\(\mu \) and \(\tau \)) neutrinos and antineutrinos, respectively. The data used to generate the plots are taken from a supernova model beginning with a \(12\,M_\odot \) progenitor and evolved with the Chimera supernova code. To set the correct physical scale against which the mean free paths can be compared, we indicate the location of the various neutrinospheres and the shock wave. All four plots demonstrate that, as we move out in radius to lower densities, all of the mean free paths plotted vary from being much less than to much greater than the neutrinosphere radii—i.e., to the characteristic spatial scale of the proto-neutron star. Consequently, the neutrinos will not behave in a fluid-like manner everywhere, and a kinetic rather than a fluid description of them is necessary

4.1 The need for a kinetic description of neutrinos

Figure 4 shows the magnitude of the neutrino transport mean free paths for the electron neutrino, electron antineutrino, and heavy-flavor neutrinos (muon and tau neutrinos and their antineutrinos). The mean free paths are given at a time of 100 ms after bounce, during the critical shock reheating epoch, in the context of a Chimera supernova simulation of a \(12\,M_\odot \) star. They are given as a function of radius, for select neutrino energies. Also shown are the neutrinospheres for the select energies, as well as the radius of the stalled shock wave. For all neutrino flavors and energies, the mean free paths exceed the respective neutrinosphere radii, as well as the shock radius, at some radius as we move outward. That is, the neutrino mean free paths exceed the scale of the proto-neutron star, as well as the shock radius scale, before we reach the shock radius. Under these circumstances, the neutrinos are not well described as components of the proto-neutron star fluid everywhere within it, and therefore, they are certainly not well described as a fluid in the critical heating layer between the proto-neutron star and the shock. A kinetic description of the neutrinos is required. Such a description, based on the Boltzmann kinetic equations, would supply the neutrino distributions functions, \(f(r,\theta ,\phi ,\mu ,\phi _{p},E,t)\), for each species of neutrino and antineutrino, where \(\mu \) is a the direction cosine taken with respect to the outward radial direction, \(\phi _{p}\) is the corresponding second angle describing the neutrino propagation direction in these momentum-space spherical polar coordinates, and \(E=|p|\) is the neutrino energy. Deep in the proto-neutron star, neutrinos and the proto-neutron star fluid are in weak-interaction equilibrium. The distribution functions are then given by their equilibrium counterparts and the neutrinos are well described as an additional component of the fluid. Of course, the neutrinos fall out of weak equilibrium as the neutrinospheres are approached, and beyond them stream freely. Thus a fluid description of them would be limited to only a small portion of the simulation domain and would be of equally limited utility. The nature of the weak interactions demands the greater computational challenge and the higher computational cost of a kinetic description of neutrino transport in the proto-neutron star and above it in the cavity between it and the shock.

4.2 The choice of phase-space coordinates

The expansion from the four dimensions of spacetime to the seven dimensions of relativistic phase space brings with it additional choices. Now, in addition to making what will hopefully be optimal choices for spacetime coordinates, we will also need to consider optimal choices for momentum-space coordinates. And this is not without some give and take. Simplification in some respects afforded by one choice is always accompanied by complexification in other respects.

There is, however, an overarching consideration that guides the typical choice made by most modelers: Neutrino–matter interactions are most naturally and, consequently, most easily described in the frame of reference of the inertial observer instantaneously comoving with the fluid. (The fluid is accelerating, but the instantaneously comoving observer is not.) In this frame, the matter is instantaneously at rest, and the neutrino four-momentum components that enter the expressions for the neutrino weak interaction rates are the components measured by the comoving observer. However, while the description of neutrino–matter interactions are simplified in this picture, the choice to use four-momenta measured by instantaneously comoving observers introduces additional terms on the left-hand side of the Boltzmann equation that correspond to relativistic angular aberration and Doppler shift due to the fact that two spatially-adjacent instantaneously-comoving observers do not necessarily have the same velocity—in general, they will measure different neutrino angles of propagation and energies. In the context of Newtonian gravity, this would certainly add considerable complexity to the left-hand side of the Boltzmann equation. But in the general relativistic case, such momentum-space advection terms that involve derivatives with respect to the neutrino angles of propagation (or their direction cosines) and the neutrino energy are already there in light of general relativistic angular aberration and frequency shift in curved spacetime. While the character of the physical effects—special versus general relativistic—is different and, as such, presents different numerical challenges, the relative additional complexity of adding terms corresponding to special relativistic effects—e.g., relativistic Doppler shift and angular aberration—to the left-hand side of the Boltzmann equation versus the significant simplification of the collision term when comoving-frame neutrino four-momenta are used has led most modelers to choose comoving frame neutrino four momenta as phase-space coordinates. With regard, then, to the difficulties associated with the terms/effects added to the advection of neutrinos in phase-space, as we will see in this review, very different numerical approaches have been taken to describe them.

In what follows, we will adopt the following notation: We will designate the neutrino four-momentum components measured by an inertial observer instantaneously comoving with the fluid as \(p^{{\hat{\mu }}}\). Neutrino four-momentum components measured by an Eulerian observer will be designated as \(p^{{\bar{\mu }}}\). Finally, the neutrino four-momentum components in the coordinate basis will be designated as \(p^{\mu }\).

4.3 The general relativistic Boltzmann equation

In light of the need to conserve simultaneously both energy and lepton number, we wish to begin with a version of the Boltzmann equation that is manifestly conservative across all phase-space dimensions. As we will show, this is not true of the standard formulation of the general relativistic Boltzmann equation. In this section, we outline the derivation of both as presented by Cardall and Mezzacappa (2003) to illustrate the differences and, of course, to arrive at a form of the Boltzmann equation that is better suited to numerical application. Before we begin, we emphasize the following: While spacetime is endowed with a natural metric, \(g_{{\mu }{\nu }}\), which is determined by Einstein’s equations given the stress–energy content of spacetime, phase space is not. Consequently, the development of general relativistic neutrino radiation hydrodynamics requires the full machinery of the metric-free language of the differential and integral calculus of forms. That is, the derivation we present below is not a matter of taste. Treatments of non-relativistic kinetic theory typically assume that phase space is endowed with a Euclidean metric. This can serve as a bookkeeping device at best, and it is important to interpret the theory accordingly.

The one-particle phase space for particles of arbitrary mass is an eight-dimensional space, which we label M, of spacetime position x and four-momentum p. If we specify a mass for the particle, m, which satisfies

$$\begin{aligned} m^2=-g_{\mu \nu }p^{\mu }p^{\nu }, \end{aligned}$$
(10)

we confine ourselves to a hypersurface of M, which we write as \(M_m\), which is the phase space for particles of mass m. The flow in \(M_m\) defined by the particle trajectories (xp) is generated by the Liouville operator

$$\begin{aligned} {{L_m}={p^{\mu }}{{{{{{{\mathscr {L}}}}}^{\mu }}}_{\mu }}\frac{\partial }{\partial {x^{\mu }}} -{{{{\varGamma }^{i}}}_{\nu \rho }}{p^{\nu }}{p^{\rho }} \frac{\partial }{\partial {p^{i}}}}. \end{aligned}$$
(11)

\({{{\mathscr {L}}}^{{\hat{\mu }}}}_{\mu }\) is the composite transformation that takes us, first, from the coordinate basis to the orthonormal frame of the Eulerian observer at rest with respect to the “laboratory” and, second, via a Lorentz transformation, from the Eulerian frame to the frame of reference comoving with the stellar core fluid:

$$\begin{aligned} {{{{\mathscr {L}}}}^{{\hat{\mu }}}}_{\mu } = {\varLambda ^{{\hat{\mu }}}}_{{\bar{\mu }}}{e^{{\bar{\mu }}}}_{\mu }. \end{aligned}$$
(12)

\({{{\mathscr {L}}}^{\mu }}_{{\hat{\mu }}}\) is the inverse transformation. \({\varGamma ^{{\hat{\mu }}}}_{{\hat{\nu }}{\hat{\rho }}}\) are the Ricci Rotation Coefficients and are given by

$$\begin{aligned} {{\varGamma ^{{\hat{\mu }}}}}_{{\hat{\nu }}{\hat{\rho }}} = {\varGamma {{\hat{\mu }}}}_{{\mu }} {{{\mathscr {L}}}^\nu }_{{\hat{\nu }}} {\varGamma ^{\hat{\rho }}}_{{{\rho }}} \,{\varGamma ^\mu }_{\nu \rho } + {\varGamma ^{{\hat{\mu }}}}_{{\mu }} {{{\mathscr {L}}}^\rho }_{{\hat{\rho }}} \frac{\partial {{{\mathscr {L}}}^\mu }_{{\hat{\nu }}}}{\partial x^\rho }, \end{aligned}$$
(13)

where \({\varGamma ^{\mu }}_{\nu \rho }\) are the Levi-Civita connection coefficients corresponding to the spacetime metric \(g_{\mu \nu }\).

For a given type of particle of mass m, the distribution function, f, gives the density of such particles in phase space. An equation for the distribution function, the Boltzmann equation, is derived by considering a closed six-dimensional hypersurface \(\partial D\) bounding a region D in \(M_m\). The net number of particles flowing through the boundary of D is given by the generalized Stokes’ Theorem

$$\begin{aligned} {{N}[\partial D]=\int _{\partial D} f\omega =\int _D d(f\omega ),} \end{aligned}$$
(14)

where the infinitesimal surface element \(\omega \) normal to the flow across D is given by

$$\begin{aligned} \omega ={L_m}\cdot \varOmega \end{aligned}$$
(15)

and \(\varOmega \) is an infinitesimal volume element in \(M_m\). The product rule gives

$$\begin{aligned} {d(f\omega )={df}\wedge \omega ={df}\wedge ({L_m}\cdot \varOmega ),} \end{aligned}$$
(16)

where we have used the fact that \(d\omega =0\) (an expression of the general relativistic Liouville’s Theorem that tells us that the phase-space flow is incompressible). But f, \(L_m\), and \(\varOmega \) obey the identity

$$\begin{aligned} {{df}\wedge (L_m\cdot \varOmega )=L_m[f]\varOmega .} \end{aligned}$$
(17)

Then

$$\begin{aligned} {{N}[\partial D]=\int _D {L_m}[f]\varOmega .} \end{aligned}$$
(18)

Finally, the number of particles crossing the boundary \(\partial D\) of D in \(M_m\) is given by the change in the number of particles in D due to emission, absorption, and scattering. Defining the “collision term,” \({\mathscr {C}}[f]\), as the spacetime density of such events, we have

$$\begin{aligned} {{N}[\partial D]=\int _D {\mathscr {C}}[f]\varOmega ,} \end{aligned}$$
(19)

and

$$\begin{aligned} {{L_m}[f]={\mathscr {C}}[f].} \end{aligned}$$
(20)

Substituting for \(L_m\) using Eq. (11), we arrive at the Boltzmann equation in “standard” form:

$$\begin{aligned} {{p^{\mu }}{{{{{{{\mathscr {L}}}}}^{\mu }}}_{\mu }}\frac{\partial f}{\partial {x^{\mu }}} -{{{{\varGamma }^{j}}}_{\nu \rho }}{p^{\nu }}{p^{\rho }} \frac{\partial {u^{i}}}{\partial {p^{j}}}\frac{\partial f}{\partial {u^{i}}}={\mathscr {C}}[f].} \end{aligned}$$
(21)

Note that to obtain the Boltzmann equation, we had to consider integration on our phase-space manifold \(M_m\) on which there is no natural metric. This necessitates the use of the language of differential forms.

If we integrate over momentum space, we obtain the balance equation for particle number

$$\begin{aligned} {\frac{1}{{\sqrt{-g}}}\frac{\partial }{\partial {x^{\mu }}}\big ({\sqrt{-g}}{N^{\mu }}\big )=\int {\mathscr {C}}[f]{{\pi }_m},} \end{aligned}$$
(22)

where

$$\begin{aligned} {{N^{\mu }(x)}=\int {{{{f {{{\mathscr {L}}}}}^{\mu }}}_{\mu }}{p^{\mu }}{{\pi }_m}=\int f{p^{\mu }}{{\pi }_m}} \end{aligned}$$
(23)

is the particle 4-current density and

$$\begin{aligned} {{{\pi }_m}=\frac{1}{E(\mathbf{p})}\Big |\det \big [\frac{\partial \mathbf{p}}{\partial \mathbf{u}}\big ]\Big |{{{du}}^{123}}} \end{aligned}$$
(24)

is the invariant momentum-space 3-volume expressed in terms of the spherical momentum-space coordinates: \(u^{{\hat{i}}}=(E=\Vert p\Vert /c,\mu \equiv \cos \theta _{p},\phi _{p})\). But in light of the fact that the Boltzmann equation is not expressed in manifestly conservative form it is not obvious how we arrive at Eq. (22) by integrating over momentum space. We desire to reexpress the Boltzmann equation in terms of spacetime and momentum-space divergences so that it is manifestly conservative with respect to an integration over a spacetime region, a momentum-space region, or both—i.e., a phase-space region.

Of course, the generalized Stokes’ Theorem, Eq. (14), is an expression of manifest conservation, equating the change in a quantity within a volume of phase space in terms of a surface term involving its flux on the volume’s boundary. The key insight by Cardall and Mezzacappa (2003) was to recognize that the total exterior derivative \(d(f\omega )\) in Eq. (14) can instead be expressed as

$$\begin{aligned} {d(f\omega )={\mathscr {N}}[f]\varOmega ,} \end{aligned}$$
(25)

where

$$\begin{aligned} {\mathscr {N}}[f]\equiv & {} \frac{1}{{\sqrt{-g}}}\frac{\partial }{\partial {x^{\mu }}}\big ({\sqrt{-g}}{{{{{{{\mathscr {L}}}}}^{\mu }}}_{\mu }} {p^{\mu }}f\big ) \nonumber \\&-E(\mathbf{p})\left\| \det \left[ \frac{\partial \mathbf{p}}{\partial \mathbf{u}}\right] \right\| ^{-1} \frac{\partial }{\partial {u^{i}}}\left( \frac{1}{E(\mathbf{p})}\left\| \det \left[ \frac{\partial \mathbf{p}}{\partial \mathbf{u}}\right] \right\| {{{{\varGamma }^{j}}}_{\nu \rho }}{p^{\nu }}{p^{\rho }} \frac{\partial {u^{i}}}{\partial {p^{j}}}f\right) .\nonumber \\ \end{aligned}$$
(26)

Substituting Eq. (25) in Eq. (14) and using Eq. (19), we arrive at

$$\begin{aligned}&\frac{1}{{\sqrt{-g}}}\frac{\partial }{\partial {x^{\mu }}}\big ({\sqrt{-g}}{{{{{{{\mathscr {L}}}}}^{\mu }}}_{\mu }} {p^{\mu }}f\big ) \nonumber \\&\qquad -E(\mathbf{p})\left\| \det \left[ \frac{\partial \mathbf{p}}{\partial \mathbf{u}}\right] \right\| ^{-1} \frac{\partial }{\partial {u^{i}}}\left( \frac{1}{E(\mathbf{p})}\left\| \det \big [\frac{\partial \mathbf{p}}{\partial \mathbf{u}}\big ]\right\| {{{{\varGamma }^{j}}}_{\nu \rho }}{p^{\nu }}{p^{\rho }} \frac{\partial {u^{i}}}{\partial {p^{j}}}f\right) \nonumber \\&\quad = {\mathscr {C}}[f], \end{aligned}$$
(27)

which is the manifestly conservative formulation of the Boltzmann equation. It is now obvious that upon integration over momentum space, for example, the momentum derivative terms on the left-hand side of the Boltzmann equation in Eq. (27) will give rise only to surface terms. The counterpart equation for 4-momentum conservation can be derived in the same way (Cardall and Mezzacappa 2003) and is given by

$$\begin{aligned}&\frac{1}{{\sqrt{-g}}}\frac{\partial }{\partial {x^{\nu }}}\big ({\sqrt{-g}}{{{{{\mathscr {T}}}}}^{{\mu \nu }}}\big ) \nonumber \\&\qquad - {E(\mathbf{p})}\left\| \det \left[ \frac{\partial \mathbf{p}}{\partial \mathbf{u}}\right] \right\| ^{-1} \frac{\partial }{\partial {u^{i}}}\left( \frac{1}{{E(\mathbf{p})}}\left\| \det \left[ \frac{\partial \mathbf{p}}{\partial \mathbf{u}}\right] \right\| {{{{\varGamma }^{j}}}_{\nu \rho }}{p^{\rho }} \frac{\partial {u^{i}}}{\partial {p^{j}}}{{{{{{{\mathscr {L}}}}}^{\nu }}}_{\nu }}{{{{{\mathscr {T}}}}}^{{\mu \nu }}}\right) \nonumber \\&\quad = - {{{{\varGamma }^{\mu }}}_{{\nu \rho }}}{{{{{\mathscr {T}}}}}^{{\nu \rho }}} + {{{{{{{\mathscr {L}}}}}^{\mu }}}_{\mu }}{p^{\mu }}{\mathbb {C}}[f], \end{aligned}$$
(28)

where

$$\begin{aligned} {{{{{{\mathscr {T}}}}}^{{\mu \nu }}}\equiv {{{{{{{\mathscr {L}}}}}^{\mu }}}_{\mu }}{{{{{{{\mathscr {L}}}}}^{\nu }}}_{\nu }}{p^{\mu }}{p^{\nu }}f} \end{aligned}$$
(29)

is the specific particle stress-energy tensor.

As an illustrative example, we specialize Eq. (27) to the case of spherical symmetry, Lagrangian coordinates, and \({\mathscr {O}}(v/c)\) transport, as in Mezzacappa and Bruenn (1993a, 1993b, 1993c). As shown by Cardall and Mezzacappa, Eq. (27) reduces to

$$\begin{aligned}&{\partial \over \partial t}\left( f\over \rho \right) + {\partial \over \partial m}\left( 4\pi r^2\rho \mu \, {f\over \rho }\right) + {1\over \varepsilon ^2}{\partial \over \partial \varepsilon } \left( \varepsilon ^3\left[ \mu ^2\left( {3 v\over r} + {\partial \ln \rho \over \partial t}\right) -{v\over r}\right] {f\over \rho }\right) \nonumber \\&\qquad + {\partial \over \partial \mu }\left( (1-\mu ^2)\left[ {1\over r}+\mu \left( {3 v\over r}+ {\partial \ln \rho \over \partial t}\right) \right] {f\over \rho }\right) ={1\over \rho \, \varepsilon }\,{\mathscr {C}}[f], \end{aligned}$$
(30)

in agreement with the conservative formulation of the Boltzmann equation used in Mezzacappa and Bruenn (1993a, 1993b, 1993c). In spherical symmetry and to \({\mathscr {O}}(v/c)\) one can arrive at a manifestly conservative form of the Boltzmann equation through trial and error. However, in three dimensions and with full general relativity, such trial and error approaches are doomed to failure. A manifestly conservative starting point becomes paramount.

4.4 The 3 + 1 formulation of general relativity

The fundamental building blocks of the “3 + 1” formulation of general relativity are the spacelike hypersurfaces corresponding to surfaces of constant \(\tau \), where \(\tau \) is some scalar function of the spacetime coordinates \(x^\mu \): \(\tau =\tau (x^0,x^1,x^2,x^3)\). It is natural to choose \(\tau \) to be \(x^0=t\). The spacelike hypersurfaces, \(\varSigma _t\), are threaded by a timelike congruence of constant-spatial-coordinate curves. The points of constant \(x^i(t)\) between two hypersurfaces separated by dt are connected by the four-vector t. At each point of the hypersurface \(\varSigma _t\), there is a unit timelike normal four-vector n satisfying \(n_\mu n^\mu = -1\). n corresponds to the four-velocity of the observer at rest with respect to the hypersurface. This is the generalization of the definition of the Eulerian observer familiar from non-relativistic formulations. The four-vector \(\beta \), known as the “shift” vector, describes how the spatial coordinates move within each hypersurface. The proper time between two hypersurfaces \(\varSigma _t\) and \(\varSigma _{t+dt}\) is given by \(\alpha dt\). \(\alpha \) is known as the “lapse” function. Given such a foliation of spacetime and such a coordinatization, the squared spacetime line element becomes

$$\begin{aligned} ds^{2} = - (\alpha ^{2}-\beta _{i}\beta ^{i})dt^{2} + 2\beta _{i}dx^{i}dt+\gamma _{ij}dx^{i}dx^{j}, \end{aligned}$$
(31)

where \(\gamma _{ij}\) is the metric on the hypersurface \(\varSigma _t\). From Eq. (31), the spacetime metric can be read off as

$$\begin{aligned} g_{\mu \nu } = \left( \begin{array}{cc} -\alpha ^{2}+\beta _{i}\beta ^{i} &{} \beta _{i} \\ \beta _{i} &{} \gamma _{ij} \end{array} \right) , \end{aligned}$$
(32)

whose determinant g can be computed directly to find \(\sqrt{-g}=\alpha \,\sqrt{\gamma }\), where \(\gamma \) is the determinant of the spatial metric.

In addition to the intrinsic geometry—specifically, the intrinsic curvature—of each spacelike hypersurface, which is determined by the metric \(\gamma _{ij}\), we describe how such a hypersurface is embedded in the four-dimensional spacetime by its extrinsic curvature, \({\mathsf {K}}_{ij}\), which is related to the three-metric by

$$\begin{aligned} \partial _{t}\gamma _{ij}=-2\alpha {\mathsf {K}}_{ij}+D_{i}\beta _{j}+D_{j}\beta _{i}. \end{aligned}$$
(33)

Here \(D_{i}\) corresponds to the covariant derivative on \(\varSigma _{t}\) corresponding to the Levi–Civita connection associated with \(\gamma _{ij}\). We can regard the coordinates of this formulation as the metric components, \(\gamma _{ij}\), and the components of the extrinsic curvature, \({\mathsf {K}}_{ij}\), as the velocities. The dynamics is supplied by the Einstein equations, which provide the following evolution equations for the six independent components of \({\mathsf {K}}_{ij}\):

$$\begin{aligned} \partial _{t}{\mathsf {K}}_{ij}= & {} -D_{i}D_{j}\alpha +\beta ^{k}\partial _{k}{\mathsf {K}}_{ij}+{\mathsf {K}}_{ik}\partial _{j}\beta ^{k}+{\mathsf {K}}_{kj}\partial _{i}\beta ^{k} \nonumber \\&+ \alpha \left( ^{(3)}R_{ij}+{\mathsf {K}}{\mathsf {K}}_{ij}-2{\mathsf {K}}_{ik}{\mathsf {K}}^{k}_{j} \right) +4\pi \alpha [\gamma _{ij} (S-E) - 2S_{ij}], \end{aligned}$$
(34)

where \({\mathsf {K}}\) is the trace of the extrinsic curvature tensor, and \(^{(3)}R_{ij}\) is the Ricci curvature tensor for the spacelike hypersurface. The source terms in Eq. (34) are given in terms of the stress–energy tensor, \(T_{\alpha \beta }\), by

$$\begin{aligned} S_{\mu \nu }= & {} {\gamma ^{\alpha }}_{\mu }{\gamma ^{\beta }}_{\nu }T_{\alpha \beta }, \end{aligned}$$
(35)
$$\begin{aligned} S_{\mu }= & {} -{\gamma ^{\alpha }}_{\mu }n^{\beta }T_{\alpha \beta }, \end{aligned}$$
(36)
$$\begin{aligned} S= & {} {S^{\mu }}_{\mu }, \end{aligned}$$
(37)
$$\begin{aligned} E= & {} n^{\alpha }n^{\beta }T_{\alpha \beta }, \end{aligned}$$
(38)

where

$$\begin{aligned} n^{\mu } = \frac{1}{\alpha }(1,-\beta ^{i}) \quad \text {with}\quad n_{\mu } = (-\alpha ,0) \end{aligned}$$
(39)

and

$$\begin{aligned} \gamma ^{\alpha }_{\mu }=\delta ^{\alpha }_{\mu }+n^{\alpha }\,n_{\mu } \end{aligned}$$
(40)

provide timelike and spacelike projections, respectively. While not drawn here, there is a corresponding spacelike hypersurface to which the fluid four-velocity

$$\begin{aligned} u^{\mu } = W\,(\,n^{\mu }+v^{\mu }\,) \end{aligned}$$
(41)

is the unit timelike normal, which defines the timelike basis element of the orthonormal frame of reference of the inertial observer instantaneously comoving with the fluid and at rest with respect to the hypersurface. This is our generalized Lagrangian observer in this formalism. The projection into the slice defined by the normal \(u^{\mu }\) is given by

$$\begin{aligned} h^{\alpha }_{\mu }=\delta ^{\alpha }_{\mu }+u^{\alpha }\,u_{\mu }. \end{aligned}$$
(42)

Here, \(W=-n_{\mu }u^{\mu }\) is the Lorentz factor and \(v^{\mu }=(\gamma ^{\mu }_{\nu }u^{\nu })/W\) the fluid three-velocity.

4.5 3 + 1 general relativistic hydrodynamics

The 3 + 1 slicing of spacetime allows us to formulate the radiation-hydrodynamics equations in a form suitable for numerical solution. Here we briefly summarize the 3 + 1 form of the hydrodynamics equations given by Eqs. (4)–(6) (see, e.g., Anile 1989; Rezzolla and Zanotti 2013 for details). The mass conservation equation [cf. Eq. (4)] becomes

$$\begin{aligned} \frac{1}{\alpha \sqrt{\gamma }} \big [\, \partial _{t}{}\big (\,\sqrt{\gamma }\,D\,\big ) + \partial _{i}{}\big (\,\sqrt{\gamma }\,D\,\big [\,\alpha \,v^{i}-\beta ^{i}\,\big ]\,\big ) \,\big ] =0, \end{aligned}$$
(43)

where \(D=W\,\rho \), while the electron number conservation equation [cf. Eq. (6)] becomes

$$\begin{aligned} \frac{1}{\alpha \sqrt{\gamma }} \big [\, \partial _{t}{}\big (\,\sqrt{\gamma }\,D\,Y_{e}\,\big ) + \partial _{i}{}\big (\,\sqrt{\gamma }\,D\,Y_{e}\,\big [\,\alpha \,v^{i}-\beta ^{i}\,\big ]\,\big ) \,\big ] =-m_{\text{ B }}\,L. \end{aligned}$$
(44)

Conservative forms of the energy and momentum equations are derived by decomposing Eq. (5) into components relative to the spatial hypersurface. The energy equation becomes

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\, \partial _{t}{}\big (\,\sqrt{\gamma }\,\tau _{\text{ fluid }}\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,(S^{i}-D\,v^{i})-\tau _{\text{ fluid }}\,\beta ^{i}\,\big ]\,\big ) \,\big ] \nonumber \\&\quad =\frac{1}{\alpha }\,\big [\,\alpha \,S^{ik}\,{\mathsf {K}}_{ik}-S^{i}\partial _{i}{\alpha }\,\big ]+n_{\mu }\,G^{\mu }, \end{aligned}$$
(45)

where \(\tau _{\text{ fluid }}=E-D\), \(E=\rho \,h\,W^{2}-p\), \(S^{\mu }=\rho \,h\,W^{2}\,v^{\mu }\), and \(S^{\mu \nu }=\rho \,h\,W^{2}\,v^{\mu }\,v^{\nu }+p\,\gamma ^{\mu \nu }\), while the momentum equation is given by

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\, \partial _{t}{}\big (\,\sqrt{\gamma }\,S_{j}\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,S^{i}_{j}-\beta ^{i}\,S_{j}\,\big ]\,\big ) \,\big ] \nonumber \\&\quad =\frac{1}{\alpha }\,\big [\,S_{i}\,\partial _{j}{\beta ^{i}}+\frac{1}{2}\,\alpha \,S^{ik}\partial _{j}{\gamma _{ik}}-E\,\partial _{j}{\alpha }\,\big ] - \gamma _{j\mu }\,G^{\mu }. \end{aligned}$$
(46)

The source terms modeling lepton and four-momentum exchange due to neutrino–matter interactions (\(-L\), \(-n_{\mu }\,G^{\mu }\), and \(\gamma _{j\mu }\,G^{\mu }\), respectively) will be discussed in detail in Sect. 5.

4.6 The 3 + 1 general relativistic Boltzmann equation

The general relativistic Boltzmann equation in both conservative form and using the spacetime coordinates associated with the 3 + 1 decomposition of spacetime was derived by Cardall et al. (2013a). Essential to the derivation is the recognition that the composite transformation \({\mathscr {L}^{\mu }}_{{\hat{\mu }}}\) can be viewed as the coordinate basis components (\(\mu \)) of the element of the tetrad of four-vectors (\({\hat{\mu }}\)) corresponding to the frame carried by the observer instantaneously comoving with the fluid. The Eulerian decomposition of \({\mathscr {L}^{\mu }}_{{\hat{\mu }}}\) into timelike and spacelike components is

$$\begin{aligned} {\mathscr {L}^{\mu }}_{{\hat{\mu }}}={{{\mathscr {L}}}}_{{\hat{\mu }}}n^{\mu }+{l^{\mu }}_{{\hat{\mu }}}, \end{aligned}$$
(47)

where \({{{\mathscr {L}}}}_{{\hat{\mu }}}\) is the coefficient of the timelike component of the tetrad element (four-vector) designated by \({\hat{\mu }}\), and \({l^{\mu }}_{{\hat{\mu }}}\) is the spacelike component of this tetrad element. Explicit expressions for \({{{\mathscr {L}}}}_{{\hat{\mu }}}\) and \({l^{\mu }}_{{\hat{\mu }}}\) can be found in Cardall et al. (2013a). The Ricci Rotation Coefficients can be expressed as

$$\begin{aligned} {\varGamma ^{{\hat{\rho }}}}_{{\hat{\nu }}{\hat{\mu }}}={\mathscr {L}^{{\hat{\rho }}}}_{\nu } {\mathscr {L}^{\mu }}_{{\hat{\mu }}}\nabla _{\mu } {\mathscr {L}^{\nu }}_{{\hat{\nu }}}. \end{aligned}$$
(48)

Using the decomposition (47), we are left with three terms to evaluate:

$$\begin{aligned} {\mathscr {L}^{{\hat{\rho }}}}_{\nu } {\mathscr {L}^{\mu }}_{{\hat{\mu }}} \left( \mathcal{L}_{{\hat{\nu }}}\nabla _{\mu }n^{\nu } +n^{\nu }\nabla _{\mu }\mathcal{L}_{{\hat{\nu }}} +\nabla _{\mu }{l^{\nu }}_{{\hat{\nu }}} \right) . \end{aligned}$$
(49)

The results can be found in Cardall et al. (2013a). With the decomposition of the momentum-space transformation matrix \({P^{{\tilde{i}}}}_{{\hat{i}}}\) into elements parallel and perpendicular to the three-momentum \(p^{{\hat{i}}}\),

$$\begin{aligned} {P^{{\tilde{i}}}}_{{\hat{i}}} = \frac{Q^{{\tilde{i}}}p_{{\hat{i}}}}{p} + {U^{{\tilde{i}}}}_{{\hat{i}}}, \end{aligned}$$
(50)

with

$$\begin{aligned} Q^{{\tilde{i}}}= & {} \frac{{P^{{\tilde{i}}}}_{{\hat{i}}} p^{{\hat{i}}}}{p}, \end{aligned}$$
(51)
$$\begin{aligned} p= & {} \sqrt{p^{{\hat{i}}}p_{{\hat{i}}}}, \end{aligned}$$
(52)
$$\begin{aligned} {U^{{\tilde{i}}}}_{{\hat{i}}}= & {} {P^{{\tilde{i}}}}_{{\hat{j}}} {k^{{\hat{j}}}}_{{\hat{i}}}, \end{aligned}$$
(53)
$$\begin{aligned} {k^{{\hat{j}}}}_{{\hat{i}}}= & {} {\delta ^{{\hat{j}hat{i}}}} + \frac{p^{{\hat{j}}}p_{{\hat{i}}}}{p^2}. \end{aligned}$$
(54)

The 3 + 1 general relativistic Boltzmann equation can now be written as

$$\begin{aligned} S_N + M_N = C[f], \end{aligned}$$
(55)

where the spacetime divergence is

$$\begin{aligned} S_N = \frac{\left( -p_{{{\hat{0}}}} \right) }{\alpha \sqrt{\gamma }}\left[ \frac{\partial \left( D_N \right) }{\partial t} + \frac{\partial \left( F_N \right) ^i }{\partial x^i} \right] , \end{aligned}$$
(56)

with

$$\begin{aligned} D_N= & {} \frac{\sqrt{\gamma }}{\left( -p_{{{\hat{0}}}} \right) } \, {\mathscr {L}}_{{{\hat{\mu }}}} \, p^{{{\hat{\mu }}}} f, \end{aligned}$$
(57)
$$\begin{aligned} \left( F_N \right) ^i= & {} \frac{\sqrt{\gamma }}{\left( -p_{{{\hat{0}}}} \right) } \left( \alpha \, {\ell ^i}_{{{\hat{\mu }}}} - \beta ^i {\mathscr {L}}_{{{\hat{\mu }}}} \right) p^{{{\hat{\mu }}}} f. \end{aligned}$$
(58)

\(D_N\) and \(\left( F_N \right) ^i \) are, respectively, the conserved number density and number flux. The momentum-space divergence, \(M_N\), can be expressed as

$$\begin{aligned} M_N= & {} \frac{1}{\alpha \sqrt{\gamma }} \frac{\left( -p_{{{\hat{0}}}} \right) }{\sqrt{\lambda }} \frac{\partial }{\partial p^{{{\tilde{\imath }}}}} \left\{ \sqrt{\lambda } \, \frac{Q^{{{\tilde{\imath }}}} \left( -p_{{{\hat{0}}}}\right) }{p}\! \left[ \left( R_N \right) ^{{{\hat{0}}}} + \left( O_N \right) ^{{{\hat{0}}}} \right] \right. \nonumber \\&\left. + \sqrt{\lambda } \, {U^{{{\tilde{\imath }}}}}_{{{\hat{\imath }}}} \left[ \left( R_N \right) ^{{{\hat{\imath }}}} + \left( O_N \right) ^{\hat{\imath }} \right] \right\} , \end{aligned}$$
(59)

where

$$\begin{aligned} \left( R_N \right) ^{{{\hat{\rho }}}}= & {} \frac{\alpha \sqrt{\gamma }}{\left( -p_{{{\hat{0}}}} \right) }\, p^{{{\hat{\nu }}}} p^{{{\hat{\mu }}}} f \nonumber \\&\times \left[ {\mathscr {L}}^{{{\hat{\rho }}}}\, {\ell ^j}_{{{\hat{\nu }}}} \left( \frac{ {\mathscr {L}}_{{{\hat{\mu }}}} }{\alpha } \frac{\partial \alpha }{\partial x^j} - {\ell ^k}_{{{\hat{\mu }}}} \, {\mathsf {K}}_{jk} \right) \right. \nonumber \\&\left. - {\ell ^{{{\hat{\rho }}} j}} \! \left( \! \frac{{\mathscr {L}}_{{{\hat{\nu }}}} {\mathscr {L}}_{{{\hat{\mu }}}} }{\alpha } \frac{\partial \alpha }{\partial x^j} \!-\! \frac{\ell _{k{{\hat{\nu }}}} \, {\mathscr {L}}_{{{\hat{\mu }}}}}{\alpha } \frac{\partial \beta ^k}{\partial x^j} \!-\! \frac{{\ell ^k}_{{{\hat{\nu }}}} \,{\ell ^i}_{{{\hat{\mu }}}}}{2} \frac{\partial \gamma _{ki} }{\partial x^j} \!\right) \!\right] \! \end{aligned}$$
(60)

describes momentum shifts (i.e., redshift and angular aberration in momentum-space spherical coordinates) due to gravity as embodied in the spacetime geometry,

$$\begin{aligned} \left( O_N \right) ^{{{\hat{\rho }}}}= & {} \frac{\sqrt{\gamma }}{\left( -p_{{{\hat{0}}}} \right) } \, p^{{{\hat{\nu }}}} p^{{{\hat{\mu }}}} f \nonumber \\&\times \left\{ {\mathscr {L}}^{{{\hat{\rho }}}} \left[ {\mathscr {L}}_{{{\hat{\mu }}}} \frac{\partial {\mathscr {L}}_{{{\hat{\nu }}}}}{\partial t} + \left( \alpha \, {\ell ^j}_{{{\hat{\mu }}}} - \beta ^j {\mathscr {L}}_{{{\hat{\mu }}}}\right) \frac{\partial {\mathscr {L}}_{{{\hat{\nu }}}}}{\partial x^j} \right] \right. \nonumber \\&\left. - \ell ^{{{\hat{\rho }}} k} \left[ {\mathscr {L}}_{{{\hat{\mu }}}} \frac{\partial \ell _{k {{\hat{\nu }}}}}{\partial t} + \left( \alpha \, {\ell ^j}_{{{\hat{\mu }}}} - \beta ^j {\mathscr {L}}_{{{\hat{\mu }}}}\right) \frac{\partial \ell _{k {{\hat{\nu }}}}}{\partial x^j} \right] \right\} \end{aligned}$$
(61)

are ‘observer corrections’ due to the acceleration of the fluid and, consequently, changing comoving observers with different velocities (and partially entangled with the geometry as well), and

$$\begin{aligned} \sqrt{\lambda }=\left\| \det \left[ \frac{\partial \mathbf{p}}{\partial \mathbf{u}}\right] \right\| . \end{aligned}$$
(62)

4.7 Multi-frequency moment kinetics and the closure problem

Because of the prohibitively high computational cost associated with solving the Boltzmann equation with sufficient phase-space resolution, most (all in three spatial dimensions) supernova models to date employ a moments approach to neutrino transport. In the moments approach, one solves for a finite number of moments of the distribution function (instead of the distribution function), and the hierarchy of moment equations is closed by a closure procedure, relating higher-order moments to the evolved lower-order moments.

The basic idea of the moments approach can be illustrated by considering the Boltzmann equation in one spatial dimension

$$\begin{aligned} \partial _{t}{f}+\mu \partial _{x}{f} = \chi \,(f_{0}-f) + \sigma \,(\langle f \rangle -f), \end{aligned}$$
(63)

where, for simplicity, we let the distribution function depend on spatial position, x, momentum-space angle cosine, \(\mu \), and time, t. \(\chi \) is the absorption opacity, \(f_{0}\) is the isotropic equilibrium distribution, and \(\sigma \) is the scattering opacity due to isotropic and isoenergetic scattering. A finite number (\(N+1\)) of angular moments of the distribution function can be formed as weighted integrals over angle:

$$\begin{aligned} m^{(k)}(x,t)=\langle \,f,\,\mu ^{k}\,\rangle \equiv \frac{1}{2}\int _{-1}^{1}f(\mu ,x,t)\,\mu ^{k}\,d\mu ,\quad k=0,1,\ldots ,N. \end{aligned}$$
(64)

Thus, in a truncated moments approach the distribution function is approximated by the moments vector

$$\begin{aligned} {\mathbf {m}}_{N}=\big (\,m^{(0)},m^{(1)},\ldots ,m^{(N)}\,\big )^{T} \end{aligned}$$
(65)

so that

$$\begin{aligned} f(\mu ,x,t) \approx \sum _{k=0}^{N}c^{(k)}\,m^{(k)}(x,t)\,\mu ^{k}, \end{aligned}$$
(66)

where \(c^{(k)}\) are normalization constants. Similarly, by taking moments of the Boltzmann equation in Eq. (63), the hierarchy of moment equations is given by

$$\begin{aligned} \partial _{t}{m^{(0)}}+\partial _{x}{m^{(1)}}&=\chi \,(f_{0} - m^{(0)}), \end{aligned}$$
(67)
$$\begin{aligned} \partial _{t}{m^{(k)}}+\partial _{x}{m^{(k+1)}}&= \chi \,(\langle f_{0},\mu ^{k} \rangle -m^{(k)}) + \sigma \,(m_{0}^{(k)}-m^{(k)}), \quad \text {for}\quad k>0, \end{aligned}$$
(68)

where on the right-hand sides we have defined

$$\begin{aligned} \langle f_{0},\mu ^{k} \rangle =f_{0}\,\frac{[1+(-1)^{k}]}{2\,(k+1)} \quad \text{ and }\quad m_{0}^{(k)}=m^{(0)}\,\frac{[1+(-1)^{k}]}{2\,(k+1)}. \end{aligned}$$
(69)

When considering the expansion in Eq. (66), the moments approach is simply an approximation to the angular dependence of the distribution function in terms of the monomial basis \(\{\mu _{k}\}_{k=0}^{N}\). The power of the moments approach becomes evident when collisions are moderate to strong. In this case, collisions tend to drive the zeroth moment \(m^{(0)}\) towards the isotropic distribution \(f_{0}\), the higher-order odd moments decay exponentially to zero (\(m^{(k)}\rightarrow 0\); k odd), and the higher-order even moments tend to \(m^{(k)}\rightarrow m^{(0)}/(k+1)\) (k even). Thus, the angular dependence of the distribution is captured well by only a few moments. In the absence of collisions, more moments are typically needed to capture the angular shape of the distribution function. Note in particular that in Eq. (68), the equation for the k-th moment contains the \(k+1\)-th moment. Thus, in a truncated moment model based on \(N+1\) moments, \({\mathbf {m}}_{N}\), the equation for \(m^{(N)}\) contains the moment \(m^{(N+1)}\), which must be related to the lower order moments by a closure procedure—i.e., \(m^{(N+1)}:=g({\mathbf {m}}_{N})\)—in order to form a closed system of equations. This is referred to as the closure problem. Typically, the closure function g is a nonlinear function of \({\mathbf {m}}_{N}\), which can make the construction of numerical methods for moment models more difficult. There are several challenges associated with the construction of closures for moment hierarchies (see, e.g., Levermore 1996), one being the construction of closures that preserve the hyperbolic character of the system of moment equations; see, e.g., Pons et al. (2000), for a discussion of this topic in the context of two-moment models. In the remainder of this section, we will discuss relativistic two-moment models (\(N=1\) in the simpler formalism above). In the multi-dimensional setting, the two-moment model evolves four unknowns (e.g., the energy density and three components of the momentum density), and, in the relativistic setting considered here, second and third moments appear in the equations for the first moments.

Conservative, 3 + 1 general relativistic, multi-frequency (or multi-energy) two-moment formalisms have been developed by Shibata et al. (2011) and Cardall et al. (2013b). The formalism of Shibata et al. (2011) is based on the formalism of Thorne (1981), while the formalism of Cardall et al. (2013b) starts out with the conservative formulation of kinetic theory from Cardall and Mezzacappa (2003) discussed in Sect. 4.3. Both approaches, of course, lead to the same result, which we summarize here.

Covariant expressions for the first few moments of the distribution function f are given by

$$\begin{aligned} N^{\mu }(x,t)&= \int _{V_{p}} f(p,x,t)\,p^{\mu }\,\pi _{m}, \end{aligned}$$
(70)
$$\begin{aligned} T^{\mu \nu }(x,t)&=\int _{V_{p}} f(p,x,t)\,p^{\mu }\,p^{\nu }\,\pi _{m}, \end{aligned}$$
(71)
$$\begin{aligned} Q^{\mu \nu \rho }(x,t)&=\int _{V_{p}} f(p,x,t)\,p^{\mu }\,p^{\nu }\,p^{\rho }\,\pi _{m}, \end{aligned}$$
(72)

where \(N^{\mu }\) is the four-current density, \(T^{\mu \nu }\) the stress-energy tensor, and the rank three tensor of moments \(Q^{\mu \nu \rho }\) is sometimes referred to as the tensor of fluxes or heat flux tensor. When expressed in terms of comoving frame spherical-polar momentum coordinates \((\varepsilon ,\vartheta ,\varphi )\), the invariant momentum-space 3-volume in Eq. (24) is

$$\begin{aligned} \pi _{m} = \varepsilon \,\sin \vartheta \,d\vartheta \,d\varphi \,d\varepsilon . \end{aligned}$$
(73)

Higher-order moments can be constructed similarly in a straightforward way, but we will limit the discussion to moment models involving the moments in Eqs. (70)–(72). Note that the moments defined above depend only on position x and time t. However, because neutrino heating and cooling rates are sensitive to the neutrino energy (cf. Sect. 2), supernova models based on moment descriptions for neutrino transport retain the energy dimension and solve for angular moments, or spectral moments, defined by

$$\begin{aligned} {\mathscr {N}}^{\mu }(\varepsilon ,x,t)&=\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f\,p^{\mu }\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(74)
$$\begin{aligned} {\mathscr {T}}^{\mu \nu }(\varepsilon ,x,t)&=\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f\,p^{\mu }\,p^{\nu }\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(75)
$$\begin{aligned} {\mathscr {Q}}^{\mu \nu \rho }(\varepsilon ,x,t)&=\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f\,p^{\mu }\,p^{\nu }\,p^{\rho }\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(76)

where \(d\omega =\sin \vartheta d\vartheta d\varphi \) and the integrals extend over the sphere

$$\begin{aligned} {\mathbb {S}}^{2} = \big \{\,\omega \in (\vartheta ,\varphi )\,|\,\vartheta \in [0,\pi ],\,\varphi \in [0,2\pi )\,\big \}, \end{aligned}$$
(77)

where \(\vartheta \) and \(\varphi \) are momentum-space angular coordinates. The angular moments defined in Eqs. (74)–(76) depend on the neutrino energy, \(\varepsilon \), position, x, and time, t. They are related to the moments in Eqs. (70)–(72) by the integral over energy

$$\begin{aligned} \big \{\,N^{\mu },\,T^{\mu \nu },\,Q^{\mu \nu \rho }\,\big \}(x,t) =\int _{0}^{\infty }\big \{\,{\mathscr {N}}^{\mu },\,{\mathscr {T}}^{\mu \nu },\,{\mathscr {Q}}^{\mu \nu \rho }\,\big \}(\varepsilon ,x,t)\,dV_{\varepsilon }, \end{aligned}$$
(78)

where the infinitesimal energy-space shell-volume element is \(dV_{\varepsilon }=4\pi \varepsilon ^{2}d\varepsilon \). In forming the angular moments we have used the freedom in choosing distinct spacetime and momentum space coordinates: x and t are spacetime coordinates in a global coordinate basis, while \(\{\varepsilon ,\vartheta ,\varphi \}\) are momentum coordinates in a comoving basis.

Moment equations governing the evolution of the angular moments are derived from the general relativistic Boltzmann equation discussed in Sect. 4.3. Since current supernova modelers employing angular moment models use either a flux-limited diffusion (one-moment) or a two-moment approach, we will limit the discussion to these approaches. In this context, we will need evolution equations for the spectral neutrino number density, energy density, and three-momentum density. The evolution equation for the neutrino number density is obtained by multiplying Eq. (27) by \(1/(4\pi \varepsilon )\) and integrating over \({\mathbb {S}}^{2}\):

$$\begin{aligned} \nabla _{\nu }{\mathscr {N}}^{\nu } -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{2}\,{\mathscr {T}}^{\mu \nu }\,\nabla _{\mu }u_{\nu }\,\big ) =\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}[f]\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(79)

where \(u_{\nu }\) is the four-velocity of the observer measuring neutrino energy \(\varepsilon \) (i.e., the comoving observer). Note that the left-hand side of Eq. (79) is in divergence form, and the use of spherical momentum-space coordinates is apparent from the form of the second term. Integrating over energy (\(dV_{\varepsilon }\)) gives rise to the balance equation

$$\begin{aligned} \nabla _{\nu }N^{\nu } = \int _{V_{p}}{\mathscr {C}}[f]\,\pi _{m}, \end{aligned}$$
(80)

where the left-hand side is in conservative form. The right-hand side gives rise to lepton exchange sources and sinks due to neutrino–matter interactions (e.g., emission and absorption). In a similar manner, conservative evolution equations for the neutrino four-momentum are obtained by multiplying the four-momentum conservative Boltzmann equation in Eq. (28) by \(1/(4\pi \varepsilon )\) and integrating over \({\mathbb {S}}^{2}\):

$$\begin{aligned} \nabla _{\nu }{\mathscr {T}}^{\mu \nu } -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{2}\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho }\,\big ) =\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}[f]\,p^{\mu }\,\frac{d\omega }{\varepsilon }. \end{aligned}$$
(81)

Again, integrating this equation over energy results in the balance equation

$$\begin{aligned} \nabla _{\nu }T^{\mu \nu } = \int _{V_{p}}{\mathscr {C}}[f]\,p^{\mu }\,\pi _{m}, \end{aligned}$$
(82)

where the left-hand side is in conservative form, and the right-hand side gives rise to four-momentum exchange with the fluid.

Equation (81) forms a basis for the two-moment model for neutrino transport. Since neutrinos exchange lepton number and four-momentum with the fluid, Eq. (79) needs to be considered, as well. However, these equations are not independent. Due to the relations [obvious from the definitions in Eqs. (74)–(76)]

$$\begin{aligned} {\mathscr {N}}^{\nu } = -\frac{u_{\mu }}{\varepsilon }\,{\mathscr {T}}^{\mu \nu } \quad \text {and}\quad {\mathscr {T}}^{\nu \rho } = -\frac{u_{\mu }}{\varepsilon }\,{\mathscr {Q}}^{\mu \nu \rho }, \end{aligned}$$
(83)

Equations (79) and (81) are related in a similar way: Eq. (79) can be obtained from Eq. (81) by a contraction with \(-u_{\mu }/\varepsilon \). In a numerical implementation targeting both lepton and four-momentum exchange between neutrinos and the stellar fluid, such consistency is desirable since the numerical method then preserves a critical structure of the moment system. In the following, we provide versions of the two-moment model in the 3 + 1 framework of general relativity. Before we delve into the details, we briefly discuss two useful decompositions of the angular moments.

4.7.1 Lagrangian decompositions

With comoving frame four-momentum coordinates, Lagrangian decompositions of tensors is a natural way to express the angular moments in Eqs. (74)–(76) in terms of elementary moments of the distribution function. This is achieved with the Lagrangian decomposition of the particle four-momentum

$$\begin{aligned} p^{\mu } = \varepsilon \,(\,u^{\mu }+\ell ^{\mu }\,), \end{aligned}$$
(84)

where \(u^{\mu }\) is the four-velocity of the Lagrangian observer, and \(\ell ^{\mu }\) is a unit four-vector orthogonal to \(u^{\mu }\); i.e., \(\ell _{\mu }\ell ^{\mu }=1\) and \(u_{\mu }\ell ^{\mu }=0\). Then, \(\varepsilon =-u_{\mu }p^{\mu }\) is the neutrino energy measured by the Lagrangian observer. In terms of the composite transformation of the neutrino four-momentum, \(p^{\mu }={\mathscr {L}}^{\mu }_{{\hat{\mu }}}p^{{\hat{\mu }}}=\varepsilon \big ({\mathscr {L}}^{\mu }_{{\hat{0}}}+{\mathscr {L}}^{\mu }_{{\hat{\imath }}}\,\ell ^{{\hat{\imath }}}\big )\), a comparison with Eq. (84) implies that \({\mathscr {L}}^{\mu }_{{\hat{0}}}=u^{\mu }\) and \(\ell ^{\mu }={\mathscr {L}}^{\mu }_{{\hat{\imath }}}\,\ell ^{{\hat{\imath }}}\), where

$$\begin{aligned} \ell ^{{\hat{\imath }}} = \big \{\,\cos \vartheta ,\,\sin \vartheta \cos \varphi ,\,\sin \vartheta \sin \varphi \,\big \} \end{aligned}$$
(85)

are components of the spatial unit vector in the orthonormal comoving frame. (See Sect. 4.3 for the definition of \({\mathscr {L}}^{\mu }_{{\hat{\mu }}}\).) Inserting Eq. (84) into Eq. (74) results in the Lagrangian decomposition of the spectral neutrino four-current density

$$\begin{aligned} {\mathscr {N}}^{\mu } = {\mathscr {D}}\,u^{\mu } + {\mathscr {I}}^{\mu }, \end{aligned}$$
(86)

where the angular moments

$$\begin{aligned} \big \{{\mathscr {D}},{\mathscr {I}}^{\mu }\big \}(\varepsilon ,x,t) = \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega ,\varepsilon ,x,t)\,\big \{\,1,\,\ell ^{\mu }\,\big \}\,d\omega \end{aligned}$$
(87)

are the comoving spectral number density and number flux, respectively. Using the fluid four-velocity \(u^{\mu }\) and the projector in Eq. (42), these components are obtained from \({\mathscr {D}}=-u_{\mu }{\mathscr {N}}^{\mu }\) and \({\mathscr {I}}^{\mu }=h^{\mu }_{\nu }{\mathscr {I}}^{\nu }\). The moments in Eq. (87) are the most elementary in the moment hierarchy, and for the two-moment model, these are used in the closure procedure to determine the higher-order moments in terms of \({\mathscr {D}}\) and \({\mathscr {I}}^{\mu }\). Note that for an isotropic distribution function \(f=f_{0}\) (where \(f_{0}\) is independent of \(\omega \)), \({\mathscr {D}}=f_{0}\) and \({\mathscr {I}}^{\mu }=0\).

In a similar way, using Eq. (84) in Eq. (75), the Lagrangian decomposition of the stress-energy tensor is given by

$$\begin{aligned} {\mathscr {T}}^{\mu \nu } = {\mathscr {J}}\,u^{\mu }\,u^{\nu } + {\mathscr {H}}^{\mu }\,u^{\nu } + u^{\mu }\,{\mathscr {H}}^{\nu } + {\mathscr {K}}^{\mu \nu }, \end{aligned}$$
(88)

where

$$\begin{aligned} \big \{\,{\mathscr {J}},\,{\mathscr {H}}^{\mu },\,{\mathscr {K}}^{\mu \nu }\,\big \}(\varepsilon ,x,t) = \frac{\varepsilon }{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega ,\varepsilon ,x,t)\,\big \{1,\,\ell ^{\mu },\,\ell ^{\mu }\ell ^{\nu }\,\big \}\,d\omega , \end{aligned}$$
(89)

and \({\mathscr {H}}^{\mu }\) and \({\mathscr {K}}^{\mu \nu }\) are orthogonal to \(u_{\mu }\) (spacelike in the comoving frame); i.e., \(u_{\mu }{\mathscr {H}}^{\mu }=u_{\mu }{\mathscr {K}}^{\mu \nu }=u_{\nu }{\mathscr {K}}^{\mu \nu }=0\). In Eq. (89), \({\mathscr {J}}\), \({\mathscr {H}}^{\mu }\), and \({\mathscr {K}}^{\mu \nu }\) are respectively the spectral energy density, momentum density, and stress measured by a Lagrangian observer. The four-velocity \(u_{\mu }\) and the associated orthogonal projector \(h_{\mu \nu }\) are used to extract components of the Lagrangian decompositions of \({\mathscr {T}}^{\mu \nu }\):

$$\begin{aligned} {\mathscr {J}} = u_{\mu }\,u_{\nu }\,{\mathscr {T}}^{\mu \nu }, \quad {\mathscr {H}}^{\mu } =-u_{\nu }\,h^{\mu }_{\rho }\,{\mathscr {T}}^{\nu \rho }, \quad \text {and}\quad {\mathscr {K}}^{\mu \nu } =h^{\mu }_{\rho }\,h^{\nu }_{\sigma }\,{\mathscr {T}}^{\rho \sigma }. \end{aligned}$$
(90)

Note that the Lagrangian energy density and momentum density are related to the number density and flux by a factor \(\varepsilon \); i.e.,

$$\begin{aligned} \big \{\,{\mathscr {J}},\,{\mathscr {H}}^{\mu }\,\big \} = \varepsilon \,\big \{\,{\mathscr {D}},\,{\mathscr {I}}^{\mu }\,\big \}. \end{aligned}$$
(91)

Finally, a Lagrangian decomposition of the rank-three tensor in Eq. (76) gives

$$\begin{aligned} {\mathscr {Q}}^{\mu \nu \rho }&= \varepsilon \,\big (\, {\mathscr {J}}\,u^{\mu }\,u^{\nu }\,u^{\rho } + {\mathscr {H}}^{\mu }\,u^{\nu }\,u^{\rho } + {\mathscr {H}}^{\nu }\,u^{\mu }\,u^{\rho } + {\mathscr {H}}^{\rho }\,u^{\mu }\,u^{\nu } \nonumber \\&\quad + {\mathscr {K}}^{\mu \nu }\,u^{\rho } +{\mathscr {K}}^{\mu \rho }\,u^{\nu } +{\mathscr {K}}^{\nu \rho }\,u^{\mu } + {\mathscr {L}}^{\mu \nu \rho } \,\big ), \end{aligned}$$
(92)

where the spectral rank-three tensor measured by a Lagrangian observer,

$$\begin{aligned} {\mathscr {L}}^{\mu \nu \rho }(\varepsilon ,x,t) = \frac{\varepsilon }{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega ,\varepsilon ,x,t)\,\ell ^{\mu }\ell ^{\nu }\ell ^{\rho }\,d\omega , \end{aligned}$$
(93)

is orthogonal to \(u_{\mu }\)—i.e., \(u_{\mu }{\mathscr {L}}^{\mu \nu \rho }=u_{\nu }{\mathscr {L}}^{\mu \nu \rho }=u_{\rho }{\mathscr {L}}^{\mu \nu \rho }=0\)—and is obtained from \({\mathscr {Q}}^{\mu \nu \rho }\) using the orthogonal projector:

$$\begin{aligned} {\mathscr {L}}^{\mu \nu \rho } = \frac{1}{\varepsilon }\,h^{\mu }_{\sigma }\,h^{\nu }_{\kappa }\,h^{\rho }_{\lambda }\,{\mathscr {Q}}^{\sigma \kappa \lambda }. \end{aligned}$$
(94)

4.7.2 Eulerian decompositions

Eulerian projections of tensors are particularly useful when deriving evolution equations in the context of moment models for neutrino transport, as it is the Eulerian number density, energy density, and three-momentum density that are governed by conservation laws. In a manner similar to the Lagrangian decomposition in Eq. (86), the Eulerian decomposition of the spectral number current density is

$$\begin{aligned} {\mathscr {N}}^{\mu } = {\mathscr {N}}\,n^{\mu } + {\mathscr {G}}^{\mu }, \end{aligned}$$
(95)

where \(n_{\mu }\,{\mathscr {G}}^{\mu }=0\). The four-velocity \(n_{\mu }\) and the projector \(\gamma _{\mu \nu }=g_{\mu \nu }+n_{\mu }\,n_{\nu }\) can be used to extract the Eulerian components

$$\begin{aligned} {\mathscr {N}} = -n_{\mu }\,{\mathscr {N}}^{\mu } \quad \text {and}\quad {\mathscr {G}}^{\mu } = \gamma ^{\mu }_{\nu }\,{\mathscr {N}}^{\nu }, \end{aligned}$$
(96)

where \({\mathscr {N}}\) and \({\mathscr {G}}^{\mu }\) are the spectral number density and number flux density measured by an Eulerian observer, respectively. Note that \({\mathscr {N}}\) and \({\mathscr {G}}^{\mu }\) are still considered functions of \(\varepsilon \), the neutrino energy measured by a Lagrangian observer. Thus, the definition in Eq. (95) should merely be viewed as a decomposition of \({\mathscr {N}}^{\mu }\) in a different basis than in Eq. (86), not as moments of the distribution with respect to Eulerian momentum coordinates. Inserting the Lagrangian decomposition in Eq. (86) into the expressions in Eq. (96), the Eulerian number density and number flux density are expressed in terms of the Lagrangian number density and number flux density as

$$\begin{aligned} {\mathscr {N}}&= W\,{\mathscr {D}} + v_{\mu }\,{\mathscr {I}}^{\mu }, \end{aligned}$$
(97)
$$\begin{aligned} {\mathscr {G}}^{\mu }&=\big [\,\delta ^{\mu }_{\nu }-n^{\mu }v_{\nu }\,\big ]{\mathscr {I}}^{\nu } + W\,{\mathscr {D}}\,v^{\mu }. \end{aligned}$$
(98)

Similarly, the Eulerian decomposition of the stress-energy tensor is

$$\begin{aligned} {\mathscr {T}}^{\mu \nu } = {\mathscr {E}}\,n^{\mu }\,n^{\nu } + {\mathscr {F}}^{\mu }\,n^{\nu } + n^{\mu }\,{\mathscr {F}}^{\nu } + {\mathscr {S}}^{\mu \nu }, \end{aligned}$$
(99)

where \({\mathscr {E}}\), \({\mathscr {F}}^{\mu }\), and \({\mathscr {S}}^{\mu \nu }\) are respectively the spectral energy density, momentum density, and stress measured by an Eulerian observer. The Eulerian momentum density and stress are spacelike (i.e., \(n_{\mu }{\mathscr {F}}^{\mu }=n_{\mu }{\mathscr {S}}^{\mu \nu }=n_{\nu }{\mathscr {S}}^{\mu \nu }=0\)), and the components of the Eulerian decomposition of \({\mathscr {T}}^{\mu \nu }\) are extracted using \(n_{\mu }\) and the associated orthogonal projector \(\gamma _{\mu \nu }\):

$$\begin{aligned} {\mathscr {E}} = n_{\mu }\,n_{\nu }\,{\mathscr {T}}^{\mu \nu }, \quad {\mathscr {F}}^{\mu } =-n_{\nu }\,\gamma ^{\mu }_{\rho }\,{\mathscr {T}}^{\nu \rho }, \quad {\mathscr {S}}^{\mu \nu } =\gamma ^{\mu }_{\rho }\,\gamma ^{\nu }_{\sigma }\,{\mathscr {T}}^{\rho \sigma }. \end{aligned}$$
(100)

Inserting the Lagrangian decomposition in Eq. (88) into the expressions in Eq. (100), the Eulerian energy density, momentum density, and stress are expressed in terms of the corresponding Lagrangian quantities as [cf. Equations (B8)–(B10) in Cardall et al. 2013b]

$$\begin{aligned} {\mathscr {E}}&=W^{2}{\mathscr {J}} + 2\,W\,v_{\mu }\,{\mathscr {H}}^{\mu } + v_{\mu }\,v_{\nu }\,{\mathscr {K}}^{\mu \nu }, \end{aligned}$$
(101)
$$\begin{aligned} {\mathscr {F}}^{\mu }&=W\,v^{\mu }\,\big (\,W{\mathscr {J}} + v_{\nu }\,{\mathscr {H}}^{\nu }\,\big ) + \big [\,\delta ^{\mu }_{\rho }-n^{\mu }\,v_{\rho }\,\big ]\,\big (\,W{\mathscr {H}}^{\rho }+v_{\nu }{\mathscr {K}}^{\nu \rho }\,\big ), \end{aligned}$$
(102)
$$\begin{aligned} {\mathscr {S}}^{\mu \nu }&=W^{2}{\mathscr {J}}v^{\mu }v^{\nu } + Wv^{\nu }\big [\,\delta ^{\mu }_{\rho }-n^{\mu }v_{\rho }\,\big ]\,{\mathscr {H}}^{\rho } +Wv^{\mu }\big [\,\delta ^{\nu }_{\sigma }-n^{\nu }v_{\sigma }\,\big ]{\mathscr {H}}^{\sigma } \nonumber \\&\quad +\big [\,\delta ^{\mu }_{\rho }-n^{\mu }v_{\rho }\,\big ]\big [\,\delta ^{\nu }_{\sigma }-n^{\nu }v_{\sigma }\,\big ]{\mathscr {K}}^{\rho \sigma }. \end{aligned}$$
(103)

Finally, and similar to Eqs. (95) and (99), the Eulerian decomposition of the rank three tensor in Eq. (76) is given by

$$\begin{aligned} {\mathscr {Q}}^{\mu \nu \rho }&=\varepsilon \,\big (\, {\mathscr {X}}\,n^{\mu }\,n^{\nu }\,n^{\rho } + {\mathscr {Y}}^{\mu }\,n^{\nu }\,n^{\rho } + {\mathscr {Y}}^{\nu }\,n^{\mu }\,n^{\rho } + {\mathscr {Y}}^{\rho }\,n^{\mu }\,n^{\nu } \nonumber \\&\quad + {\mathscr {Z}}^{\mu \nu }\,n^{\rho } +{\mathscr {Z}}^{\mu \rho }\,n^{\nu } +{\mathscr {Z}}^{\nu \rho }\,n^{\mu } + {\mathscr {W}}^{\mu \nu \rho } \,\big ), \end{aligned}$$
(104)

where the Eulerian components are obtained from

$$\begin{aligned} {\mathscr {X}}&= - \frac{1}{\varepsilon }\,n_{\mu }\,n_{\nu }\,n_{\rho }\,{\mathscr {Q}}^{\mu \nu \rho }, \end{aligned}$$
(105)
$$\begin{aligned} {\mathscr {Y}}^{\mu }&=\frac{1}{\varepsilon }\,\gamma ^{\mu }_{\sigma }\,n_{\nu }\,n_{\rho }\,{\mathscr {Q}}^{\sigma \nu \rho }, \end{aligned}$$
(106)
$$\begin{aligned} {\mathscr {Z}}^{\mu \nu }&=-\frac{1}{\varepsilon }\,\gamma ^{\mu }_{\sigma }\,\gamma ^{\nu }_{\kappa }\,n_{\rho }\,{\mathscr {Q}}^{\sigma \kappa \rho }, \end{aligned}$$
(107)
$$\begin{aligned} {\mathscr {W}}^{\mu \nu \rho }&=\frac{1}{\varepsilon }\,\gamma ^{\mu }_{\sigma }\,\gamma ^{\nu }_{\kappa }\,\gamma ^{\rho }_{\lambda }\,{\mathscr {Q}}^{\sigma \kappa \lambda }. \end{aligned}$$
(108)

These components can be expressed in terms of the Lagrangian moments by inserting the Lagrangian decomposition in Eq. (92). We will not repeat these tedious expressions here, but see Eqs. (B15), (B16), (B17), and (B18) in Cardall et al. (2013b) for expressions relating respectively \({\mathscr {Z}}\), \({\mathscr {Y}}_{\mu }\), \({\mathscr {Z}}_{\mu \nu }\), and \({\mathscr {W}}_{\mu \nu \rho }\) in terms of the Lagrangian moments \({\mathscr {J}}\), \({\mathscr {H}}^{\mu }\), \({\mathscr {K}}^{\mu \nu }\), and \({\mathscr {L}}^{\mu \nu \rho }\) (note the difference of the factor of \(\varepsilon \) between our definition of \({\mathscr {Q}}^{\mu \nu \rho }\) and the corresponding variable in Cardall et al. 2013b).

While components of Lagrangian decompositions are more closely related to the distribution function, Eulerian decompositions appear to be more natural to use in the 3 + 1 approach, and powerful in simplifying terms appearing in the moment equations, especially for the energy derivative terms in Eqs. (79) and (81), which contain contractions with the covariant derivative of the fluid four-velocity. As elaborated on in Cardall et al. (2013b), Eulerian decompositions of \({\mathscr {T}}^{\mu \nu }\) and \({\mathscr {Q}}^{\mu \nu \rho }\), in combination with the Eulerian decomposition of \(u^{\mu }\), in Eq. (41) result in surprisingly simple expressions, without explicit reference to connection coefficients [cf. Eq. (13)]. Moreover, as emphasized by Cardall et al. (2013b), consistent use of Eulerian decompositions in spacetime and momentum-space divergences in the moment equations turns out to simplify the elucidation of the relationship between the equations for four-momentum and number conservation in the 3 + 1 case.

4.7.3 Two-moment kinetics

In this section we review two-moment models in the 3 + 1 formulation of general relativity, which can serve as a basis for the development of numerical methods and their implementation in codes to model neutrino transport in core-collapse supernovae. We present three versions, all based on Eq. (81), but using different projections. The projection of Eq. (81) orthogonal and tangential to the spacelike slice of the Eulerian observer (using \(n_{\mu }\) and \(\gamma _{\mu \nu }\)) gives rise to the Eulerian two-moment model, while the projection of Eq. (81) orthogonal and tangential to the spacelike slice of the Lagrangian observer (using \(u_{\mu }\) and \(h_{\mu \nu }\)) gives rise to the Lagrangian two-moment model. We also present a number conservative two-moment model, which is closely related to the Lagrangian two-moment model, but uses projections based on \(u_{\mu }/\varepsilon \) and \(h_{\mu \nu }/\varepsilon \). This results in one of the evolved equations being Eq. (79), which is neutrino number conservative. Analytically, all these formulations are equivalent, but they could have different numerical properties.

Eulerian two-moment model The Eulerian two-moment model evolves the spectral energy density and momentum density measured by an Eulerian observer (\({\mathscr {E}}\) and \({\mathscr {F}}_{j}\), respectively). The energy equation is obtained as the projection of Eq. (81) onto the four-velocity of the Eulerian observer [i.e., contracting \(-n_{\mu }\) with Eq. (81)]. The result is:

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,{\mathscr {E}}\,\big )+\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,{\mathscr {F}}^{i}-\beta ^{i}\,{\mathscr {E}}\,\big ]\,\big )\,\big ] -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{2}\,(-n_{\mu })\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho }\,\big ) \nonumber \\&\quad =\frac{1}{\alpha }\,\big [\,\alpha \,{\mathscr {S}}^{ij}\,{\mathsf {K}}_{ij}-{\mathscr {F}}^{i}\,\partial _{i}{\alpha }\,\big ] +\frac{W}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,d\omega +\frac{v^{j}}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,\ell _{j}\,d\omega , \end{aligned}$$
(109)

where the sources on the right-hand side are due to spacetime geometry and energy exchange between neutrinos and the fluid. The left-hand side is in divergence form, where the divergence operates on the spacetime-plus-energy phase-space. In expressing the terms inside the energy derivative (last term on the left-hand side), we make use of the Eulerian decomposition in Eq. (104) to write

$$\begin{aligned}&-\frac{n_{\mu }}{\varepsilon }\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho } \nonumber \\&\quad = \big (\, {\mathscr {X}}\,n^{\nu }\,n^{\rho } + {\mathscr {Y}}^{\nu }\,n^{\rho } + n^{\nu }\,{\mathscr {Y}}^{\rho } + {\mathscr {Z}}^{\nu \rho } \,\big )\,\nabla _{\nu }u_{\rho } \nonumber \\&\quad = \frac{W}{\alpha }\, \Big \{\, \big (\,{\mathscr {Y}}^{i} - {\mathscr {X}}\,v^{i}\,\big )\,\partial _{i}{\alpha } +{\mathscr {Y}}_{k}\,v^{i}\,\partial _{i}{\beta ^{k}} +\alpha \,{\mathscr {Z}}^{ki}\,\left( \,\frac{1}{2}\,v^{m}\,\partial _{m}{\gamma _{ki}} - {\mathsf {K}}_{ki}\,\right) \,\Big \} \nonumber \\&\qquad +\frac{1}{\alpha }\, \Big \{\, {\mathscr {Y}}_{k}\,\partial _{t}{}\big (Wv^{k}\big ) - {\mathscr {X}}\,\partial _{t}{W} - \big (\,\alpha \,{\mathscr {Y}}^{i} - {\mathscr {X}}\,\beta ^{i}\,\big )\,\partial _{i}{W} \nonumber \\&\qquad + \big (\,\alpha \,{\mathscr {Z}}_{k}^{i} - {\mathscr {Y}}_{k}\,\beta ^{i}\,\big )\,\partial _{i}{}\big (Wv^{k}\big ) \,\Big \}, \end{aligned}$$
(110)

which account for changes in the spectral energy density due to gravitational energy shifts and the fact that adjacent comoving observer velocities in spacetime are generally different.

The momentum equation is obtained as the projection of Eq. (81) into the slice with normal given by \(n^{\mu }\) [i.e., contracting \(\gamma _{j\mu }\) with Eq. (81)], which results in

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\partial _{t}{}\big (\,\sqrt{\gamma }{\mathscr {F}}_{j}\,\big )+\partial _{i}{}\big (\,\sqrt{\gamma }\big [\,\alpha {\mathscr {S}}^{i}_{j}-\beta ^{i}{\mathscr {F}}_{j}\,\big ]\,\big )\big ] -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{2}\,\gamma _{j\mu }\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho }\,\big ) \nonumber \\&\quad =\frac{1}{\alpha }\,\big [\,{\mathscr {F}}_{i}\,\partial _{j}{\beta ^{i}}+\frac{1}{2}\,\alpha \,{\mathscr {S}}^{ik}\,\partial _{j}{\gamma _{ik}}-{\mathscr {E}}\,\partial _{j}{\alpha }\,\big ] +\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,\ell _{j}\,d\omega +\frac{Wv_{j}}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,d\omega , \end{aligned}$$
(111)

where the right-hand side gives rise to changes in the spectral momentum density due to spacetime geometry and neutrino–matter interactions. Again, using the Eulerian decomposition in Eq. (104), the terms inside the energy derivative can be written as

$$\begin{aligned}&\frac{\gamma _{j\mu }}{\varepsilon }\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho } \nonumber \\&\quad = \big (\, {\mathscr {Y}}_{j}\,n^{\nu }\,n^{\rho } + {\mathscr {Z}}_{j}^{\nu }\,n^{\rho } + {\mathscr {Z}}_{j}^{\rho }\,n^{\nu } + {\mathscr {W}}_{j}^{\nu \rho } \,\big )\,\nabla _{\nu }u_{\rho } \nonumber \\&\quad = \frac{W}{\alpha }\, \Big \{\, \big (\,{\mathscr {Z}}_{j}^{i} - {\mathscr {Y}}_{j}\,v^{i}\,\big )\,\partial _{i}{\alpha } +{\mathscr {Z}}_{jk}\,v^{i}\,\partial _{i}{\beta ^{k}} +\alpha \,{\mathscr {W}}_{j}^{ki}\,\left( \,\frac{1}{2}\,v^{m}\,\partial _{m}{\gamma _{ki}} - {\mathsf {K}}_{ki}\,\right) \,\Big \} \nonumber \\&\qquad +\frac{1}{\alpha }\, \Big \{\, {\mathscr {Z}}_{jk}\,\partial _{t}{}\big (Wv^{k}\big ) - {\mathscr {Y}}_{j}\,\partial _{t}{W} - \big (\,\alpha \,{\mathscr {Z}}_{j}^{i} - {\mathscr {Y}}_{j}\,\beta ^{i}\,\big )\,\partial _{i}{W} \nonumber \\&\qquad + \big (\,\alpha \,{\mathscr {W}}_{jk}^{i} - {\mathscr {Z}}_{jk}\,\beta ^{i}\,\big )\,\partial _{i}{}\big (Wv^{k}\big ) \,\Big \}, \end{aligned}$$
(112)

which account for changes in the spectral momentum density due to gravitational comoving observer effects.

An obvious advantage of the Eulerian two-moment model given by Eqs. (109) and (111) is the conservative form. Integrating these equations over energy space (using \(dV_{\varepsilon }=4\pi \varepsilon ^{2}d\varepsilon \)) results in the radiation energy equation

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,E\,\big )+\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,F^{i}-\beta ^{i}\,E\,\big ]\,\big )\,\big ] \nonumber \\&\quad =\frac{1}{\alpha }\,\big [\,\alpha \,S^{ij}\,{\mathsf {K}}_{ij}-F^{i}\,\partial _{i}{\alpha }\,\big ] +W\int _{V_{p}}{\mathscr {C}}[f]\,\varepsilon \,\pi _{m} +v^{j}\int _{V_{p}}{\mathscr {C}}[f]\,\varepsilon \,\ell _{j}\,\pi _{m} \end{aligned}$$
(113)

and radiation momentum equation

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }F_{j}\,\big )+\partial _{i}{}\big (\,\sqrt{\gamma }\big [\,\alpha S^{i}_{j}-\beta ^{i}F_{j}\,\big ]\,\big )\,\big ] \nonumber \\&\quad =\frac{1}{\alpha }\,\big [\,F_{i}\,\partial _{j}{\beta ^{i}}+\frac{1}{2}\,\alpha \,S^{ik}\,\partial _{j}{\gamma _{ik}}-E\,\partial _{j}{\alpha }\,\big ] +\int _{V_{p}}{\mathscr {C}}[f]\,\varepsilon \,\ell _{j}\,\pi _{m} +Wv_{j}\int _{V_{p}}{\mathscr {C}}[f]\,\varepsilon \,\pi _{m}, \end{aligned}$$
(114)

where the energy-integrated Eulerian moments are given by

$$\begin{aligned} \big \{\,E,\,F^{\mu },\,S^{\mu \nu }\,\big \} = \int _{0}^{\infty }\big \{\,{\mathscr {E}},\,{\mathscr {F}}^{\mu },\,{\mathscr {S}}^{\mu \nu }\,\big \}\,dV_{\varepsilon }. \end{aligned}$$
(115)

Equations (113) and (114) are conservation laws for radiation energy and momentum in the sense that in the case of Cartesian coordinates in flat spacetime, with no neutrino–matter interactions, the right-hand sides vanish, and the equations express exact conservation of radiation energy and momentum.

The Eulerian two-moment model presented here is the basis for several codes developed to model neutrino transport in core-collapse supernovae (O’Connor 2015; Kuroda et al. 2016; Roberts et al. 2016): the GR1D code (O’Connor 2015) solves the equations in spherical symmetry; the Zelmani code (Roberts et al. 2016) solves the equations in three spatial dimensions, but does not include velocity dependent terms (i.e., \(v^{i}=0\) in the transport equations); and Kuroda et al. (2016) solve the full system in three spatial dimensions.

Lagrangian two-moment model The Lagrangian two-moment model is an alternative to the Eulerian two-moment model discussed above, where the spectral energy density and momentum density measured by the Lagrangian observer with four-velocity \(u_{\mu }\) are evolved (\({\mathscr {J}}\) and \({\mathscr {H}}_{j}\), respectively). The energy equation is obtained as the projection of Eq. (81) along the four-velocity of the Lagrangian observer [i.e., contracting \(-u_{\mu }\) with Eq. (81)], which results in

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,\big [\,W{\mathscr {J}}+v^{i}{\mathscr {H}}_{i}\,\big ]\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,{\mathscr {H}}^{i}+\big (\,\alpha \,v^{i}-\beta ^{i}\,\big )\,W{\mathscr {J}}\,\big ]\,\big )\,\big ] \nonumber \\&\qquad -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{3}\,{\mathscr {T}}^{\mu \nu }\,\nabla _{\mu }u_{\nu }\,\big ) =-{\mathscr {T}}^{\mu \nu }\nabla _{\mu }u_{\nu } + \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,d\omega , \end{aligned}$$
(116)

where the contraction of the stress-energy equation with the covariant derivative of the Lagrangian observer’s four-velocity is given in \(3\,+\,1\) form as

$$\begin{aligned}&{\mathscr {T}}^{\mu \nu }\,\nabla _{\mu }u_{\nu } \nonumber \\&\quad = \big (\, {\mathscr {E}}\,n^{\mu }\,n^{\nu } + {\mathscr {F}}^{\mu }\,n^{\nu } + n^{\mu }\,{\mathscr {F}}^{\nu } + {\mathscr {S}}^{\mu \nu } \,\big )\,\nabla _{\mu }u_{\nu } \nonumber \\&\quad = \frac{W}{\alpha }\, \Big \{\, \big (\,{\mathscr {F}}^{i} - {\mathscr {E}}\,v^{i}\,\big )\,\partial _{i}{\alpha } +{\mathscr {F}}_{k}\,v^{i}\,\partial _{i}{\beta ^{k}} +\alpha \,{\mathscr {S}}^{ki}\,\left( \,\frac{1}{2}\,v^{m}\,\partial _{m}{\gamma _{ki}} - {\mathsf {K}}_{ki}\,\right) \,\Big \} \nonumber \\&\qquad +\frac{1}{\alpha }\, \Big \{\, {\mathscr {F}}_{k}\,\partial _{t}{}\big (Wv^{k}\big ) - {\mathscr {E}}\,\partial _{t}{W} - \big (\,\alpha \,{\mathscr {F}}^{i} - {\mathscr {E}}\,\beta ^{i}\,\big )\,\partial _{i}{W} \nonumber \\&\qquad + \big (\,\alpha \,{\mathscr {S}}_{k}^{i} - {\mathscr {F}}_{k}\,\beta ^{i}\,\big )\,\partial _{i}{}\big (Wv^{k}\big ) \,\Big \}, \end{aligned}$$
(117)

which accounts for changes to the spectral energy density from gravitational effects and from the fact that adjacent comoving observers in spacetime have different velocities. In Eq. (117), we made use of the Eulerian decomposition of the stress-energy tensor, which, as discussed at the end of Sect. 4.7.2, is more convenient than using the Lagrangian decomposition, since it keeps the number of terms in the expression to a minimum and simplifies book-keeping. The components of the Eulerian decomposition are related to the Lagrangian components by Eqs. (101)–(103).

The Lagrangian momentum equation is obtained by projecting Eq. (81) tangential to the slice with \(u^{\mu }\) as the normal [i.e., contracting \(h_{j\mu }\) with Eq. (81)], which gives

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,\big [\,W{\mathscr {H}}_{j}+v^{i}{\mathscr {K}}_{ij}\,\big ]\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,{\mathscr {K}}^{i}_{j}+\big (\,\alpha \,v^{i}-\beta ^{i}\,\big )\,W{\mathscr {H}}_{j}\,\big ]\,\big )\,\big ] \nonumber \\&\qquad -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{2}\,h_{j\mu }\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho }\,\big ) ={\mathscr {T}}^{\mu \nu }\,\big (\,\nabla _{\nu }h_{j\mu } + \varGamma ^{\rho }_{j\nu }h_{\rho \mu }\,\big ) + \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,\ell _{j}\,d\omega , \end{aligned}$$
(118)

where the “geometry” source on the right-hand side can be written as

$$\begin{aligned}&{\mathscr {T}}^{\mu \nu }\,\big (\,\nabla _{\nu }h_{j\mu } + \varGamma ^{\rho }_{j\nu }h_{\rho \mu }\,\big ) \nonumber \\&\quad =\frac{1}{2}\,{\mathscr {T}}^{\mu \nu }\,\partial _{j}{g_{\mu \nu }} +Wv_{j}\,{\mathscr {T}}^{\mu \nu }\nabla _{\mu }u_{\nu } +u_{\mu }{\mathscr {T}}^{\mu \nu }\,\partial _{\nu }{}\big (\,Wv_{j}\,\big ). \end{aligned}$$
(119)

Again, using the Eulerian decomposition of \({\mathscr {T}}^{\mu \nu }\), the first term on the right-hand side of Eq. (119) can be written as

$$\begin{aligned} \frac{1}{2}\,{\mathscr {T}}^{\mu \nu }\,\partial _{j}{g_{\mu \nu }} =\frac{1}{\alpha }\,\big [\,{\mathscr {F}}_{i}\,\partial _{j}{\beta ^{i}}+\frac{1}{2}\,\alpha \,{\mathscr {S}}^{ik}\,\partial _{j}{\gamma _{ik}}-{\mathscr {E}}\,\partial _{j}{\alpha }\,\big ], \end{aligned}$$
(120)

which also appears on the right-hand side of Eq. (111). Similarly, the third term on the right-hand side of Eq. (119) can be written as

$$\begin{aligned} u_{\mu }{\mathscr {T}}^{\mu \nu }\,\partial _{\nu }{}\big (\,Wv_{j}\,\big )&=-\frac{W}{\alpha }\,\Big \{\,{\mathscr {E}}-v^{k}\,{\mathscr {F}}_{k}\,\Big \}\,\partial _{t}{}\big (Wv_{j}\big )\nonumber \\&\quad -\frac{W}{\alpha }\, \Big \{\, \big (\,\alpha \,{\mathscr {F}}^{i}-\beta ^{i}\,{\mathscr {E}}\,\big ) -v^{k}\,\big (\,\alpha {\mathscr {S}}^{i}_{k}-\beta ^{i}\,{\mathscr {F}}_{k}\,\big ) \,\Big \}\,\partial _{i}{}\big (Wv_{j}\big ), \end{aligned}$$
(121)

while the second term on the right-hand side of Eq. (119) contains the expression in Eq. (117). Finally, the expression inside the energy derivative term on the left-hand side of Eq. (118) can be written as

$$\begin{aligned}&\frac{h_{j\mu }}{\varepsilon }\,{\mathscr {Q}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho } \nonumber \\&\quad =\big (\,\frac{1}{\varepsilon }\,{\mathscr {Q}}_{j}^{\nu \rho }-Wv_{j}\,{\mathscr {T}}^{\nu \rho }\,\big )\,\nabla _{\nu }u_{\rho } \nonumber \\&\quad = \Big \{\, \big ({\mathscr {Y}}_{j}-Wv_{j}\,{\mathscr {E}}\big )\,n^{\nu }\,n^{\rho } +\big ({\mathscr {Z}}_{j}^{\nu }-Wv_{j}\,{\mathscr {F}}^{\nu }\big )\,n^{\rho } \nonumber \\&\qquad +\big ({\mathscr {Z}}_{j}^{\rho }-Wv_{j}{\mathscr {F}}^{\rho }\big )\,n^{\nu } +\big ({\mathscr {W}}_{j}^{\nu \rho }-Wv_{j}{\mathscr {S}}^{\nu \rho }\big )\,\Big \}\,\nabla _{\nu }u_{\rho }, \end{aligned}$$
(122)

which is a contraction of Eulerian decompositions of rank two tensors, with components \(\big ({\mathscr {Y}}_{j}-Wv_{j}\,{\mathscr {E}}\big )\), \(\big ({\mathscr {Z}}_{j}^{\nu }-Wv_{j}\,{\mathscr {F}}^{\nu }\big )\), and \(\big ({\mathscr {W}}_{j}^{\nu \rho }-Wv_{j}{\mathscr {S}}^{\nu \rho }\big )\), contracted with the covariant derivative of the fluid four-velocity, and can be written in a form similar to Eq. (112).

The Lagrangian two-moment model presented here [Eqs. (116) and (118)] is the basis for several codes used to model neutrino transport in core-collapse supernovae: Müller et al. (2010) used it in conjunction with the conformal flatness approximation to GR (CFA) and ray-by-ray neutrino transport transport; and Just et al. (2015) and Skinner et al. (2019) used this model in its \({\mathscr {O}}(v/c)\) limit to develop multi-dimensional neutrino transport codes.

Number conservative two-moment model The number conservative model is yet another formulation of two-moment transport, which evolves the spectral number density as measured by the Eulerian observer (with four-velocity \(n_{\mu }\)) and the spectral number flux. The equation for the number density is obtained (1) directly from Eq. (79), (2) by contraction of Eq. (81) with \(-u_{\mu }/\varepsilon \), or (3) by dividing Eq. (116) by \(\varepsilon \). In \(3\,+\,1\) form it is given by

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,\big [\,W{\mathscr {D}}+v^{i}{\mathscr {I}}_{i}\,\big ]\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,{\mathscr {I}}^{i}+\big (\,\alpha \,v^{i}-\beta ^{i}\,\big )\,W{\mathscr {D}}\,\big ]\,\big )\,\big ] \nonumber \\&\qquad -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\big (\,\varepsilon ^{2}\,{\mathscr {T}}^{\mu \nu }\,\nabla _{\mu }u_{\nu }\,\big ) =\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(123)

where the expression inside the energy derivative (last term on the left-hand side) is given by Eq. (117). Equation (123) is conservative in the sense that an integration over energy space gives the balance equation Eq. (80), which in \(3\,+\,1\) form is given by

$$\begin{aligned} \frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,N\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,G^{i}-\beta ^{i}\,N\,\big ]\,\big )\,\big ] =\int _{V_{p}}{\mathscr {C}}[f]\,\pi _{m}, \end{aligned}$$
(124)

expressing exact particle conservation in the absence of particle-converting neutrino–matter interactions (e.g., emission and absorption).

The equation for the number flux density is obtained by contraction of \(h_{j\mu }/\varepsilon \) with Eq. (81) [or by dividing Eq. (118) by \(\varepsilon \)]:

$$\begin{aligned}&\frac{1}{\alpha \sqrt{\gamma }} \big [\,\partial _{t}{}\big (\,\sqrt{\gamma }\,\big [\,W{\mathscr {I}}_{j}+v^{i}\widehat{{\mathscr {K}}}_{ij}\,\big ]\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,\widehat{{\mathscr {K}}}^{i}_{j}+\big (\,\alpha \,v^{i}-\beta ^{i}\,\big )\,W{\mathscr {I}}_{j}\,\big ]\,\big )\,\big ] \nonumber \\&\qquad -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon }\Big (\,\varepsilon ^{2}\,h_{j\mu }\,\widehat{{\mathscr {Q}}}^{\mu \nu \rho }\,\nabla _{\nu }u_{\rho }\,\Big ) =\frac{1}{2}\,\widehat{{\mathscr {T}}}^{\mu \nu }\,\partial _{j}{g_{\mu \nu }} +\frac{1}{\varepsilon }\,\widehat{{\mathscr {Q}}}_{j}^{\mu \nu }\,\nabla _{\nu }u_{\mu } -{\mathscr {N}}^{\nu }\,\partial _{\nu }{}u_{j} \nonumber \\&\qquad +\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}(f)\,\ell _{j}\,\frac{d\omega }{\varepsilon }. \end{aligned}$$
(125)

Here, we use the “hat” to denote previously-defined moments divided by \(\varepsilon \); e.g.,

$$\begin{aligned} \big \{\,\widehat{{\mathscr {T}}}^{\mu \nu },\,\widehat{{\mathscr {Q}}}^{\mu \nu \rho }\,\big \} = \frac{1}{\varepsilon }\big \{\,{\mathscr {T}}^{\mu \nu },\,{\mathscr {Q}}^{\mu \nu \rho }\,\big \}. \end{aligned}$$
(126)

The expression in the energy derivative in Eq. (125) is given by Eq. (122). The first term on the right-hand side of Eq. (125) can be written as [cf. Eq. (118)]

$$\begin{aligned} \frac{1}{2}\,\widehat{{\mathscr {T}}}^{\mu \nu }\,\partial _{j}{g_{\mu \nu }} =\frac{1}{\alpha }\,\Big \{\,\widehat{{\mathscr {F}}}_{i}\,\partial _{j}{\beta ^{i}}+\frac{1}{2}\,\alpha \,\widehat{{\mathscr {S}}}^{ik}\,\partial _{j}{\gamma _{ik}}-\widehat{{\mathscr {E}}}\,\partial _{j}{\alpha }\,\Big \}, \end{aligned}$$
(127)

while the third term on the right-hand side of Eq. (125) can be written as

$$\begin{aligned} {\mathscr {N}}^{\nu }\partial _{\nu }{}u_{j} =\frac{1}{\alpha }\,\Big \{\,{\mathscr {N}}\,\partial _{t}{}\,\big (Wv_{j}\big )+\big (\alpha \,{\mathscr {G}}^{i}-\beta ^{i}{\mathscr {N}}\big )\,\partial _{i}{}\big (Wv_{j}\big )\,\Big \}, \end{aligned}$$
(128)

where \({\mathscr {N}}\) and \({\mathscr {G}}^{i}\) are written in terms of Lagrangian moments in Eqs. (97) and (98). The second term on the right-hand side of Eq. (125) can be written as

$$\begin{aligned} \frac{1}{\varepsilon }\,\widehat{{\mathscr {Q}}}_{j}^{\mu \nu }\,\nabla _{\nu }u_{\mu }&= \Big \{\, \widehat{{\mathscr {Y}}}_{j}\,n^{\mu }\,n^{\nu } +\widehat{{\mathscr {Z}}}_{j}^{\mu }\,n^{\nu } +n^{\mu }\,\widehat{{\mathscr {Z}}}_{j}^{\nu } +\widehat{{\mathscr {W}}}_{j}^{\mu \nu } \,\Big \}\,\nabla _{\nu }u_{\mu }, \end{aligned}$$
(129)

which is in the same form as Eq. (117), but where \(\widehat{{\mathscr {Y}}}_{j}\), \(\widehat{{\mathscr {Z}}}_{j}^{\mu }\), and \(\widehat{{\mathscr {W}}}_{j}^{\mu \nu }\), replace \({\mathscr {E}}\), \({\mathscr {F}}^{\mu }\), and \({\mathscr {S}}^{\mu \nu }\), respectively.

This number conservative two-moment model was presented in spherical symmetry, assuming the conformal flatness approximation (CFA) to general relativity, by Müller et al. (2010), and was also presented in the \({\mathscr {O}}(v/c)\) limit by Just et al. (2015), but it was not explicitly used in the numerical techniques developed by either of these authors. The model presented here is the 3 + 1 general relativistic version of that model, without approximation. It should also be mentioned that Rampp and Janka (2002) developed a two-moment, variable Eddington factor method based on solving both the Lagrangian two-moment model and the number conservative two-moment model simultaneously, in spherical symmetry and in the \({\mathscr {O}}(v/c)\) limit, treating the radiation energy density, momentum density, number density and number flux density as independent variables. However, resulting from inconsistency between the energy and number equations, in this approach the mean energy in an energy group, \({\mathscr {J}}/{\mathscr {D}}\), is not constrained to the group boundaries, and can even move outside the group (Müller et al. 2010).

4.7.4 The closure problem

The two-moment models discussed above are not closed. The rank-two tensor \({\mathscr {K}}^{\mu \nu }\) defined in Eq. (89) and the rank-three tensor \({\mathscr {L}}^{\mu \nu \rho }\) defined in Eq. (93) are present in various terms in the two-moment model: components of \({\mathscr {K}}^{\mu \nu }\) are present in spacetime derivative terms, while components of \({\mathscr {K}}^{\mu \nu }\) and \({\mathscr {L}}^{\mu \nu \rho }\) are present in energy derivative terms and source terms. These tensor components must be expressed in terms of the evolved moments to close the system of equations. For the Eulerian and the Lagrangian two-moment models, the evolved quantities are ultimately the energy density and momentum density measured by a comoving observer; \({\mathscr {J}}\) and \({\mathscr {H}}_{j}\), respectively. (For the number conservative two-moment model, the evolved quantities are the number density and number flux density measured by a comoving observer; \({\mathscr {D}}\) and \({\mathscr {I}}_{j}\), respectively.)

Following Levermore (1984) and Anile et al. (1992), the general symmetric, rank-two tensor \({\mathscr {K}}^{\mu \nu }\), depending on \({\mathscr {J}}\) and \({\mathscr {H}}^{\mu }\), that is orthogonal to the fluid four-velocity \(u_{\mu }\) and that satisfies the trace condition \({\mathscr {K}}^{\mu }_{\mu }={\mathscr {J}}\) takes the form

$$\begin{aligned} {\mathscr {K}}^{\mu \nu } =\frac{1}{2}\,\Big [\,\big (\,1-{\mathfrak {k}}\,\big )\,h^{\mu \nu }+\big (\,3\,{\mathfrak {k}}-1\,\big )\,\widehat{{\mathsf {h}}}^{\mu }\,\widehat{{\mathsf {h}}}^{\nu }\,\Big ]\,{\mathscr {J}}, \end{aligned}$$
(130)

where \({\mathfrak {k}}({\mathscr {J}},{\mathfrak {h}})\) is the Eddington factor, \({\mathfrak {h}}={\mathscr {H}}/{\mathscr {J}}\) is the flux factor, \({\mathscr {H}}=\sqrt{{\mathscr {H}}_{\mu }{\mathscr {H}}^{\mu }}\), and \(\widehat{{\mathsf {h}}}^{\mu }={\mathscr {H}}^{\mu }/{\mathscr {H}}\) is a unit four-vector parallel to \({\mathscr {H}}^{\mu }\). It is straightforward to show that the Eddington factor can be written as

$$\begin{aligned} {\mathfrak {k}}=\frac{\widehat{{\mathsf {h}}}_{\mu }\widehat{{\mathsf {h}}}_{\nu }\,{\mathscr {K}}^{\mu \nu }}{{\mathscr {J}}} =\frac{\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega )\,(\widehat{{\mathsf {h}}}_{\mu }\ell ^{\mu })^{2}\,d\omega }{\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega )\,d\omega } =\frac{\frac{1}{2}\int _{-1}^{1}{\mathfrak {f}}(\mu )\,\mu ^{2}\,d\mu }{\frac{1}{2}\int _{-1}^{1}{\mathfrak {f}}(\mu )\,d\mu }, \end{aligned}$$
(131)

where we have defined

$$\begin{aligned} {\mathfrak {f}}(\mu )=\frac{1}{2\pi }\int _{0}^{2\pi }f(\mu ,\varphi )\,d\varphi . \end{aligned}$$
(132)

In the last step in Eq. (131) we have aligned the momentum-space coordinate system in the comoving frame so that \(\widehat{{\mathsf {h}}}_{\mu }\ell ^{\mu }=\widehat{{\mathsf {h}}}_{{\hat{\mu }}}\ell ^{{\hat{\mu }}}=\cos \vartheta =\mu \). (Note, this is not the same \(\mu \) that will be defined later, in Sect. 6.1.1. The angle here is defined in terms of the direction specified by \(\widehat{{\mathsf {h}}}_{{\hat{\mu }}}\), whereas in Sect. 6.1.1 it will be defined in terms of \({\hat{r}}\).) The two-moment closure for \({\mathscr {K}}^{\mu \nu }\) requires the Eddington factor to be specified in terms of \({\mathscr {J}}\) and \({\mathfrak {h}}\) (or equivalently \({\mathscr {D}}\) and \({\mathfrak {h}}\)). We will discuss some specific approaches further below.

In a similar way, we can construct the third-order moment, \({\mathscr {L}}^{\mu \nu \rho }\), depending on \({\mathscr {J}}\) and \({\mathscr {H}}^{\mu }\), as the symmetric rank-three tensor that is orthogonal to \(u_{\mu }\) and that satisfies the trace conditions \({\mathscr {L}}^{\mu \nu }_{\nu }={\mathscr {H}}^{\mu }\). From (e.g., Pennisi 1992; Cardall et al. 2013b; Just et al. 2015),

$$\begin{aligned} {\mathscr {L}}^{\mu \nu \rho } =\frac{1}{2}\, \Big [\, \big (\,{\mathfrak {h}}-{\mathfrak {q}}\,\big )\, \Big (\,\widehat{{\mathsf {h}}}^{\mu }\,h^{\nu \rho }+\widehat{{\mathsf {h}}}^{\nu }\,h^{\mu \rho }+\widehat{{\mathsf {h}}}^{\rho }\,h^{\mu \nu }\,\Big ) +\big (\,5\,{\mathfrak {q}}-3\,{\mathfrak {h}}\,\big )\,\widehat{{\mathsf {h}}}^{\mu }\,\widehat{{\mathsf {h}}}^{\nu }\,\widehat{{\mathsf {h}}}^{\rho } \,\Big ]\,{\mathscr {J}}, \end{aligned}$$
(133)

where we have defined the “heat flux” factor \({\mathfrak {q}}({\mathscr {J}},{\mathfrak {h}})\):

$$\begin{aligned} {\mathfrak {q}} = \frac{{\widehat{h}}_{\mu }\,{\widehat{h}}_{\nu }\,{\widehat{h}}_{\rho }\,{\mathscr {L}}^{\mu \nu \rho }}{{\mathscr {J}}} =\frac{\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega )\,(\widehat{{\mathsf {h}}}_{\mu }\ell ^{\mu })^{3}\,d\omega }{\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f(\omega )\,d\omega } =\frac{\frac{1}{2}\int _{-1}^{1}{\mathfrak {f}}(\mu )\,\mu ^{3}\,d\mu }{\frac{1}{2}\int _{-1}^{1}{\mathfrak {f}}(\mu )\,d\mu }. \end{aligned}$$
(134)

The two-moment closure for \({\mathscr {L}}^{\mu \nu \rho }\) requires that we specify the heat flux factor in terms of \({\mathscr {J}}\) and \({\mathfrak {h}}\) (or \({\mathscr {D}}\) and \({\mathfrak {h}}\)).

To complete the specification of the two-moment closure, the Eddington and heat flux factors must be specified in terms of the zeroth and first moments. To this end, several approaches have been proposed for the Eddington factor, including maximum entropy closure (e.g., Minerbo 1978), Kershaw-type closure (e.g., Kershaw 1976), and closures derived from fits to results obtained with higher-fidelity models (e.g., Janka 1991). In the context of spherically symmetric proto-neutron star models, Murchikova et al. (2017) carried out a comprehensive comparison of results obtained with two-moment neutrino transport, using analytic Eddington factors, to results obtained with Monte Carlo transport calculations. Murchikova et al. (2017) included Eddington factors from Wilson et al. (1975), Kershaw (1976), Levermore (1984), Minerbo (1978), Cernohorsky and Bludman (1994) and Janka (1991, 1992), and found no closure to perform consistently better than the others in the test cases considered. Because the maximum entropy closures of Minerbo (1978) and Cernohorsky and Bludman (1994) gave practically identical results and never yielded the worst results, and given the simplicity of the closure by Minerbo (1978) relative to the closure by Cernohorsky and Bludman (1994) and Murchikova et al. (2017) concluded that the Minerbo (1978) closure is the most attractive choice for neutrino transport around proto-neutron stars. The closures provided by Minerbo (1978) and Levermore (1984) are probably the most widely used in core-collapse supernova simulations employing two-moment neutrino transport. Recently, Just et al. (2015), comparing the closures of Minerbo (1978), Cernohorsky and Bludman (1994) and Levermore (1984) in the context of a simulation of collapse and post-bounce evolution of a 13 \(M_{\odot }\) star in spherical symmetry, showed that the differences in shock radii, neutrino luminosities, and mean energies are practically indistinguishable. This may be because the closures are very similar for the values of \({\mathscr {J}}\) and \({\mathfrak {h}}\) encountered. Chu et al. (2019) considered Eddington factors by Minerbo (1978), Cernohorsky and Bludman (1994), Larecki and Banach (2011) and Banach and Larecki (2017) and found that, under certain conditions, results obtained with closures based on Fermi–Dirac statistics can differ significantly from results obtained with the Minerbo (1978) closure, which is based on Boltzmann statistics.

We discuss the closures due to Minerbo (1978), Levermore (1984) and Kershaw (1976) in further detail and give explicit expressions for Eddingon and heat flux factors, which are also plotted in Fig. 5 (see figure caption for details).

Fig. 5
figure 5

Plot of Eddington factors \({\mathfrak {k}}\) (solid lines) and heat flux factors \({\mathfrak {q}}\) (dotted lines) versus flux factor \({\mathfrak {h}}\) for the closures due to Minerbo (1978) (black), Levermore (1984) (magenta), and Kershaw (1976) (blue)

Maximum entropy closure The maximum entropy approach to specifying the Eddington and heat flux factors comes from statistical mechanics, and has been used extensively in moment models for radiation transport (e.g., Minerbo 1978; Cernohorsky and Bludman 1994). In this approach, the “most probable” values of \({\mathfrak {k}}\) and \({\mathfrak {q}}\) are determined by finding the distribution function \({\mathfrak {f}}_{{\textsc {Me}}}\) that maximizes the entropy functional \(s[{\mathfrak {f}}_{{\textsc {Me}}}]\), subject to the constraints that \({\mathfrak {f}}_{{\textsc {Me}}}\) reproduces the known moments (e.g., \({\mathscr {D}}\) and \({\mathfrak {h}}\)). The unknowns can then be computed from Eqs. (131) and (134) by setting \({\mathfrak {f}}={\mathfrak {f}}_{{\textsc {Me}}}\). For the two-moment model, the maximum entropy distribution is obtained by extremizing

$$\begin{aligned} S = \int _{-1}^{1}s[{\mathfrak {f}}_{{\textsc {Me}}}]\,d\mu + \alpha _{0}\int _{-1}^{1}{\mathfrak {f}}_{{\textsc {Me}}}\,d\mu + \alpha _{1}\int _{-1}^{1}{\mathfrak {f}}_{{\textsc {Me}}}\,\mu \,d\mu \end{aligned}$$
(135)

with respect to \({\mathfrak {f}}_{{\textsc {Me}}}\), where the Lagrange multipliers \(\alpha _{0}\) and \(\alpha _{1}\) are introduced for the constraints. A particularly simple closure is obtained by considering the case of Boltzmann statistics, where \(s[{\mathfrak {f}}_{{\textsc {Me}}}]={\mathfrak {f}}_{{\textsc {Me}}}\ln {\mathfrak {f}}_{{\textsc {Me}}}-{\mathfrak {f}}_{{\textsc {Me}}}\). This case was considered in detail by Minerbo (1978), and is the low-occupancy limit (\({\mathscr {D}}\ll 1\)) of the more appropriate case (for neutrino transport) of Fermi–Dirac statistics considered by Cernohorsky and Bludman (1994). For the case of Boltzmann statistics, the maximum entropy distribution is easily found to be given by

$$\begin{aligned} {\mathfrak {f}}_{{\textsc {Me}}}(\mu ) = \exp \big (\,\alpha _{0}+\alpha _{1}\,\mu \,\big ), \end{aligned}$$
(136)

where \(\alpha _{0}\) and \(\alpha _{1}\) are found from the known moments. Direct integration of Eq. (136) gives (Minerbo 1978)

$$\begin{aligned} {\mathscr {D}} = \frac{1}{2}\int _{-1}^{1}{\mathfrak {f}}_{{\textsc {Me}}}(\mu )\,d\mu&=e^{\alpha _{0}}\,\sinh (\alpha _{1})/\alpha _{1}, \end{aligned}$$
(137)
$$\begin{aligned} {\mathscr {I}} = \frac{1}{2}\int _{-1}^{1}{\mathfrak {f}}_{{\textsc {Me}}}(\mu )\,\mu \,d\mu&=e^{\alpha _{0}}\,\big (\,\alpha _{1}\,\cosh (\alpha _{1})-\sinh (\alpha _{1})\,\big )/\alpha _{1}^{2}, \end{aligned}$$
(138)

which can be solved for \(\alpha _{0}\) and \(\alpha _{1}\). In particular, the flux factor is given by the Langevin function \(L(\alpha _{1})\),

$$\begin{aligned} {\mathfrak {h}} ={\mathscr {I}}/{\mathscr {D}} =\coth (\alpha _{1})-1/\alpha _{1} \equiv L(\alpha _{1}), \end{aligned}$$
(139)

and is independent of \(\alpha _{0}\). Thus, \(\alpha _{1}({\mathfrak {h}})=L^{-1}({\mathfrak {h}})\). (The inversion of the Langevin function must be done numerically.) Once \(\alpha _{1}\) is obtained, \(\alpha _{0}\) can be obtained directly from either Eq. (137) or Eq. (138), which completes the specification of \({\mathfrak {f}}_{{\textsc {Me}}}\). Then the Eddington factor and heat flux factor can be computed by direct integration

$$\begin{aligned} {\mathfrak {k}}({\mathfrak {h}}) = 1-2\,{\mathfrak {h}}/\alpha _{1}({\mathfrak {h}}) \quad \text{ and }\quad {\mathfrak {q}}({\mathfrak {h}}) = \coth \big (\alpha _{1}({\mathfrak {h}})\big ) - 3\,{\mathfrak {k}}({\mathfrak {h}})/\alpha _{1}({\mathfrak {h}}), \end{aligned}$$
(140)

which closes the two-moment model under the simplifying assumption of Boltzmann statistics, which is a reasonable approximation for neutrinos only when the occupation density is low; i.e., when \({\mathscr {D}}\ll 1\). This closure is referred to as the Minerbo closure, and is a commonly used closure in simulations employing spectral two-moment neutrino transport (e.g., Kuroda et al. 2016; Just et al. 2018; O’Connor and Couch 2018). In practice, to avoid inverting the Langevin function for \(\alpha _{1}\), the Eddington and heat flux factors can be approximated as polynomials in the flux factor. This leads to algebraic expressions, which are computationally more efficient. The algebraic form of the Eddingon factor, which approximates the one in Eq. (140) to better than one percent, is given by (Cernohorsky and Bludman 1994)

$$\begin{aligned} {\mathfrak {k}}_{\text{ Alg }}({\mathfrak {h}}) =\frac{1}{3} + \frac{2}{15}\,\big (\,3\,{\mathfrak {h}}^{2} - {\mathfrak {h}}^{3} + 3\,{\mathfrak {h}}^{4}\,\big ). \end{aligned}$$
(141)

Similarly, the algebraic form of the heat flux factor, which approximates the one in Eq. (140) to within a few percent, is given by (Just et al. 2015)

$$\begin{aligned} {\mathfrak {q}}_{\text{ Alg }}({\mathfrak {h}}) ={\mathfrak {h}}\,\big (\,45 + 10\,{\mathfrak {h}} - 12\,{\mathfrak {h}}^{2} - 12\,{\mathfrak {h}}^{3} + 38\,{\mathfrak {h}}^{4} - 12\,{\mathfrak {h}}^{5} + 18\,{\mathfrak {h}}^{6}\,\big ) / 75. \end{aligned}$$
(142)

In Fig. 5, the Eddington and heat flux factors \({\mathfrak {k}}_{\text{ Alg }}\) and \({\mathfrak {q}}_{\text{ Alg }}\) are plotted versus the flux factor \({\mathfrak {h}}\) (denoted Minerbo in the legend, using solid and dotted black lines, respectively).

Another two-moment closure based on the maximum entropy principle is the so-called M1 closure (e.g., Levermore 1984; Dubroca and Fuegas 1999). The M1 closure is thus based on the same principle as the Minerbo closure, but a different entropy functional is considered; namely the entropy functional for Bose–Einstein statistics \(s[{\mathfrak {f}}_{{\textsc {Me}}}]=(1+{\mathfrak {f}}_{{\textsc {Me}}})\ln (1+{\mathfrak {f}}_{{\textsc {Me}}})-{\mathfrak {f}}_{{\textsc {Me}}}\,\ln {\mathfrak {f}}_{{\textsc {Me}}}\). For the M1 closure the Eddington factor is given by

$$\begin{aligned} {\mathfrak {k}}_{\text{ M1 }}({\mathfrak {h}}) = \frac{3+4\,{\mathfrak {h}}^{2}}{5+2\sqrt{4-3\,{\mathfrak {h}}^{2}}}. \end{aligned}$$
(143)

It should be noted that Levermore (1984) derived this result without the maximum entropy principle. More recently, Vaytet et al. (2011) proposed a numerical method for multi-group radiation hydrodynamics in the \({\mathscr {O}}(v/c)\) limit, and provided an expression for the heat flux factor in the M1 model:

$$\begin{aligned} {\mathfrak {q}}_{\text{ M1 }}({\mathfrak {h}}) = 3\,\varphi _{1}({\mathfrak {h}})\,{\mathfrak {h}} + \varphi _{2}({\mathfrak {h}})\,{\mathfrak {h}}^{3}, \end{aligned}$$
(144)

where

$$\begin{aligned} \varphi _{1}({\mathfrak {h}})&=\frac{({\mathfrak {h}}-2+a)({\mathfrak {h}}+2-a)}{4{\mathfrak {h}}(a-2)^{5}} \left[ \, 12\ln \Big (\frac{{\mathfrak {h}}-2+a}{{\mathfrak {h}}+2-a}\Big )\big ({\mathfrak {h}}^{4}+2a{\mathfrak {h}}^{2}-7{\mathfrak {h}}^{2}-4a+8\big )\right. \nonumber \\&\quad \left. +48{\mathfrak {h}}^{3}-9a{\mathfrak {h}}^{3}-80{\mathfrak {h}}+40a{\mathfrak {h}} \,\right] , \end{aligned}$$
(145)
$$\begin{aligned} \varphi _{2}({\mathfrak {h}})&=\frac{1}{{\mathfrak {h}}^{3}(a-2)^{5}} \left[ \, 60\ln \Big (\frac{{\mathfrak {h}}-2+a}{{\mathfrak {h}}+2-a}\Big )\big (-{\mathfrak {h}}^{6}+15{\mathfrak {h}}^{4}-3a{\mathfrak {h}}^{4}\right. \nonumber \\&\quad +15a{\mathfrak {h}}^{2}-42{\mathfrak {h}}^{2}-16a+32\big ) \nonumber \\&\quad \left. +54a{\mathfrak {h}}^{5}-465{\mathfrak {h}}^{5}-674a{\mathfrak {h}}^{3}+2140{\mathfrak {h}}^{3}+1056a{\mathfrak {h}}-2112{\mathfrak {h}} \,\right] , \end{aligned}$$
(146)

and \(a=\sqrt{4-3{\mathfrak {h}}^{2}}\). The M1 closure is another commonly used closure in simulations employing spectral two-moment neutrino transport (e.g., Skinner et al. 2019). In Fig. 5, the Eddington and heat flux factors \({\mathfrak {k}}_{\text{ M1 }}\) and \({\mathfrak {q}}_{\text{ M1 }}\) are plotted versus the flux factor \({\mathfrak {h}}\) (denoted “Levermore” in the legend, using solid and dotted magenta lines, respectively). When plotting the heat flux factor, we found \(\varphi _{1}\) and \(\varphi _{2}\) to exhibit oscillatory behavior as \({\mathfrak {h}}\rightarrow 0\). To avoid these oscillations in \({\mathfrak {q}}_{\text{ M1 }}\), we used Taylor expansions of \(\varphi _{1}\) (around \({\mathfrak {h}}=0.1\)) and \(\varphi _{2}\) (around \({\mathfrak {h}}=0.2\)) to plot \({\mathfrak {q}}_{\text{ M1 }}\) for smaller values of \({\mathfrak {h}}\).

The low occupancy assumption used for the Minerbo closure does not hold everywhere in a supernova simulation, but may be a reasonable approximation in the neutrino heating region. The M1 closure based on Bose–Einstein statistics is also not a good approximation when the phase space occupation is high. In this case, a more realistic treatment for neutrinos must consider the entropy functional for Fermi–Dirac statistics, where \(s[{\mathfrak {f}}_{{\textsc {Me}}}]={\mathfrak {f}}_{{\textsc {Me}}}\,\ln {\mathfrak {f}}_{{\textsc {Me}}}+(1-{\mathfrak {f}}_{{\textsc {Me}}})\ln (1-{\mathfrak {f}}_{{\textsc {Me}}})\), and follow the procedure as outlined above, as was done by Cernohorsky and Bludman (1994), and more recently in further detail by Larecki and Banach (2011). For the maximum entropy closure derived by Cernohorsky and Bludman (1994), the Eddington factor is

$$\begin{aligned} {\mathfrak {k}}_{\text{ CB }}({\mathscr {D}},{\mathfrak {h}}) = \frac{1}{3} + \frac{2\,(1-{\mathscr {D}})\,(1-2{\mathscr {D}})}{3}\,\varTheta \Big (\frac{{\mathfrak {h}}}{1-{\mathscr {D}}}\Big ), \end{aligned}$$
(147)

where \(\varTheta (x)=x^{2}(3-x+3x^{2})/5\). To account for Fermi–Dirac statistics, the Eddington factor in Eq. (147) depends on both the number density \({\mathscr {D}}\) and the flux factor \({\mathfrak {h}}\). In the low-occupancy limit when, \({\mathscr {D}}\ll 1\), this Eddington factor reduces to the Eddington factor due to Minerbo in Eq. (141). Cernohorsky and Bludman (1994) did not provide an expression for the heat flux factor.

It should be noted that the term “M1 closure,” used here to refer to the closure in Eqs. (143) and (144), derives from the more general term “MN closure,” which is used in transport theory to refer to maximum entropy closures applied to N-moment hierarchies. As such, all the closures discussed in this section are M1 closures, but they differ in the entropy functional that is maximized.

Kershaw closure A different approach to the closure problem was proposed by Kershaw (1976). The key idea behind the Kershaw closure is to consider the bounds on the moments generated by the underlying distribution function. For a nonnegative distribution function (\({\mathfrak {f}}\ge 0\)), the set generated by the normalized moments \(\{\,1,\,{\mathfrak {h}},\,{\mathfrak {k}},\,{\mathfrak {q}}\,\}\) is convex and bounded, which in turn allows one to construct any sequence of moments in this set by a convex combination of moment vectors on the boundary of this domain. The moments constructed by this procedure are then “good” in the sense that they can be obtained from a nonnegative distribution function.

For the two-moment model, the Kershaw closure procedure can be used to specify \({\mathfrak {k}}\) and \({\mathfrak {q}}\) in terms of \({\mathfrak {h}}\). For \({\mathfrak {f}}\ge 0\), it is straightforward to show that \(-1\le {\mathfrak {h}}\le 1\), while the bounds on the Eddington factor are given by

$$\begin{aligned} {\mathfrak {h}}^{2}\equiv {\mathfrak {k}}_{{\textsc {L}}}({\mathfrak {h}})\le {\mathfrak {k}}\le {\mathfrak {k}}_{{\textsc {H}}}({\mathfrak {h}})\equiv 1. \end{aligned}$$
(148)

For \(\zeta \in [0,1]\), the Eddington factor can be written as the convex combination

$$\begin{aligned} {\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}}) = \zeta \,{\mathfrak {k}}_{{\textsc {L}}}({\mathfrak {h}})+(1-\zeta )\,{\mathfrak {k}}_{{\textsc {H}}}({\mathfrak {h}}). \end{aligned}$$
(149)

Demanding that this expression be correct in the limit when \({\mathfrak {h}}=0\), i.e., \({\mathfrak {k}}(0)=1/3\), gives \(\zeta =2/3\), so that

$$\begin{aligned} {\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}}) = \frac{1}{3} + \frac{2}{3}\,{\mathfrak {h}}^{2}. \end{aligned}$$
(150)

Similarly, for the heat flux factor, it can be shown that the following bounds hold (e.g., Schneider 2016):

$$\begin{aligned} -{\mathfrak {k}}+\frac{({\mathfrak {h}}+{\mathfrak {k}})^{2}}{1+{\mathfrak {h}}} \equiv {\mathfrak {q}}_{{\textsc {L}}}({\mathfrak {h}},{\mathfrak {k}})\le {\mathfrak {q}}\le {\mathfrak {q}}_{{\textsc {H}}}({\mathfrak {h}},{\mathfrak {k}})\equiv {\mathfrak {k}}-\frac{({\mathfrak {h}}-{\mathfrak {k}})^{2}}{1-{\mathfrak {h}}}. \end{aligned}$$
(151)

Constructing the heat flux factor from a convex combination of these bounds, and using \({\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}})\), gives

$$\begin{aligned} {\mathfrak {q}}_{{\textsc {K}}}({\mathfrak {h}}) =\zeta \,{\mathfrak {q}}_{{\textsc {L}}}({\mathfrak {h}},{\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}})) +(1-\zeta )\,{\mathfrak {q}}_{{\textsc {H}}}({\mathfrak {h}},{\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}})). \end{aligned}$$
(152)

Demanding that \({\mathfrak {q}}_{{\textsc {K}}}(0)=0\) (isotropic limit) gives \(\zeta =1/2\), so that

$$\begin{aligned} {\mathfrak {q}}_{{\textsc {K}}}({\mathfrak {h}}) =\frac{{\mathfrak {h}}\,\big (\,{\mathfrak {h}}^{2}+{\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}})^{2}-2\,{\mathfrak {k}}_{{\textsc {K}}}({\mathfrak {h}})\,\big )}{({\mathfrak {h}}^{2}-1)}. \end{aligned}$$
(153)

In Fig. 5, the Eddington and heat flux factors \({\mathfrak {k}}_{{\textsc {K}}}\) and \({\mathfrak {q}}_{{\textsc {K}}}\) are plotted versus the flux factor \({\mathfrak {h}}\) (denoted “Kershaw” in the legend; solid and dotted blue lines, respectively). The Kershaw closure considered here only assumes \({\mathfrak {f}}\ge 0\), which holds for Bose–Einstein and Boltzmann statistics. Kershaw-type closures for Fermi–Dirac statistics, which is appropriate for neutrinos where \({\mathfrak {f}}\in [0,1]\), was recently considered by Banach and Larecki (2017).

4.7.5 One-moment kinetics

One-moment models [commonly referred to as flux-limited diffusion models (Levermore and Pomraning 1981)] are among the earliest transport models adopted for neutrino transport in core-collapse supernova simulations (Bruenn 1975), and are still in use today (e.g., Bruenn et al. 2020; Rahman et al. 2019). Essentially, one-moment models evolve only the zeroth moment of the distribution function, while higher-order moments are specified through a closure procedure. Specifically, the radiation flux is specified in terms of the zeroth moment in a way that it is correct in both the diffusion and streaming regimes. In order to be correct in the streaming regime, a flux limiter is applied to transition the model from parabolic (diffusion) to hyperbolic (streaming). Here we consider the 3 + 1 general relativistic formulation presented by Rahman et al. (2019), which was derived using the formalisms from Shibata et al. (2011), Endeve et al. (2012) and Cardall et al. (2013b). We start from a slightly different perspective, since we already have presented the main evolution equation in Eq. (116). Rahman et al. (2019) define their angular moments with an additional factor of \(\varepsilon ^{2}\) relative to our definitions in Eq. (89) and absorb \(\sqrt{\gamma }\) into the variables; hence, we make the following definitions

$$\begin{aligned} \big \{\,\hat{{\mathscr {J}}},\,\hat{{\mathscr {H}}}^{\mu },\,\hat{{\mathscr {T}}}^{\mu \nu },\ldots \big \} =\sqrt{\gamma }\,\varepsilon ^{2}\,\big \{\,{\mathscr {J}},\,{\mathscr {H}}^{\mu },\,{\mathscr {T}}^{\mu \nu },\ldots \big \}; \end{aligned}$$
(154)

i.e., similar definitions hold for other moments appearing in the equations. (They also do not normalize their moments by the factor of \(4\pi \), but that should not cause confusion in the presentation here.) We can then write Eq. (116) as

$$\begin{aligned}&\frac{1}{\alpha } \big [\,\partial _{t}{}\big (\,\big [\,W\hat{{\mathscr {J}}}+v^{i}\hat{{\mathscr {H}}}_{i}\,\big ]\,\big ) +\partial _{i}{}\big (\,\big [\,\alpha \,\hat{{\mathscr {H}}}^{i}+\big (\,\alpha \,v^{i}-\beta ^{i}\,\big )\,W\hat{{\mathscr {J}}}\,\big ]\,\big )\,\big ] \nonumber \\&\qquad +\hat{{\mathscr {R}}}_{\varepsilon } - \partial _{\varepsilon }{}\big (\,\varepsilon \,\hat{{\mathscr {R}}}_{\varepsilon }\,\big ) =\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}\hat{{\mathscr {C}}}(f)\,d\omega , \end{aligned}$$
(155)

where we have defined

$$\begin{aligned} \hat{{\mathscr {R}}}_{\varepsilon }&= \hat{{\mathscr {T}}}^{\mu \nu }\nabla _{\mu }u_{\nu } \nonumber \\&=W\,\Big [\,\hat{{\mathscr {F}}}_{k}\,\partial _{\tau }{v^{k}} + \hat{{\mathscr {S}}}_{k}^{i}\partial _{i}{v^{k}} + \big (\hat{{\mathscr {F}}}^{i}-\hat{{\mathscr {E}}}v^{i}\big )\,\partial _{i}{\ln \alpha } + \alpha ^{-1}\hat{{\mathscr {F}}}_{k}v^{i}\partial _{i}{\beta ^{k}} \nonumber \\&\quad +\hat{{\mathscr {S}}}^{ik}\big (\,\frac{1}{2}v^{m}\partial _{m}{\gamma _{ik}}-{\mathsf {K}}_{ik}\,\big )\,\Big ] -\big (\,\hat{{\mathscr {E}}}-v^{k}\hat{{\mathscr {F}}}_{k}\,\big )\,\partial _{\tau }{W} - \big (\,\hat{{\mathscr {F}}}^{i}-\hat{{\mathscr {S}}}_{k}^{i}v^{k}\,\big )\,\partial _{i}{W}, \end{aligned}$$
(156)

where in the second step, we used Eq. (117), re-expressed in the form given by Rahman et al. (2019) [cf. their Eq. (A14)], and we have defined \(\partial _{\tau }{}=n^{\mu }\partial _{\mu }{}\).

Rahman et al. (2019) solve for moments defined in an orthonormal comoving frame, and write

$$\begin{aligned} \hat{{\mathscr {H}}}^{i}&=L^{i}_{{\hat{\mu }}}\hat{{\mathscr {H}}}^{{\hat{\mu }}} = e^{i}_{{\bar{\mu }}}\,\varLambda ^{{\bar{\mu }}}_{{\hat{\mu }}}\hat{{\mathscr {H}}}^{{\hat{\mu }}} \nonumber \\&=e^{i}_{{\hat{i}}}\,\hat{{\mathscr {H}}}^{{\hat{i}}} + W\,\Big (\,\frac{W}{W+1}\,v^{i}-\frac{\beta ^{i}}{\alpha }\,\Big )\,{\bar{v}}_{{\hat{i}}}\,\hat{{\mathscr {H}}}^{{\hat{i}}}, \end{aligned}$$
(157)

where

$$\begin{aligned} \hat{{\mathscr {H}}}^{{\hat{i}}}(\varepsilon )=\sqrt{\gamma }\,\frac{\varepsilon ^{3}}{4\pi }\int _{{\mathbb {S}}}f(\omega ,\varepsilon )\,\ell ^{{\hat{i}}}(\omega )\,d\omega . \end{aligned}$$
(158)

(Similar expressions can be made for higher-order moments; see Endeve et al. 2012; Rahman et al. 2019.) In Eq. (157), we remind the reader that \(\varLambda ^{{\bar{\mu }}}_{{\hat{\mu }}}\) is the Lorentz transformation between the orthonormal comoving frame basis and the orthonormal laboratory frame basis, while \(e^{i}_{{\bar{\mu }}}\) is a transformation between the orthonormal laboratory frame basis and the coordinate basis. We have made the choice \(e^{\mu }_{{\bar{0}}}=n^{\mu }\), and \({\bar{v}}_{{\hat{i}}}\) are three-velocity components in the orthonormal laboratory frame basis (\({\bar{v}}_{{\bar{i}}}={\bar{v}}^{{\bar{i}}}={\bar{v}}_{{\hat{i}}}={\bar{v}}_{{\hat{i}}}\)), so that \(v^{i}=e^{i}_{{\hat{i}}}{\bar{v}}^{{\hat{i}}}\), where the notation \(e^{i}_{{\hat{i}}}=e^{i}_{{\bar{i}}}\delta ^{{\bar{i}}}_{{\hat{i}}}\) is used.

To close the one-moment (MGFLD) model, Rahman et al. (2019) replace the momentum density by the gradient of the energy density:

$$\begin{aligned} {\mathscr {H}}^{{{\hat{i}}}}\longrightarrow - D \frac{e^{k {{\hat{i}}}}}{\alpha ^3} \partial _k (\alpha ^3 {\mathscr {J}}) , \end{aligned}$$
(159)

where D is the diffusion coefficient, which they express in terms of the flux-limiter \(\lambda \in [0,1/3]\) and the total opacity \(\kappa _t\) as

$$\begin{aligned} D \equiv \frac{\lambda }{\kappa _\mathrm {t}}. \end{aligned}$$
(160)

For Levermore–Pomraning and Wilson flux-limiting,

$$\begin{aligned} \lambda _\mathrm {LP}\equiv & {} \frac{2+R}{6+3R+R^2}, \nonumber \\ \lambda _\mathrm {Wilson}\equiv & {} \frac{1}{3+R}, \end{aligned}$$
(161)

respectively, where Rahman et al. (2019) define the generalized Knudsen number as

$$\begin{aligned} R\equiv & {} \frac{|e^{k {{\hat{i}}}}\partial _{k} (\alpha ^3{\mathscr {J}})|}{\kappa _\mathrm {t}\alpha ^3{\mathscr {J}}}. \end{aligned}$$
(162)

Thus, when the opacity is high, \(R\rightarrow 0\) and \(\lambda \rightarrow 1/3\). On the other hand, when the opacity is low, \(\lambda \rightarrow 1/R\) and

$$\begin{aligned} {\mathscr {H}}^{{{\hat{i}}}}\rightarrow -\frac{e^{k {{\hat{i}}}}\partial _{k} (\alpha ^3{\mathscr {J}})}{|e^{k {{\hat{i}}}}\partial _{k} (\alpha ^3{\mathscr {J}})|}\,{\mathscr {J}}. \end{aligned}$$
(163)

The Eddington tensor is related to the neutrino radiation stress tensor:

$$\begin{aligned} \chi ^{{{\hat{i}}} {{\hat{j}}}}= & {} \frac{{\mathscr {K}}^{{{\hat{i}}} {{\hat{j}}}}}{{\mathscr {J}}} . \end{aligned}$$
(164)

In the MGFLD approximation, the Eddington tensor, which appears in the expression for \(\hat{{\mathscr {R}}}_{\varepsilon }\), takes a form analogous to Eq. (130):

$$\begin{aligned} \chi ^{{{\hat{i}}} {{\hat{j}}}} = \frac{1}{2} [(1-\chi )\delta ^{{{\hat{i}}} {{\hat{j}}}} + (3\chi -1)h^{{{\hat{i}}}}h^{{{\hat{j}}}}] . \end{aligned}$$
(165)

In Eq. (165), \(h^{{\hat{i}}}\) is a unit vector in the direction of the neutrino flux, \({\mathscr {H}}^{{\hat{i}}}\), and \(\chi \) is the Eddington factor, which is given by

$$\begin{aligned} \chi = \lambda + (\lambda R)^2 . \end{aligned}$$
(166)

5 Neutrino interactions

The phenomenon of core-collapse supernovae is a magnificent juxtaposition of the macroscopic physics of neutrino radiation hydrodynamics and the microscopic physics of neutrino weak interactions and the nuclear equation of state. In particular, the weak interactions between the neutrinos and the matter are what make neutrinos important to this phenomenon. Thus, any review of neutrino transport in core-collapse supernovae must include a discussion of such interactions. In the history of core-collapse supernova modeling, there have been many important examples of studies that have demonstrated the impact of additional weak interaction physics and/or improved treatments of such physics in supernova models. Here we select a subset of these studies, each selected to investigate one of the dimensions of this component of supernova modeling: (1) The impact of the addition of new weak-interaction channels. (2) The impact of improved treatments of channels that have already been included in the models. (3) The interplay between different weak-interaction channels and the impact of adding/changing more than one weak-interaction channel at a time in a model. (4) The uncertainties in the weak-interaction rates currently used in core-collapse supernova models and their ramifications for core-collapse supernova modeling.

5.1 An intertwined history

Looking back at the history of the development of the theory of weak interactions and of core-collapse supernovae, especially during the time frame after the discovery and publication of the electroweak theory, it becomes obvious that (1) the first period of what can be called modern core-collapse supernova theory, after the publication of the seminal work of Colgate and White, was greatly influenced and greatly accelerated by the new electroweak theory, for more than a decade, and (2) the interplay between advancing descriptions of neutrino weak interactions and core-collapse supernovae continued well beyond this period, even to this day.

A year after the publication of the Colgate and White work, the electroweak theory was published (Weinberg 1967; Salam 1968). It was specifically the advent of weak neutral currents that would turn out to be a game changer for core-collapse supernova theory. Seven years after the publication of the electroweak theory, Freedman (1974) showed that, owing to weak neutral currents, neutrinos could scatter coherently off the nucleons in a nucleus, introducing an \(A^2\) dependence in the cross section, where A is the atomic number. During stellar core collapse, the core is neutronized through the emission and escape of electron neutrinos. As a result, the core nuclei become large—i.e., have large A—given that the nuclear size is a competition between Coulomb repulsion and surface tension, the former favoring smaller nuclei, the latter favoring larger nuclei, and the latter winning out. In turn, coherent nuclear scattering cross sections become large. Following Freedman’s discovery and publication, Tubbs and Schramm provided an electroweak-theory-based set of cross sections for problems of astrophysical interest (Tubbs and Schramm 1975). Subsequently, these were implemented in the pioneering work of Arnett (1977), wherein he showed that coherent nuclear scattering led to the trapping of electron neutrinos during stellar core collapse and to the development of a trapped Fermi sea of them in the core. This provided the foundation for the discovery 5 years later by Wilson that the stalled core-collapse supernova shock wave could be revived by charged-current mediated electron neutrino and antineutrino absorption on the shock-liberated nucleons behind it (Wilson 1985; Bethe and Wilson 1985), which marked the beginning of contemporary core-collapse supernova theory, which has largely operated within the framework of the delayed-shock or, equivalently, the neutrino-reheating mechanism. The 15 years between 1966 and 1982 saw the fundamental and significant advance from the first models of core-collapse supernovae to the establishment of the framework within which all core-collapse supernova modelers operate today. The developments in core-collapse supernova theory during these first 15 years were very tightly intertwined with the development of weak interaction physics. While this period was certainly unique in this regard, additional milestones, owing to further development in the theory of neutrino interactions in the environments of interest here, occurred since.

Bruenn (1985) published a landmark paper on the physics of stellar core collapse. Bruenn included the following electron neutrino emissivities and opacities in his models, which have come to be known as the “Bruenn 85” opacity set. Subsequent to Bruenn’s publication and prior to the publications discussed below, this set was frozen in as the canonical neutrino opacity set. It is still used today in code tests and comparisons. Bruenn included electron capture on (free) protons and nuclei and the inverse interactions of electron neutrino absorption, as well as scattering on (free) nucleons and electrons and coherent scattering on nuclei in his models. For electron antineutrino and heavy-flavor neutrino production, electron–positron pair annihilation served as the dominant source after core bounce and shock formation.

Hannestad and Raffelt (1998) computed the production of neutrino–antineutrino pairs from nucleon–nucleon bremsstrahlung. Prior to the recognition that such bremsstrahlung could lead to, and perhaps dominate, neutrino pair production, pair production occurred only through electron–positron pair annihilation. Thus, particularly for the muon and tau neutrino flavors, which have only pair production as sources, bremsstrahlung production introduced a fundamental change. Figures 6 and 7 show the relative importance of nucleon–nucleon bremsstrahlung for the production of electron neutrino–antineutrino pairs of all three flavors, relative to the production by electron–positron annihilation. The results shown are for two times after core bounce, at 4 and 100 ms, in a core-collapse supernova model performed with the Chimera code, initiated from an \(18\,M_\odot \) progenitor.

Fig. 6
figure 6

The neutrino number production rate due to neutrino–antineutrino pair production via electron–positron annihilation and nucleon–nucleon bremsstrahlung are plotted. Also shown are the thermalization surfaces for the different neutrino and antineutrino flavors, as well as the shock location at this time after bounce: 4 ms. The data used to generate the plot are taken from a Chimera model using an \(18\,M_\odot \) progenitor. At the high densities present at radii below \(\sim 10\,\hbox {km}\) in the core, pair production from bremsstrahlung dominates. On the other hand, between the heavy-flavor thermalization surfaces and the shock, production by electron–positron pair annihilation is consistently larger

Fig. 7
figure 7

The same as in Fig. 6 but at a time of 100 ms after bounce, during the critical shock reheating epoch. At this time, at radii above \(\sim 25\,\hbox {km}\), which is well below the thermalization spheres where the neutrino spectra set in, neutrino pair production is dominated by electron–positron pair annihilation. At high densities, below \(\sim 10\,\hbox {km}\), production due to bremsstrahlung continues to dominate

In the same year, Burrows and Sawyer (1998) and Reddy et al. (1998) took on the long-term challenge to understand neutrino interactions in dense, interacting, nuclear matter, taking into account nucleon recoil, degeneracy, relativity, thermal motions, and correlations. In particular, these authors computed new differential scattering rates (and new charged-current absorption and emission rates), which were no longer iso-energetic, as had been assumed before (e.g., in the Bruenn 85 opacity set), but resulted in small energy transfer between the neutrinos and the nucleons. Per scattering, the amount of energy transferred would be of little consequence, but taken over all of the scattering events in the dense environment in the vicinity of the neutrinospheres, such small-energy scattering has a notable impact. Müller et al. (2012) were the first to demonstrate this. In particular, they showed that small-energy scattering of heavy flavor neutrinos by nucleons at the electron neutrino- and antineutrino-spheres led to heating of these neutrinospheres and, consequently, an increase in the electron neutrino flavor luminosities. Their results are shown in Fig. 8. This in turn impacted neutrino shock reheating. In the absence of small-energy scattering on nucleons, shock revival was delayed by 50–100 ms relative to their baseline model.

Fig. 8
figure 8

Image reproduced with permission, copyright by AAS

Plotted are the neutrino and antineutrino luminosities for all three flavors of neutrinos, as a function of density, at 400 ms after bounce in the general relativistic model of Müller et al. (2012) initiated from a \(15\,M_\odot \) progenitor. Solid lines show data from the model that includes neutrino–nucleon small-energy scattering. Evident in the plots is the \(\sim 20\%\) increase in both the electron neutrino and antineutrino luminosities, at a density of \(10^{11}\,{\text {g}}\,{\text {cm}}^{-3}\), due to the heating of the electron neutrinospheres resulting from the scattering of higher-energy heavy flavor neutrinos, emanating from deeper regions, on nucleons in the neutrinospheric region

In 2003, yet another source of heavy-flavor neutrino pair production was introduced. Buras et al. (2003) examined the production of heavy-flavor neutrino pairs through the annihilation of electron-neutrino pairs. They too found that heavy-flavor pair production by electron-flavor pair annihilation dominated the production of such pairs through electron–positron pair annihilation. Moreover, they found that the inclusion of this mode of heavy-flavor production in their model boosted the heavy-flavor luminosities during the first \(\sim \)150 ms after bounce and decreased the electron-flavor luminosities after \(\sim 200\) ms. They found too that their shock was weaker and reached a smaller peak radius when electron-flavor pair annihilation was included. While the differences were not “dramatic,” they concluded they were also not “negligible.”

And once again in the same year, progress was made on a different front. The rates for electron capture on nuclei in the Bruenn 85 opacity set are based on the Independent Particle Model for the nucleons in the nucleus. That is, the IPM assumes that the nucleons are noninteracting. Under this assumption, the final neutron states are filled for nuclei with \(N>40\), which is true of the stellar core nuclei, and electron capture is blocked, relying in turn solely on capture on protons. This assumption was finally removed, as nuclear structure models developed. In 2003, rates for electron capture on nuclei using a “hybrid” model, wherein thermal excitation and nucleon–nucleon correlations were both accounted for, were recomputed (Langanke et al. 2003). Owing to the improved description, capture in nuclei was no longer blocked and in fact dominates capture on protons during core collapse, resulting in a more neutronized/deleptonized core, a smaller inner core, and a deeper shock formation mass (Hix et al. 2003).

The importance of the above additions and modifications to the neutrino opacities in core collapse supernova models were reinforced in the context of later two-dimensional models developed by other groups (Burrows et al. 2018; Just et al. 2018; Kotake et al. 2018) (Fig. 9).

Fig. 9
figure 9

Image reproduced with permission from Hix et al. (2003), copyright by APS

Plots of the density, entropy per baryon, electron fraction, and fluid velocity at core bounce using data from two models: one implementing the “Bruenn 85” electron capture rates (Bruenn 1985), based on the Independent Particle Model of nucleons, and one implementing the rates of Langanke et al. (2003), based on the “Hybrid” model, which includes correlations between the nucleons in RPA and finite-temperature effects. The former (latter) data correspond to the thin (thick) black lines in the plots. Given the inclusion of the hybrid model electron capture rates, electron capture is unblocked and proceeds, leading to increased electron capture in the core in this model and, consequently, to a significant (inward) change in the location of the bounce shock in mass

Earlier in this section, we have seen the impact of adding new weak interaction channels and improving the treatment of those already included in core-collapse supernova model. Here we explore yet another dimension of this important sector of core-collapse supernova physics: the interplay of neutrino weak interaction channels (new and/or modified). Lentz et al. (2012a) conducted an in-depth analysis focused largely on the neutrino production and interaction channels discussed above (i.e., nucleon–nucleon bremsstrahlung, non-isoenergetic scattering, and electron capture on nuclei). They demonstrated several important points: (1) While the addition of a single interaction channel may impact the dynamics of stellar core collapse and the post-bounce evolution, the addition of two interaction channels may not be additive—in fact, it may render one of the additional channels irrelevant. (2) When two or more interaction channels are included and are instead additive, the additive impact may be nonlinear. As an example, Lentz et al. considered the interplay of electron capture on nuclei and neutrino–electron scattering during stellar core collapse. If we consider the nucleons as independent particles [Independent Particle Model (IPM)], electron capture on nuclei is blocked for \(N>40\), where N is the neutron number. In this case, the nuclear electron capture rates are given by Bruenn (1985) are appropriate. In this instance, neutrino–electron scattering, which scatters neutrinos to lower energies given the core’s electron degeneracy, leads to a significant increase in core deleptonization and a concommitant decrease in the inner core mass. On the other hand, if the improved nuclear electron capture rates of Langanke et al. are used, which factor in nucleon interactions and correlations, nuclear electron capture is no longer blocked. In turn, low-energy neutrino states are filled, and neutrino–electron scattering is no longer able to down scatter neutrinos in energy (and contributes very little to the total neutrino opacity) and becomes rather unimportant. This is captured in Fig. 10. Comparing, for example, the velocity at bounce in the upper left panel of Fig. 10 for the cases “Base,” which includes the full set of neutrino weak interactions with “Base–noNES,” which leaves out neutrino–electron scattering, it is obvious there is no difference. This is also true of all of the other quantities plotted. On the other hand, a comparison between “Base” and “IPA,” which includes nuclear electron capture in the independent particle approximation, it is evident that neutrino–electron scattering had a significant impact during collapse and on the final shock formation location.

Fig. 10
figure 10

Image reproduced with permission from Lentz et al. (2012a), copyright by AAS

Plots of velocity, density, entropy, temperature, electron and lepton fraction, and pressure at core bounce across five models with different input physics. The model “Base” includes all weak interactions and uses the modern, hybrid-model electron capture rates. Model Base-NoNES includes the same weak interaction physics, with one exception: neutrino-electron scattering (NES) is not included. Similarly, model “IPA” includes all weak interaction channels, as does model Base, but uses the Independent Particle Approximation (IPA) rates for nuclear electron capture. And model “IPA-NoNES” includes the same weak interaction physics except neutrino–electron scattering. Comparing models Base and Base-NoNES, no significant changes result when NES is excluded. On the other hand, comparing models IPA and IPA-NoNES, we reach a different conclusion: in this case, the inclusion of NES has a significant impact on core deleptonization and, consequently, on the mass of the inner core at bounce. These comparisons demonstrate there is an interplay between different neutrino opacities. An improvement in one opacity may render an otherwise important second opacity relatively unimportant

That the search for all core collapse-supernova relevant neutrino weak interactions is an ongoing activity is no better illustrated than by the very recent example provided by Bollig et al. (2017), whose work illuminated the importance of including muons and neutrino–muon weak interactions in core-collapse supernova models. Past models assumed that the population of muons in the stellar core during collapse, bounce, and the post-bounce neutrino shock reheating epoch would remain low given the large rest mass of the muon. Bollig et al. point out that such arguments are not well motivated. The electron chemical potential in the proto-neutron star at this time exceeds the muon rest mass, and the core temperature is large, as well. In the context of two-dimensional supernova models using the Vertex code and initiated from a \(20\,M_\odot \) progenitor, they demonstrated that significant populations of muons are in fact produced and, more importantly, that the inclusion of muons in their supernova models impacted the outcomes quantitatively in all cases and even qualitatively in some cases, depending on the nuclear equation of state used. For the SFHo equation of state, models with muons exhibited explosion whereas counterpart models without them did not. For models with the LS220 equation of state, models with muons exhibited earlier explosions, indicating that explosion was facilitated in these models. Bollig et al.’s results are encapsulated in Fig. 11.

Fig. 11
figure 11

Image reproduced with permission from Bollig et al. (2017), copyright by APS

In the upper left panel, the angle-averaged shock trajectories for several models, excluding and including muons, using the Steiner–Fischer–Hempel (SFHo) equation of state, are plotted. In the upper right panel, plotted are the results from the models that instead use the Lattimer–Swesty equation of state with bulk compression modulus \(K=220\,\hbox {MeV}\) (LS220). Here “Standard” indicates models without muons

Fig. 12
figure 12

Image reproduced with permission, copyright by AAS

Results are shown here from the core-collapse supernova studies of Melson et al. (2015a). In particular, in the uppermost left panel, the angle-averaged shock radius is plotted for two pairs of models, one for the two-dimensional case and one for the three-dimensional case. All cases are launched from a \(20\,M_\odot \) progenitor and were performed with the Vertex code. Within each case, two simulations were performed, one using the standard weak interaction cross section for neutrino–nucleon scattering and the other including a correction to the strangeness content of the nucleon, which results in a correction to the coupling constants. In two dimensions, both models explode, with some quantitative differences observed in the shock trajectories. In the more important three-dimensional case, the outcomes with and without the correction are qualitatively different. Specifically, explosion is not obtained in their model unless the opacity correction is included

We close this section with an emphasis on one final important point: Like all weak interaction cross sections, those the community has found to be important to core-collapse supernova evolution and has included in its supernova models have uncertainties associated with them, which can arise from experimental uncertainties in the few cases where the cross sections have been measured directly, or from uncertainties in the theory used to predict them, which in the end the supernova modeling community must rely on given it is impossible to measure all relevant cross sections under all relevant thermodynamic conditions and at all relevant neutrino energies found in a supernova environment. Thus, it is important to explore the potential impact of such uncertainties on the quantitative and qualitative core-collapse supernova model outcomes.

Case in point: The exploration of the impact of the uncertainty in the neutrino–nucleon cross section. Melson et al. (2015a) performed two state-of-the-art three-dimensional simulations of the core-collapse supernova explosion of a \(20\,M_\odot \) progenitor. In one case, they included what the modeling community regarded at the time as the state-of-the-art neutrino weak interaction set, with no modification to any of the cross sections. In the other, they varied one of the cross sections, albeit a critical one: the cross section for neutrino scattering on nucleons. This cross section is one of the most important for neutrino transport below the neutrinospheres, as the leading opacity source and, as we saw above, as an additional heating source for matter within the proto-neutron star. Uncertainty in the cross section for neutrino–nucleon scattering arises from, among other things, uncertainty in the strangeness content of the nucleon, which can alter the coupling constants. In particular, Melson et al. varied the cross section by \(\sim \)10%, consistent with the experimental uncertainties, and in so doing found they could qualitatively alter the outcome of the model. When the standard weak interaction set was used, they did not obtain an explosion in the model. When they varied the neutrino–nucleon cross section, they did. The results are shown in Fig. 12. Of course, we have already seen that variations in a particular cross section can interact with variations in another. The only way the supernova modeling community can accurately assess the impact of variations in a single cross section is to vary all of them, in a statistically meaningful way—i.e., perform a sensitivity study. And, obviously, this should be performed, at least ultimately, in the context of three-dimensional models. Unfortunately, the last requirement cannot be met at this time. Such a study would require that many three-dimensional models be performed, which at the moment, even with the significant computing power afforded the modeling community by today’s supercomputers, is prohibitive. Such studies should be conducted, but they will have to wait for future supercomputing capabilities.

5.2 The relevant neutrino interactions

The previous section makes clear that the effort to ascertain which neutrino weak interactions are important to core-collapse supernovae theory is an ongoing activity. To date, the list included in Table 1 is what is deemed to be the essential list. Most, if not all, of the weak interactions in the list have been included in the state-of-the-art simulations whose underlying numerical methods have been the focus of this review. Motivated by the recent example documented in the previous section, in Table 2 we also include a list of the relevant neutrino weak interactions involving muons. At present, these have been included by only one group (Bollig et al. 2017) and, as discussed, have been found to be important by this group. In light of this, adoption of these weak interactions by other groups is certainly warranted.

Table 1 Relevant modern neutrino emissivities and opacities, most or all of which have been adopted in three-dimensional core-collapse supernova models

5.2.1 Boltzmann collision term

We write the collision term as the sum of terms corresponding to the main processes—emission and absorption, scattering, and pair creation and annihilation—listed in Table 1:

$$\begin{aligned} {\mathscr {C}}[f_{s}](p) = {\mathscr {C}}_{{\textsc {AbEm}}}[f_{s}](p) + {\mathscr {C}}_{{\textsc {Scat}}}[f_{s}](p) + {\mathscr {C}}_{{\textsc {Pair}}}[f_{s}](p). \end{aligned}$$
(167)

For each of the terms, we focus on its functional form, which is closely related to the computational complexity of including a particular weak interaction in a core-collapse supernova model. Each term—hence, each added interaction—warrants tailored consideration.

Table 2 Relevant neutrino–muon weak interactions

The term due to neutrino emission and absorption is written as

$$\begin{aligned} \frac{1}{\varepsilon }\,{\mathscr {C}}_{{\textsc {AbEm}}}[f_{s}](p) = [1-f_{s}(p)]\eta _{s} - \chi _{s}\,f_{s}(p), \end{aligned}$$
(168)

where \(\eta _{s}\) and \(\chi _{s}\) are the emissivity and absorption opacity of neutrino species s and are assumed to be isotropic in the momentum-space angle (independent of \(\omega \)), but depend on the neutrino energy \(\varepsilon \). The blocking factor, \(1-f_{s}(p)\), is included to account for the Fermi–Dirac statistics of neutrinos, and suppresses neutrino emission when the phase-space occupancy is high (i.e., when \(f_{s}\lesssim 1\)). It is common to introduce \({\tilde{\chi }}_{s}=(\eta _{s}+\chi _{s})\), associated in this case with “stimulated absorption” (as opposed to the stimulated emission of photons), and to define \(f_{0,s}=\eta _{s}/{\tilde{\chi }}_{s}\), in which case Eq. (168) can be written in relaxation form:

$$\begin{aligned} \frac{1}{\varepsilon }\,{\mathscr {C}}_{{\textsc {AbEm}}}[f_{s}](p) = {\tilde{\chi }}\,\big (\,f_{0,s}-f_{s}\,\big ). \end{aligned}$$
(169)

In this form it is easy to see that the collision term drives the distribution function towards the equilibrium distribution, \(f_{0,s}\). Also note, this interaction is local in momentum-space; i.e., there is no coupling across momentum-space.

Neutrino–matter scattering (the second and third category in Table 1) is described by

$$\begin{aligned} \frac{1}{\varepsilon }\,{\mathscr {C}}_{{\textsc {Scat}}}[f_{s}](p)&=\big (1-f_{s}(p)\big )\int _{V_{p}}{\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(p,p^{\prime })\,f_{s}(p^{\prime })\,d^{3}p^{\prime } \nonumber \\&\quad -f_{s}(p)\int _{V_{p}}{\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {Out}}}(p,p^{\prime })\,(1-f_{s}(p^{\prime }))\,d^{3}p^{\prime }, \end{aligned}$$
(170)

where \({\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(p,p^{\prime })\) is the scattering rate from momentum \(p^{\prime }\) into p, and \({\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {Out}}}(p,p^{\prime })\) is the scattering rate out of momentum p into \(p^{\prime }\). When compared with the collision term in Eq. (169), the coupling in momentum-space (due to the integral operators) increases the computational complexity of evaluating the collision operator. If momentum-space is discretized into \(N_{p}\) bins, a brute force evaluation of Eq. (170) for all p requires \({\mathscr {O}}(N_{p}^{2})\) operations. Note also the blocking factors in Eq. (170), which suppress scattering to high-occupancy regions of momentum-space. The second category in Table 1 (coherent, isoenergetic scattering) is obtained as a simplification of Eq. (170) by letting

$$\begin{aligned} {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}/{\textsc {Out}}}(p,p^{\prime }) \rightarrow {\mathscr {R}}_{{\textsc {Iso}}}(|p|,\cos \alpha )\,\delta (|p|-|p^{\prime }|), \end{aligned}$$
(171)

where \(\cos \alpha =p \cdot p^{\prime }/(|p||p^{\prime }|)\). For this type of interaction, with \(d^{3}p^{\prime }=|p^{\prime }|^{2}\,d|p^{\prime }|\,d\omega '\), the collision term is given by

$$\begin{aligned} \frac{1}{\varepsilon }\,{\mathscr {C}}_{{\textsc {Iso}}}[f_{s}](|p|,\omega )&=\int _{{\mathbb {S}}^{2}}{\mathscr {R}}_{{\textsc {Iso}}}(|p|,\cos \alpha )\,|p|^{2}\,f_{s}(|p|,\omega ')\,d\omega ' \nonumber \\&\quad -f_{s}(|p|,\omega )\int _{{\mathbb {S}}^{2}}{\mathscr {R}}_{{\textsc {Iso}}}(|p|,\cos \alpha )\,|p|^{2}\,d\omega ', \end{aligned}$$
(172)

which is considerably simplified relative to the scattering operator in Eq. (170).

Finally, neutrino-antineutrino pair creation and annihilation (e.g., from electron-positron pairs; the fourth category in Table 1) is described by

$$\begin{aligned} \frac{1}{\varepsilon }\,{\mathscr {C}}_{{\textsc {Pair}}}[f_{s}](p)&=(1-f_{s}(p))\int _{V_{p}}{\mathscr {R}}_{{\textsc {Pair}}}^{{\textsc {In}}}(p,p^{\prime })\,(1-{\bar{f}}_{s}(p^{\prime }))\,d^{3}p^{\prime } \nonumber \\&\quad -f_{s}(p)\int _{V_{p}}{\mathscr {R}}_{{\textsc {Pair}}}^{{\textsc {Out}}}(p,p^{\prime })\,{\bar{f}}_{s}(p^{\prime })\,dp^{\prime }, \end{aligned}$$
(173)

where \({\mathscr {R}}_{{\textsc {Pair}}}^{{\textsc {In}}}(p,p^{\prime })\) and \({\mathscr {R}}_{{\textsc {Pair}}}^{{\textsc {Out}}}\) are the neutrino-antineutrino pair production and annihilation rates, respectively, and \({\bar{f}}_{s}\) is the antineutrino distribution function. We note that the functional form of the collision term for the last of the pair processes included in Table 1 is not represented by the functional form for pair creation and annihilation presented here. In this particular case, both in-states and both out-states correspond to neutrinos, which, when treated without approximation, results in a collision term involving four distribution functions. This non-approximate treatment of the process has yet to be implemented in core-collapse supernova models. As a result, we do not include its functional form here.

All of the above rates \(\eta _{s}\), \(\chi _{s}\), \({\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}/{\textsc {Out}}}\), and \({\mathscr {R}}_{{\textsc {Pair}}}^{{\textsc {In}}/{\textsc {Out}}}\) depend on the thermodynamic state of the stellar core fluid (e.g., \(\rho \), T, and \(Y_{e}\)).

Symmetries in some of the collision kernels exist (e.g., Bruenn 1985; Cernohorsky 1994), which should be leveraged in computations. First, because the total number of neutrinos is conserved in neutrino–matter scattering,

$$\begin{aligned} \int _{V_{p}}{\mathscr {C}}_{{\textsc {Scat}}}[f_{s}](p)\,\frac{d^{3}p}{\varepsilon }=0, \end{aligned}$$
(174)

and the following in–out invariance holds:

$$\begin{aligned} {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(p,p^{\prime }) = {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {Out}}}(p^{\prime },p). \end{aligned}$$
(175)

Second, when the neutrino distribution function equals the local Fermi–Dirac distribution, \(f_{s}=f_{0,s}=1/[e^{(\varepsilon -\mu _{\nu ,s})/T}+1]\), where T is the matter temperature and \(\mu _{\nu }\) is the equilibrium neutrino chemical potential, the net energy and momentum transfer between neutrinos and matter due to scattering must vanish. Thus, requiring

$$\begin{aligned} \int _{V_{p}}{\mathscr {C}}_{{\textsc {Scat}}}[f_{0,s}](p)\,g(p)\,\frac{d^{3}p}{\varepsilon }=0 \end{aligned}$$
(176)

for an arbitrary function g(p), gives

$$\begin{aligned} {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(p,p^{\prime }) = {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {Out}}}(p,p^{\prime })\,e^{-(\varepsilon -\varepsilon ^{\prime })/T}={\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(p^{\prime },p)\,e^{-(\varepsilon -\varepsilon ^{\prime })/T}, \end{aligned}$$
(177)

where Eq. (175) is used in the rightmost expression.

5.2.2 Two-moment collision terms

Collision terms for the two-moment model are derived by taking angular moments of the collision term in Eq. (167). Such terms have been discussed in the context of multidimensional two-moment models by, e.g., Shibata et al. (2011). For completeness, we list two-moment collision terms corresponding to angular moments of Eqs. (169), (170) and (173) here. Considering the two-moment models delineated in Sect. 4.7.3, the relevant angular moments of the Boltzmann collision term are

$$\begin{aligned} \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}[f_{s}]\,\frac{d\omega }{\varepsilon } \quad \text {and}\quad \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}[f_{s}]\,\ell _{j}\,\frac{d\omega }{\varepsilon }. \end{aligned}$$
(178)

[The first of these terms also appears in the one-moment model discussed in Sect. 4.7.5; cf. Eq. (155)].

Emission/absorption For emission and absorption, the evaluation is straightforward since the emissivity and opacity are isotropic in momentum space angle. The zeroth moment gives

$$\begin{aligned} \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}_{{\textsc {AbEm}}}[f_{s}]\,\frac{d\omega }{\varepsilon } =\big (1-{\mathscr {D}}_{s}\big )\,\eta _{s} - \chi _{s}\,{\mathscr {D}}_{s} ={\tilde{\chi }}_{s}\,\big (\,{\mathscr {D}}_{0,s}-{\mathscr {D}}_{s}\,\big ), \end{aligned}$$
(179)

where the zeroth moment of the equilibrium distribution is defined as

$$\begin{aligned} {\mathscr {D}}_{0,s}=\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}f_{0,s}\,d\omega . \end{aligned}$$
(180)

The first moment gives

$$\begin{aligned} \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}_{{\textsc {AbEm}}}[f_{s}]\,\ell _{j}\,\frac{d\omega }{\varepsilon } =-{\tilde{\chi }}_{s}\,{\mathscr {I}}_{s,j}, \end{aligned}$$
(181)

since the angular moment of \(\ell _{j}\) vanishes.

Angular kernel approximations To incorporate scattering and pair processes in the two-moment model, following Bruenn (1985), the kernels are expanded in a Legendre series up to linear order; e.g.,

$$\begin{aligned} {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(p,p^{\prime }) = {\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime },\varOmega ) \approx \varPhi _{{\textsc {Scat}},0}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime }) + \varPhi _{{\textsc {Scat}},1}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,\varOmega (\omega ,\omega '), \end{aligned}$$
(182)

where \(\varOmega =\ell _{\mu }(\omega )\ell ^{\mu }(\omega ')\) is the cosine of the scattering angle. From the orthogonality of the Legendre polynomials, the scattering coefficients are then evaluated form the kernels as

$$\begin{aligned} \big \{\,\varPhi _{{\textsc {Scat}},0}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime }),\varPhi _{{\textsc {Scat}},1}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,\big \} =\frac{1}{2}\int _{-1}^{1}{\mathscr {R}}_{{\textsc {Scat}}}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime },\varOmega )\,\big \{\,1,\,3\,\varOmega \,\big \}\,d\varOmega . \end{aligned}$$
(183)

Terms beyond linear can be included in the expansion of the kernel in Eq. (182) at the expense of a more complicated collision operator for the two-moment model. Smit and Cernohorsky (1996) investigated the effect of including the quadratic term for neutrino-electron scattering in a configuration during the infall phase of stellar core collapse. They found that including the quadratic term results in a better fit to the scattering kernel, but when comparing stationary state transport solutions with and without the quadratic term, they found no significant difference in relevant quantities such as the neutrino number density, flux, and transfer rates of lepton number, energy, or momentum to the stellar fluid. We also note that Just et al. (2018), in their Appendix A, provide expressions for pair processes that include the quadratic term in the Legendre expansion of the kernels.

Scattering Employing the expansion in Eq. (182) for the scattering operator gives

$$\begin{aligned}&\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}_{{\textsc {Scat}}}[f_{s}](p)\,\frac{d\omega }{\varepsilon } \nonumber \\&\quad =\big (1-{\mathscr {D}}(\varepsilon )\big )\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},0}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,{\mathscr {D}}(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} -{\mathscr {I}}_{\mu }(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},1}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,{\mathscr {I}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \nonumber \\&\qquad -{\mathscr {D}}(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},0}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,\big (1-{\mathscr {D}}(\varepsilon ^{\prime })\big )\,dV_{\varepsilon ^{\prime }} +{\mathscr {I}}_{\mu }(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},1}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,{\mathscr {I}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \end{aligned}$$
(184)

for the zeroth moment (recall that \(dV_{\varepsilon }=4\pi \varepsilon ^{2}d\varepsilon \)), and

$$\begin{aligned}&\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}_{{\textsc {Scat}}}[f_{s}]\,\ell _{j}\,\frac{d\omega }{\varepsilon } \nonumber \\&\quad =-{\mathscr {I}}_{j}(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},0}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,{\mathscr {D}}(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \nonumber \\&\qquad +\left( \frac{1}{3}\,g_{j\mu }-\widehat{{\mathscr {K}}}_{j\mu }(\varepsilon )\right) \int _{0}^{\infty }\varPhi _{{\textsc {Scat}},1}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,{\mathscr {I}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \nonumber \\&\qquad -{\mathscr {I}}_{j}(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},0}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,\big (1-{\mathscr {D}}(\varepsilon ^{\prime })\big )\,dV_{\varepsilon ^{\prime }} +\widehat{{\mathscr {K}}}_{j\mu }\int _{0}^{\infty }\varPhi _{{\textsc {Scat}},1}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,{\mathscr {I}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \end{aligned}$$
(185)

for the first moment. Here we have used

$$\begin{aligned} \frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}\ell _{\mu }(\omega )\,\ell _{\nu }(\omega )\,d\omega =\frac{1}{3}\,g_{\mu \nu }. \end{aligned}$$
(186)

Pair processes Employing the kernel expansion in Eq. (182) for the neutrino-antineutrino pair creation and annihilation operator in Eq. (173) gives

$$\begin{aligned}&\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}_{{\textsc {Pair}}}[f_{s}](p)\,\frac{d\omega }{\varepsilon } \nonumber \\&\quad =\big (1-{\mathscr {D}}(\varepsilon )\big )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},0}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,\big (1-\bar{{\mathscr {D}}}(\varepsilon ^{\prime })\big )\,dV_{\varepsilon ^{\prime }} +{\mathscr {I}}_{\mu }(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},1}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,\bar{{\mathscr {I}}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \nonumber \\&\qquad -{\mathscr {D}}(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},0}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,\bar{{\mathscr {D}}}(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} -{\mathscr {I}}_{\mu }(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},1}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,\bar{{\mathscr {I}}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \end{aligned}$$
(187)

for the zeroth moment of the collision operator, and

$$\begin{aligned}&\frac{1}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}_{{\textsc {Pair}}}[f_{s}](p)\,\ell _{j}\,\frac{d\omega }{\varepsilon } \nonumber \\&\quad =-{\mathscr {I}}_{j}(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},0}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,\big (1-\bar{{\mathscr {D}}}(\varepsilon ^{\prime })\big )\,dV_{\varepsilon ^{\prime }}\nonumber \\&\qquad -\left( \frac{1}{3}\,g_{j\mu }-\widehat{{\mathscr {K}}}_{j\mu }(\varepsilon )\right) \int _{0}^{\infty }\varPhi _{{\textsc {Pair}},1}^{{\textsc {In}}}(\varepsilon ,\varepsilon ^{\prime })\,\bar{{\mathscr {I}}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \nonumber \\&\qquad -{\mathscr {I}}_{j}(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},0}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,\bar{{\mathscr {D}}}(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} -\widehat{{\mathscr {K}}}_{j\mu }(\varepsilon )\int _{0}^{\infty }\varPhi _{{\textsc {Pair}},1}^{{\textsc {Out}}}(\varepsilon ,\varepsilon ^{\prime })\,\bar{{\mathscr {I}}}^{\mu }(\varepsilon ^{\prime })\,dV_{\varepsilon ^{\prime }} \end{aligned}$$
(188)

for the first moment. Here, \(\bar{{\mathscr {D}}}\) and \(\bar{{\mathscr {I}}}^{\mu }\) are the zeroth and first moments of the antineutrino distribution function \({\bar{f}}\).

5.3 Neutrino-matter coupling

In coupling neutrinos and matter, we are primarily concerned with lepton and four-momentum exchange. The neutrino lepton current density is

$$\begin{aligned} J_{\text{ neutrino }}^{\nu } = \sum _{s=\nu _{e},{\bar{\nu }}_{e}}{\mathsf {g}}_{s}\,N_{s}^{\nu }, \end{aligned}$$
(189)

where \(N_{s}^{\nu }\) is the neutrino four-current density for neutrino species s, defined as in Eq. (70) with distribution function \(f_{s}\), and \({\mathsf {g}}_{s}\) is the lepton number of neutrino species s (\({\mathsf {g}}_{s}=+1\) for neutrinos, and \({\mathsf {g}}_{s}=-1\) for antineutrinos). From the electron number conservation equation, Eq. (6), and the neutrino number conservation equation, Eq. (80) (one for each neutrino species), we obtain

$$\begin{aligned} \nabla _{\nu }\big (\,J_{\text{ neutrino }}^{\nu } + J_{e}^{\nu }/m_{\text{ B }}\,\big ) = \sum _{s=\nu _{e},{\bar{\nu }}_{e}}{\mathsf {g}}_{s}\,\int _{V_{p}}{\mathscr {C}}[f_{s}]\,\pi _{m} - L, \end{aligned}$$
(190)

Lepton number conservation demands that the source term of the right-hand side of Eq. (6) takes the form

$$\begin{aligned} L = \sum _{s=\nu _{e},{\bar{\nu }}_{e}}{\mathsf {g}}_{s}\int _{V_{p}}{\mathscr {C}}[f_{s}]\,\pi _{m}. \end{aligned}$$
(191)

Note that, for simplicity of this exposition, we have assumed that only electron neutrinos and antineutrinos are involved in lepton exchange with the fluid, but see Bollig et al. (2017) for a discussion of additional lepton exchange channels when muons are included as a fluid component. When muons are included, an additional equation for the muon number density, similar to Eq. (6), must be evolved, and the definition of the neutrino lepton current density in Eq. (189) must be extended to include contributions from muon neutrinos. [Technically, similar extensions should be done to accommodate tauons, but, because of their large rest mass, they can be neglected as an agent for lepton number exchange with the fluid (Bollig et al. 2017).]

The total neutrino stress-energy tensor is defined as

$$\begin{aligned} T_{\text{ neutrino }}^{\mu \nu } =\sum _{s=1}^{N_{{\textsc {Sp}}}}T_{s}^{\mu \nu }, \end{aligned}$$
(192)

where the stress-energy tensor for neutrino species s, \(T_{s}^{\mu \nu }\), is defined as in Eq. (71) with distribution function \(f_{s}\) and \(N_{{\textsc {Sp}}}\) is the total number of neutrino species. Using Eqs. (5) and (82), the divergence of the total (fluid plus neutrino) stress-energy is

$$\begin{aligned} \nabla _{\nu }\big (\,T_{\text{ neutrino }}^{\mu \nu }+T_{\text{ fluid }}^{\mu \nu }\,\big ) =\sum _{s=1}^{N_{{\textsc {Sp}}}}\int _{V_{p}}{\mathscr {C}}[f_{s}]\,p^{\mu }\,\pi _{m} - G^{\mu }. \end{aligned}$$
(193)

Then, four-momentum conservation in neutrino–matter interactions demands the right-hand side of Eq. (5) takes the form

$$\begin{aligned} G^{\mu } = \sum _{s=1}^{N_{{\textsc {Sp}}}}\int _{V_{p}}{\mathscr {C}}[f_{s}]\,p^{\mu }\,\pi _{m}. \end{aligned}$$
(194)

To illustrate the complexity of the neutrino–matter coupling problem further, we consider the neutrino–matter coupling problem in the space-homogeneous case using the number conservative two-moment model discussed in Sect. 4.7.3. The angular moments of the neutrino distribution function of species s evolve according to

$$\begin{aligned} d_{t}\big (\,\sqrt{\gamma }\,\big [\,W\,{\mathscr {D}}_{s}+v^{i}\,{\mathscr {I}}_{s,i}\,\big ]\,\big )&= \frac{\alpha \,\sqrt{\gamma }}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}[f_{s}]\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(195)
$$\begin{aligned} d_{t}\big (\,\sqrt{\gamma }\,\big [\,W\,{\mathscr {I}}_{s,j}+v^{i}\,\widehat{{\mathscr {K}}}_{s,ij}\,\big ]\,\big )&= \frac{\alpha \,\sqrt{\gamma }}{4\pi }\int _{{\mathbb {S}}^{2}}{\mathscr {C}}[f_{s}]\,\ell _{j}\,\frac{d\omega }{\varepsilon }, \end{aligned}$$
(196)

where we use the ordinary derivative \(d_{t}=d/dt\) to indicate that we consider the space-homogeneous case where physical variables are considered functions of time only. The right-hand sides of Eqs. (195) and (196) will include the contributions from emission and absorption, scattering, pair processes (as discussed above), and other processes. This sub-problem is typically considered in numerical implementations where neutrino–matter interactions are solved for in a time-implicit fashion, e.g., as is done within an implicit-explicit framework for integrating the full neutrino-radiation hydrodynamics system forward in time, which we will discuss in more details later (see, e.g., Sect. 6.5). Coupled to the transport equations, are the fluid evolution equations, which are combined with the transport equations and formulated as constraints due to mass, four-momentum, and lepton number conservation:

$$\begin{aligned} d_{t}\big (\,\sqrt{\gamma }\,D\,\big )&=0, \end{aligned}$$
(197)
$$\begin{aligned} d_{t}\big (\,\sqrt{\gamma }\,\big [\,S_{j}+S_{j,\text{ neutrino }}\,\big ]\,\big )&=0, \end{aligned}$$
(198)
$$\begin{aligned} d_{t}\big (\,\sqrt{\gamma }\big [\,\tau _{\text{ fluid }}+E_{\text{ neutrino }}\,\big ]\,\big )&=0, \end{aligned}$$
(199)
$$\begin{aligned} d_{t}\big (\,\sqrt{\gamma }\,\big [\,N_{e}+N_{\text{ neutrino }}\,\big ]\,\big )&=0, \end{aligned}$$
(200)

where \(N_{e}=D\,Y_{e}/m_{\text{ B }}\), and

$$\begin{aligned} S_{j,\text{ neutrino }}&=\sum _{s=1}^{N_{{\textsc {Sp}}}}\int _{0}^{\infty }{\mathscr {F}}_{j,s}\,dV_{\varepsilon }, \end{aligned}$$
(201)
$$\begin{aligned} E_{\text{ neutrino }}&=\sum _{s=1}^{N_{{\textsc {Sp}}}}\int _{0}^{\infty }{\mathscr {E}}_{s}\,dV_{\varepsilon }, \end{aligned}$$
(202)
$$\begin{aligned} N_{\text{ neutrino }}&=\sum _{s=\nu _{e},{\bar{\nu }}_{e}}{\mathsf {g}}_{s}\int _{0}^{\infty }{\mathscr {N}}_{s}\,dV_{\varepsilon }, \end{aligned}$$
(203)

and where the Eulerian angular moments \({\mathscr {F}}_{s,j}\), \({\mathscr {E}}_{s}\), and \({\mathscr {N}}_{s}\) are defined in Sect. 4.7.2. The Eulerian neutrino number density \({\mathscr {N}}_{s}\) is expressed in terms of the Lagrangian moments in Eq. (97), which is also the expression inside the time-derivative on the left-hand side of Eq. (195). The Eulerian momentum and energy, can also be written as combinations of the quantities in the time-derivatives on the left-hand side of Eqs. (195) and (196):

$$\begin{aligned} {\mathscr {F}}_{j,s}&=\varepsilon \, \big \{\, W\,v_{j}\,\big [\,W\,{\mathscr {D}}_{s}+v^{i}\,{\mathscr {I}}_{s,i}\,\big ] +\big [\,W\,{\mathscr {I}}_{s,j}+v^{i}\,\widehat{{\mathscr {K}}}_{s,ij}\,\big ] \,\big \}, \end{aligned}$$
(204)
$$\begin{aligned} {\mathscr {E}}_{s}&=\varepsilon \, \big \{\, W\,\big [\,W\,{\mathscr {D}}_{s}+v^{i}\,{\mathscr {I}}_{s,i}\,\big ] +v^{j}\,\big [\,W\,{\mathscr {I}}_{s,j}+v^{i}\,\widehat{{\mathscr {K}}}_{s,ij}\,\big ] \,\big \}. \end{aligned}$$
(205)

Thus, adopting a closure for the radiation moments, writing \(\widehat{{\mathscr {K}}}_{s,ij}\) in terms of \({\mathscr {D}}_{s}\) and \({\mathscr {I}}_{s,j}\) as discussed in Sect. 4.7.4, and an equation of state for the fluid \(p=p(\rho ,e,Y_{e})\), the system given by Eqs. (195)–(196) and (197)–(200) can be solved for the radiation moments \({\mathscr {D}}_{s}\) and \({\mathscr {I}}_{s,j}\), and the fluid states \(\rho \), \(v^{i}\), e, and \(Y_{e}\). This is a nonlinear system of equations, where nonlinearities are due to the radiation moment closure, the fluid equation of state, the dependence of D, \(S_{j}\), and \(\tau _{\text{ fluid }}\) on \(\rho \), \(v^{i}\), e, and \(Y_{e}\), and the nonlinear dependence of the neutrino opacities discussed in Sect. 5.2.2 on the thermodynamic state \(\rho \), e, and \(Y_{e}\). Modeling this four-momentum and lepton exchange between neutrinos and the fluid—with all the relevant neutrino–matter interactions included—constitutes the major computational cost of core-collapse supernova models.

6 Phase-space discretizations and implementations

6.1 Boltzmann kinetics: spatial and energy finite differencing plus discrete ordinates

6.1.1 Phase-space coordinates

In a spherical spatial coordinate system, the neutrino’s direction of propagation is specified relative to the basis vectors \(\{{\mathbf {e}}_{r,\theta ,\phi }\}\) as (see Fig. 13)

$$\begin{aligned} {\mathbf {n}}=(n^{r},n^{\theta },n^{\phi }), \end{aligned}$$
(206)

where

$$\begin{aligned} n^{r}= & {} \cos \theta _{p}, \end{aligned}$$
(207)
$$\begin{aligned} n^{\theta }= & {} \sin \theta _{p}\cos \phi _{p}, \end{aligned}$$
(208)
$$\begin{aligned} n^{\phi }= & {} \sin \theta _{p}\sin \phi _{p}. \end{aligned}$$
(209)

This can be rexpressed as

$$\begin{aligned} n^{r}= & {} \mu , \end{aligned}$$
(210)
$$\begin{aligned} n^{\theta }= & {} (1 - \mu ^2)^{\frac{1}{2}}\cos \phi _{p}, \end{aligned}$$
(211)
$$\begin{aligned} n^{\phi }= & {} (1 - \mu ^2)^{\frac{1}{2}}\sin \phi _{p}, \end{aligned}$$
(212)

where \(\mu \equiv \cos \theta _{p}\). When spherical spatial and momentum-space coordinates are used, as defined above, the neutrino distribution function has the following dependencies for no imposed symmetry, axisymmetry, and spherical symmetry,

$$\begin{aligned} f= & {} f(r,\theta ,\phi ,{\mathbf {n}},E,t)=f(r,\theta ,\phi ,\mu ,\phi _{p},E,t), \end{aligned}$$
(213)
$$\begin{aligned} f= & {} f(r,\theta ,{\mathbf {n}},E,t)=f(r,\theta ,\mu ,\phi _{p},E,t), \end{aligned}$$
(214)
$$\begin{aligned} f= & {} f(r,{\mathbf {n}},E,t)=f(r,\mu ,E,t), \end{aligned}$$
(215)

respectively, where in all three cases E is the neutrino energy.

Fig. 13
figure 13

Diagram illustrating the spherical momentum-space coordinates used in most neutrino radiation hydrodynamics implementations. The angle \(\theta _p\) is the angle between the outgoing radial direction and the neutrino propagation direction, at the neutrino’s location. The neutrino direction cosine, \(\mu \equiv \cos \theta _p\), is defined in terms of it. \(\phi _p\) is the associated momentum-space azimuthal angle. In spherical symmetry, the distribution function is only a function of \(\mu \), not \(\phi _p\)

6.1.2 Spherical symmetry

We illustrate the approach used by Mezzacappa and Bruenn (1993a), Mezzacappa and Messer (1999), Liebendörfer et al. (2004) and Mezzacappa et al. (2004, 2005) in the context of a model that assumes Newtonian gravity and is valid to \({\mathscr {O}}(v/c)\). The fully general relativistic case is detailed in Liebendörfer et al. (2004). In the Newtonian gravity, \({\mathscr {O}}(v/c)\) case, the conservative neutrino Boltzmann equation reads

$$\begin{aligned}&\frac{1}{c}\frac{\partial F}{\partial t} + 4\pi \mu \frac{\partial (r^{2}\rho F)}{\partial m} + \frac{1}{r}\frac{\partial [(1-\mu ^{2})F]}{\partial \mu } \nonumber \\&\qquad + \frac{1}{c}\left( \frac{\partial {\mathrm{ln}}\rho }{\partial t}+\frac{3v}{r}\right) \frac{\partial [\mu (1-\mu ^{2})F]}{\partial \mu } + \frac{1}{c}\left[ \mu ^{2}\left( \frac{\partial {\mathrm{ln}}\rho }{\partial t}+\frac{3v}{r}\right) -\frac{v}{r}\right] \frac{1}{E^{2}}\frac{\partial (E^{3}F)}{\partial E} \nonumber \\&\quad = \frac{j}{\rho }-{\tilde{\chi }}F + \frac{1}{c}\frac{1}{h^{3}c^{3}}E^{2}\int d\mu ^{\prime }R_{{\mathrm{IS}}}F - \frac{1}{c}\frac{1}{h^{3}c^{3}}E^{2}F\int d\mu ^{\prime }R_{{\mathrm{IS}}} \nonumber \\&\qquad + \frac{1}{h^{3}c^{4}} \left( \frac{1}{\rho }-F\right) \int dE^{\prime }E^{{\prime }{2}}d\mu ^{\prime } {\tilde{R}}_{{\mathrm{NIS}}}^{{\mathrm{in}}} F - \frac{1}{h^{3}c^{4}} F \int dE^{\prime }E^{{\prime }{2}}d\mu ^{\prime } {\tilde{R}}_{{\mathrm{NIS}}}^{\mathrm{out}} \left( \frac{1}{\rho }-F\right) \nonumber \\&\qquad + \frac{1}{h^{3}c^{4}} \left( \frac{1}{\rho }-F\right) \int dE^{\prime }E^{{\prime }{2}}d\mu ^{\prime } {\tilde{R}}_{{\mathrm{PAIR}}}^{{\mathrm{em}}} \left( \frac{1}{\rho }-{\bar{F}}\right) - \frac{1}{h^{3}c^{4}} F \int dE^{\prime }E^{{\prime }{2}}d\mu ^{\prime } {\tilde{R}}_{{\mathrm{PAIR}}}^{\mathrm{abs}} {\bar{F}} ,\nonumber \\ \end{aligned}$$
(216)

where \(F\equiv f/\rho \), m is the Lagrangian mass coordinate, \(\mu \) is the neutrino direction cosine, as defined above, and E is the neutrino energy. In spherical symmetry, \(F=F(t,m,\mu ,E)\). After the time derivative term on the left-hand side of the Boltzmann equation, the remaining terms correspond to the transport of neutrinos in all three dimensions of phase space: \((m,\mu ,E)\). The first term corresponds to spatial transport of neutrinos through the stellar core layers. As a neutrino propagates through the core, its direction cosine, defined in spherical coordinates with respect to the outward radial vector at its position, changes. This is captured by the second term. The third and fourth terms capture the transport of neutrinos in angle and energy due to relativistic (in this case to \({\mathscr {O}}(v/c)\)) angular aberration and frequency shift, respectively. On the right-hand side, the collision term includes (1) thermal emission, with emissivity, j, (2) absorption, with absorption opacity \({\tilde{\chi }}\equiv j+\chi \), which accounts for stimulated absorption, (3) iso-energetic scattering, with scattering kernel \(R_{\mathrm{IS}}\), (4) non-isoenergetic scattering, with scattering kernel, \(R_{\mathrm{NIS}}\), and (5) neutrino pair creation and annihilation, with pair-production kernel, \(R_{{\mathrm{PAIR}}}\). The distribution function for antineutrinos are designated by \({\bar{F}}\). While the left-hand side of the Boltzmann equation is linear in the distribution functions, it is important to note that the right-hand side is not. The nonlinearity on the right-hand side is evident due to the blocking factors corresponding to the boundedness of the neutrino distribution functions: f lies in the range [0, 1]. There is an additional nonlinearity that is implicit in the equation. The distribution functions are updated together with the matter internal energy and electron fraction, due to energy and lepton number exchange between the neutrinos and the matter as a result of the above processes. In turn, the neutrino emissivity, opacity, and scattering kernels depend on the thermodynamic state of the matter, which depends on the matter’s density, internal energy, and electron fraction. Thus, a simultaneous linearization of the discretized equations of neutrino radiation hydrodynamics in the neutrino distribution functions, the matter internal energy, and the matter electron fraction is required.

The finite differencing of the time derivative of the neutrino distribution function in Eq. (216) is straightforward:

$$\begin{aligned} \frac{\partial F}{\partial t}=\frac{F_{i^{\prime },j^{\prime },k^{\prime }}-{F}_{i^{\prime },j^{\prime },k^{\prime }}^{n}}{dt}. \end{aligned}$$
(217)

For simplicity, we define the zone-center indices for each of the phase space dimensions with primed indices: \(i^{\prime }\equiv i+1/2\), \(j^{\prime }\equiv j+1/2\), and \(k^{\prime }\equiv k+1/2\). Focusing now on the spatial advection term, the first of the \({\mathscr {O}}(1)\) terms: In the free streaming limit, the advected neutrino number in a time step (as measured by a comoving observer) can be large relative to the neutrino number in a zone (mass shell). Upwind differencing of the advection term is appropriate to limit destabilizing errors in the fluxes. For discrete direction cosines, \( \mu _{j^{\prime }} \), the direction of the neutrino “wind” is given by the sign of \( \mu _{j^{\prime }} \). On the other hand, in diffusive conditions, the neutrino flux may be orders of magnitude smaller than the nearly isotropic neutrino density in a zone. In this situation, an asymmetric differencing can lead to an overestimation of the first angular moment because of improper cancellations among the contributions of the nearly isotropic neutrino radiation field. As a result, Mezzacappa et al. interpolate between upwind differencing in free streaming regimes and centered differencing in diffusive regimes. Specifically, using the coefficients, \( \beta _{i,k^{\prime }} \), defined as

$$\begin{aligned} \beta _{i,k^{\prime }}=\left\{ \begin{array}{ll} 1/2 &{}\quad {\text {if}}\; 2dr_{i}>\lambda _{i,k^{\prime }},\\ \left( 2dr_{i}/\lambda _{i,k^{\prime }}+1\right) ^{-1} &{}\quad {\mathrm{otherwise}}, \end{array}\right. \end{aligned}$$
(218)

where \(\lambda _{i,k}\) is the angle-averaged neutrino mean free path, the spatial advection term is discretized as

$$\begin{aligned} \mu \frac{\partial r^{2}\rho F}{\partial m}=\frac{\mu _{j^{\prime }}}{dm_{i^{\prime }}}\left[ 4\pi r^{2}_{i+1}\rho _{i+1}F_{i+1,j^{\prime },k^{\prime }} -4\pi r^{2}_{i}\rho _{i}F_{i,j^{\prime },k^{\prime }}\right] \end{aligned}$$
(219)

with

$$\begin{aligned} \rho _{i}F_{i,j^{\prime },k^{\prime }}=\beta _{i,k^{\prime }}\rho _{i^{\prime }-1}F_{i^{\prime }-1,j^{\prime },k^{\prime }}+\left( 1-\beta _{i,k^{\prime }}\right) \rho _{i^{\prime }}F_{i^{\prime },j^{\prime },k^{\prime }} \end{aligned}$$
(220)

for outward propagating neutrinos \( \left( \mu _{j^{\prime }}>0\right) \) and

$$\begin{aligned} \rho _{i}F_{i,j^{\prime },k^{\prime }}=\left( 1-\beta _{i,k^{\prime }}\right) \rho _{i^{\prime }-1}F_{i^{\prime }-1,j^{\prime },k^{\prime }}+\beta _{i,k^{\prime }}\rho _{i^{\prime }}F_{i^{\prime },j^{\prime },k^{\prime }} \end{aligned}$$
(221)

for inward propagating neutrinos \( \left( \mu _{j^{\prime }}<0\right) \).

Next, focusing on the angular advection term, Mezzacappa et al. use the following discretization:

$$\begin{aligned} \frac{\partial [(1-\mu ^{2})F]}{r\partial \mu } =\frac{3\left[ r^{2}_{i+1}-r^{2}_{i}\right] }{2\left[ r^{3}_{i+1}-r^{3}_{i}\right] } \frac{1}{w_{j^{\prime }}}\left( \zeta _{j+1}F_{i^{\prime },j+1,k^{\prime }}-\zeta _{j}F_{i^{\prime },j,k^{\prime }}\right) . \end{aligned}$$
(222)

The differencing of the coefficients, \( \zeta =1-\mu ^{2} \), is defined by

$$\begin{aligned} \zeta _{j+1}-\zeta _{j}=-2\mu _{j^{\prime }}w_{j^{\prime }}, \end{aligned}$$
(223)

where the \(w_{j^{\prime }}\) are the weights corresponding to the Gaussian quadrature values used for \(\mu _{j^{\prime }}\). The discretization of the coefficient, 1/r, of the angular advection term is set such that in an infinite homogenous medium in thermal equilibrium, \(\rho F= f_{\mathrm{eq}} =\) constant is a solution (Mezzacappa and Bruenn 1993a). The angular integration of the term \(\partial [(1-\mu ^{2})F]/r\partial \mu \) produces the zeroth and second angular moments of the neutrino distribution function. Its finite difference representation is therefore not as sensitive to cancellations in the diffusive limit as the differencing of the spatial advection term. Upwind differencing is justified. The angular “wind” always points towards \( \mu =1 \). However, for reasons of completeness and consistency, Mezzacappa et al. use centered differencing in the diffusive regime here as well, with angular coefficients, \( \gamma _{i^{\prime },k^{\prime }}\equiv \beta _{i^{\prime },k^{\prime }} \), and

$$\begin{aligned} F_{i^{\prime },j,k^{\prime }}=\gamma _{i^{\prime },k^{\prime }}F_{i^{\prime },j^{\prime }-1,k^{\prime }}+\left( 1-\gamma _{i^{\prime },k^{\prime }}\right) F_{i^{\prime },j^{\prime },k^{\prime }}. \end{aligned}$$
(224)

Finally, Mezzacappa et al. discretize the last of the \({\mathscr {O}}(1)\) terms in the Boltzmann equation, the collision term, as

$$\begin{aligned}&\frac{j^{n+1}_{i^{\prime },k^{\prime }}}{\rho ^{n+1}_{i^{\prime }}}-\tilde{\chi }^{n+1}_{i^{\prime },k^{\prime }} \, F_{i^{\prime },j^{\prime },k^{\prime }} \nonumber \\&\qquad + \frac{1}{ch^{3}c^{3}}\, E_{k^{\prime }}^{2}\sum _{l=1}^{jmax}w_{l^{\prime }}\, (R_{\mathrm{IS}})^{n+1}_{i^{\prime },j^{\prime },l^{\prime },k^{\prime }} \, F_{i^{\prime },l^{\prime },k^{\prime }} - \frac{1}{ch^{3}c^{3}}\, E_{k^{\prime }}^{2}\, F_{i^{\prime },j^{\prime },k^{\prime }} \sum _{l=1}^{jmax}w_{l^{\prime }}\, (R_{\mathrm{IS}})^{n+1}_{i^{\prime },j^{\prime },l^{\prime },k^{\prime }} \nonumber \\&\qquad + \frac{1}{ch^{3}c^{3}}\,\left( 1/\rho ^{n+1}_{i^{\prime }}-F_{i^{\prime },j^{\prime },k^{\prime }}\right) \sum _{m=1}^{kmax} \varDelta E_{m^{\prime }} E_{m^{\prime }}^{2} \sum _{l=1}^{jmax}w_{l^{\prime }}\, \times ({\tilde{R}}^{{\mathrm{in}}}_{\mathrm{NIS}})^{n+1}_{i^{\prime },j^{\prime },l^{\prime },k^{\prime },m^{\prime }}\, F_{i^{\prime },l^{\prime },m^{\prime }} \nonumber \\&\qquad - \frac{1}{ch^{3}c^{3}}\,F_{i^{\prime },j^{\prime },k^{\prime }} \sum _{m=1}^{kmax} \varDelta E_{m^{\prime }} E_{m^{\prime }}^{2} \sum _{l=1}^{jmax}w_{l^{\prime }}\, \times ({\tilde{R}}^{\mathrm{out}}_{\mathrm{NIS}})^{n+1}_{i^{\prime },j^{\prime },l^{\prime },k^{\prime },m^{\prime }}\, \left( 1/\rho ^{n+1}_{i^{\prime }}-F_{i^{\prime },l^{\prime },m^{\prime }}\right) \nonumber \\&\qquad + \frac{1}{ch^{3}c^{3}}\,\left( 1/\rho ^{n+1}_{i^{\prime }}-F_{i^{\prime },j^{\prime },k^{\prime }}\right) \sum _{m=1}^{kmax} \varDelta E_{m^{\prime }} E_{m^{\prime }}^{2} \sum _{l=1}^{jmax}w_{l^{\prime }}\, \nonumber \\&\qquad \times ({\tilde{R}}^{\mathrm{em}}_{{\mathrm{PAIR}}})^{n+1}_{i^{\prime },j^{\prime },l^{\prime },k^{\prime },m^{\prime }}\, \left( 1/\rho ^{n+1}_{i^{\prime }}-{\bar{F}}_{i^{\prime },l^{\prime },m^{\prime }}\right) \nonumber \\&\qquad - \frac{1}{ch^{3}c^{3}}\,F_{i^{\prime },j^{\prime },k^{\prime }} \sum _{m=1}^{kmax} \varDelta E_{m^{\prime }} E_{m^{\prime }}^{2} \sum _{l=1}^{jmax}w_{l^{\prime }}\, \times ({\tilde{R}}^{\mathrm{abs}}_{\mathrm{PAIR}})^{n+1}_{i^{\prime },j^{\prime },l^{\prime },k^{\prime },m^{\prime }}\, {\bar{F}}_{i^{\prime },l^{\prime },m^{\prime }} \end{aligned}$$
(225)

It is important to note that the collision term is differenced implicitly with respect to time. All of the neutrino and antineutrino distribution functions in Eq. (225) are evaluated at the new time step. Given the implementation of discrete ordinates in angle, the angular integrals in the collision term are evaluated with Gaussian quadrature, using the same quadrature set used for the angular discretizations of the distribution function and terms on the left-hand side of the Boltzmann equation.

6.1.3 Challenges: relativistic effects and the simultaneous conservation of lepton number and energy

Define

$$\begin{aligned} J^{N}= & {} \int ^{1}_{-1}\int ^{\infty }_{0}FE^{2}dEd\mu , \end{aligned}$$
(226)
$$\begin{aligned} H^{N}= & {} \int ^{1}_{-1}\int ^{\infty }_{0}FE^{2}dE\mu d\mu . \end{aligned}$$
(227)

\( J^{N} \) and \( H^{N} \) are the zeroth and first angular number moments of the distribution function. Integration of Eq. (216) over \(\mu \) and E with \(E^{2}\) as the measure of integration gives the following evolution equation for \(J^{N}\):

$$\begin{aligned} \frac{\partial J^{N}}{\partial t}+\frac{\partial }{\partial m}\left[ 4\pi r^{2}\rho H^{N}\right] -\int \frac{j}{\rho }E^{2}dEd\mu + \int \chi FE^{2}dEd\mu =0. \end{aligned}$$
(228)

One more integration over rest mass \( m \) from the center of the star to its surface gives the evolution equation for the total neutrino (lepton) number. It is clear from Eq. (228) that the total neutrino (lepton) number in the computational domain will change only as a result of an inflow or an outflow of neutrinos at the boundary of the domain and/or as a result of the exchange of lepton number between the neutrinos and the matter. Now, in the same way, define the energy moments:

$$\begin{aligned} J^{E}= & {} \int FE^{3}dEd\mu , \end{aligned}$$
(229)
$$\begin{aligned} H^{E}= & {} \int FE^{3}dE\mu d\mu , \end{aligned}$$
(230)
$$\begin{aligned} K^{E}= & {} \int FE^{3}dE\mu ^{2}d\mu . \end{aligned}$$
(231)

By taking the zeroth and first angular moments of the energy moment (\(\int E^{3}dE\{ \partial F/\partial t = O[F]\}\)) of the Boltzmann equation, the latter weighted by the fluid velocity, v,—i.e., \(\int E^{3}dEd\mu \{ \partial F/\partial t = O[F]\}\) and \(v\int E^{3}dEd\mu \mu \{ \partial F/\partial t = O[F]\}\)—one obtains two equations:

$$\begin{aligned}&\frac{\partial J^{E}}{\partial t} + \frac{\partial }{\partial m}\left[ 4\pi r^{2}\rho H^{E}\right] -\left( \frac{\partial {\mathrm{ln}} \rho }{\partial t} +\frac{2v}{r}\right) K^{E}+\frac{v}{r}\left( J^{E}-K^{E}\right) \nonumber \\&\qquad - \int \frac{j}{\rho }E^{3}dEd\mu +\int \chi FE^{3}dEd\mu =0, \end{aligned}$$
(232)

and

$$\begin{aligned}&v\frac{\partial H^{E}}{\partial t} + \frac{\partial }{\partial m}\left[ 4\pi r^{2}v\rho K^{E}\right] -4\pi r^{2}\rho \frac{dv}{dm}K^{E}-\frac{v}{r}\left( J^{E}-K^{E}\right) \nonumber \\&\qquad - v\left( \frac{\partial {\mathrm{ln}} \rho }{\partial t}+\frac{2v}{r}\right) H^{E} + v\int \chi FE^{3}dE\mu d\mu =0. \end{aligned}$$
(233)

Equation (232) is the evolution equation for the comoving-frame neutrino energy per gram. Equation (233) is the evolution equation for the comoving-frame neutrino momentum per gram. Combining the two, to \({\mathscr {O}}(v/c)\), one obtains the laboratory-frame neutrino energy conservation equation:

$$\begin{aligned} 0= & {} \frac{\partial }{\partial t}\left( J^{E}+vH^{E}\right) +\frac{\partial }{\partial m}\left[ 4\pi r^{2}\rho \left( vK^{E}+H^{E}\right) \right] \nonumber \\&- \int \frac{j}{\rho }E^{3}dEd\mu +\int \chi FE^{3}dEd\mu +v\int \chi FE^{3}dE\mu d\mu . \end{aligned}$$
(234)

Note that \(J^{E}+vH^{E}\) is the laboratory-frame neutrino energy per gram as expressed in terms of the comoving-frame moments \(J^{E}\) and \(H^{E}\). Similarly, \(vK^{E}+H^{E}\) is the laboratory-frame flux per gram expressed in terms of comoving-frame moments. Integration of Eq. (234) over enclosed mass leads to an equation for total neutrino energy conservation. It is clear that, with the exception again of fluxes at the boundary of the computational domain and energy exchange with the matter due to collisions (the terms involving j and \(\chi \)) and neutrino stress (the term involving \(v\chi \)), the total neutrino energy as defined in the laboratory frame (where one can speak of conservation of energy) is conserved.

In arriving at Eq. (234), the expressions \((\partial {\mathrm{ln}}\rho /\partial {\mathrm{t}} +2{\mathrm{v/r}}){\mathrm{K}}^{{\mathrm{E}}}\) and \(K^{E}4\pi r^{2}\rho \partial v/\partial m\) in Eqs. (232) and (233) cancel given the continuity equation

$$\begin{aligned} \frac{\partial {\mathrm{ln}} \rho }{\partial t}+\frac{2v}{r}=-4\pi r^{2}\rho \frac{\partial v}{\partial m}. \end{aligned}$$
(235)

To achieve global energy conservation in the discrete limit, one must ensure, these cancellations occur in the finite differencing as well. Identifying the origin of the terms \((\partial {\mathrm{ln}}\rho /\partial {\mathrm{t}} +2{\mathrm{v/r}}){\mathrm{K}}^{{\mathrm{E}}}\) and \( K^{E}4\pi r^{2}\rho \partial v/\partial m \), we find that \((\partial {\mathrm{ln}}\rho /\partial {\mathrm{t}} +2{\mathrm{v/r}}){\mathrm{K}}^{{\mathrm{E}}}\) originates from the zeroth moment of the energy advection term,

$$\begin{aligned} \left[ \mu ^{2}\left( \frac{\partial {\mathrm{ln}} \rho }{\partial t}+\frac{2v}{r}\right) -\left( 1-\mu ^{2}\right) \frac{v}{r}\right] \frac{1}{E^{2}}\frac{\partial }{\partial E}\left( E^{3}F\right) , \end{aligned}$$
(236)

in the Boltzmann equation (216), and \( K^{E}4\pi r^{2}\rho \partial v/\partial m \) originates from the first moment of the spatial advection term,

$$\begin{aligned} \mu \frac{\partial \left( 4\pi r^{2}\rho F\right) }{\partial m}, \end{aligned}$$
(237)

in the same equation. The terms \( \left( J^{E}-K^{E}\right) v/r \) also stem from the zeroth moment of the energy advection term, Eq. (236), and the first moment of the angular advection term

$$\begin{aligned} \frac{1}{r}\frac{\partial \left[ \left( 1-\mu ^{2}\right) F\right] }{\partial \mu } \end{aligned}$$
(238)

in the Boltzmann equation (216). The requirement of global energy conservation in the laboratory frame therefore imposes interdependencies on the finite differencing of the \({\mathscr {O}}(1)\) spatial and angular advection terms, Eqs. (237) and (238), and the \({\mathscr {O}}(v/c)\) energy advection term, Eq. (236) (Liebendörfer et al. 2004). In particular, given a choice of finite differencing of the \({\mathscr {O}}(1)\) terms on the left-hand side of the Boltzmann equation (216), conservation of energy requires “matched” finite differencing for the coefficients

$$\begin{aligned} A \equiv \frac{\partial {\mathrm{ln}} \rho }{\partial t}+\frac{2v}{r} \end{aligned}$$
(239)

and

$$\begin{aligned} B \equiv (1-\mu ^{2})\frac{v}{r} \end{aligned}$$
(240)

of the \({\mathscr {O}}(v/c)\) advection terms in the same equation.

Mezzacappa et al. begin by multiplying the discrete representation of the \({\mathscr {O}}(1)\) terms on the left-hand side of the Boltzmann equation (216) by \(1+\mu {\bar{v}}_{i+1}\) (in what follows, unless otherwise specified the indices are \(i^{\prime }\), \(j^{\prime }\), and \(k^{\prime }\)):

$$\begin{aligned}&(1+\mu {\bar{v}}_{i+1})E\frac{F-{\bar{F}}}{cdt} +(1+\mu {\bar{v}}_{i+1})E\frac{4\pi \mu }{dm}[{\bar{r}}^{2}_{i+1}{\bar{\rho }}_{i+1}F_{i+1}-{\bar{r}}^{2}_{i}{\bar{\rho }}_{i}F_{i}] \nonumber \\&\qquad +(1+\mu {\bar{v}}_{i+1})E\frac{3({\bar{r}}^{2}_{i+1}-{\bar{r}}^{2}_{i})}{2({\bar{r}}^{3}_{i+1}-{\bar{r}}^{2}_{3})}\frac{1}{w}[\zeta _{j+1}F_{j+1}-\zeta _{j}F_{j}] \nonumber \\&\quad = \frac{(1+\mu v_{i+1})EF-(1+\mu {\bar{v}}_{i+1})E{\bar{F}}}{cdt} -\frac{\mu v_{i+1}EF-\mu {\bar{v}}_{i+1}EF}{cdt} \nonumber \\&\qquad +\frac{4\pi \mu }{dm}[(1+\mu {\bar{v}}_{i+1})E{\bar{r}}^{2}_{i+1}{\bar{\rho }}_{i+1}F_{i+1}-(1+\mu {\bar{v}}_{i})E{\bar{r}}^{2}_{i}{\bar{\rho }}_{i}F_{i}] \nonumber \\&\qquad - \frac{4\pi \mu ^{2}}{dm}[{\bar{v}}_{i+1}{\bar{r}}^{2}_{i}{\bar{\rho }}_{i}EF_{i}-{\bar{v}}_{i}{\bar{r}}^{2}_{i}{\bar{\rho }}_{i}EF_{i}] +\frac{3({\bar{r}}^{2}_{i+1}-{\bar{r}}^{2}_{i})}{2({\bar{r}}^{3}_{i+1}-{\bar{r}}^{2}_{3})}\frac{1}{w}[\zeta _{j+1}EF_{j+1}-\zeta _{j}EF_{j}] \nonumber \\&\qquad + \frac{3({\bar{r}}^{2}_{i+1}-{\bar{r}}^{2}_{i})}{2({\bar{r}}^{3}_{i+1}-{\bar{r}}^{2}_{3})}{\bar{v}}_{i+1}\frac{1}{w}[\mu \zeta _{j+1}EF_{j+1}-\mu \zeta _{j}EF_{j}] \nonumber \\&\quad =\frac{(1+\mu v_{i+1})EF-(1+\mu {\bar{v}}_{i+1})E{\bar{F}}}{cdt} - EF\frac{\mu v_{i+1}-\mu {\bar{v}}_{i+1}}{cdt} \nonumber \\&\qquad +\frac{4\pi \mu }{dm}[(1+\mu {\bar{v}}_{i+1})E{\bar{r}}^{2}_{i+1}{\bar{\rho }}_{i+1}F_{i+1}-(1+\mu {\bar{v}}_{i})E{\bar{r}}^{2}_{i}{\bar{\rho }}_{i}F_{i}] -\frac{4\pi \mu ^{2}}{dm}{\bar{r}}^{2}_{i}{\bar{\rho }}_{i}EF_{i}[{\bar{v}}_{i+1}-{\bar{v}}_{i}] \nonumber \\&\qquad + \frac{3({\bar{r}}^{2}_{i+1}-{\bar{r}}^{2}_{i})}{2({\bar{r}}^{3}_{i+1}-{\bar{r}}^{2}_{3})}\frac{1}{w}[\zeta _{j+1}EF_{j+1}-\zeta _{j}EF_{j}]\nonumber \\&\qquad + +\frac{3({\bar{r}}^{2}_{i+1}-{\bar{r}}^{2}_{i})}{2({\bar{r}}^{3}_{i+1}-{\bar{r}}^{2}_{3})} {\bar{v}}_{i+1}\frac{1}{w}[\mu \zeta _{j+1}EF_{j+1}-\mu \zeta _{j}EF_{j}]. \end{aligned}$$
(241)

A bar over a variable indicates its value is to be taken at time step \(t^n\). As noted, the total energy equation is obtained when summing Eqs. (232) and (233) and then integrating over m (the integration in \(\mu \) and E has already taken place). In this sequence of integrations (over \(\mu \), E, and then m), the term involving A in Eq. (232) cancels with the term \(-4\pi r^{2}\rho K^{E}dv/dm\) in Eq. (233).

Identifying the appropriate velocity gradient term in Eq. (241) and focusing on the appropriate integration (in this case, over m), Mezzacappa et al. require that [below, the term involving A comes from the zeroth moment of first term in the observer correction (236) after an integration by parts in energy, E; the term involving the velocity gradient is the next to last term in Eq. (241), corresponding to the first moment of the spatial propagation term in the Boltzmann equation (216)]:

$$\begin{aligned}&\sum _{i=1,imax-1}\mu ^2 A_{i^{\prime }}F_{i^{\prime }}dm_{i^{\prime }} \nonumber \\&\qquad -\sum _{i=1,imax-1}4\pi \mu ^{2}{\bar{r}}_{i}^{2}{\bar{\rho }}_{i}F_{i}({\bar{v}}_{i+1}-{\bar{v}}_{i}) \nonumber \\&\quad =\sum _{i=1,imax-1}\mu ^2 A_{i^{\prime }}F_{i^{\prime }}dm_{i^{\prime }} \nonumber \\&\qquad -\sum _{i=1,imax-1,j\le jmax/2}4\pi \mu ^{2}{\bar{r}}_{i}^{2}(\beta _{i}{\bar{\rho }}_{i^{\prime }}F_{i^{\prime }}+(1-\beta _{i}){\bar{\rho }}_{i^{\prime }-1}F_{i^{\prime }-1})({\bar{v}}_{i+1}-{\bar{v}}_{i}) \nonumber \\&\qquad -\sum _{i=1,imax-1,j\ge jmax/2+1}4\pi \mu ^{2}{\bar{r}}_{i}^{2}(\beta _{i}{\bar{\rho }}_{i^{\prime }-1}F_{i^{\prime }-1}+(1-\beta _{i}){\bar{\rho }}_{i^{\prime }}F_{i^{\prime }})({\bar{v}}_{i+1}-{\bar{v}}_{i}) \nonumber \\&\quad = \sum _{i=1,imax-1}\mu ^2 A_{i^{\prime }}F_{i^{\prime }}dm_{i^{\prime }} \nonumber \\&\qquad - \sum _{i=1,imax-1,j\le jmax/2}4\pi \mu ^{2}{\bar{r}}_{i}^{2}({\bar{v}}_{i+1}-{\bar{v}}_{i})\beta _{i}{\bar{\rho }}_{i^{\prime }}F_{i^{\prime }} \nonumber \\&\qquad - \sum _{i=1,imax-2,j\le jmax/2}4\pi \mu ^{2}{\bar{r}}_{i+1}^{2}({\bar{v}}_{i+2}-{\bar{v}}_{i+1})(1-\beta _{i+1}){\bar{\rho }}_{i^{\prime }}F_{i^{\prime }} \nonumber \\&\qquad - \sum _{i=1,imax-1,j\ge jmax/2+1}4\pi \mu ^{2}{\bar{r}}_{i}^{2}({\bar{v}}_{i+1}-{\bar{v}}_{i})(1-\beta _{i}){\bar{\rho }}_{i^{\prime }}F_{i^{\prime }} \nonumber \\&\qquad - \sum _{i=1,imax-2,j\ge jmax/2+1}4\pi \mu ^{2}{\bar{r}}_{i+1}^{2}({\bar{v}}_{i+2}-{\bar{v}}_{i+1})\beta _{i+1}){\bar{\rho }}_{i^{\prime }}F_{i^{\prime }}\nonumber \\&\quad = 0, \end{aligned}$$
(242)

which gives

$$\begin{aligned} A_{i^{\prime },k^{\prime }}=4\pi \frac{{\bar{\rho }}_{i^{\prime }}}{dm_{i^{\prime }}}\left( {\bar{r}}_{i}^{2}({\bar{v}}_{i+1}-{\bar{v}}_{i})\beta _{i,k^{\prime }} +{\bar{r}}_{i+1}^{2}({\bar{v}}_{i+2}-{\bar{v}}_{i+1})(1-\beta _{i+1,k^{\prime }})\right) \end{aligned}$$
(243)

for \(j\le jmax/2\) and

$$\begin{aligned} A_{i^{\prime },k^{\prime }}=4\pi \frac{{\bar{\rho }}_{i^{\prime }}}{dm_{i^{\prime }}}\left( {\bar{r}}_{i}^{2}({\bar{v}}_{i+1}-{\bar{v}}_{i})(1-\beta _{i,k^{\prime }}) +{\bar{r}}_{i+1}^{2}({\bar{v}}_{i+2}-{\bar{v}}_{i+1})\beta _{i+1,k^{\prime }}\right) \end{aligned}$$
(244)

for \(j\ge jmax/2 + 1\). (The case \(i=imax-1\) is a boundary case, the details of which are not important for the present discussion.)

Similarly, defining \(B^{\prime }\) according to

$$\begin{aligned} B_{i^{\prime },j^{\prime },k^{\prime }}\equiv \frac{3}{2}\frac{{\bar{r}}_{i+1}^{2}-{\bar{r}}_{i}^{2}}{{\bar{r}}_{i+1}^{3}-{\bar{r}}_{i}^{3}}{\bar{v}}_{i+1}B^{\prime }_{j^{\prime },k^{\prime }}, \end{aligned}$$
(245)

and again focusing on the appropriate integration (in this case, over \(\mu \)), Mezzacappa et al. require that [below, the term involving \(B'\) comes from the zeroth moment of the second term in brackets in the energy advection term (236), after an integration by parts in angle, \(\mu \); the second term is the last term in Eq. (241), corresponding to the first moment of the angular advection term]:

$$\begin{aligned} 0= & {} \sum _{j=1,jmax}B^{\prime }_{j^{\prime }}F_{j^{\prime }}w_{j^{\prime }} +\sum _{j=1,jmax}\frac{2}{w_{j^{\prime }}}[\mu _{j^{\prime }}\alpha _{j+1}F_{j+1}-\mu _{j^{\prime }}\alpha _{j}F_{j}]w_{j^{\prime }} \nonumber \\= & {} \sum _{j=1,jmax}B^{\prime }_{j^{\prime }}F_{j^{\prime }}w_{j^{\prime }} \nonumber \\&+ \sum _{j=1,jmax}2[\mu _{j^{\prime }}\alpha _{j+1}(\gamma F_{j^{\prime }}+(1-\gamma )F_{j^{\prime }+1}) -\mu _{j^{\prime }}\alpha _{j} (\gamma F_{j^{\prime }-1}+(1-\gamma )F_{j^{\prime }})] \nonumber \\= & {} \sum _{j=1,jmax}B^{\prime }_{j^{\prime }}F_{j^{\prime }}w_{j^{\prime }} +\sum _{j=1,jmax}2[\mu _{j^{\prime }}\alpha _{j+1}\gamma -\mu _{j^{\prime }}\alpha _{j}(1-\gamma )]F_{j^{\prime }} \nonumber \\&+ \sum _{j=2,jmax}2\mu _{j^{\prime }-1}\alpha _{j}(1-\gamma )F_{j^{\prime }} +\sum _{j=1,jmax-1}(-2)\mu _{j^{\prime }+1}\alpha _{j+1}\gamma F_{j^{\prime }} \nonumber \\= & {} \sum _{j=1,jmax}B^{\prime }_{j^{\prime }}F_{j^{\prime }}w_{j^{\prime }}+ \sum _{j=1,jmax}2\gamma \alpha _{j+1}(\mu _{j^{\prime }}-\mu _{j^{\prime }+1})F_{j^{\prime }} \nonumber \\&+\sum _{j=1,jmax}2(1-\gamma )\alpha _{j} (\mu _{j^{\prime }-1}-\mu _{j^{\prime }})F_{j^{\prime }}, \end{aligned}$$
(246)

which gives

$$\begin{aligned} B_{i^{\prime },j^{\prime },k^{\prime }} =\frac{3}{2}\frac{{\bar{r}}_{i+1}^{2}-{\bar{r}}_{i}^{2}}{{\bar{r}}_{i+1}^{3}-{\bar{r}}_{i}^{3}}{\bar{v}}_{i+1} \left[ 2\gamma _{i^{\prime },k^{\prime }} \alpha _{j+1}\frac{\mu _{j^{\prime }+1}-\mu _{j^{\prime }}}{w_{j^{\prime }}} +2(1-\gamma _{i^{\prime },k^{\prime }} )\alpha _{j} \frac{\mu _{j^{\prime }}-\mu _{j^{\prime }-1}}{w_{j^{\prime }}}\right] . \end{aligned}$$
(247)

Given the necessary matched finite differencing for A and B, Mezzacappa et al. then consider the finite difference representation of the energy advection term (236). Using the definitions (239) and (240), they rewrite the equation corresponding to the change in the distribution function due to relativistic energy advection as

$$\begin{aligned} 0=E^{3}\left( \frac{\partial F}{\partial t}\right) _{E}+ \left( \mu ^{2}A-B\right) E\frac{\partial }{\partial E}\left[ E^{3}F\right] , \end{aligned}$$
(248)

and then solve it analytically. To solve Eq. (248), Mezzacappa et al. write the prefactor of the energy derivative as the time derivative of the quantity

$$\begin{aligned} R_{f}=r^{3\mu ^{2}-1}\rho ^{\mu ^2}; \end{aligned}$$
(249)

i.e.,

$$\begin{aligned} \frac{\partial {\mathrm{ln}} R_{f}}{\partial t}=\mu ^{2}A-B. \end{aligned}$$
(250)

They then transform from the “Eulerian” variable \( x=E \) to the “Lagrangian” variable \( y=E/R_{f} \), and in so doing they transform Eq. (248):

$$\begin{aligned} 0= & {} \left( \frac{\partial }{\partial t}\left[ E^{3}F\right] \right) _{E}+\frac{\partial R_{f}}{R^{2}_{f}\partial t}E\times R_{f}\frac{\partial }{\partial E}\left[ E^{3}F\right] \nonumber \\= & {} \left( \frac{\partial }{\partial t}\left[ E^{3}F\right] \right) _{E}-\left( \frac{\partial \left[ E/R_{f}\right] }{\partial t}\right) _{E}\frac{\partial \left[ E^{3}F\right] }{\partial \left[ E/R_{f}\right] } \nonumber \\= & {} \left( \frac{\partial }{\partial t}\left[ E^{3}F\right] \right) _{E/R_{f}}. \end{aligned}$$
(251)

For a small section of energy phase space \( E^{2}\varDelta E=\left( E^{3}_{2}-E^{3}_{1}\right) /3 \), this relationship leads to

$$\begin{aligned} \left( \frac{\partial }{\partial t}\left[ E^{2}F\varDelta E\right] \right) _{E/R_{f}}=0, \end{aligned}$$
(252)

which has the following interpretation: The neutrinos in the energy interval \( E^{2}\varDelta E \) move along constant \( E/R_{f} \) in the phase space of a comoving observer. Given this, Mezzacappa et al. are able to evolve any neutrino quantity in this phase-space interval—in particular, the neutrino specific energy, \( d\varepsilon =E^{3}F\varDelta E \):

$$\begin{aligned} \left( \frac{\partial }{\partial t}\left[ E^{3}F\varDelta E\right] \right) _{E/R_{f}} =E^{2}F\varDelta E\left( \frac{\partial E}{\partial t}\right) _{E/R_{f}}=\frac{\partial {\mathrm{ln}} R_{f}}{\partial t}d\varepsilon . \end{aligned}$$
(253)

They then consider a neutrino energy group \( k^{\prime } \), with neighboring groups \( k^{\prime }+dk \), \( dk=\pm 1 \). From Eq. (252), the number of neutrinos before energy advection, \( F_{i^{\prime },j^{\prime },k^{\prime }}E^{2}_{k^{\prime }}dE_{k^{\prime }} \), is equal to the number of neutrinos after advection. The distribution of these neutrinos in energy after the advection yields a diminished number of neutrinos \( F_{i^{\prime },j^{\prime },k^{\prime }}E_{k^{\prime }}^{2}dE_{k^{\prime }}-n_{i^{\prime },j^{\prime },k^{\prime }}^{-} \) in group \( k^{\prime } \) and an additional number of neutrinos \( n_{i^{\prime },j^{\prime },k^{\prime }+dk}^{+} \) in the neighboring group \( k^{\prime }+dk \) such that

$$\begin{aligned} F_{i^{\prime },j^{\prime },k^{\prime }}E_{k^{\prime }}^{2}dE_{k^{\prime }}-\left[ \left( F_{i^{\prime },j^{\prime },k^{\prime }}E_{k^{\prime }}^{2}dE_{k^{\prime }}-n^{-}_{i^{\prime },j^{\prime },k^{\prime }}\right) +n^{+}_{i^{\prime },j^{\prime },k^{\prime }+dk}\right] =0. \end{aligned}$$
(254)

Equation (253) defines a similar correction for the specific neutrino energy in group \(k^{\prime }\):

$$\begin{aligned}&F_{i^{\prime },j^{\prime },k^{\prime }}E_{k^{\prime }}^{3}dE_{k^{\prime }} - \left[ \left( F_{i^{\prime },j^{\prime },k^{\prime }}E_{k^{\prime }}^{3}dE_{k^{\prime }}-E_{k^{\prime }}n^{-}_{i^{\prime },j^{\prime },k^{\prime }}\right) +E_{k^{\prime }+dk}n^{+}_{i^{\prime },j^{\prime },k^{\prime }+dk}\right] \nonumber \\&\quad = -\left( \mu ^{2}_{j^{\prime }}A_{i^{\prime },k^{\prime }}-B_{i^{\prime },j^{\prime },k^{\prime }}\right) F_{i^{\prime },j^{\prime },k^{\prime }}E^{3}_{k^{\prime }}dE_{k^{\prime }}dt, \end{aligned}$$
(255)

where \(A_{i^{\prime },k^{\prime }}\) and \(B_{i^{\prime },j^{\prime },k^{\prime }}\) are given by Eqs. (243), (244) and (247). Equations (254) and (255) can be solved for \(n^{-}_{i^{\prime },j^{\prime },k^{\prime }}\) and \(n^{+}_{i^{\prime },j^{\prime },k^{\prime }}\):

$$\begin{aligned} n^{-}_{i^{\prime },j^{\prime },k^{\prime }}= & {} \left( \mu _{j^{\prime }}^{2}A_{i^{\prime },k^{\prime }}-B_{i^{\prime },j^{\prime },k^{\prime }}\right) \frac{dE_{k^{\prime }}}{E_{k^{\prime }+dk}-E_{k^{\prime }}}E^{3}_{k^{\prime }}F_{i^{\prime },j^{\prime },k^{\prime }} \, dt , \nonumber \\ n^{+}_{i^{\prime },j^{\prime },k^{\prime }}= & {} n_{i^{\prime },j^{\prime },k^{\prime }-dk}^{-}, \end{aligned}$$
(256)

which leads, given the change in the neutrino distribution function in group \(k^{\prime }\) due to energy advection can be expressed as

$$\begin{aligned} F_{i^{\prime },j^{\prime },k^{\prime }}=F_{i^{\prime },j^{\prime },k^{\prime }}^{n}+\left( n^{+}_{i^{\prime },j^{\prime },k^{\prime }}-n^{-}_{i^{\prime },j^{\prime },k^{\prime }}\right) /\left( E_{k^{\prime }}^{2}dE_{k^{\prime }}\right) , \end{aligned}$$
(257)

to the following finite difference representation of the energy advection term in the Boltzmann equation (216):

$$\begin{aligned}&\frac{1}{E^{2}_{k^{\prime }}dE_{k^{\prime }}} \left[ \left( \mu _{j^{\prime }}^{2}A_{i^{\prime },k^{\prime }-dk}-B_{i^{\prime },j^{\prime },k^{\prime }}\right) \frac{dE_{k^{\prime }-dk}}{E_{k^{\prime }}-E_{k^{\prime }-dk}}E_{k^{\prime }-dk}^{3}F_{i^{\prime },j^{\prime },k^{\prime }-dk}\right. \nonumber \\&\quad \quad - \left. \left( \mu _{j^{\prime }}^{2}A_{i^{\prime },k^{\prime }}-B_{i^{\prime },j^{\prime },k^{\prime }}\right) \frac{dE_{k^{\prime }}}{E_{k^{\prime }+dk}-E_{k^{\prime }}}E_{k^{\prime }}^{3}F_{i^{\prime },j^{\prime },k^{\prime }}\right] . \end{aligned}$$
(258)

Mezzacappa et al. are then left with the task of finding a finite difference representation for the angular advection term in Eq. (216). Their finite differencing of the energy advection term conserved specific neutrino energy. Their finite differencing of the angular advection term is designed to conserve specific neutrino luminosity. With \( \zeta =1-\mu ^{2} \), the angular aberration term can be rewritten as

$$\begin{aligned} \left( \frac{\partial F}{\partial t}\right) _{\mu }=\left( A+B/\zeta \right) \frac{\partial }{\partial \mu }\left[ \zeta \mu F\right] . \end{aligned}$$
(259)

As before, Mezzacappa et al. seek an analytic solution to Eq. (259). To do so, they convert the prefactor of the angular derivative to a time derivative. For the quantity \( R_{a}=r^{3}\rho \), they find

$$\begin{aligned} \frac{d{\mathrm{ln}} R_{a}}{dt}=A+B/\zeta . \end{aligned}$$
(260)

They then rewrite Eq. (259) in terms of the “Lagrangian” variable \( y=\zeta ^{-1/2}\mu /R_{a} \) instead of the “Eulerian” variable \( x=\mu \). After multiplication by \( \zeta \mu \), Eq. (259) becomes:

$$\begin{aligned} 0= & {} \zeta \mu \left[ \left( \frac{\partial F}{\partial t}\right) _{\mu }+\alpha \left( A+B/\zeta \right) \frac{\partial }{\partial \mu }\left[ \zeta \mu F\right] \right] \nonumber \\= & {} \left( \frac{\partial }{\partial t}\left[ \zeta \mu F\right] \right) _{\mu }+\zeta ^{-1/2}\mu \frac{\partial R_{a}}{R^{2}_{a}\partial t}\times \zeta ^{3/2}R_{a}\frac{\partial }{\partial \mu }\left[ \zeta \mu F\right] \nonumber \\= & {} \left( \frac{\partial }{\partial t}\left[ \zeta \mu F\right] \right) _{\mu } - \left( \frac{\partial \left[ \zeta ^{-1/2}\mu /R_{a}\right] }{\partial t}\right) _{\mu }\frac{\partial \left[ \zeta \mu F\right] }{\partial \left[ \zeta ^{-1/2}\mu /R_{a}\right] }\nonumber \\= & {} \left( \frac{\partial }{\partial t}\left[ \zeta \mu F\right] \right) _{\zeta ^{-1/2}\mu /R_{a}}. \end{aligned}$$
(261)

As before, the interpretation is clear: The neutrinos initially residing in the interval \( \left( 1-3\mu ^{2}\right) \varDelta \mu =\zeta _{2}\mu _{2}-\zeta _{1}\mu _{1} \) are shifted by angular aberration along constant \( \mu /\left( \sqrt{\zeta }R_{a}\right) \) in the phase space of a comoving observer:

$$\begin{aligned} \left( \frac{\partial }{\partial t}\left[ \left( 1-3\mu ^{2}\right) F\varDelta \mu \right] \right) _{\zeta ^{-1/2}\mu /R_{a}}=0. \end{aligned}$$
(262)

Given Eq. (262), Mezzacappa et al. are in turn able to evaluate the change in other neutrino quantities—in particular, the specific neutrino luminosity, \( d\ell =\left( 1-3\mu ^{2}\right) \mu F\varDelta \mu \):

$$\begin{aligned} \left( \frac{\partial }{\partial t}\left[ \left( 1-3\mu ^{2}\right) \mu F\varDelta \mu \right] \right) _{\zeta ^{-1/2}\mu /R_{a}}= & {} \left( 1-3\mu ^{2}\right) F\varDelta \mu \left( \frac{\partial \mu }{\partial t}\right) _{\zeta ^{-1/2}/R_{a}}\nonumber \\= & {} \zeta \frac{\partial {\mathrm{ln}} R_{a}}{\partial t}d\ell . \end{aligned}$$
(263)

Identifying their bin size \( \left( 1-3\mu _{j^{\prime }}^{2}\right) \varDelta \mu _{j^{\prime }}=w_{j^{\prime }} \) with their Gaussian quadrature weights, Eq. (262) leads to their condition for neutrino number conservation,

$$\begin{aligned} F_{i^{\prime },j^{\prime },k^{\prime }}w_{j^{\prime }}-\left[ \left( F_{i^{\prime },j^{\prime },k^{\prime }}w_{j^{\prime }}-n^{-}_{i^{\prime },j^{\prime },k^{\prime }}\right) +n^{+}_{i^{\prime },j^{\prime }+dj,k^{\prime }}\right] =0, \end{aligned}$$
(264)

and Eq. (263) leads to their prescription for the numerical evolution of the specific luminosity,

$$\begin{aligned}&F_{i^{\prime },j^{\prime },k^{\prime }}\mu _{j^{\prime }}w_{j^{\prime }} - \left[ \left( F_{i^{\prime },j^{\prime },k^{\prime }}\mu _{j^{\prime }}w_{j^{\prime }}-\mu _{j^{\prime }}n^{-}_{i^{\prime },j^{\prime },k^{\prime }}\right) +\mu _{j^{\prime }+dj}n^{+}_{i^{\prime },j^{\prime }+dj,k^{\prime }}\right] \nonumber \\&\quad = -\left( \zeta _{j^{\prime }}A_{i^{\prime },k^{\prime }}+B_{i^{\prime },j^{\prime },k^{\prime }}\right) F_{i^{\prime },j^{\prime },k^{\prime }}\mu _{j^{\prime }}w_{j^{\prime }} \, dt, \end{aligned}$$
(265)

where \( dj=\pm 1 \). The change in the neutrino distribution from angular aberration is then

$$\begin{aligned} F_{i^{\prime },j^{\prime },k^{\prime }}=F_{i^{\prime },j^{\prime },k^{\prime }}^{n}+\left( n_{i^{\prime },j^{\prime },k^{\prime }}^{+}-n_{i^{\prime },j^{\prime },k^{\prime }}^{-}\right) /w_{j^{\prime }}, \end{aligned}$$
(266)

with

$$\begin{aligned} n^{-}_{i^{\prime },j^{\prime },k^{\prime }}= & {} \left( A_{i^{\prime },k^{\prime }}+B_{i^{\prime },j^{\prime },k^{\prime }}/\zeta _{j^{\prime }}\right) \frac{w_{j^{\prime }}}{\mu _{j^{\prime }+dj}-\mu _{j^{\prime }}}\zeta _{j^{\prime }}\mu _{j^{\prime }}F_{i^{\prime },j^{\prime },k^{\prime }} \, dt, \nonumber \\ n^{+}_{i^{\prime },j^{\prime },k^{\prime }}= & {} n^{-}_{i^{\prime },j^{\prime }-dj,k^{\prime }}. \end{aligned}$$
(267)

This leads to the following finite difference representation of the angular aberration term in the Boltzmann equation (216):

$$\begin{aligned}&\frac{1}{w_{j^{\prime }}}\left[ \left( A_{i^{\prime },k^{\prime }}+B_{i^{\prime },j^{\prime }-dj,k^{\prime }}/\zeta _{j^{\prime }-dj}\right) \frac{w_{j^{\prime }-dj}}{\mu _{j^{\prime }}-\mu _{j^{\prime }-dj}}\zeta _{j^{\prime }-dj}\mu _{j^{\prime }-dj}F_{i^{\prime },j^{\prime }-dj,k^{\prime }}\right. \nonumber \\&\quad \quad - \left. \left( A_{i^{\prime },k^{\prime }}+B_{i^{\prime },j^{\prime },k^{\prime }}/\zeta _{j^{\prime }}\right) \frac{w_{j^{\prime }}}{\mu _{j^{\prime }+dj}-\mu _{j^{\prime }}}\zeta _{j^{\prime }}\mu _{j^{\prime }}F_{i^{\prime },j^{\prime },k^{\prime }}\right] , \end{aligned}$$
(268)

where \( dj=+1 \) for \( \mu _{j^{\prime }} \le 0 \) and \( dj=-1 \) for \( \mu _{j^{\prime }} >0 \).

Given the finite differencing for all of the terms in the Boltzmann equation (216)—i.e., Eqs. (217), (219), (222), (268), (258) and (225)—Mezzacappa et al. solve the discretized equation as follows. With the exception of the discretized time derivative, which is a finite difference of the values of the distribution function at time step \(t^{n+1}\) and \(t^n\), the distribution function in all of the remaining terms is defined at time step \(t^{n+1}\)—i.e., Mezzacappa et al. employ a fully implicit approach, including phase-space advection and collisions. Given the presence of blocking factors in the collision term and the presence of products of the distribution functions and the neutrino opacities, which are functions of the specific internal energy and electron fraction of the matter, which are updated together with the distribution functions given lepton number and energy exchange with the matter through collisions [see Eqs. (5), (6), (191) and (194)], linearization is necessary. Specifically, Mezzacappa et al. introduce the linearizations

$$\begin{aligned} F_{i^{\prime },j^{\prime },k^{\prime }}= & {} F^{0}_{i^{\prime },j^{\prime },k^{\prime }}+\delta F_{i^{\prime },j^{\prime },k^{\prime }}, \end{aligned}$$
(269)
$$\begin{aligned} \varepsilon _{i^{\prime }}= & {} \varepsilon ^{0}_{i^{\prime }}+\delta \varepsilon _{i^{\prime }}, \end{aligned}$$
(270)
$$\begin{aligned} (Y_e)_{i^{\prime }}= & {} (Y_e)^{0}_{i^{\prime }}+\delta (Y_e)_{i^{\prime }}, \end{aligned}$$
(271)

where a 0 superscript denotes the value of the variable at the current iterate in an outer Newton iteration of the solution algorithm. Given the dependence of j, \({\tilde{\chi }}\), \(R_{\mathrm{IS}}\), \(R_{\mathrm{NIS}}\), and \(R_{{\mathrm{PAIR}}}\) on \(\rho \), T, and \(Y_e\), the above linearizations lead to linearizations in all of these quantities. For example:

$$\begin{aligned} j_{i^{\prime },k^{\prime }}=j^{0}_{i^{\prime },k^{\prime }}+\left[ \left( \frac{\partial j}{\partial T}\right) _{\rho ,Y_e}\right] ^{0}_{i^{\prime },k^{\prime }}+\left[ \left( \frac{\partial j}{\partial Y_e}\right) _{\rho ,T}\right] ^{0}_{i^{\prime },k^{\prime }}. \end{aligned}$$
(272)

Insertion of these linearizations into the finite differenced Boltzmann equation leads to a block-tridiagonal linear system of equations for the quantities \(\delta F_{i^{\prime },j^{\prime },k^{\prime }}\), \(\delta \varepsilon _{i^{\prime }}\), and \(\delta (Y_e)_{i^{\prime }}\), which is solved for each outer iteration until a prescribed tolerance is reached for all of the variables. The block tridiagonal system has the form

$$\begin{aligned} -{\mathbf {C}}_{i}{\mathbf {V}}_{i-1}+{\mathbf {A}}_{i}{\mathbf {V}}_{i}-{\mathbf {B}}_{i+1}{\mathbf {V}}_{i+1}={\mathbf {U}}_{i}, \end{aligned}$$
(273)

where \({\mathbf {B}}_{i}\) and \({\mathbf {C}}_{i}\) are diagnoal, reflecting the fact that spatial advection couples nearest neighbors only, and where \({\mathbf {A}}_{i}\) has the form

$$\begin{aligned} {\mathbf {A}}_{i}= \left( \begin{array}{cc} A_{1} &{} A_{2} \\ A_{3} &{} A_{4} \end{array} \right) . \end{aligned}$$
(274)

\({\mathbf {A}}_{i}\) is an \(M\times M\) matrix, where \(M=jmax \times kmax +2\). jmax corresponds to the number of angular quadratures used in the discrete ordinates implementation for angle, and kmax corresponds to the number of energy groups. The submatrices \({\mathbf {A}}_{2}\) and \({\mathbf {A}}_{3}\) are of dimension \(2\times (M-2)\) and \((M-2)\times 2\), respectively. \({\mathbf {A}}_{4}\) is a \(2\times 2\) matrix. The 2 rightmost columns of \({\mathbf {A}}_{i}\) and the 2 bottom-most rows correspond to the coupling of the Boltzmann equation to the equations for the specific internal energy and electron fraction of the matter, accounting for energy and lepton number exchange. The solution vector, \({\mathbf {V}}_{i}\), comprising the quantities \(\delta F_{i^{\prime },j^{\prime },k^{\prime }}\), \(\delta \varepsilon _{i^{\prime }}\), and \(\delta (Y_e)_{i^{\prime }}\), has the form

$$\begin{aligned} \left( \begin{array}{c} \delta F_{i^{\prime },1',1'} \\ \delta F_{i^{\prime },2',1'} \\ . \\ . \\ . \\ \delta F_{i^{\prime },1',2'} \\ \delta F_{i^{\prime },2',2'} \\ . \\ . \\ . \\ \delta \varepsilon _{i^{\prime }} \\ \delta (Y_e)_{i^{\prime }} \end{array} \right) . \end{aligned}$$
(275)

D’Azevedo et al. (2005) developed a physics-based preconditioner for the above system. This “ADI-like” preconditioner treats the diagonal dense blocks, which correspond to coupling in momentum space, and the tridiagonal bands, which correspond to coupling in space, separately, and has proven very effective.

For Mezzacappa et al., neutrino momentum exchange with the matter is handled separately, during the hydrodynamics update, and is differenced explicitly in time.

6.1.4 Challenges: neutrino–nucleon (small-energy) scattering

In the case of neutrino–electron scattering, for example, where the energy transfer is not small in comparison with the widths of the zones of our energy grid, Eq. (216) is differenced using zone-centered values of energy in both the neutrino distribution function and the scattering kernels. However, for small-energy transfers compared with our energy zone widths, the scattering kernel \(R_{{\mathrm{NNS}}}^{\mathrm{in/out}}(\varepsilon _{k}, \varepsilon _{k^{\prime }}, \cos \theta )\) will be effectively zero if \(\varepsilon _{k} \ne \varepsilon _{k^{\prime }}\), and the scattering will become essentially isoenergetic, with negligible energy transfer. As already discussed, while the transfer of energy between neutrinos and nucleons during a scattering event is small, there are many such scatterings, and the overall impact of the energy exchange between the neutrinos and nucleons in these events is nonnegligible. Thus, a numerical treatment of this scattering contribution that reflects the fact that the energy exchange between neutrinos and matter is important and, more important, captures this exchange accurately, must be developed.

Focusing on this term in the collision term, we have

$$\begin{aligned}&{\displaystyle \frac{\partial f(\mu , \varepsilon ) }{\partial t } = [ 1 - f(\mu , \varepsilon ) ] \frac{1}{(hc)^{3}} \int _{0}^{\infty } \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } \int _{-1}^{1} d\mu ^{\prime } f(\mu ^{\prime }, \varepsilon ^{\prime }) \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - f(\mu , \varepsilon ) \frac{1}{(hc)^{3}} \int _{0}^{\infty } \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } \int _{-1}^{1} d\mu ^{\prime } [1 - f(\mu ^{\prime }, \varepsilon ^{\prime }) ] \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ), } \nonumber \\ \end{aligned}$$
(276)

where, for simplicity, we have suppressed the radial and temporal dimensions. With the energy zone centers, \(\varepsilon _{k+1/2}\), defined in terms of the energy zone edges, \(\varepsilon _{k}\), as

$$\begin{aligned} \varepsilon _{k+\frac{1}{2}} = \frac{1}{3} [ \varepsilon _{k}^{2} + \varepsilon _{k}\varepsilon _{k+1} + \varepsilon _{k+1}^{2} ], \end{aligned}$$
(277)

the volume of an energy zone is given by

$$\begin{aligned} 4\pi \varepsilon _{k+\frac{1}{2}}^{2} \varDelta \varepsilon _{k+\frac{1}{2}} = \frac{4\pi }{3} [ \varepsilon _{k+1}^{3} - \varepsilon _{k}^{3} ], \end{aligned}$$
(278)

where

$$\begin{aligned} \varDelta \varepsilon _{k+\frac{1}{2}} = \varepsilon _{k+1} - \varepsilon _{k}. \end{aligned}$$
(279)

The integral over energy can now be replaced by

$$\begin{aligned} \int _{0}^{\varepsilon _{N+1}} \varepsilon ^{2} d\varepsilon = \sum _{k=1}^{N} \varepsilon _{k+\frac{1}{2}}^{2} \varDelta \varepsilon _{k+\frac{1}{2}}, \end{aligned}$$
(280)

and Eq. (276) becomes

$$\begin{aligned}&{\displaystyle \left. \frac{\partial f(\mu , \varepsilon ) }{\partial t } \right| _{\mathrm{scat}}} \nonumber \\&\quad {\displaystyle \simeq [ 1 - f(\mu , \varepsilon ) ] \frac{1}{(hc)^{3}} \int _{0}^{\varepsilon _{N+1}} \varepsilon '^{2} d\varepsilon ^{\prime } \int _{-1}^{1} d\mu ^{\prime } f(\mu ^{\prime }, \varepsilon ^{\prime }) \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - f(\mu , \varepsilon ) \frac{1}{(hc)^{3}} \int _{0}^{\varepsilon _{N+1}} \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } \int _{-1}^{1} d\mu ^{\prime } [1 - f(\mu ^{\prime }, \varepsilon ^{\prime }) ] \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\quad {\displaystyle = [ 1 - f(\mu , \varepsilon ) ] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } \int _{-1}^{1} d\mu ^{\prime } f(\mu ^{\prime }, \varepsilon ^{\prime }) \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - f(\mu , \varepsilon ) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } \int _{-1}^{1} d\mu ^{\prime } [1 - f(\mu ^{\prime }, \varepsilon ^{\prime }) ] \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) }\nonumber \\&\quad {\displaystyle = [ 1 - f(\mu , \varepsilon ) ] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } f(\mu ^{\prime }, \varepsilon ^{\prime }) \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - f(\mu , \varepsilon ) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}}{\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } [1 - f(\mu ^{\prime }, \varepsilon ^{\prime }) ] \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ). } \end{aligned}$$
(281)

In Eq. (281), the first approximation was made by truncating the energy integral at \(\varepsilon _{N+1}\). In the second equality, the integral over the entire energy domain is broken up into segments within the domain, corresponding to the energy zone widths. This is not an approximation. In the last equality, we have inserted a factor of unity inside the summation over energy groups, which, again, is not an approximation. Therefore, no approximations have been made thus far except for truncating the range of the energy integration.

The ultimate goal of an improved treatment of small-energy, neutrino–nucleon scattering is to accurately compute the energy transfer between the neutrinos and the nucleons—i.e., to compute accurately the change in the neutrino energy within each of the groups of our energy grid from such scattering. The change in the neutrino energy within a group is given by

$$\begin{aligned}&{\displaystyle \left. \frac{\partial E_{k+\frac{1}{2}} }{\partial t } \right| _{\mathrm{scat}} = \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \left. \frac{\partial f(\mu , \varepsilon ) }{\partial t } \right| _{\mathrm{scat}} } \nonumber \\&\quad {\displaystyle = \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon [ 1 - f(\mu , \varepsilon ) ] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } f(\mu ^{\prime }, \varepsilon ^{\prime }) \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon f(\mu , \varepsilon ) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}}} \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } [1 - f(\mu ^{\prime }, \varepsilon ^{\prime }) ] \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ), } \end{aligned}$$
(282)

where we have inserted Eq. (281) for the time derivative of the neutrino distribution function due to neutrino–nucleon scattering. If we now choose to define the distribution function, \(f(\mu ,\varepsilon )\), at the energy zone centers, Eq. (282) can be expressed as

$$\begin{aligned}&{\displaystyle \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \left. \frac{\partial f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) }{\partial t } \right| _{\mathrm{scat}} = \left. \frac{\partial f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) }{\partial t } \right| _{\mathrm{scat}} \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon } \nonumber \\&\quad {\displaystyle = \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \left[ 1 - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \right] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } } \nonumber \\&\quad \quad {\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{in}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } \left[ 1 - f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \right] \int _{0}^{2\pi } d\beta ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\quad {\displaystyle = \left[ 1 - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \right] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \int _{-1}^{1} d\mu ^{\prime } f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \int _{0}^{2\pi } d\beta ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \int _{-1}^{1} d\mu ^{\prime } \left[ 1 - f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \right] \int _{0}^{2\pi } d\beta ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \frac{1}{(hc)^{3}} \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} {\varepsilon ^{\prime }}^{2} d\varepsilon ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ). } \end{aligned}$$
(283)

The first equality in Eq. (283) stems from the fact that, once the distribution function is evaluated at the energy zone center and, consequently, its time derivative is evaluated there, the time derivative becomes a constant integrand and can be taken outside of the integral. Dividing both sides of Eq. (283) by

$$\begin{aligned} \frac{1}{(hc)^3}\int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon = \frac{1}{(hc)^3} \varepsilon _{k+1/2}^{3} \varDelta \varepsilon _{k+\frac{1}{2}}, \end{aligned}$$
(284)

we obtain

$$\begin{aligned}&{\displaystyle \left. \frac{\partial f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) }{\partial t } \right| _{\mathrm{scat}} } \nonumber \\&\quad {\displaystyle = \left[ 1 - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \right] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \int _{-1}^{1} d\mu ^{\prime } f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \int _{0}^{2\pi } d\beta ^{\prime } } \nonumber \\&\quad \quad {\displaystyle \times \frac{1}{ \varepsilon _{k+\frac{1}{2}}^{3} \varDelta \varepsilon _{k+\frac{1}{2}} } \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } R_{{\mathrm{NNS}}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) } \nonumber \\&\qquad {\displaystyle - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \int _{-1}^{1} d\mu ^{\prime } \left[ 1 - f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \right] \int _{0}^{2\pi } d\beta ^{\prime } } \nonumber \\&\qquad {\displaystyle \times \frac{1}{ \varepsilon _{k+\frac{1}{2}}^{3} \varDelta \varepsilon _{k+\frac{1}{2}} } \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ), }\nonumber \\ \end{aligned}$$
(285)

which we rewrite as

$$\begin{aligned} \displaystyle \left. \frac{\partial f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) }{\partial t } \right| _{\mathrm{scat}}= & {} \left[ 1 - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \right] \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} \nonumber \\&{\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } f\left( \mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}\right) \int _{0}^{2\pi } d\beta ^{\prime } \langle R_{\mathrm{NNS}}^{{\mathrm{in}}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) \rangle _{E} } \nonumber \\&{\displaystyle - f\left( \mu , \varepsilon _{k+\frac{1}{2}}\right) \frac{1}{(hc)^{3}} \sum _{k^{\prime }=1}^{N} \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \nonumber \\&{\displaystyle \times \int _{-1}^{1} d\mu ^{\prime } [1 - f(\mu ^{\prime }, \varepsilon _{k^{\prime }+\frac{1}{2}}) ] \int _{0}^{2\pi } d\beta ^{\prime } \langle R_{\mathrm{NNS}}^{\mathrm{out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) \rangle _{E}, }\nonumber \\ \end{aligned}$$
(286)

where

$$\begin{aligned}&{\displaystyle \langle R_{{\mathrm{NNS}}}^{\mathrm{in/out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ) \rangle _{E} } \nonumber \\&\quad {\displaystyle \equiv \frac{1}{ \varepsilon _{k+\frac{1}{2}}^{3} \varDelta \varepsilon _{k+\frac{1}{2}} } \int _{\varepsilon _{k}}^{\varepsilon _{k+1}} \varepsilon ^{3} d\varepsilon \frac{1}{ \varepsilon _{k^{\prime }+\frac{1}{2}}^{2} \varDelta \varepsilon _{k^{\prime }+\frac{1}{2}} } \int _{\varepsilon _{k^{\prime }}}^{\varepsilon _{k^{\prime }+1}} \varepsilon ^{{\prime }{2}} d\varepsilon ^{\prime } R_{{\mathrm{NNS}}}^{\mathrm{in/out}}(\varepsilon , \varepsilon ^{\prime }, \cos \theta ). }\nonumber \\ \end{aligned}$$
(287)

With the scattering kernel defined as in Eq. (287) in the collision term of the Boltzmann equation, the energy transfer between the neutrinos and the nucleons resulting from the many neutrino–nucleon scattering events is captured accurately, despite the fact that the energy exchange per scattering is much less than a typical energy zone width.

6.1.5 Axisymmetry

The first implementation of multi-angle, multi-frequency neutrino transport in the context of spatially two-dimensional, axisymmetric core-collapse supernova models was achieved by Ott et al. (2008). Their implementation was based on the neutrino transport solver developed by Livne et al. (2004) for the neutrino specific intensity (I), whose evolution is given by the following equation:

$$\begin{aligned} \frac{D I}{Dt}+\varOmega \cdot \nabla I + \sigma I = S. \end{aligned}$$
(288)

Here, D/Dt is the Lagrangian time derivative, \(\varOmega \) is the unit vector in the direction of neutrino propagation, whose components are \((\cos \theta _{p},\sin \theta _{p}\cos \phi _{p},\sin \theta _{p}\sin \phi _{p})\), where \(\theta _{p}\) and \(\phi _{p}\) are spherical momentum-space coordinates defined relative to the outward radial direction, \(\sigma \) is the total absorption cross section, including absorption and scattering, and S is the total emissivity, including emission and scattering.

Equation (288) is temporally discretized fully implicitly. The phase space discretization is handled as follows. Space—i.e., radius and angle—is discretized using a conservative difference scheme. Momentum space—i.e., the space comprising the two dimensions corresponding to the angles of the neutrino’s direction of propagation, \(\theta _{p}\) and \(\phi _{p}\), and the dimension corresponding to the neutrino’s energy, \(\varepsilon _{\nu }\), is discretized as follows. The discrete ordinates method is used for the momentum-space dimensions. Further details of the discretization of Eq. (288) have not yet been provided.

6.1.6 Three spatial dimensions

The journey down what will no doubt be a long road toward the implementation of general relativistic, three-dimensional Boltzmann neutrino transport in the context of core-collapse supernovae was begun by Sumiyoshi and Yamada (2012). With core-collapse supernovae in mind, they began by solving the conservative form of the Boltzmann equation for three-dimensional, static stellar core configurations:

$$\begin{aligned}&\frac{1}{c}\frac{\partial f}{\partial t} + \frac{\mu }{r^{2}} \frac{\partial }{\partial r} (r^{2} f) + \frac{\sqrt{1-\mu ^{2}}\,\mathrm{cos}\,\phi _{p}}{r \mathrm{sin}\,\theta } \frac{\partial }{\partial \theta } (\mathrm{sin}\,\theta f) + \frac{\sqrt{1-\mu ^{2}}\,\mathrm{sin}\,\phi _{p}}{r \mathrm{sin}\,\theta } \frac{\partial f}{\partial \phi } \nonumber \\&\qquad + \frac{1}{r} \frac{\partial }{\partial \mu } [(1-\mu ^{2}) f] - \frac{\sqrt{1-\mu ^{2}}}{r} \frac{\mathrm{cos}\,\theta }{\mathrm{sin}\,\theta } \frac{\partial }{\partial \phi _{p}} (\mathrm{sin}\,\phi _{p} f) = \left[ \frac{1}{c} \frac{\delta f}{\delta t} \right] _{{\mathrm{collision}}}. \end{aligned}$$
(289)

In light of the use of spherical polar coordinates, there are terms that correspond to advection in momentum space even in a static medium in flat spacetime—i.e., even in the absence of special and general relativistic effects. For example, as a neutrino propagates, its direction cosine, \(\mu \equiv \cos \theta _{p}\), which is defined relative to the outwardly pointing radial basis vector, will necessarily change. This is described by the fourth term on the left-hand side of Eq. (289). This is not a geometric effect, as spacetime is flat in this case. Rather, it is a coordinate effect. The last term on the left-hand side of the same equation has a similar origin and interpretation. Given the assumption of a static medium and flat spacetime, no other terms appear on the left-hand side, which would describe special and general relativistic effects were they considered.

The discretization of Eq. (289) follows and extends that used in Mezzacappa and Bruenn (1993a)—i.e., finite differencing in space and energy, and discrete ordinates in neutrino propagation angles. For the second term on the left-hand side of Eq. (289), corresponding to radial advection of neutrinos, Sumiyoshi and Yamada use the following discretization:

$$\begin{aligned} \left[ \frac{\mu }{r^{2}} \frac{\partial }{\partial r} (r^{2} f) \right] = \left[ \mu \frac{\partial }{\partial (r^3 / 3)} (r^{2} f) \right] = {\mu }_{j} \, \frac{3}{r_{I}^{3} - r_{I-1}^{3}} \, ( r_{I}^{2} \, f_{I} - r_{I-1}^{2} \, f_{I-1} ), \end{aligned}$$
(290)

where, in their notation, \(f_{I-1}\) and \(f_{I}\) are the neutrino distributions at the cell interfaces of the i-th zone. The quantities \({\mu }_{j} f_{I}\) at the cell boundaries are defined by

$$\begin{aligned} {\mu }_{j} f_{I} = \frac{ {\mu }_{j} - | {\mu }_{j} | }{2} \{ ( 1 - \beta _{I} ) f_{i} + \beta _{I} f_{i+1} \} + \frac{ {\mu }_{j} + | {\mu }_{j} | }{2} \{ \beta _{I} f_{i} + ( 1 - \beta _{I} ) f_{i+1} \}, \end{aligned}$$
(291)

and \(\beta _{I}\) is

$$\begin{aligned} \beta _{I} = 1 - \frac{1}{2} \frac{\alpha \varDelta r_{I} / \lambda _{I}}{1 + \alpha \varDelta r_{I} / \lambda _{I}}. \end{aligned}$$
(292)

In the diffusion (free-streaming) limit, \(\beta _{I}=1/2 (1)\). The advection in \(\mu = \cos \theta _{p}\) is discretized as

$$\begin{aligned} \left[ \frac{1}{r} \frac{\partial }{\partial \mu } [\,(1-\mu ^{2}) f\,] \right] = \frac{3}{2} \, \frac{r_{I}^{2} - r_{I-1}^{2}}{r_{I}^{3} - r_{I-1}^{3}} \, \frac{1}{{d \mu }_{j}} \, \left[ (1-{\mu }^2)_{J} f_{J} - (1-{\mu }^2)_{J-1} f_{J-1} \right] . \end{aligned}$$
(293)

Upwind differencing is implemented, and \(f_{J} = f_{j}\). \(\theta \)-advection is first reexpressed and then discretized as

$$\begin{aligned}&\left[ \frac{\sqrt{1-\mu ^{2}}\,\mathrm{cos}\,\phi _{p}}{r \mathrm{sin}\,\theta } \frac{\partial }{\partial \theta } (\mathrm{sin}\,\theta f) \right] = \left[ - \frac{\sqrt{1-\mu ^{2}}\,\mathrm{cos}\,\phi _{p}}{r } \frac{\partial }{\partial \mu } [\,(1-\mu ^{2})^{\frac{1}{2}} f\,] \right] \nonumber \\&\quad = - \frac{3}{2} \, \frac{r_{I_r}^{2} - r_{I_r-1}^{2}}{r_{I_r}^{3} - r_{I_r-1}^{3}} (1-{\mu }_{j_{\theta }}^{2})^{\frac{1}{2}} {\cos \phi _{p}}_{j_{\phi }}\, \frac{1}{d \mu _{i_{\theta }}} \, \left[ (1-{\mu }^2)^{\frac{1}{2}}_{I_{\theta }} f_{I_{\theta }} - (1-{\mu }^2)^{\frac{1}{2}}_{I_{\theta }-1} f_{I_{\theta }-1} \right] . \nonumber \\ \end{aligned}$$
(294)

The factor, \((1-{\mu }_{j_{\theta }}^{2})^{\frac{1}{2}} {\cos \phi _{p}}_{j_{\phi }}\), determines the direction of advection and the evaluation of \(f_{I_{\theta }}\) at the cell interface. Given the sign of \(\cos \phi _{p}\), \(f_{I_{\theta }}\) is determined by

$$\begin{aligned} {\cos \phi _{p}}_{j_{\phi }} f_{I_{\theta }}= & {} \frac{ {\cos \phi _{p}}_{j_{\phi }} + | {\cos \phi _{p}}_{j_{\phi }} | }{2} \{ ( 1 - \beta _{I_{\theta }} ) f_{i_{\theta }} + \beta _{I_{\theta }} f_{i_{\theta }+1} \} \nonumber \\&+ \frac{ {\cos \phi _{p}}_{j_{\phi }} - | {\cos \phi _{p}}_{j_{\phi }} | }{2} \{ \beta _{I_{\theta }} f_{i_{\theta }} + ( 1 - \beta _{I_{\theta }} ) f_{i_{\theta }+1} \}. \end{aligned}$$
(295)

As before, \(\beta _{I_{\theta }}\) takes on values between \(\frac{1}{2}\) (diffusion limit) and 1 (free-streaming limit) and is defined in the same way as \(\beta _{I}\), using instead the angular zone widths and mean free paths. \(\phi _{p}\) advection is discretized as

$$\begin{aligned}&\left[ - \frac{\sqrt{1-\mu ^{2}}}{r} \frac{\mathrm{cos}\,\theta }{\mathrm{sin}\,\theta } \frac{\partial }{\partial \phi _{p}} (\mathrm{sin}\,\phi _{p} f) \right] = \left[ - \frac{\sqrt{1-\mu ^{2}}}{r} \frac{\mu }{\sqrt{1-\mu ^{2}}} \frac{\partial }{\partial \phi _{p}} (\mathrm{sin}\,\phi _{p} f) \right] \nonumber \\&\quad =- \frac{3}{2} \, \frac{r_{I_r}^{2} - r_{I_r-1}^{2}}{r_{I_r}^{3} - r_{I_r-1}^{3}} (1-{\mu }_{j_{\theta }}^{2})^{\frac{1}{2}} \, \frac{\mu _{i_{\theta }}}{(1-{\mu _{i_{\theta }}^{2})^{\frac{1}{2}}}} \, \frac{1}{{d \phi _{p}}_{j_{\phi }}} \left[ (\sin \phi _{p})_{J_{\phi }} f_{J_{\phi }} - (\sin \phi _{p})_{J_{\phi }-1} f_{J_{\phi }-1} \right] . \nonumber \\ \end{aligned}$$
(296)

In this case, the sign of \(\mu _{i_{\theta }} (\sin \phi _{p})_{J_{\phi }}\) determines the direction of advection. Upwind differencing is used to determine \(f_{J_{\phi }}\) at the cell interface. \(f_{J_{\phi }}\) is given by

$$\begin{aligned} \mu _{i_{\theta }} (\sin \phi _{p})_{J_{\phi }} f_{J_{\phi }}= & {} \frac{ \mu _{i_{\theta }} (\sin \phi _{p})_{J_{\phi }} + | \mu _{i_{\theta }} (\sin \phi _{p})_{J_{\phi }} | }{2} f_{j_{\phi }+1} \nonumber \\&+ \frac{ \mu _{i_{\theta }} (\sin \phi _{p})_{J_{\phi }} - | \mu _{i_{\theta }} (\sin \phi _{p})_{J_{\phi }} | }{2} f_{j_{\phi }}. \end{aligned}$$
(297)

Last but not least, \(\phi \) advection is discretized as follows

$$\begin{aligned} \left[ \frac{\sqrt{1-\mu ^{2}}\,\mathrm{sin}\,\phi _{p}}{r \mathrm{sin}\,\theta } \frac{\partial f}{\partial \phi } \right]= & {} \left[ \frac{\sqrt{1-\mu ^{2}}\,\mathrm{sin}\,\phi _{p}}{r \sqrt{1-\mu ^{2}}} \frac{\partial f}{\partial \phi } \right] \nonumber \\= & {} \frac{3}{2} \, \frac{r_{I_r}^{2} - r_{I_r-1}^{2}}{r_{I_r}^{3} - r_{I_r-1}^{3}} (1-{\mu }_{j_{\theta }}^{2})^{\frac{1}{2}} \, \frac{{\sin \phi _{p}}_{j_{\phi }}}{(1-{\mu _{i_{\theta }}^{2})^{\frac{1}{2}}}} \, \frac{1}{{d \phi }_{i_{\phi }}} \left[ f_{I_{\phi }} - f_{I_{\phi }-1} \right] .\nonumber \\ \end{aligned}$$
(298)

Given the sign of \({\sin \phi _{p}}_{j_{\phi }}\) and, therefore, the advection direction, \(f_{I_{\phi }}\) is given by

$$\begin{aligned} {\sin \phi _{p}}_{j_{\phi }} f_{I_{\phi }}= & {} \frac{ {\sin \phi _{p}}_{j_{\phi }} + | {\sin \phi _{p}}_{j_{\phi }} | }{2} \{ \beta _{I_{\phi }} f_{i_{\phi }} + (1 - \beta _{I_{\phi }}) f_{i_{\phi }+1}\} \nonumber \\&+ \frac{ {\sin \phi _{p}}_{j_{\phi }} - | {\sin \phi _{p}}_{j_{\phi }} | }{2} \{ (1 - \beta _{I_{\phi }}) f_{i_{\phi }} + \beta _{I_{\phi }} f_{i_{\phi }+1}\}. \end{aligned}$$
(299)

\(\beta _{I_{\phi }}\) is determined in the same way as its counterparts in the radial and \(\theta \) directions, using the appropriate angular zone widths and mean free paths.

Focusing on the temporal discretization, the phase-space discretizations spelled out in Eqs. (290) through (299) are assembled and evaluated in a fully implicit manner, as shown schematically below (i.e., the phase-space discretizations themselves are not inserted; each term is represented by its continuum counterpart):

$$\begin{aligned}&\frac{1}{c}\frac{f_{i}^{n+1} - f_{i}^{n}}{\varDelta t} + \left[ \frac{\mu }{r^{2}} \frac{\partial }{\partial r} (r^{2} f) \right] ^{n+1} + \left[ \frac{\sqrt{1-\mu ^{2}}\,\mathrm{cos}\,\phi _{p}}{r \mathrm{sin}\,\theta } \frac{\partial }{\partial \theta } (\mathrm{sin}\,\theta f) \right] ^{n+1} \nonumber \\&\qquad + \left[ \frac{\sqrt{1-\mu ^{2}}\,\mathrm{sin}\,\phi _{p}}{r \mathrm{sin}\,\theta } \frac{\partial f}{\partial \phi } \right] ^{n+1} + \left[ \frac{1}{r} \frac{\partial }{\partial \mu } [(1-\mu ^{2}) f] \right] ^{n+1} \nonumber \\&\qquad + \left[ - \frac{\sqrt{1-\mu ^{2}}}{r} \frac{\mathrm{cos}\,\theta }{\mathrm{sin}\,\theta } \frac{\partial }{\partial \phi _{p}} (\mathrm{sin}\,\phi _{p} f) \right] ^{n+1} = \left[ \frac{1}{c} \frac{\delta f}{\delta t} \right] _{{\mathrm{collision}}}^{n+1}, \end{aligned}$$
(300)

where n designates the current time slice. The left-hand side of Eq. (300) is linear in the distribution function, but the right-hand side is not. Consequently, as in the spherically symmetric case, both sides of Eq. (300) are linearized in f. (In this case, Sumiyoshi and Yamada are working with a hydrostatic and thermally frozen stellar core profile. As a result, linearizations in \(\varepsilon \) and \(Y_e\) are not necessary.) This gives rise to a linear system of equations for \(\delta f_i\). To solve the combination of the outer nonlinear system of equations and the corresponding inner linear system of equations, Sumiyoshi and Yamada implement a Newton–Krylov approach—specifically, they implement Newton–BiCGSTAB, with point-Jacobi preconditioning.

The extension of these lepton-number conservative methods to the special relativistic case was documented by Nagakura et al. (2014). They deployed novel momentum-space gridding based on three considerations: (1) The invariant emissivity and opacity, which together define an invariant collision term on the right-hand side of the Boltzmann equation, can be computed in either the inertial, Eulerian frame or the inertial frame of an observer instantaneously comoving with the stellar core fluid. The value obtained in both cases would be the same if the neutrino angles and energies used in either case were related by the Lorentz transformation between the two inertial frames. (2) The Lorentz transformations of angles and energies between the Eulerian and comoving frames decouple—i.e., one is free to define one’s energy grid in either of the two frames independently of one’s angular grids, allowing the choices that would simplify the numerics while respecting the physics. (3) The dominant opacity during stellar core collapse stems from coherent, isoenergetic scattering on nuclei—i.e., any novel gridding should be constructed with this opacity in mind.

In Nagakura et al.’s notation, the invariance of the collision term can be expressed as

$$\begin{aligned} \varepsilon ^{{\mathrm{lb}}}\Big ( \frac{\delta f}{\delta t} \Big )_{\mathrm{col}}^{{\mathrm{lb}}} = \varepsilon ^{{\mathrm{fr}}}\Big ( \frac{\delta f}{\delta {\tilde{t}}} \Big )_{\mathrm{col}}^{{\mathrm{fr}}}. \end{aligned}$$
(301)

where \(t ({\tilde{t}})\) is the Eulerian (comoving) frame time and where the labels \({\mathrm{lb (fr)}}\) correspond to the Eulerian (comoving) frames. The equality in Eq. (301) is to be understood as follows: If one evaluates the left-hand side at a particular neutrino angle and energy as measured by the Eulerian observer, the equality is guaranteed provided the righ-hand side is evaluated at the corresponding Lorentz transformed neutrino angle and energy, which would be the angle and energy measured by the comoving observer. The neutrino energies in the two frames, \(\varepsilon ^{{{\mathrm{lb}}}}\) and \(\varepsilon ^{{\mathrm{fr}}}\), are related by

$$\begin{aligned} \varepsilon ^{{\mathrm{fr}}} = \varepsilon ^{{\mathrm{lb}}} \gamma (1 - {\mathbf {n}}^{{\mathrm{lb}}} \cdot {\mathbf {v}}) , \end{aligned}$$
(302)

where \(\gamma \) is the Lorentz factor, \({\mathbf {n}}^{{\mathrm{lb}}}\) is the neutrino propagation direction as measured in the Eulerian frame, and \({\mathbf {v}}\) is the fluid velocity in the same frame. The unit neutrino propagation direction vectors in the two frames are related by

$$\begin{aligned} \varepsilon ^{{\mathrm{fr}}} {\mathbf {n}}^{{\mathrm{fr}}} = \varepsilon ^{{\mathrm{lb}}} \left[ {\mathbf {n}}^{{\mathrm{lb}}} + \left( - \gamma + (\gamma - 1) \frac{{\mathbf {n}}^{{\mathrm{lb}}} \cdot {\mathbf {v}}}{v^2}\right) {\mathbf {v}}\right] , \end{aligned}$$
(303)

where \({\mathbf {n}}^{{\mathrm{fr}}}\) denotes the unit neutrino propagation direction vector in the comoving frame.

Figure 14 from Nagakura et al. shows two momentum-space grids associated with momentum-space spherical coordinates. The grid on the left corresponds to a choice of uniform gridding in both angle and energy in the Eulerian frame. (Uniform gridding is typically not used for either, but for simplicity Nagakura et al. consider this case to illustrate the essential features of their approach.) The grid on the right corresponds to the Lorentz-transformed Eulerian grid—i.e., the counterpart grid in the comoving frame. This grid is no longer uniform in either angle or energy. On the comoving-frame grid, an isoenergetic scattering event, wherein the neutrino’s angle changes but its energy does not, would necessitate an interpolation in energy given the fact the energy grid is not uniform in angle. The number (typically \(\sim \)20) of energy “groups” used in most core-collapse supernova simulations is low, and to make matters worse, the groups are typically spaced exponentially, with coarser resolution at higher energies. Interpolation on such grids is problematic for these reasons and for the conservation of neutrino (lepton) number. To overcome these difficulties, Nagakura et al. use the independence of the Lorentz transformation for neutrino angles and energy and choose a hybrid-grid approach. They introduce the Lagrangian Remap Grid (LRG) for the Eulerian observer, which is shown on the left-hand side of Fig. 15, which is the primary grid used in their work. On the LRG, the angular grid is uniform but the energy grid is not. The energy grid on the LRG is the Lorentz transform of the energy grid on the right-hand side of the same figure, which corresponds to the comoving-frame observer’s energy grid, which is uniform. Of course, by virtue of the Lorentz transformation and the fact that the angular grid is uniform in the Eulerian frame, the angular grid in the comoving frame cannot be uniform. This presents no difficulties in their approach, so Nagakura et al. opt for the simplicity of the uniform angular grid on the LRG, their primary grid. The evaluation of the collision term on the LRG, which is how the collision term is evaluated in Nagakura et al.’s approach to the discretization and solution of the Boltzmann equation, is the same as its evaluation on the comoving-frame grid, given the invariance of the collision term for such Lorentz-transform-related grids. Since the latter energy grid is uniform across angles, no interpolation in energy is required in evaluating, for example, isoenergetic scattering. The Lorentz transformation between the two grids is spatially and temporally dependent, so the LRG must be continually redefined as the evolution proceeds, but the comoving-frame grid does not change. As the LRG evolves, a conservative remapping procedure is used to remap the neutrino distributions on the previous LRG to the new LRG.

Fig. 14
figure 14

The left panel shows a schematic of uniform momentum-space angular and energy grids in the Laboratory frame. Constant-energy grid lines are represented by concentric circles (Nagakura et al. 2014). Constant angles are indicated by radial lines. The right panel shows the corresponding contours and lines in the comoving frame. Also added (dotted line) is a constant comoving-frame neutrino energy contour

Fig. 15
figure 15

The left panel shows the Lagrangian Remapping Grid (LRG) used by Nagakura et al. (2014) in their Boltzmann transport implementation. On the LRG, the neutrino angular grid is uniform, but the energy grid corresponds to a uniform energy grid in the comoving frame, shown in the right panel by concentric circles. The two energy grids are related by a Lorentz transformation. Given that the angular grid is uniform in the Laboratory frame, the corresponding angular grid in the comoving frame is not uniform. The angular grids, too, are related by a Lorentz transformation between the frames

With all of the above in mind, and focusing on isoenergetic scattering, the right-hand side of the Boltzmann equation, Eq. (289), is evaluated on the LRG as

$$\begin{aligned}&\left( \frac{\delta f}{\delta t}\right) _{{\mathrm{collision}}} \nonumber \\&\quad = \gamma \left( 1-{\mathbf {n}}^{{\mathrm{lb}}}\cdot {\mathbf {v}}\right) \left( \frac{\delta f}{\delta {\tilde{t}}}\right) _{{\mathrm{collision}}} \nonumber \\&\quad = \gamma \left( 1-{\mathbf {n}}^{{\mathrm{lb}}}\cdot {\mathbf {v}}\right) \left[ \frac{-(\varepsilon ^{{\mathrm{fr}}})^2}{(2\pi )^3} \int d{{\varOmega }^{\prime }}^{{\mathrm{fr}}} R^{{\mathrm{fr}}}(\varOmega ^{{\mathrm{fr}}},{\varOmega ^{\prime }}^{{\mathrm{fr}}}) [f^{{\mathrm{fr}}}(\varepsilon ^{{\mathrm{fr}}},\varOmega ^{{\mathrm{fr}}})-f^{\mathrm{fr}}(\varepsilon ^{{\mathrm{fr}}},{\varOmega ^{\prime }}^{{\mathrm{fr}}})]\right] \nonumber \\&\quad = \gamma \left( 1-{\mathbf {n}}^{{\mathrm{lb}}}\cdot {\mathbf {v}}\right) [ \frac{-[\varepsilon ^{{\mathrm{fr}}}(\varepsilon ^{\mathrm{lb}})]^2}{(2\pi )^3}\int d{{\varOmega }^{\prime }}^{\mathrm{lb}}\frac{d{{\varOmega }^{\prime }}^{{\mathrm{fr}}}}{d{{\varOmega }^{\prime }}^{{\mathrm{lb}}}} R^{\mathrm{lb}}[\varOmega ^{{\mathrm{fr}}}(\varOmega ^{{\mathrm{lb}}}),{\varOmega ^{\prime }}^{\mathrm{fr}}({\varOmega ^{\prime }}^{{\mathrm{lb}}})] \nonumber \\&\qquad \times \{f^{{\mathrm{lb}}}[\varepsilon ^{{\mathrm{fr}}}(\varepsilon ^{\mathrm{lb}}),\varOmega ^{{\mathrm{fr}}}(\varOmega ^{{\mathrm{lb}}})]-f^{{\mathrm{lb}}}[\varepsilon ^{\mathrm{fr}}(\varepsilon ^{{\mathrm{lb}}}),{\varOmega ^{\prime }}^{{\mathrm{fr}}}({\varOmega ^{\prime }}^{\mathrm{lb}})]\} ] \nonumber \\&\quad = \gamma \left( 1-{\mathbf {n}}^{{\mathrm{lb}}}\cdot {\mathbf {v}}\right) [ \frac{-[\varepsilon ^{{\mathrm{fr}}}(\varepsilon ^{\mathrm{lb}})]^2}{(2\pi )^3}\int d{{\varOmega }^{\prime }}^{\mathrm{lb}}\frac{d{{\varOmega }^{\prime }}^{{\mathrm{fr}}}}{d{{\varOmega }^{\prime }}^{{\mathrm{lb}}}} R^{\mathrm{lb}}[\varOmega ^{{\mathrm{fr}}}(\varOmega ^{{\mathrm{lb}}}),{\varOmega ^{\prime }}^{\mathrm{fr}}({\varOmega ^{\prime }}^{{\mathrm{lb}}})] \nonumber \\&\qquad \times \{ f^{{\mathrm{lb}}}(\varepsilon ^{{\mathrm{lb}}},\varOmega ^{\mathrm{lb}})-f^{{\mathrm{lb}}}(\varepsilon ^{{\mathrm{lb}}},{\varOmega ^{\prime }}^{{\mathrm{lb}}}) \} ] .\nonumber \\ \end{aligned}$$
(304)

The last equality follows from the invariance of the distribution function.

While the use of the LRG simplifies the evaluation of the collision term and avoids the need to introduce velocity-dependent angle and energy advection on the left-hand side of the Boltzmann equation, there is a cost: It complicates spatial advection. To overcome this inherited difficulty, Nagakura et al. invoke yet another grid, the Laboratory Fixed Grid (LFG). The LFG is like the grid depicted on the left-hand side of Fig. 14. It is the same for all Eulerian observers at different spatial locations and is constant in time. And, in Nagakura et al.’s implementation, it is of higher resolution in energy relative to the LRG. This is evident in Fig. 16.

Fig. 16
figure 16

In the left panel, energy zones on the LRG are shown for adjacent radial or angular grid points, designated here by y (Nagakura et al. 2014). In the right panel, the higher-resolution Laboratory Fixed Grid (LFG) is shown, superimposed on the LRG

Given the LFG, the treatment of spatial and angular advection occurs in the following steps: (1) Using the subgrid energy distribution, \(f_{{\mathrm{subgrid}}}\), the values of the distribution function, f, at the zone centers of the LFG grid are determined by \(f_{\mathrm{subgrid}}(\varepsilon _{{\mathrm{LFG}}_{{\mathrm{A}}^{\prime }},{\mathrm{B}}^{\prime },\ldots })\), where \(\varepsilon _{\mathrm{LFG}_{{\mathrm{A}}^{\prime },{\mathrm{B}}^{\prime },\ldots }}\) are the value of the energies corresponding to the zone centers on the LFG grid for zones \(\hbox {A}^{\prime }\), \(\hbox {B}^{\prime }, \ldots \), respectively. (For the example points selected here, the LFG energies are the same.) (2) Once the values of the distribution function are defined at the zone centers of the LFG, they can be used to define the spatial and angular fluxes on the LRG as follows. Consider Fig. 16. On the left-hand side of the figure, the LRG is shown. On the right, the LFG is overlaid on the LRG. Note, too, here we are considering advection in space and angle, denoted on the vertical axis by y to represent both. Let us consider LFG zones \(\hbox {A}^{\prime }\) and \(\hbox {B}^{\prime }\). The flux at the interface between these two zones is determined from the value of the distribution function there, which is determined by interpolating between the values of the distribution function at the \(\hbox {A}^{\prime }\) and \(\hbox {B}^{\prime }\) zone centers, as outlined by Sumiyoshi and Yamada (2012). (When invoking the LFG, this interpolation involves only two zones, not three as it would in the case of the LRG.) (3) Given the fluxes on the LFG, we are ready to define the fluxes that will be used on the LRG to update the distribution function in each of the LRG’s zones due to advection. Note that advection into (for example) LFG zone \(\hbox {B}^{\prime }\) from \(\hbox {A}^{\prime }\) involves advection into a single zone. However, it is easy to see from Fig. 16 that advection from \(\hbox {A}^{\prime }\) into \(\hbox {B}^{\prime }\) involves advection into two zones of the LRG: A and B. To divide the contribution of the LFG flux into \(\hbox {B}^{\prime }\) into LRG fluxes into zones A and B, we split the flux as follows:

$$\begin{aligned} F_{A^{\prime }|B^{\prime }}=\gamma F_{A^{\prime }|B^{\prime }} + (1-\gamma )F_{A^{\prime }|B^{\prime }}, \end{aligned}$$
(305)

where \(F_{A^{\prime }|B^{\prime }}\) is the LFG flux at the interface between LFG zones \(\hbox {A}^{\prime }\) and \(\hbox {B}^{\prime }\) and where

$$\begin{aligned} \gamma= & {} \frac{N_L}{N_L+N_R}, \end{aligned}$$
(306)
$$\begin{aligned} N_L= & {} |\varepsilon ^{3}_{AB}-\varepsilon ^{3}_{L}|f_A, \end{aligned}$$
(307)
$$\begin{aligned} N_R= & {} |\varepsilon ^{3}_{AB}-\varepsilon ^{3}_{R}|f_B, \end{aligned}$$
(308)

with \(\varepsilon _{AB}\) corresponding to the value of the energy at the interface of the LRG zones A and B and where \(\varepsilon _{L(R)}\) corresponds to the energy value associated with the left (right) boundary of the LFG zone \(\hbox {B}^{\prime }\). \(f_{A(B)}\) corresponds to the value of the distribution function on the LRG in zone A(B). In other words, the LFG flux at the interface of LFG zones \(\hbox {A}^{\prime }\) and \(\hbox {B}^{\prime }\) is split, according to the distribution-weighted energy volume, between LRG fluxes into zones A and B. Note that zone B, for example, has multiple LFG fluxes advecting into it. The total LRG flux for zone B would therefore be the sum of all of the relevant LFG fluxes into it determined in the manner described here. (4) Once the LRG interface fluxes are defined as in step 3, the spatial (or angular) advection on the LRG is carried out as outlined by Sumiyoshi and Yamada (2012).

Nagakura et al.’s novel method has been designed to conserve lepton number. A demonstration that it simultaneously conserves energy at an appropriate level remains to be demonstrated.

With regard to the temporal discretization with special relativistic effects included, Nagakura et al. use a semi-implicit method. This is necessitated by the fact that the methods outlined above for the treatment of advection on the LRG cannot be made fully implicit. With the temporal descretization alone in mind, the Boltzmann equation can be written as

$$\begin{aligned} \frac{f^{n+1} - f^{n}}{\varDelta t} = - F_{{\mathrm{adv}}}(f^{gs},f^{n+1}) + \Big ( \frac{\delta f}{\delta t} \Big )_{\mathrm{col}}^{{\mathrm{lb}}}(f^{n+1}), \end{aligned}$$
(309)

where

$$\begin{aligned}&F_{\mathrm{adv}}(f^{gs},f^{n+1}) = F^{SR}_{{\mathrm{adv}}}(f^{{\mathrm{gs}}}) + \kappa \Big ( F^{\mathrm{NR}}_{{\mathrm{adv}}}(f^{n+1}) - F^{\mathrm{NR}}_{{\mathrm{adv}}}(f^{{\mathrm{gs}}}) \Big ). \end{aligned}$$
(310)

The first term on the right-hand side of Eq. (310) is the advection term for the special relativistic case. It is evaluated explicitly at the value of the current iterate, \(f^{gs}\). The second two terms correspond to what the advection terms would be in the non-relativistic case, evaluated both implicitly and explicitly (at the current iterate), respectively. Together they represent a “correction” to the first term and are introduced for numerical stability. When \(f^{gs}\rightarrow f^{n+1}\), the second two terms cancel, and the right-hand side of Eq. (310) becomes \(F^{SR}_{adv}(f^{n+1})\), as desired. The parameter, \(\kappa \), is a limiter and prevents the correction from becoming larger than the first term, which Nagakura et al. note can happen when the fluid velocities become several tens of percent of the speed of light.

Given the solution of the distribution function and, in particular, the numerical determination of the collision term, the update to the matter electron fraction and stress–energy tensor (including both energy and momentum exchange) are computed as follows [see Eqs. (5), (6), (191) and (194)]:

$$\begin{aligned} T^{\mu \nu }_{ ,\nu }= & {} - G^{\mu }, \end{aligned}$$
(311)
$$\begin{aligned} N_{e ,\nu }^{\nu }= & {} - \varGamma , \end{aligned}$$
(312)

where

$$\begin{aligned} G^{\mu }\equiv & {} \sum _{\mathrm{i}} G_{\mathrm{i}}^{\mu }, \end{aligned}$$
(313)
$$\begin{aligned} G_{\mathrm{i}}^{\mu }\equiv & {} \int p_{\mathrm{i}}^{\mu } \Big ( \frac{\delta f}{\delta \tau } \Big )_{\mathrm{col}(\mathrm{i})} dV_p, \end{aligned}$$
(314)
$$\begin{aligned} \varGamma\equiv & {} \varGamma _{\nu _{e}} - \varGamma _{\bar{\nu _{e}}}, \end{aligned}$$
(315)
$$\begin{aligned} \varGamma _{i}\equiv & {} \int \Big ( \frac{\delta f}{\delta \tau } \Big )_{\mathrm{col}(\mathrm{i})} dV_p, \end{aligned}$$
(316)

and where, for Nagakura et al., \(N_{e}^{\nu }\) (our \(J_{e}^{\nu }\)) is the electron density current, \(dV_p\) (our \(\pi _m\)) is the invariant momentum-space volume element, and i indicates the neutrino species (Fig. 17).

Fig. 17
figure 17

Image reproduced with permission, copyright by AAS

In the top left panel, Nagakura et al. (2018) plot the \(r-\theta \) component of the Eddington tensor, \(k^{r\theta }\), at 190 ms after bounce in a core-collapse supernova simulation they performed with their Boltzmann neutrino transport solver, initiated from a progenitor of \(11.2\,M_\odot \). In the corresponding upper right panel, they plot the (absolute) difference between \(k^{r\theta }\) computed with both Boltzmann neutrino transport and two-moment neutrino transport with M1 closure. In both cases, \(k^{r\theta }\) is evaluated at the mean neutrino energy at each point of the spatial grid shown here. Nagakura et al. classify such absolute differences in the off-diagonal components of the Eddington tensor in their model as substantial, indicating that Boltzmann transport is needed to accurately compute the components of the neutrino Eddington tensor. In their model, \(k^{r\theta }\) was demonstrated to dictate the evolution of the lateral neutrino fluxes, not \(k^{\theta \theta }\), in the critical semitransparent regime

6.2 Boltzmann kinetics: spatial discontinuous Galerkin discretization plus spectral multigroup \(P_{N}\)

A numerical treatment of Boltzmann kinetics that implements a finite-element discretization—specifically, a Discontinuous Galerkin (DG) discretization—for the spatial degrees of freedom together with a spectral decomposition in momentum space was developed by Radice et al. (2013) for the Boltzmann equation:

$$\begin{aligned} p^\mu \frac{\partial F}{\partial x^{\mu }} = {\mathbb {C}}[F]. \end{aligned}$$
(317)

In this scheme, the distribution function, F, is first decomposed in momentum space as

$$\begin{aligned} F(x^\alpha , \nu , \varphi , \theta ) = \sum _{n=0}^{N_\nu } \sum _{\ell = 0}^{N} \sum _{m = -\ell }^\ell F^{n\ell m}(x^\alpha )\, \chi _{n}(\nu )\, Y_{\ell m}(\varphi ,\theta ), \end{aligned}$$
(318)

where the orthonormal basis functions in the energy dimension are defined by

$$\begin{aligned} \chi _n(\nu )&= \left\{ \begin{array}{ll} {1}/{\sqrt{V_n}}, &{}\quad {\text {if}}\;\nu \in [\nu _n, \nu _{n+1}], \\ 0, &{}\quad {\text {otherwise}}, \end{array}\right. ,&V_n&= \int _{\nu _n}^{\nu _{n+1}} h^3 \nu ^2\, \mathrm {d}\nu = \frac{h^3}{3} (\nu _{n+1}^3 - \nu _n^3). \end{aligned}$$
(319)

Using the orthonormality of the spherical harmonics and \(\chi _{n}(\nu )\), the coefficients in the momentum-space expansion of the distribution function, Eq. (318), are given by

$$\begin{aligned} F_{n \ell m}(x^\alpha ) = \int _{0}^{\infty } h^3 \nu ^2\, \mathrm {d}\nu \int _{{\mathscr {S}}_1} \mathrm {d}\varOmega \, F(x^\alpha , \nu , \varphi , \theta )\, Y_{\ell m}(\varphi , \theta ) \, \chi _{n}(\nu ). \end{aligned}$$
(320)

Radice et al. introduce the shorthand notation:

$$\begin{aligned} \varPsi _A(\nu , \varphi , \theta ) \equiv \chi _{n}(\nu )\, Y_{\ell m}(\varphi ,\theta ), \end{aligned}$$
(321)

and reexpress Eq. (318) as

$$\begin{aligned} F(x^\alpha , \varepsilon , \varphi , \theta ) = \sum _A F^A(x^\alpha ) \varPsi _A(\varepsilon , \varphi , \theta ) = F^A \varPsi _A. \end{aligned}$$
(322)

Inserting the expansion (322) into the Boltzmann equation (317) leads to a coupled system of equations for the expansion coefficients that must be solved to determine them as a function of time and space:

$$\begin{aligned} p^0 \frac{\partial F^B}{\partial t} \varPsi _B + p^k \frac{\partial F^B}{\partial x^k} \varPsi _B = {\mathbb {C}}[F]. \end{aligned}$$
(323)

Multiplying Eq. (323) by \(\varPsi ^A\) (in the notation of Radice et al., a superscript A indicates a complex conjugate), integrating over momentum space, and using the orthonormality of the basis functions \(\varPsi _A\) gives

$$\begin{aligned} \frac{\partial F^A}{\partial t} + {{{\mathscr {P}}}^{kA}}_{B} \frac{\partial F^B}{\partial x^k} = {\mathbb {S}}^A[F], \end{aligned}$$
(324)

where

$$\begin{aligned} {{{\mathscr {P}}}^{kA}}_{B} \equiv \int p^k\, \varPsi ^A\, \varPsi _B\, \mathrm {d}\varPi \, \end{aligned}$$
(325)

and

$$\begin{aligned} {\mathbb {S}}^A[F] \equiv \int {\mathbb {C}}[F]\, \varPsi ^A\, \mathrm {d}\varPi . \end{aligned}$$
(326)

In Eqs. (325) and (326), \(\mathrm {d}\varPi \) is the invariant momentum-space volume element. Once the expansion coefficients are obtained by solving Eq. (324), the solution to the original Boltzmann equation is given by Eq. (318).

Radice et al. illustrate their approach to solving Eq. (324) by considering the one-dimensional, collisionless case:

$$\begin{aligned} \frac{\partial F^A}{\partial t} + {{{\mathscr {P}}}^{1A}}_{B} \frac{\partial F^B}{\partial x} = 0. \end{aligned}$$
(327)

In a DG discretization in x, the distribution function is written as an expansion in Lagrange polynomials:

$$\begin{aligned} F^{A}(x,t)=\sum _{A}{F^A}_{i}(t)u(x), \end{aligned}$$
(328)

where

$$\begin{aligned} u(x) = u_{i-1/2} l_{i-1/2}(x) + u_{i+3/2} l_{i+3/2}(x), \end{aligned}$$
(329)

and where the Lagrange polynomials are defined by

$$\begin{aligned} l_{i-1/2}(x)&= 1 - \frac{x - x_{i-1/2}}{x_{i+3/2} - x_{i-1/2}},&l_{i+3/2}(x)&= \frac{x - x_{i-1/2}}{x_{i+3/2} - x_{i-1/2}}. \end{aligned}$$
(330)

Insertion of the expansion (328) in Eq. (327) yields the following set of coupled ordinary differential equations for the coefficients \({F^A}_{i}\):

$$\begin{aligned} \varDelta x \frac{\mathrm {d}{F^A}_i}{\mathrm {d}t} = {{{\mathbb {F}}}^A}_i, \end{aligned}$$
(331)

where the flux factors are given by

$$\begin{aligned} {{{\mathbb {F}}}^A}_i&\equiv \frac{3}{2} {\mathscr {F}}^- - \overline{{\mathscr {F}}} - \frac{1}{2} {\mathscr {F}}^+ ,&{{{\mathbb {F}}}^A}_{i+1}&\equiv \frac{1}{2}{\mathscr {F}}^- + \overline{{\mathscr {F}}} - \frac{3}{2} {\mathscr {F}}^+, \end{aligned}$$
(332)
$$\begin{aligned} \overline{{\mathscr {F}}}\equiv & {} \frac{1}{2} \Big [ \big ({{{\mathscr {P}}}^{1A}}_{B}\big )_i {F^B}_i + \big ({{{\mathscr {P}}}^{1A}}_{B}\big )_{i+1} {F^B}_{i+1}\Big ], \\ {\mathscr {F}}^-\equiv & {} \frac{1}{2} \big [ {{{\mathscr {P}}}^{1A}}_{B} \big ({F^B}_L + {F^B}_R\big ) - {{{\mathscr {R}}}^{1A}}_{C} \mathrm {max}(v, |{\varLambda ^{1C}}_{D}|) {{{\mathscr {L}}}^{1D}}_{B} \big ({F^B}_R - {F^B}_L\big ) \big ], \\ {{{\mathscr {P}}}^{1A}}_{B}= & {} {{{\mathscr {R}}}^{1A}}_{C} {\varLambda ^{1C}}_{D} {{{\mathscr {L}}}^{1D}}_{B}. \end{aligned}$$

In Eq. (332), \(\overline{{\mathscr {F}}}\) is the average flux; \({\mathscr {F}}^-\) is the flux computed at the boundary \(x_{i+1/2}\) of the \(i{\mathrm{th}}\) element through an exact solution of the Riemann problem with left and right states, \({F^B}_{L}\) and \({F^B}_{R}\), respectively; \({\mathscr {F}}^+\) is defined in the same way, at the boundary \(x_{i+3/2}\) ; \({{{\mathscr {R}}}^{1A}}_{C}\) is the matrix of right eigenvectors of \({{{\mathscr {P}}}^{1A}}_{B}\); \({{{\mathscr {L}}}^{1D}}_{B}\) is the matrix of left eigenvectors of \({{{\mathscr {P}}}^{1A}}_{B}\); \({\varLambda ^{1C}}_{D}\) is the matrix of eigenvalues of \({{{\mathscr {P}}}^{1A}}_{B}\); and v is a parameter taken to be the first abscissa of the adopted Legendre quadrature (this parameter is introduced by Radice et al. to dissipate numerically zero-speed modes). The three-dimensional extension of the scheme is given by constructing the flux factors in each of the three dimensions in the same way, which gives

$$\begin{aligned} \frac{\mathrm {d}{F^A}_{i,j,k}}{\mathrm {d}t} = {\mathbb {S}}^A[F] + \frac{1}{\varDelta x} {{{\mathbb {F}}}^A}_{i,j,k} + \frac{1}{\varDelta y} {{{\mathbb {G}}}^A}_{i,j,k} + \frac{1}{\varDelta z} {{{\mathbb {H}}}^A}_{i,j,k} . \end{aligned}$$
(333)

Now, focusing on the temporal discretization of Eq. (333) and using Radice et al.’s rewrite of the equation as

$$\begin{aligned} \frac{\mathrm {d}F^A}{\mathrm {d}t} = {\mathbb {S}}^A[F] + {\mathscr {A}}^A[F], \end{aligned}$$
(334)

the authors evolve the coefficients of the distribution function’s DG–spectral expansion, Eqs. (318) and (328), in a two-step, semi-implicit, asymptotic-preserving scheme (McClarren et al. 2008), staged as a predictor step,

$$\begin{aligned} \frac{{F^A}_{k+1/2} - {F^A}_{k}}{\varDelta t/2} = {{\mathscr {A}}}^{A}[F_k] + {\mathbb {S}}^A[F_{k+1/2}]\,, \end{aligned}$$
(335)

followed by a corrector step,

$$\begin{aligned} \frac{{F^A}_{k+1} - {F^A}_{k}}{\varDelta t} = {{\mathscr {A}}}^{A}[F_{k+1/2}] + {\mathbb {S}}^A[F_{k+1}]. \end{aligned}$$
(336)

Given that Radice et al. choose to use a partially spectral scheme, like all others deploying such schemes they had to contend with the Gibbs phenomenon. To do so, they were informed by the seminal work of McClarren and Hauck (2010), who developed a method, using filtering, to mitigate Gibbs phenomena in \(P_{N}\) schemes. Unfortunately, as pointed out by McLerran and Hauck and by Radice et al., the filtered \(P_{N}\) scheme does not have a unique continuum limit—i.e., it cannot be shown to be a discretization of a system of partial differential equations. In Radice et al.’s approach, the spherical harmonic expansion of the solution is filtered at each time step using a spherical-spline filter:

$$\begin{aligned} \big [{\mathscr {F}}(F)\big ](\varphi , \theta ) = \sum _{\ell =0}^N \sum _{m = -\ell }^\ell \big [\sigma \Big (\frac{\ell }{N+1}\Big )\big ]^s F^{\ell m} Y_{\ell m}(\varphi , \theta ), \end{aligned}$$
(337)

where \(\sigma (\eta )\) is a filter function of order p such that

$$\begin{aligned} \sigma (0)&= 1,&\sigma ^{(k)}(0) = 0, \ \text {for } k = 1,2, \ldots p-1, \end{aligned}$$
(338)

and where s is a strength parameter, which is chosen to be a function of the time step:

$$\begin{aligned} s=\beta \varDelta t, \end{aligned}$$
(339)

where \(\beta \) is a parameter. Radice et al. document success using a modified, second-order Lanczos filter:

$$\begin{aligned} \sigma (\eta )=\frac{\sin \eta }{\eta }. \end{aligned}$$
(340)

With the introduction of filtering, the time stepping algorithm, Eqs. (335) and (336), is modified as follows:

$$\begin{aligned} \frac{{F^A}_{*} - {F^A}_k}{\varDelta t/2}&= {{\mathscr {A}}^A}[F_k] + {{\mathbb {S}}^A}[F_{k+1/2}], \end{aligned}$$
(341)
$$\begin{aligned} {F^A}_{k+1/2}&= {{\mathscr {F}}^A}_{B} {F^B}_{*}, \end{aligned}$$
(342)
$$\begin{aligned} \frac{{F^A}_{**} - {F^A}_{k}}{\varDelta t}&= {{\mathscr {A}}^A}[F_{k+1/2}] + {{\mathbb {S}}^A}[F_{k+1}], \end{aligned}$$
(343)
$$\begin{aligned} {F^A}_{k+1}&= {{\mathscr {F}}^A}_{B} {F^B}_{**}, \end{aligned}$$
(344)

where \({{\mathscr {F}}^A}_{B}\) is a diagonal matrix that instantiates the filtering operation. Moreover, Radice et al. were able to show that their filtering method represents the first-order, operator-split discretization of a term added to the underlying system of partial differential equations, Eq. (324):

$$\begin{aligned} \frac{\partial F^A}{\partial t} + {{{\mathscr {P}}}^{kA}}_{B} \frac{\partial F^B}{\partial x^k} = e^A + {S^A}_{B} F^B + \beta {L^A}_{B} F^B, \end{aligned}$$
(345)

where \({L^A}_{B}\) is a diagonal matrix with coefficients \(\log \sigma (l/(N+1))\). That is, their filtering method is equivalent to the addition of a forward-scattering term [\(\sigma (0)=1\)] to Eq. (324), and their overall method is a unique discretization of an underlying system of coupled partial differential equations, Eq. (345).

While the filtering effectively mitigates the Gibbs phenomenon, the distribution function can still become negative, which is unphysical. To contend with negative distribution functions in the context of the filtered \(P_{N}\) scheme, Laiu and Hauck (2019) developed and analyzed so-called positivity limiters, which can be used to ensure positivity of distribution function in each step of a time integration scheme.

6.3 Boltzmann kinetics: spectral decomposition across phase space

Peres et al. (2014) opt for a fully spectral approach to the solution of the 3 + 1 general relativistic Boltzmann equation in the CFC approximation in non-conservative form:

$$\begin{aligned} \frac{1}{\alpha }\frac{\partial f}{\partial t} + \left( \frac{p^i}{\varPsi ^2 \varepsilon } - \frac{\beta ^i}{\alpha } \right) \frac{\partial f}{\partial x^i} - {\bar{\varGamma }}^j\,\!_{\mu \nu } p^\mu p^\nu J^i\,\!_j \frac{1}{\varepsilon }\frac{\partial f}{\partial p^i} = \frac{1}{\varepsilon }{\mathscr {C}}[f]. \end{aligned}$$
(346)

In this case, the 3 + 1 line element is

$$\begin{aligned} ds^2 = -\alpha ^2 dt^2 + \gamma _{{\tilde{i}}{\tilde{j}}}(dx^{{\tilde{i}}} + \beta ^{{\tilde{i}}} dt)(dx^{{\tilde{j}}} + \beta ^{{\tilde{j}}} dt), \end{aligned}$$
(347)

where the spatial geometry is assumed to be conformally flat—i.e.,

$$\begin{aligned} \gamma _{{\tilde{i}}{\tilde{j}}} = \varPsi ^4 f_{{\tilde{i}}{\tilde{j}}}. \end{aligned}$$
(348)

In Eq. (348), \(f_{{\tilde{i}}{\tilde{j}}}\) is the flat metric and \(\varPsi \) is the conformal factor,

$$\begin{aligned} \varPsi = \left( \frac{\det \gamma _{{\tilde{i}}{\tilde{j}}}}{\det f_{{\tilde{i}}{\tilde{j}}}} \right) ^{1/12}. \end{aligned}$$
(349)

In Eq. (346), \(p^{\mu }\) and \(\varepsilon \) correspond to the neutrino four-momenta and energy, respectively, measured by an Eulerian observer. \({\bar{\varGamma }}^j\,\!_{\mu \nu }\) are the Ricci rotation coefficients. Peres et al.’s choice of phase-space coordinates is motivated by the known challenge time derivatives present for spectral methods. Specifically, were comoving-frame four-momenta chosen instead, the coefficients of the advection terms on the left-hand side of Eq. (346) would contain time derivatives associated with, for example, relativistic Doppler shift. Of course, the collision term is best evaluated in the comoving frame, using comoving-frame four momenta, so the choice of Eulerian frame four-momenta necessitates additional work to treat collisions. Peres et al. leave the detailed treatment of this term to future publication. They also acknowledge the benefits of beginning instead with the conservative form of Eq. (346) and leave that to future publication, as well.

In their approach, the distribution function is written as an expansion in terms of the basis functions across all six dimensions of phase space—in this case, spherical coordinates in both space and momentum space:

$$\begin{aligned}&f \left( t, r, \theta , \phi , \varepsilon , \varTheta , \varPhi \right) \nonumber \\&\quad \simeq \sum _{i=0}^{n_r} \sum _{j=0}^{n_\theta } \sum _{k=0}^{n_\phi } \sum _{l=0}^{n_\varepsilon } \sum _{m=0}^{n_\varTheta } \sum _{p=0}^{n_\varPhi } C_{ijklmp}(t)\, T_i ({\bar{r}})\, F_j(\theta )\, F_k(\phi )\, T_l( {\bar{\varepsilon }} )\, T_m( {\bar{\varTheta }} )\, F_p(\varPhi ).\nonumber \\ \end{aligned}$$
(350)

Chebyshev basis functions are used for r, \(\varepsilon \), and \(\varTheta \)—i.e., for the expected non-periodic nature of the distribution function in these dimensions. Fourier basis functions are used for \(\theta \), \(\phi \), and \(\varPhi \)—i.e., for the expected periodic nature of the distribution function in these dimensions. Barred variables in Eq. (350) are in the range \([-1,1]\) and are related to the standard coordinates by affine transformations. In the case of the radial coordinate, the affine transformation is written explicitly as

$$\begin{aligned} r = \alpha _r {\bar{r}} + \beta _r, \qquad {\bar{r}} \in [-1,1], \end{aligned}$$
(351)

where \(\alpha _r\) and \(\beta _r\) are constants, with \(R_{\mathrm{min}} = \beta _r - \alpha _r\) and \(R_{\mathrm{max}} = \alpha _r + \beta _r\). \(R_\mathrm{min}\) and \(R_{\mathrm{max}}\) are the minimum and maximum radii of the spherical shell considered in the Peres et al. analysis, respectively. (The extension of their method to \(r=0\) is left for future development.) Ignoring the collision term in Eq. (346), it can be written in terms of the Liouville operator, \({\tilde{L}}[f]\), as

$$\begin{aligned} \frac{\partial f}{\partial t} = - {\tilde{L}}[f]. \end{aligned}$$
(352)

Substituting the expansion (350) into Eq. (352) results in a system of coupled ordinary differential equations for the solution vector, \(U_N(t)\), where \(N=n_r\times n_\theta \times n_\phi \times n_\varepsilon \times n_\varTheta \times n_\varPhi \). The elements of the solution vector are the coefficients \(C_{ijklmp}(t)\). Under this substitution, the operator, \({\tilde{L}}[f]\), in Eq. (352) becomes an \(N\times N\) matrix. To solve this system of equations, Peres et al. employ an explicit, third-order, Adams–Bashforth scheme,

$$\begin{aligned} U_N^{n+1} = U_N^n - \varDelta t \left( \frac{23}{12} {\tilde{L}}_N U_N^n - \frac{4}{3} {\tilde{L}}_N U_N^{n-1} + \frac{5}{12} {\tilde{L}}_N U_N^{n-2} \right) , \end{aligned}$$
(353)

though they emphasize they are not restricted to explicit updates but could also deploy semi-implicit and implicit methods.

6.4 Boltzmann kinetics: Monte Carlo methods

Up to now, we have focused on deterministic methods for the solution of the Boltzmann neutrino transport equations in core-collapse supernovae. But nondeterministic—specifically Monte Carlo—methods have also been used. Until recently, they have been confined to “snapshot” studies in a particular slice of an evolving stellar core and have been used most extensively as a gauge of the accuracy of deterministic, but approximate, methods. Although it has yet to be used in the context of a core-collapse supernova simulation as the method of choice for treating time-dependent neutrino transport, a lepton-number and energy conserving Monte Carlo scheme for such transport has been developed by Abdikamalov et al. (2012) for the O(v/c) limit of special relativistic effects and Newtonian gravity.

In their paper, Abdikamalov et al. illustrate their method assuming spherical symmetry. They begin with the equation for the neutrino intensity for each neutrino species, here written generically without a species label:

$$\begin{aligned}&\frac{1}{c}\frac{\partial I (r,\mu ,\varepsilon ,t)}{\partial t} + \frac{\partial I(r,\mu ,\varepsilon ,t)}{\partial r} + \frac{1-\mu ^2}{r} \frac{\partial I (r,\mu ,\varepsilon ,t)}{\partial \mu } \nonumber \\&\quad = \kappa _a(\varepsilon ,T) \left[ B(\varepsilon ,T) - I (x,\mu ,\varepsilon ,t)\right] - \kappa _s(\varepsilon ,T) I (r,\mu ,\varepsilon ,t) \nonumber \\&\qquad + 2 \pi \int _{-1}^{+1} \int _0^\infty \varkappa _s(\varepsilon ^{\prime },\mu ^{\prime } \rightarrow \varepsilon ,\ \mu ) I(x,\mu ^{\prime },\varepsilon ^{\prime },t) d\mu ^{\prime } d\varepsilon ^{\prime } . \end{aligned}$$
(354)

The first term term on the right-hand side is the familiar term for emission and absorption of neutrinos. \(\kappa _a(s)\) is the total absorption (scattering) opacity. The last term describes the additional source of neutrino as a result of inscattering into the neutrino “beam” with direction \(\mu \) and energy \(\varepsilon \). Equation (354) is solved using the boundary conditions:

$$\begin{aligned} I (R,\mu ,\varepsilon ,0) = I_R (\mu ,\varepsilon ,t), \quad -1\le \mu \le 0 . \end{aligned}$$
(355)

In their set of evolution equations, Eq. (354) is coupled to the material energy equation and the equation for the evolution of the electron fraction:

$$\begin{aligned} \rho \frac{d U_m}{d t}= & {} 2 \pi \sum _i \int _{-1}^1 \int _0^\infty \kappa _{ai} (I_i - B_i) \, d\mu d\varepsilon \nonumber \\&+ \sum _i S_i \, , \end{aligned}$$
(356)
$$\begin{aligned} \rho N_A \frac{d Y_e}{d t}= & {} 2\pi \sum _i s_i \int _{-1}^1 \int _0^\infty \frac{\kappa _{ai}}{\varepsilon } (I_i - B_i) \, d\mu d\varepsilon . \end{aligned}$$
(357)

The sum over i is over neutrino species, which will be dropped in what follows. \(s_i=+1, -1, 0\) for electron neutrinos, electron antineutrinos, and heavy-flavor neutrinos, respectively, and will be carried through the remaining presentation of the method. In Eq. (356), S is the contribution to the material energy from energy-exchanging scattering with neutrinos and is given by

$$\begin{aligned} S= & {} (2\pi )^2 \int _0^\infty \!\!\!\! \int _0^\infty \!\!\! \int _{-1}^1 \int _{-1}^1 \!\! \big [\frac{\varepsilon }{\varepsilon ^{\prime }} \varkappa _s(\varepsilon ^{\prime }, \mu ^{\prime } \!\!\rightarrow \varepsilon ,\mu ) I (x,\mu ^{\prime },\varepsilon ^{\prime },t) \nonumber \\&- \varkappa _s(\varepsilon , \mu \rightarrow \varepsilon ^{\prime }, \mu ^{\prime }) I (x,\mu ,\varepsilon ,t) \big ] d\varepsilon d\varepsilon ^{\prime } d\mu d\mu ^{\prime } . \end{aligned}$$
(358)

Abdikamalov et al. introduce the additional quantities:

$$\begin{aligned} {U_r}= & {} \frac{4\pi }{c} \int _0^\infty B d \varepsilon , \end{aligned}$$
(359)
$$\begin{aligned} b= & {} \frac{B}{4\pi \int _0^\infty B d\varepsilon } , \end{aligned}$$
(360)
$$\begin{aligned} \kappa _p= & {} \frac{\int _0^\infty \kappa _a B d\varepsilon }{\int _0^\infty B d\varepsilon } , \end{aligned}$$
(361)
$$\begin{aligned} \chi _a= & {} \frac{\kappa _{a}}{\varepsilon }, \end{aligned}$$
(362)
$$\begin{aligned} \chi _p= & {} \frac{\int _0^\infty \chi _a B d\varepsilon }{\int _0^\infty B d\varepsilon } , \end{aligned}$$
(363)

where \(U_r\) is the equilibrium neutrino energy density and \(\kappa _p\) is the Planck-mean opacity. The evolution equation for \(U_r\) is related to the evolution equations for \(U_m\) and \(Y_e\) by

$$\begin{aligned} \frac{d {U_r}}{d t} = \beta \left( \rho \frac{d U_m}{dt} \right) + \zeta \left( \rho N_A \frac{dY_e}{dt} \right) , \end{aligned}$$
(364)

where

$$\begin{aligned} \beta = \frac{1}{\rho C_V} \left( \frac{\partial {U_r}}{\partial T}\right) _{\rho ,Ye} \, \end{aligned}$$
(365)

and

$$\begin{aligned} \zeta = \frac{1}{\rho N_A} \left[ \left( \frac{\partial {U_r}}{\partial Y_e} \right) _{\rho ,T} -\frac{1}{C_V} \left( \frac{\partial U_m}{\partial Y_e} \right) _{\rho ,T} \left( \frac{\partial {U_r}}{\partial T} \right) _{\rho ,Ye} \right] . \end{aligned}$$
(366)

In Eqs. (364) through (366), \(N_A\) is Avogadro’s Number and \(C_V\) is the material heat capacity.

As with deterministic methods, the first step in the solution of Eqs. (354), (356), and (357) is to linearize them. As part of this linearization procedure, Abdikamalov also ensure that these three evolution equations become decoupled. The first step in the linearization process involves approximating \(\{\kappa _a, \kappa _p,\kappa _s, \varkappa _s, b, \chi _a, \chi _p, \beta , \zeta \}\) with \(\{{{\tilde{\kappa }}}_a,{{\tilde{\kappa }}}_p, {{\tilde{\kappa }}}_s, {{\tilde{\varkappa }}}_s, {{\tilde{b}}}, {{\tilde{\chi }}}_a, {{\tilde{\chi }}}_p, {{\tilde{\beta }}}, {{\tilde{\zeta }}}\}\). Abdikamalov define the latter as the time-centered values of the former within the time interval \(t_n\le t\le t_{n+1}\). In practice, they are chosen at the initial time step: \(t_n\). Given this linearization, Eqs. (354), (356), (357), and (364) become:

$$\begin{aligned}&\frac{1}{c} \frac{\partial I(\mu ,\varepsilon )}{\partial t} + \mu \frac{\partial I(\mu ,\varepsilon )}{\partial r} + \frac{1-\mu ^2}{r} \frac{\partial I(\mu ,\varepsilon )}{\partial \mu } \nonumber \\&\quad = c {{\tilde{\kappa }}}_{a} {{\tilde{b}}} {U_r} - ({{\tilde{\kappa }}}_a+{{\tilde{\kappa }}}_s) I(\mu ,\varepsilon ) \nonumber \\&\qquad + 2 \pi \int _{-1}^{+1} \int _0^\infty {{\tilde{\varkappa }}}_s(\varepsilon ^{\prime },\mu ^{\prime } \rightarrow \varepsilon ,\ \mu ) I(\mu ^{\prime },\varepsilon ^{\prime }) d\mu ^{\prime } d\varepsilon ^{\prime } , \end{aligned}$$
(367)
$$\begin{aligned} \rho \frac{d U_m}{d t}= & {} 2\pi \int _{-1}^1 \int _0^\infty \tilde{\kappa }_{a} I d\mu d\varepsilon - c {{\tilde{\kappa }}}_{p} {U_r} + S, \end{aligned}$$
(368)
$$\begin{aligned} \rho N_A \frac{d Y_e}{d t}= & {} 2\pi s_i \int _{-1}^1 \int _0^\infty {\tilde{\chi }}_{a} I \, d\mu d \varepsilon - c s_i {\tilde{\chi }}_{p} {U_r} , \end{aligned}$$
(369)
$$\begin{aligned} \frac{d {U_r}}{d t}= & {} 2\pi \int _{-1}^1 \int _0^\infty \tilde{\gamma }I d \, \mu d \varepsilon - c {{\tilde{\gamma }}}_{p} {U_r} + {{\tilde{\beta }}} S , \end{aligned}$$
(370)

where

$$\begin{aligned} {{\tilde{\gamma }}}= & {} {{\tilde{\beta }}} {{\tilde{\kappa }}}_{a} + {{\tilde{\zeta }}} s_i {\tilde{\chi }}_{a} , \end{aligned}$$
(371)
$$\begin{aligned} {{\tilde{\gamma }}}_{p}= & {} {{\tilde{\beta }}} {{\tilde{\kappa }}}_{p} + {{\tilde{\zeta }}} s_i {\tilde{\chi }}_{p} . \end{aligned}$$
(372)

It is understood in Eqs. (368) and (370) that \(\varkappa _s\) is replaced by \({{\tilde{\varkappa }}}_s\). Abdikamalov et al. then time average Eq. (370) and use

$$\begin{aligned} {\bar{U}}_r = \alpha U_{r,n+1} + (1 - \alpha ) U_{r,n}^* , \end{aligned}$$
(373)

where, as pointed out by Abdikamalov et al., \(\alpha \) controls the degree to which the method is implicit, and where

$$\begin{aligned} U_{r,n}^* = U_{r,n}+{{\tilde{\beta }}}\varDelta t_n{{\bar{S}}} \end{aligned}$$
(374)

and

$$\begin{aligned} {{\bar{S}}} = \frac{1}{\varDelta t_n} \int _{t_n}^{t_{n+1}} S(t) dt , \end{aligned}$$
(375)

to obtain

$$\begin{aligned} {\bar{U}}_r = f_{n} U_{r,n}^* + 2\pi \frac{1-f_{n}}{c \tilde{\gamma }_{p}} \int _{-1}^1 \int _0^\infty {{\tilde{\gamma }}} {\bar{I}} \, d\mu d\varepsilon , \end{aligned}$$
(376)

where

$$\begin{aligned} U_{r,n}^* = U_{r,n}+{{\tilde{\beta }}}\varDelta t_n{{\bar{S}}} \end{aligned}$$
(377)

and

$$\begin{aligned} f_{n} = \frac{1}{1 + \alpha c \varDelta t_n {{\tilde{\gamma }}}_{p}} . \end{aligned}$$
(378)

Abdikamalov et al. now assume that \({\bar{U}}=U_r(t)\) and \({\bar{I}}=I(t)\) in Eq. (376) and use the resultant equation to substitute for \(U_r\) in Eq. (367), to obtain their final equation for the evolution of the neutrino intensity:

$$\begin{aligned} \frac{1}{c} \frac{\partial I}{\partial t} + \mu \frac{\partial I}{\partial r} + \frac{1-\mu ^2}{r} \frac{\partial I}{\partial \mu }= & {} \tilde{\kappa }_{ea} c \tilde{b} U_{r,n}^* \nonumber \\&- {{\tilde{\kappa }}}_{ea} I + {{\tilde{\kappa }}}_{es,e} I + \tilde{\kappa }_{es,l} I + {{\tilde{\kappa }}}_s I \nonumber \\&+ 2\pi \frac{{{\tilde{\kappa }}}_a {{\tilde{b}}}}{{{\tilde{\kappa }}}_p} \int _{-1}^1 \int _0^\infty {{\tilde{\kappa }}}_{es,e} I \, d\mu d \varepsilon + 2\pi \frac{{{\tilde{\kappa }}}_a {{\tilde{b}}}}{{{\tilde{\chi }}}_p} \int _{-1}^1 \int _0^\infty {{\tilde{\chi }}}_{es,l} I \, d\mu d \varepsilon \nonumber \\&+ 2 \pi \int _{-1}^{+1} \int _0^\infty {{\tilde{\varkappa }}}_s(\varepsilon ^{\prime },\mu ^{\prime } \rightarrow \varepsilon ,\ \mu ) I(\mu ^{\prime },\varepsilon ^{\prime }) d\mu ^{\prime } d\varepsilon ^{\prime } , \end{aligned}$$
(379)

where

$$\begin{aligned} \kappa _{ea}= & {} f_n \kappa _a , \end{aligned}$$
(380)
$$\begin{aligned} \kappa _{es,e}= & {} (1-f_n) \frac{{{\tilde{\beta }}} {{\tilde{\kappa }}}_p}{\tilde{\gamma }_p} \kappa _a , \end{aligned}$$
(381)
$$\begin{aligned} \kappa _{es,l}= & {} (1-f_n) \frac{{{\tilde{\zeta }}} s_i {{\tilde{\chi }}}_p}{\tilde{\gamma }_p} \kappa _a \, , \end{aligned}$$
(382)
$$\begin{aligned} \chi _{es,e}= & {} (1-f_n) \frac{{{\tilde{\beta }}} {{\tilde{\kappa }}}_p}{\tilde{\gamma }_p} \chi _a , \end{aligned}$$
(383)
$$\begin{aligned} \chi _{es,l}= & {} (1-f_n) \frac{{{\tilde{\zeta }}} s_i {{\tilde{\chi }}}_p}{\tilde{\gamma }_p} \chi _a . \end{aligned}$$
(384)

A similar procedure can be used to derive equations for the updates of \(U_m\) and \(Y_e\), as was performed for \(U_r\). Abdikamalov et al. point out that care must be taken to use the same expression for \(U_r\)—specifically, Eq. (376) with \({\bar{U}}_r=U_r(t)\) and \({\bar{I}}=I(t)\)—in the derivation of the equation for \(U_m\) in order to guarantee conservation of energy, to arrive at

$$\begin{aligned} U_{m,n+1}= & {} U_{m,n} + \frac{\varDelta t_n}{\rho } \big \{ 2\pi \int _{-1}^1 \int _0^\infty {{\tilde{\kappa }}}_{ea} {{\bar{I}}} \, d \mu d \varepsilon \nonumber \\&-c f_{n} {{\tilde{\kappa }}}_{p} U_{r,n} + 2\pi \int _{-1}^1 \int _0^\infty {{\tilde{\kappa }}}_{es,l} {{\bar{I}}} \, d \mu d \varepsilon \nonumber \\&- 2\pi \frac{{{\tilde{\kappa }}}_p}{{{\tilde{\chi }}}_p} \int _{-1}^1 \int _0^\infty {{\tilde{\chi }}}_{es,l} {{\bar{I}}} \, d \mu d \varepsilon + {{\bar{S}}} \big \} \, \end{aligned}$$
(385)

and

$$\begin{aligned} Y_{e, n+1}= & {} Y_{e, n} + \frac{\varDelta t_n}{\rho N_A} \big \{ 2\pi s_i \int _{-1}^1 \int _0^\infty {{\tilde{\chi }}}_{ea} {{\bar{I}}} \, d \mu d \varepsilon \nonumber \\&-c s_i f_{n} {{\tilde{\chi }}}_{p} U_{r,n} + 2\pi s_i \int _{-1}^1 \int _0^\infty {{\tilde{\chi }}}_{es,e} {{\bar{I}}} \, d \mu d \varepsilon \nonumber \\&- 2\pi s_i \frac{{{\tilde{\chi }}}_p}{{{\tilde{\kappa }}}_p} \int _{-1}^1 \int _0^\infty {{\tilde{\kappa }}}_{es,e} {{\bar{I}}} \, d \mu d \varepsilon \big \}. \end{aligned}$$
(386)

Having linearized and decoupled the equations of motion, the evolution in Abdikamalov et al.’s Monte Carlo approach proceeds as follows: The weight associated with each Monte Carlo paricle (MCP) is the number of particles associated with it and is assumed to be \(N_0\). The number of particles emitted by the matter in the time interval \([t_n,t_n+1]\) is

$$\begin{aligned} {{{\mathscr {N}}}}_T = 8\pi ^2\int _{t_n}^{t_{n+1}}\int _0^R\int _0^\infty \frac{\kappa _a(\varepsilon , T) B(\varepsilon , T)}{\varepsilon } r^2 \, dt \, dr \, d\varepsilon . \end{aligned}$$
(387)

Then, the number of MCP’s emitted in this time interval is

$$\begin{aligned} N_T = \mathrm {RInt} \left( {{{\mathscr {N}}}}_T / N_0 \right) , \end{aligned}$$
(388)

where \(\mathrm {RInt}(x)\) returns the largest integer less than x. The particle energy in each MCP is chosen according to the functional form of \(\kappa B\). Since thermal emission is isotropic, the angle of propagation of each MCP emitted, \(\mu \), is chosen uniformly on the interval \([-1,+1]\) using

$$\begin{aligned} \mu = 2\xi -1 , \end{aligned}$$
(389)

where \(\xi \) is a random number that takes on values in the interval [0, 1]. Similarly, the emission time is chosen uniformly on the interval \([t_n,t_{n+1}]\) using

$$\begin{aligned} t = t_n + (t_{n+1} - t_n) \xi . \end{aligned}$$
(390)

To choose the zone in which an MCP is emitted, Abdikamalov et al. use the probability that the MCP is emitted in a particular zone, which is given by the total number of particles emitted in that particular zone divided by the total number of particles emitted across all zones. Once an MCP is emitted in a particular zone, its location (assuming spherical symmetry) within that zone is determined using

$$\begin{aligned} r=\left[ r_{j-1/2}^3 + \left( r_{j+1/2} - r_{j-1/2} \right) ^3\xi \right] ^{1/3} . \end{aligned}$$
(391)

where j is the zone index. The number of MCPs entering from the outer boundary of the domain, at radius R, during the interval \([t_n,t_{n+1}]\) is given by

$$\begin{aligned} N_B = \mathrm {RInt} \left[ - \frac{8\pi ^2 R^2}{N_0} \int _{t_n}^{t_{n+1}}\int _0^\infty \int _{-1}^0 \frac{\mu I_R(\mu ,\varepsilon ,t)}{\varepsilon } \, dt \, d\varepsilon \, d\mu \right] . \end{aligned}$$
(392)

The number of MCPs present at the beginning of the interval is

$$\begin{aligned} N_{IC} = \mathrm {RInt} \left[ \frac{8\pi ^2}{cN_0} \int _0^R \int _{-1}^1 \int _0^\infty I_i (r, \mu , \varepsilon ) r^2 \, dr \, d\mu \, d\varepsilon \right] , \end{aligned}$$
(393)

where the spatial zone, propagation angle, and energy of each MCP is chosen randomly using the functional form of I.

During transport, an emitted MCP will either (1) travel within the zone without collision and remain in the zone, (2) encounter a collision within the zone, or (3) exit the zone. These three possibilities correspond to three different distances, given by

$$\begin{aligned} d_b= & {} \left\{ \begin{array}{ll} \left| \left[ r_{j+1/2}^2-r^2(1-\mu ^2)\right] ^{1/2} - r \mu \right| , &{}\quad {\mathrm {if}} \; j=1 \ \mathrm {or} \mu >0, \ \sin \theta \ge \frac{R_{j-1/2}}{r}\\ \\ \left| \left[ r_{j-1/2}^2-r^2(1-\mu ^2)\right] ^{1/2} + r \mu \right| , &{} \mathrm {if} \ \mu< 0 , \ \sin \theta < \frac{R_{j-1/2}}{r} , \\ \end{array} \right. \end{aligned}$$
(394)
$$\begin{aligned} d_t= & {} c (t_{n+1} - t) , \end{aligned}$$
(395)

and

$$\begin{aligned} d_c = - \frac{\ln \xi }{\kappa _a + \kappa _s} . \end{aligned}$$
(396)

In Eqs. (394), (395), and (396), \(d_b\), \(d_t\), and \(d_c\) are the distance to the boundary of the zone, the distance the particle can travel in the time interval if it does not encounter a collision, and the distance between collisions, respectively. Once these distances are known, the MCP is moved to the location corresponding to the smallest of the three distances, and to the associated time, according to

$$\begin{aligned} r\rightarrow & {} \sqrt{r^2 - 2 r d \mu + d^2} , \end{aligned}$$
(397)
$$\begin{aligned} t\rightarrow & {} t + d / c . \end{aligned}$$
(398)

If \(d=d_c\), the MCP is either absorbed or scattered. To determine which, Abdikamalov et al. use the following probabilities corresponding to the absorption and scattering coefficients appearing in Eq. (379), the equation governing the MCP transport:

$$\begin{aligned} P_{ea}= & {} \kappa _{ea}/(\kappa _e+\kappa _s) , \end{aligned}$$
(399)
$$\begin{aligned} P_s= & {} \kappa _s/(\kappa _e+\kappa _s) , \end{aligned}$$
(400)
$$\begin{aligned} P_{es,e}= & {} \kappa _{es,e}/(\kappa _e+\kappa _s) , \end{aligned}$$
(401)
$$\begin{aligned} P_{es,l}= & {} \kappa _{es,l}/(\kappa _e+\kappa _s) . \end{aligned}$$
(402)

The sum of all of these probabilities is, of course, equal to 1. As a result, to determine which of the above interactions takes place, Abdikamalov et al. sample a random number \(\xi \) in the range [0, 1]. Based on the value of \(\xi \): (1) if \(\xi < P_{ea}\), the MCP undergoes effective absorption, (2) if \(P_{ea}< \xi < P_{ea} + P_s\), the MCP is scattered, (3) if \(P_{ea} + P_s< \xi < P_{ea} + P_s + P_{es,e}\), the MCP undergoes effective scattering in which its total energy is conserved, and (4) if \(\xi > P_{ea} + P_s + P_{es,e}\), the MCP undergoes effective scattering in which its total lepton number is conserved. Within the domain [0, 1], the subdomain corresponding to each of the above possibilities is proportional to the probability for each possibility to occur, which ensures that the selection procedure yields a statistically correct result. And their result does not depend on the order in which they consider the possibilities. If the MCP is absorbed, its energy and lepton number are deposited in the zone and it is removed from the population of MCPs. If the MCP undergoes real scattering, it is moved to the location where the scattering occurs. For iso-energetic scattering, its angle is determined randomly using Eq. (389). If its energy changes as well, its new energy is determined by randomly sampling the functional form of the scattering kernel in energy. If the MCP undergoes effective scattering, which is isotropic, the MCP’s angle is again determined randomly using Eq. (389) and its energy is determined by randomly sampling the local emissivity spectrum since effective scattering mimics absorption and reemission. If \(d=d_b\) and the boundary is the zone boundary, the transport sampling process begins again, using the values of the opacities in the new zone. If the boundary is the outer boundary, the MCP is removed from the population of MCPs. Finally, if \(d=d_t\), the MCP is stored for the next time step. The above procedure is conducted for all of the MCP’s in the computational domain (i.e., in all zones) at the beginning of a time step.

For the case of a non-static medium, the comoving and Eulerian frames are no longer coincident and an extension of the Monte Carlo procedure outlined above is necessary. Abdikamalov et al. extend their approach as follows: The emissivities and opacities are naturally computed in the comoving frame. Once calculated, the number of MCPs emitted in this frame in each cell is determined. Assuming spherical symmetry for simplicity, the location, \(r_0\), direction of propagation, \(\mu _0\), and energy, \(\varepsilon _0\) of each MCP emitted at \(t_0\) is sampled based on the comoving frame emissivities. Each of these quantities is then transformed to the Eulerian frame using the well-known transformations (reproduced here for the spherically symmetric case):

$$\begin{aligned} \varepsilon _0= & {} \gamma \varepsilon \left( 1 - \frac{V_r \mu }{c}\right) , \end{aligned}$$
(403)
$$\begin{aligned} \mu _0= & {} \frac{\mu -V_r/c}{1-\mu V_r/c} , \end{aligned}$$
(404)
$$\begin{aligned} \varphi _0= & {} \varphi , \end{aligned}$$
(405)
$$\begin{aligned} \kappa (\mu , \varepsilon )= & {} \frac{\varepsilon _0}{\varepsilon } \kappa _0 (\varepsilon _0) , \end{aligned}$$
(406)
$$\begin{aligned} r= & {} \gamma _j \left[ r_0 + V_{r,j} (t_0-t_n)\right] , \end{aligned}$$
(407)
$$\begin{aligned} t= & {} \gamma _j \left( t_0-t_n+ \frac{V_{r,j} r_0}{c^2}\right) . \end{aligned}$$
(408)

The index j in the last two equations is the index of the comoving-frame cell in which the MCP is emitted. (Of course, \(V_{r,j}\) is measured in the Eulerian frame.) Once these transformations are made, the MCP is transported in the Eulerian frame, as described in the static case. Note, however, the distance to collision must be determined using the Eulerian-frame values of the opacities. Most of the steps in the static case proceed in the same way, with the exception of scattering, which requires additional care. If the MCP scatters, Abdikamalov et al. transform the angle of propagation and the energy of the MCP into the comoving frame, determine a new comoving-frame angle and energy due to the scattering event, then transform this new set of momentum-space variables back into the Eulerian frame before the transport of the MCP proceeds. The amount of energy and momentum exchanged between the MCP and the matter during the scattering, determined in the comoving frame, is recorded.

One further addition to the method presented by Abdikamalov et al. that should be noted is the computational efficiency they gain by coupling their method to a Discrete Diffusion Monte Carlo (DDMC) method, first developed by Densmore et al. (2007) for photon transport and extended by Abdikamalov to neutrino transport. The latter method is used in diffusive regimes, where the original Monte Carlo method is plagued by the short distances between collisions: MCP paths between collisions become very short and the number of such paths that have to be simulated becomes prohibitively large. However, even with the coupling to DDMC, the Monte Carlo approach described here remains expensive and awaits future computing architectures that are more capable and well-suited to such an approach in order to be used for core-collapse supernova simulations.

6.5 Two-moment kinetics

Numerical methods for solving equations for two-moment kinetics in core-collapse supernovae have now been developed by multiple groups (Müller et al. 2010; O’Connor 2015; Just et al. 2015; Kuroda et al. 2016; Roberts et al. 2016; Skinner et al. 2019). There are as many variations in approach as there are groups. Here we focus on common features and highlight specific solutions. For example, some authors have adopted fully relativistic descriptions (e.g., Müller et al. 2010; O’Connor 2015; Kuroda et al. 2016; Roberts et al. 2016), while others have resorted to approximations that seek to capture relativistic effects (e.g., Just et al. 2015; Skinner et al. 2019). Current methods for solving the equations for neutrino-radiation hydrodynamics using the two-moment approach employ finite-volume or finite-difference type methods. To this end, the system of equations can be written in the compact form [cf. Eqs. (43)–(46), and Eqs. (109) and (111)]

$$\begin{aligned} \partial _{t}{}{\mathbf {U}} + \partial _{i}{{\mathbf {F}}^{i}({\mathbf {U}})} + \partial _{\varepsilon }{}\big (\,\varepsilon \,{\mathbf {F}}^{\varepsilon }({\mathbf {U}})\,\big ) ={\mathbf {S}}({\mathbf {U}}) + {\mathbf {C}}({\mathbf {U}}) , \end{aligned}$$
(409)

where the vector of evolved quantities is given by

$$\begin{aligned} {\mathbf {U}} =\sqrt{\gamma }\,\big (\,D,\,S_{j},\,\tau ,\,D\,Y_{e},\,\varepsilon ^{2}{\mathscr {E}}_{1},\,\varepsilon ^{2}{\mathscr {F}}_{1,j},\ldots ,\,\varepsilon ^{2}{\mathscr {E}}_{N_{{\textsc {Sp}}}},\,\varepsilon ^{2}{\mathscr {F}}_{N_{{\textsc {Sp}}},j}\,\big )^{T}. \end{aligned}$$
(410)

The spatial flux vectors \({\mathbf {F}}^{i}\), energy-space flux vector \({\mathbf {F}}^{\varepsilon }\) (zero for fluid variables), “geometry” sources \({\mathbf {S}}\), and the “collision” source due to neutrino–matter interactions \({\mathbf {C}}\) can be inferred from equations given in Sects. 4.5, 4.7, and 5.2. Here, as an example, we consider the Eulerian two-moment model described in Sect. 4.7.3 with \(N_{{\textsc {Sp}}}\) neutrino species. Note that for each neutrino species, each radiation moment is represented by \(N_{\varepsilon }\) degrees of freedom to represent the energy distribution of neutrinos, giving a total of \(4\times N_{\varepsilon }\times N_{{\textsc {Sp}}}\) radiation degrees of freedom (compared to 6 fluid degrees of freedom) per point in spacetime. In core-collapse supernova models, \(N_{\varepsilon }={\mathscr {O}}(20)\), while \(N_{{\textsc {Sp}}}=3-6\), resulting in 240–480 degrees of freedom per spacetime point.

Among the approaches to solve the system of equations given by Eq. (409) numerically, high-resolution shock-capturing (HRSC) methods (e.g., finite-volume or finite-difference), initially developed for compressible hydrodynamics with shocks, have attracted much attention recently. (For simplicity of presentation, we proceed to discuss the case of one spatial dimension.) In the HRSC approach, the spacetime is discretized into spacelike foliations of spacetime with discrete time coordinates \(\{\,t^{n}\,\}_{n=0}^{N_{t}}\), where the time step \(\varDelta t=t^{n+1}-t^{n}\) is the separation between foliations. On each foliation, spatial positions are assigned coordinates \(\{\,x_{j-\frac{1}{2}}\,\}_{j=1}^{N_{x}+1}\), separating \(N_{x}\) “cells” with width \(\varDelta x_{j}=(x_{j+\frac{1}{2}}-x_{j-\frac{1}{2}})\). In addition, for radiation quantities, momentum (energy) space is discretized into \(N_{\varepsilon }\) “energy bins” with edges \(\{\,\varepsilon _{i-\frac{1}{2}}\,\}_{i=1}^{N_{\varepsilon }+1}\) and bin widths \(\varDelta \varepsilon _{i}=(\varepsilon _{i+\frac{1}{2}}-\varepsilon _{i-\frac{1}{2}})\). Integration of Eq. (409) over the phase-space cell \(I_{ij}=I_{i}^{\varepsilon }\times I_{j}^{\varepsilon }\), where \(I_{i}^{\varepsilon }=(\varepsilon _{i-\frac{1}{2}},\varepsilon _{i+\frac{1}{2}})\) and \(I_{j}^{x}=(x_{j-\frac{1}{2}},x_{j+\frac{1}{2}})\), gives the semi-discretized system

$$\begin{aligned} \frac{d {\mathbf {U}}_{ij}}{d t} =-\frac{1}{\varDelta V_{ij}}\big (\,{\mathbf {F}}_{ij+\frac{1}{2}}^{x}-{\mathbf {F}}_{ij-\frac{1}{2}}^{x}\,\big ) -\frac{1}{\varDelta V_{ij}}\big (\,\varepsilon _{i+\frac{1}{2}}{\mathbf {F}}_{i+\frac{1}{2}j}^{\varepsilon }-\varepsilon _{i-\frac{1}{2}}{\mathbf {F}}_{i-\frac{1}{2}j}^{\varepsilon }\,\big ) +{\mathbf {S}}_{ij}+{\mathbf {C}}_{ij}, \end{aligned}$$
(411)

where the evolved quantities are the cell averages defined as

$$\begin{aligned} {\mathbf {U}}_{ij}(t) = \frac{1}{\varDelta V_{ij}}\int _{I_{ij}}{\mathbf {U}}(\varepsilon ,x,t)\,d\varepsilon \,dx \quad \text {and}\quad \varDelta V_{ij} = \int _{I_{ij}}\sqrt{\gamma }\,\varepsilon ^{2}d\varepsilon dx, \end{aligned}$$
(412)

with \({\mathbf {S}}_{ij}\) and \({\mathbf {C}}_{ij}\) defined analogously, and the fluxes defined as

$$\begin{aligned} {\mathbf {F}}_{ij\pm \frac{1}{2}}^{x}(t)&= \int _{I_{i}^{\varepsilon }}{\mathbf {F}}^{x}(\varepsilon ,x_{j\pm \frac{1}{2}},t)\,d\varepsilon , \end{aligned}$$
(413)
$$\begin{aligned} {\mathbf {F}}_{i\pm \frac{1}{2},j}^{\varepsilon }(t)&= \int _{I_{j}^{x}}{\mathbf {F}}^{\varepsilon }(\varepsilon _{i\pm \frac{1}{2}},x,t)\,dx. \end{aligned}$$
(414)

In Eq. (411), the temporal dimension has been left continuous (semi-discrete). Moreover, the equation is still exact. Approximations enter with the specification of the fluxes in Eqs. (413) and (414), and the integrals to evaluate the sources \({\mathbf {S}}_{ij}\) and \({\mathbf {C}}_{ij}\). These approximations ultimately result in phase-space discretization errors. With these specifications, the approximate system in Eq. (411) can be viewed as a system of ordinary differential equations (ODEs), which can be integrated forward in time with an ODE solver, which introduces temporal discretization errors. This discretization approach is called the method of lines (MOL).

6.5.1 Spatial discretization

The spatial fluxes in Eq. (413) can be approximated with an appropriate numerical flux:

$$\begin{aligned} {\mathbf {F}}_{ij+\frac{1}{2}}^{x}(t) \approx \varDelta \varepsilon _{i}\,\widehat{{\mathbf {F}}^{x}}\big ({\mathbf {U}}(\varepsilon _{i},x_{j+\frac{1}{2}}^{-},t),{\mathbf {U}}(\varepsilon _{i},x_{j+\frac{1}{2}}^{+},t)\big ), \end{aligned}$$
(415)

where \({\mathbf {U}}(\varepsilon _{i},x_{j+\frac{1}{2}}^{\pm },t)\) is an approximation of \({\mathbf {U}}\) to the immediate left and right of the cell interface located at \(x_{j+\frac{1}{2}}\) (\(x_{j+\frac{1}{2}}^{\pm }=\lim _{\delta \rightarrow 0^{+}}x_{j+\frac{1}{2}}\pm \delta \)). [In Eq. (415), the midpoint rule is used to approximate the integral, but a more accurate quadrature rule can be used if desired.] Two things must be defined when computing the interface fluxes: (1) the procedure to reconstruct the “left” and “right” states, and (2) the numerical flux function \(\widehat{{\mathbf {F}}^{x}}\). The reconstruction step for radiation variables is essentially identical to that used for hydrodynamics schemes: a polynomial of degree k is reconstructed from the evolved quantities (cell averages). To this end, the accuracy of the numerical method depends in part on the degree of the reconstructed polynomial, and the desired polynomial degree impacts the width of the computational stencil, since values in \(k+1\) cells are needed to reconstruct a polynomial of degree k. The most commonly used methods are monotonized piecewise linear (van Leer 1974; LeVeque 1992) and piecewise parabolic methods (Colella and Woodward 1984), as well as higher order monotonicity preserving (MP) (Suresh and Huynh 1997) and weighted essentially nonoscillatory (WENO) reconstruction methods (Liu et al. 1994; Shu 1998). Monotonicity constraints are placed on the reconstruction polynomial to ensure nonoscillatory solutions around discontinuities. For fluid variables, the numerical flux function can be computed with a standard Riemann solver; e.g., HLL (Harten et al. 1983) or HLLC (Toro et al. 1994). However, when using finite-volume or finite-difference methods to solve for the radiation moments, specification of the numerical flux requires additional care. As elucidated by the analysis in Audit et al. (2002) in the context of the \({\mathscr {O}}(v/c)\) limit of the energy integrated (gray) Lagrangian two-moment model presented in Sect. 4.7.3, in the asymptotic diffusion limit (characterized by a short neutrino mean free path) the inherent numerical dissipation associated with the numerical flux used for hyperbolic systems overwhelms the physical radiative diffusive flux and leads to spurious evolution unless the mean free path is resolved by the spatial grid. We discuss this important issue further below (see also Jin and Levermore 1996; Lowrie and Morel 2001, for discussions on this topic). Since it is not practical to resolve the neutrino mean free path in core-collapse supernova simulations, the numerical fluxes for the radiation moment equations are modified to better capture the evolution in diffusive regimes. Following Audit et al. (2002) and O’Connor and Ott (2013) propose the following modified HLL numerical fluxes for the two-moment model for neutrino transport (see also Kuroda et al. 2016):

$$\begin{aligned} \widehat{F_{{\mathscr {E}}_{s}}^{x}}\big ({\mathbf {U}}_{{\textsc {L}}},{\mathbf {U}}_{{\textsc {R}}}\big )&=\frac{\lambda ^{+}F_{{\mathscr {E}}_{s}}^{x}({\mathbf {U}}_{{\textsc {L}}})+\lambda ^{-}F_{{\mathscr {E}}_{s}}^{x}({\mathbf {U}}_{{\textsc {R}}})-\xi \lambda ^{-}\lambda ^{+}\big (({\mathscr {E}}_{s})_{{\textsc {R}}}-({\mathscr {E}}_{s})_{{\textsc {L}}}\big )}{\lambda ^{-}+\lambda ^{+}} , \end{aligned}$$
(416)
$$\begin{aligned} \widehat{F_{{\mathscr {S}}_{s,j}}^{x}}\big ({\mathbf {U}}_{{\textsc {L}}},{\mathbf {U}}_{{\textsc {R}}}\big )&=\frac{\xi ^{2}\big (\lambda ^{+}F_{{\mathscr {S}}_{s,j}}^{x}({\mathbf {U}}_{{\textsc {L}}})+\lambda ^{-}F_{{\mathscr {S}}_{s,j}}^{x}({\mathbf {U}}_{{\textsc {R}}})\big )-\xi \lambda ^{-}\lambda ^{+}\big (({\mathscr {S}}_{s,j})_{{\textsc {R}}}-({\mathscr {S}}_{s,j})_{{\textsc {L}}}\big )}{\lambda ^{-}+\lambda ^{+}} \nonumber \\&\quad +(1-\xi ^{2})\,\frac{1}{2}\,\big (\,F_{{\mathscr {S}}_{s,j}}^{x}({\mathbf {U}}_{{\textsc {L}}})+F_{{\mathscr {S}}_{s,j}}^{x}({\mathbf {U}}_{{\textsc {R}}})\,\big ), \end{aligned}$$
(417)

where \(F_{{\mathscr {E}}_{s}}^{x}\) and \(F_{{\mathscr {S}}_{s,j}}^{x}\) are the radiation energy and momentum spatial fluxes, respectively, and \(\lambda ^{-}\) and \(\lambda ^{+}\) are estimates of the largest (absolute) eigenvalues for left-going and right-going waves, respectively (see, e.g., Shibata et al. 2011, for explicit expressions of estimates). In the modified numerical fluxes in Eqs. (416) and (417), \(\xi \) is a local parameter depending on the ratio of the neutrino mean free path to the local grid size:

$$\begin{aligned} \xi = \min \big (1,\lambda _{ij}/\varDelta x_{j}\big ), \end{aligned}$$
(418)

where \(\lambda _{ij}\) is a local, energy-dependent neutrino mean free path (computed from the neutrino opacities). Thus, when the mean free path is much smaller than a grid cell (\(\xi \rightarrow 0\)), the numerical dissipation term (proportional to the jump in the conserved variables across the interface) vanishes, and the numerical flux switches to an average of the fluxes evaluated with the left and right states (a similar approach is also taken in Just et al. 2015; Skinner et al. 2019). It should be noted that the average flux is appropriate for solving parabolic equations, but is in general unstable for hyperbolic equations (e.g., LeVeque 1992).

To further illustrate the issue with the numerical flux, and to see how the modifications in Eqs. (416)–(417) help, it is easiest to consider the reduced system

$$\begin{aligned} \partial _{t}{{\mathscr {J}}}+\partial _{x}{{\mathscr {H}}}&= 0, \end{aligned}$$
(419)
$$\begin{aligned} \partial _{t}{{\mathscr {H}}}+\partial _{x}{{\mathscr {K}}}&=-\frac{1}{\lambda }\,{\mathscr {H}} , \end{aligned}$$
(420)

where

$$\begin{aligned} \big \{\,{\mathscr {J}},{\mathscr {H}},{\mathscr {K}}\,\big \}(x,t) = \frac{1}{2}\int _{-1}^{1}f(\mu ,x,t)\,\mu ^{\{0,1,2\}}\,d\mu , \end{aligned}$$
(421)

and \(\lambda \) is the scattering mean free path. When scattering events are frequent (\(\lambda \rightarrow 0\)), the system in Eqs. (419)–(420) limits to parabolic behavior governed by

$$\begin{aligned} \partial _{t}{{\mathscr {J}}}+\partial _{x}{{\mathscr {H}}}=0 \quad \text {and}\quad {\mathscr {H}}=-\frac{\lambda }{3}\partial _{x}{{\mathscr {J}}} \quad \Rightarrow \quad \partial _{t}{{\mathscr {J}}}-\frac{\lambda }{3}\,\partial _{xx}{\mathscr {J}}=0, \end{aligned}$$
(422)

which is referred to as the diffusion limit. The semi-discrete form of Eqs. (419)–(420) can be written as

$$\begin{aligned} d_{t}{\mathscr {J}}_{i}+\frac{1}{\varDelta x}\Big (\,\widehat{{\mathscr {H}}}_{i+\frac{1}{2}}-\widehat{{\mathscr {H}}}_{i-\frac{1}{2}}\,\Big )&=0, \end{aligned}$$
(423)
$$\begin{aligned} d_{t}{\mathscr {H}}_{i}+\frac{1}{\varDelta x}\Big (\,\widehat{{\mathscr {K}}}_{i+\frac{1}{2}}-\widehat{{\mathscr {K}}}_{i-\frac{1}{2}}\,\Big )&=-\frac{1}{\lambda }\,{\mathscr {H}}_{i}. \end{aligned}$$
(424)

With constant reconstruction, which results in first-order spatial accuracy, the numerical fluxes in Eqs. (416)–(417) at the \(x_{i+\frac{1}{2}}\) interface become

$$\begin{aligned} \widehat{{\mathscr {H}}}_{i+\frac{1}{2}}&=\frac{1}{2}\Big (\,{\mathscr {H}}_{i+1}+{\mathscr {H}}_{i}-\xi \,\big (\,{\mathscr {J}}_{i+1}-{\mathscr {J}}_{i}\,\big )\,\Big ), \end{aligned}$$
(425)
$$\begin{aligned} \widehat{{\mathscr {K}}}_{i+\frac{1}{2}}&=\frac{1}{2}\Big (\,{\mathscr {K}}_{i+1}+{\mathscr {K}}_{i}-\xi \,\big (\,{\mathscr {H}}_{i+1}-{\mathscr {H}}_{i}\,\big )\,\Big ), \end{aligned}$$
(426)

where for simplicity we set \(\lambda ^{+}=\lambda ^{-}=1\) (i.e., the global Lax-Friedrichs flux). By ignoring the time derivative term in Eq. (424) and using the numerical flux in Eq. (426) with \({\mathscr {K}}={\mathscr {J}}/3\), one can write

$$\begin{aligned} {\mathscr {H}}_{i}&= - \mathrm {Kn}\,\frac{1}{2}\,\Big (\,\frac{1}{3}\,\big (\,{\mathscr {J}}_{i+1}-{\mathscr {J}}_{i-1}\,\big )-\xi \,\big (\,{\mathscr {H}}_{i-1}-2\,{\mathscr {H}}_{i}+{\mathscr {H}}_{i+1}\,\big )\,\Big ), \nonumber \\&\approx - \mathrm {Kn}\,\frac{1}{2}\,\frac{1}{3}\,\big (\,{\mathscr {J}}_{i+1}-{\mathscr {J}}_{i-1}\,\big ), \end{aligned}$$
(427)

where we have introduced the Knudsen number \(\mathrm {Kn}=\lambda /\varDelta x\), the ratio of the mean free path to the spatial grid size. In Eq. (427), we ignored the numerical dissipation term because in the diffusion limit \(|{\mathscr {H}}|\ll {\mathscr {J}}\). Then, inserting the numerical flux, Eq. (425), using Eq. (427), into Eq. (423) gives the approximate semi-discrete form of Eq. (419) in the diffusion limit:

$$\begin{aligned}&d_{t}{\mathscr {J}}_{i} -\frac{1}{(2\varDelta x)^{2}} \Big [\, \frac{\lambda }{3}\Big ({\mathscr {J}}_{i-2}-2\,{\mathscr {J}}_{i}+{\mathscr {J}}_{i+2}\Big ) \nonumber \\&\qquad +\min (\lambda ,\varDelta x)\Big ({\mathscr {J}}_{i-1}-2\,{\mathscr {J}}_{i}+{\mathscr {J}}_{i+1}\Big ) \,\Big ]=0, \end{aligned}$$
(428)

which is an approximation to the diffusion equation in Eq. (422). Note that the last term on the left-hand side of Eq. (428) is due to the numerical dissipation term (proportional to \(\xi \)) in Eq. (425). Because of the introduction of \(\xi \) in Eq. (425), Eq. (428) remains a reasonable approximation to a diffusion equation with the correct diffusion coefficient \(\lambda /3\), even as \(\lambda \ll \varDelta x\). Without the modification to the numerical flux (i.e., \(\xi =1\) independent of \(\lambda \)), we would obtain Eq. (428) with \(\min (\lambda ,\varDelta x)\rightarrow \varDelta x\). In this case the numerical diffusion term would overwhelm the physical diffusion term when \(\lambda \ll \varDelta x\), and result in spurious evolution. Note, in this simplified discussion, where we assumed constant spatial reconstruction, the numerical dissipation term is of the same order of magnitude as the physical dissipation term, and still contributes to the diffusive evolution. With higher-order accurate spatial reconstruction, the relative contribution of this term decreases. Also note that in arriving at Eq. (428), we only relied on the modification to the numerical flux in the energy equation, as is done by Skinner et al. (2019). Finally, note that in the physical diffusion term in Eq. (428), the second derivative is approximated with a wide stencil, which supports a mode with odd-even point decoupling (Lowrie and Morel 2001).

6.5.2 Energy discretization

Next we consider the approximation of the energy fluxes in Eq. (414), which contribute to shifts in the neutrino energy spectrum due to gravitational and moving fluid effects. Müller et al. (2010), who solved the Lagrangian two-moment model in Sect. 4.7.3, developed a method to compute the energy fluxes that is inherently number conservative; i.e., with this discretization of the energy derivative, the energy equation in the Lagrangian two-moment model in Eq. (116) is consistent with the equation for number conservation in Eq. (123) at the discrete level. A key observation in achieving this is that the number conservation equation is obtained by multiplying the Lagrangian energy equation with a factor \(1/\varepsilon \). At the continuum level, when this factor is brought inside the energy derivative, the remainder cancels with the first term on the right-hand side of Eq. (116), resulting in the conservative number equation in Eq. (123). The relevant equation is given by considering only the energy derivative and the (non-collisional) source term in Eq. (116) [cf. Eq. (B1) in Müller et al. 2010]:

$$\begin{aligned} \partial _{t}{J}+\partial _{\varepsilon }{}\big (\,\varepsilon \,F_{J}\,\big )=F_{J}, \end{aligned}$$
(429)

where we introduce the shorthand notation

$$\begin{aligned} J = \sqrt{\gamma }\,\varepsilon ^{2}\,\big (\,W{\mathscr {J}}+v^{i}{\mathscr {H}}_{i}\,\big ) \quad \text {and}\quad F_{J} = - \alpha \,\sqrt{\gamma }\,\varepsilon ^{2}\,{\mathscr {T}}^{\mu \nu }\nabla _{\mu }u_{\nu }. \end{aligned}$$
(430)

Dividing Eq. (429) by \(\varepsilon \) gives the conservation equation:

$$\begin{aligned} \partial _{t}{N}+\partial _{\varepsilon }{}\big (\,F_{J}\,\big )=0, \end{aligned}$$
(431)

where \(N=J/\varepsilon \) is the spectral Eulerian number density [cf. Eq. (123)].

Similar to Eq. (411), the semi-discrete form of Eq. (429) can be written as

$$\begin{aligned} \frac{d J_{i}}{d t} =-\frac{1}{\varDelta \varepsilon _{i}}\big (\,\varepsilon _{i+\frac{1}{2}}{\widehat{F_{J}}}_{i+\frac{1}{2}}-\varepsilon _{i-\frac{1}{2}}{\widehat{F_{J}}}_{i-\frac{1}{2}}\,\big ) + {F_{J}}_{i}, \end{aligned}$$
(432)

where \({\widehat{F_{J}}}_{i\pm \frac{1}{2}}\) are the numerical flux functions to be determined. (Here we drop the spatial index j to simplify the notation.) Dividing Eq. (432) by \(\varepsilon _{i}\) and defining \(N_{i}=J_{i}/\varepsilon _{i}\) gives a provisionary semi-discrete form of Eq. (431):

$$\begin{aligned} \frac{d N_{i}}{d t}&=-\frac{1}{\varDelta \varepsilon _{i}} \big (\, \frac{\varepsilon _{i+\frac{1}{2}}}{\varepsilon _{i}}{\widehat{F_{J}}}_{i+\frac{1}{2}} -\frac{\varepsilon _{i-\frac{1}{2}}}{\varepsilon _{i}}{\widehat{F_{J}}}_{i-\frac{1}{2}} \,\big ) + \frac{{F_{J}}_{i}}{\varepsilon _{i}} \nonumber \\&=-\frac{1}{\varDelta \varepsilon _{i}} \big (\, {\widehat{F_{J}}}_{i+\frac{1}{2}} - {\widehat{F_{J}}}_{i+\frac{1}{2}} \,\big ) -\frac{(\varepsilon _{i+\frac{1}{2}}-\varepsilon _{i})}{\varDelta \varepsilon _{i}}\frac{{\widehat{F_{J}}}_{i+\frac{1}{2}}}{\varepsilon _{i}} -\frac{(\varepsilon _{i}-\varepsilon _{i-\frac{1}{2}})}{\varDelta \varepsilon _{i}}\frac{{\widehat{F_{J}}}_{i-\frac{1}{2}}}{\varepsilon _{i}} + \frac{{F_{J}}_{i}}{\varepsilon _{i}}. \end{aligned}$$
(433)

Without specifying the numerical fluxes \(\widehat{F_{J}}_{i\pm \frac{1}{2}}\), the last three terms in the second line of Eq. (433) do in general not cancel, and the neutrino number density is not conserved in the energy advection step, which is contrary to what is suggested by Eq. (431). However, there is some freedom in choosing the numerical fluxes. To determine the numerical fluxes, Müller et al. (2010) demand total number conservation upon integration of Eq. (433) over all energy bins; i.e.,

$$\begin{aligned} 0=d_{t}N_{{\textsc {Tot}}}&\equiv \sum _{i=1}^{N_{\varepsilon }}\frac{d N_{i}}{d t}\,\varDelta \varepsilon _{i} =-\sum _{i=1}^{N_{\varepsilon }} \left\{ \,\frac{\varepsilon _{i+\frac{1}{2}}}{\varepsilon _{i}}{\widehat{F_{J}}}_{i+\frac{1}{2}} -\frac{\varepsilon _{i-\frac{1}{2}}}{\varepsilon _{i}}{\widehat{F_{J}}}_{i-\frac{1}{2}} -\frac{\varDelta \varepsilon _{i}}{\varepsilon _{i}}{F_{J}}_{i}\,\right\} \nonumber \\&=-\sum _{i=1}^{N_{\varepsilon }}\left\{ \,\Big (\,\frac{1}{\varepsilon _{i}}-\frac{1}{\varepsilon _{i+1}}\,\Big )\,\varepsilon _{i+\frac{1}{2}}\,{\widehat{F_{J}}}_{i+\frac{1}{2}}-\frac{\varDelta \varepsilon _{i}}{\varepsilon _{i}}\,{F_{J}}_{i}\,\right\} , \end{aligned}$$
(434)

where zero flux energy space boundaries are assumed (i.e., \({\widehat{F_{J}}}_{\frac{1}{2}}={\widehat{F_{J}}}_{N_{\varepsilon }+\frac{1}{2}}=0\)). Next, the numerical flux is split into “left” and “right” contributions

$$\begin{aligned} {\widehat{F_{J}}}_{i+\frac{1}{2}} = {F_{J}^{{\textsc {L}}}}_{i} + {F_{J}^{{\textsc {R}}}}_{i+1}, \end{aligned}$$
(435)

so that the change in the total number density can be written as (assuming \(\varepsilon _{\frac{1}{2}}=0\) and setting \({F_{J}^{{\textsc {R}}}}_{N_{\varepsilon }+1}=0\))

$$\begin{aligned} d_{t}N_{{\textsc {Tot}}}=-\sum _{i=1}^{N_{\varepsilon }} \left\{ \, \Big (\frac{1}{\varepsilon _{i}}-\frac{1}{\varepsilon _{i+1}}\Big )\,\varepsilon _{i+\frac{1}{2}}\,{F_{J}^{{\textsc {L}}}}_{i} +\Big (\frac{1}{\varepsilon _{i-1}}-\frac{1}{\varepsilon _{i}}\Big )\,\varepsilon _{i-\frac{1}{2}}\,{F_{J}^{{\textsc {R}}}}_{i} -\frac{\varDelta \varepsilon _{i}}{\varepsilon _{i}}\,{F_{J}}_{i} \,\right\} . \end{aligned}$$
(436)

Number conservation is then obtained by demanding

$$\begin{aligned} \left( \frac{1}{\varepsilon _{i}}-\frac{1}{\varepsilon _{i+1}}\right) \,\varepsilon _{i+\frac{1}{2}}\,{F_{J}^{{\textsc {L}}}}_{i} +\left( \frac{1}{\varepsilon _{i-1}}-\frac{1}{\varepsilon _{i}}\right) \,\varepsilon _{i-\frac{1}{2}}\,{F_{J}^{{\textsc {R}}}}_{i} =\frac{\varDelta \varepsilon _{i}}{\varepsilon _{i}}\,{F_{J}}_{i}. \end{aligned}$$
(437)

Furthermore, Müller et al. (2010) introduce

$$\begin{aligned} \varepsilon _{i+\frac{1}{2}}\,{F_{J}^{{\textsc {L}}}}_{i}&=\frac{\varDelta \varepsilon _{i}}{1-\varepsilon _{i}\varepsilon _{i+1}^{-1}}\,{F_{J}}_{i}\,\xi _{i}, \end{aligned}$$
(438)
$$\begin{aligned} \varepsilon _{i-\frac{1}{2}}\,{F_{J}^{{\textsc {R}}}}_{i}&=\frac{\varDelta \varepsilon _{i}}{\varepsilon _{i}\varepsilon _{i-1}^{-1}-1}\,{F_{J}}_{i}\,(1-\xi _{i}), \end{aligned}$$
(439)

where \(\xi _{i}\) is a local weighting factor

$$\begin{aligned} \xi _{i}=\frac{j_{i+\frac{1}{2}}^{\sigma }}{j_{i-\frac{1}{2}}^{\sigma }+j_{i+\frac{1}{2}}^{\sigma }} \quad \text {and}\quad 1-\xi _{i}=\frac{j_{i-\frac{1}{2}}^{\sigma }}{j_{i-\frac{1}{2}}^{\sigma }+j_{i+\frac{1}{2}}^{\sigma }} , \end{aligned}$$
(440)

depending on the zeroth moment (j) of the distribution function at cell interfaces, \(j_{i-\frac{1}{2}}^{\sigma }\) and \(j_{i+\frac{1}{2}}^{\sigma }\), which are computed as weighted geometric means of j using values from adjacent energy bins. In regions where \(J_{i}\) varies modestly with i, \(\xi _{i}\) is close to 1/2, while in the high-energy tail of the neutrino spectrum, where \(J_{i}\) decreases rapidly with increasing i, \(\xi _{i}\ll 1\) (see Appendix B in Müller et al. 2010, for further details). Then, using the split in Eq. (435), the numerical flux, e.g., at interface \(\varepsilon _{i+\frac{1}{2}}\), to be used in Eq. (432) is given by

$$\begin{aligned} \varepsilon _{i+\frac{1}{2}}{\widehat{F_{J}}}_{i+\frac{1}{2}}&=\varepsilon _{i+\frac{1}{2}}\,{F_{J}^{{\textsc {L}}}}_{i} + \varepsilon _{i+\frac{1}{2}}\,{F_{J}^{{\textsc {R}}}}_{i+1} \nonumber \\&=\frac{\varDelta \varepsilon _{i}}{1-\varepsilon _{i}\varepsilon _{i+1}^{-1}}\,{F_{J}}_{i}\,\xi _{i} + \frac{\varDelta \varepsilon _{i+1}}{\varepsilon _{i+1}\varepsilon _{i}^{-1}-1}\,{F_{J}}_{i+1}\,(1-\xi _{i+1}). \end{aligned}$$
(441)

For a commonly used geometrically progressing grid where \(\varepsilon _{i+\frac{1}{2}}=\varDelta \varepsilon _{1}\,\lambda ^{i-1}\) (where \(\lambda >1\) and \(i=1,\ldots ,N_{\varepsilon }\)), it can be shown that \(\varDelta \varepsilon _{i}/(1-\varepsilon _{i}\varepsilon _{i+1}^{-1})=\varDelta \varepsilon _{i+1}/(\varepsilon _{i+1}\varepsilon _{i}^{-1}-1)=\varepsilon _{i+\frac{1}{2}}\), so that the numerical flux can be written as

$$\begin{aligned} {\widehat{F_{J}}}_{i+\frac{1}{2}}\big ({F_{J}}_{i},{F_{J}}_{i+1}\big ) = {F_{J}}_{i}\,\xi _{i} + {F_{J}}_{i+1}\,(1-\xi _{i+1}), \end{aligned}$$
(442)

which is simply a weighted average with nonlinear weights \(\xi _{i}\) and \((1-\xi _{i+1})\). If \(\xi _{i},\xi _{i+1}>0\) and \(\xi _{i}+\xi _{i+1}=1\), the numerical flux is a convex combination of \({F_{J}}_{i}\) and \({F_{J}}_{i+1}\), but this is not guaranteed. Although the numerical flux in Eq. (441) was developed by Müller et al. (2010) to ensure neutrino number conservation in the context of the Lagrangian two-moment model, the same approach has also been applied to the Eulerian two-moment model by O’Connor (2015) and Kuroda et al. (2016). [It is not at all clear that the approach developed by Müller et al. (2010) in the context of the Lagrangian two-moment model results in a number conservative scheme when applied to the Eulerian two-moment model. In the Lagrangian two-moment model, the spectral neutrino number and energy equations are related simply by a factor of \(1/\varepsilon \), whereas in an Eulerian two-moment model, the relationship is more complex, involving both the spectral neutrino energy and momentum equations (cf. Endeve et al. 2012; Cardall et al. 2013b).] We also note that the numerical flux in Eq. (441) is also used by Just et al. (2015), who solve the Lagrangian two-moment model in the \({\mathscr {O}}(v/c)\) limit.

A few remarks should be made about the numerical flux in Eq. (442). First, a numerical flux is said to be consistent if, when the two arguments are set to be equal, it reduces to the common value; i.e., when \({F_{J}}_{i}={F_{J}}_{i+1}={F_{J}}\) the following holds:

$$\begin{aligned} {\widehat{F_{J}}}_{i+\frac{1}{2}}\big ({F_{J}},{F_{J}}\big )={F_{J}}. \end{aligned}$$
(443)

Consistency of the numerical flux is generally required for a numerical method to be convergent (Crandall and Majda 1980; LeVeque 2002). Since it is not guaranteed that \(\xi _{i}+\xi _{i+1}=1\), the numerical flux in Eq. (442) is not consistent. Second, if one sets \(\xi _{i}=1/2\,\forall i\) (which makes it consistent), the numerical flux in Eq. (442) reduces to a simple arithmetic average, which is known to be notoriously unstable when combined with explicit time integration (e.g., LeVeque 2002).

Skinner et al. (2019), who also solve the Lagrangian two-moment model in the \({\mathscr {O}}(v/c)\) limit, follow a different approach adapted from Vaytet et al. (2011). In this case, assuming Cartesian coordinates for simplicity, the evolved quantity and the flux in energy space in the neutrino energy equation [cf. Eq. (429)] are given by

$$\begin{aligned} J = \varepsilon ^{2}{\mathscr {J}} \quad \text {and}\quad F_{J} = - \varepsilon ^{2}{\mathscr {K}}^{i}_{j}\,\partial _{i}{v^{j}}, \end{aligned}$$
(444)

where \({\mathscr {K}}^{i}_{j}\) is the radiation stress tensor [cf. Eq. (130)] and \(v^{i}\) are components of the fluid three-velocity. Similarly, the evolved quantity and flux in energy space from the neutrino momentum equation are given by

$$\begin{aligned} H_{k} = \varepsilon ^{2}{\mathscr {H}}_{k} \quad \text {and}\quad F_{H_{k}} = - \varepsilon ^{2}{\mathscr {L}}^{i}_{jk}\,\partial _{i}{v^{j}}, \end{aligned}$$
(445)

where \({\mathscr {L}}^{i}_{jk}\) is the heat flux tensor in Eq. (133). With \({\mathbf {u}}=\big (\,J,H_{k}\,\big )^{T}\) and \({\mathbf {f}}^{\varepsilon }({\mathbf {u}})=\big (\,H_{k},F_{H_{k}}\,\big )^{T}\), the subsystem to be solved is then given by

$$\begin{aligned} \partial _{t}{{\mathbf {u}}}+\partial _{\varepsilon }{\big (\,\varepsilon \,{\mathbf {f}}^{\varepsilon }({\mathbf {u}})\big )} = 0, \end{aligned}$$
(446)

which is a familiar advection-type equation. For the energy equation, the numerical flux in energy space is then given by

$$\begin{aligned} {\widehat{F_{J}}}_{i+\frac{1}{2}} = - \varepsilon _{i+\frac{1}{2}}^{2}\,\widehat{{\mathscr {K}}}^{i}_{j}(\varepsilon _{i+\frac{1}{2}})\,\partial _{i}{v^{j}}, \end{aligned}$$
(447)

where an upwind approach is used to compute

$$\begin{aligned} \widehat{{\mathscr {K}}}^{i}_{j}(\varepsilon _{i+\frac{1}{2}}) =\left\{ \begin{array}{ll} {\mathscr {K}}^{i}_{j}(\varepsilon _{i+\frac{1}{2}}^{-}), &{}\quad {\text {if}}\; \partial _{i}{v^{j}} < 0\\ {\mathscr {K}}^{i}_{j}(\varepsilon _{i+\frac{1}{2}}^{+}), &{}\quad {\text {if}}\; \partial _{i}{v^{j}} \ge 0. \end{array} \right. \end{aligned}$$
(448)

A similar expression is used for the energy-space fluxes in the radiation momentum equation. The eigenvalues of the flux Jacobian \(\partial {\mathbf {f}}^{\varepsilon }/\partial {\mathbf {u}}\) associated with the reduced system of equations governing the “advection” in energy space are always of the same sign (Vaytet et al. 2011). This is one motivation for using the upwind flux. Although the numerical flux in Eq. (447) does not necessarily lead to exact number conservation (as is the case for the corresponding numerical flux developed by Müller et al. 2010), the upwind flux has desirable properties that can improve numerical stability [e.g., the upwind flux is consistent and can be used to design monotone numerical schemes (cf. Crandall and Majda 1980; LeVeque 1992)].

6.5.3 Time integration approaches

After the specification of approximations to the terms on the right-hand side of Eq. (411), the system is evolved in time with an ODE solver. When solving the general relativistic radiation hydrodynamics system, Kuroda et al. (2016) write the resulting ODE system in the following form:

$$\begin{aligned} \frac{d {\mathbf {U}}}{d t} + {\mathbf {S}}_{\text{ adv,s }} + {\mathbf {S}}_{\text{ avd,e }} + {\mathbf {S}}_{\text{ grv }} + {\mathbf {S}}_{\nu \text{ m }} = 0, \end{aligned}$$
(449)

where the spatial advection term \({\mathbf {S}}_{\text{ adv,s }}\), the energy advection term \({\mathbf {S}}_{\text{ avd,e }}\), the gravitational source term \({\mathbf {S}}_{\text{ grv }}\), and the neutrino–matter interaction term \({\mathbf {S}}_{\nu \text{ m }}\) correspond to the terms on the right-hand side of Eq. (411). (Here we omit phase-space indices for brevity.) In their time integration scheme, Kuroda et al. (2016) evaluate the spatial advection and gravitational source terms explicitly, while the energy advection and neutrino–matter interaction terms are evaluated implicitly:

$$\begin{aligned} \frac{{\mathbf {U}}^{*}-{\mathbf {U}}^{n}}{\varDelta t}&+ {\mathbf {S}}_{\text{ adv,s }}^{n} + {\mathbf {S}}_{\text{ grv }}^{n} = 0, \end{aligned}$$
(450)
$$\begin{aligned} \frac{{\mathbf {U}}^{n+1}-{\mathbf {U}}^{*}}{\varDelta t}&+ {\mathbf {S}}_{\text{ avd,e }}^{n+1} + {\mathbf {S}}_{\nu \text{ m }}^{n+1} = 0. \end{aligned}$$
(451)

This splitting is a special case of a more general class of time integration methods referred to as implicit-explicit (IMEX) schemes (Ascher et al. 1997; Pareschi and Russo 2005). The splitting in Eqs. (450)–(451) is first-order accurate in time, while high-order accurate methods have been developed. The main benefit of introducing this split is to avoid a distributed implicit solve, since the spatial advection term couples neighboring cells in space, which can reside on different processing units. On the downside, the time step is restricted by the speed of light, but this is acceptable for relativistic systems. In general, the neutrino–matter interaction term cannot be integrated efficiently in time with explicit methods because the stable time step needed to resolve the governing time scale is many orders of magnitude shorter than that governing the spatial advection term. There is another benefit of integrating the neutrino–matter interaction term separately with implicit methods. These terms are local in space, which makes them easier to parallelize. The energy advection term can be integrated with explicit or implicit methods. Using explicit methods for this term, an additional time step restriction is needed, but this is usually less severe than that introduced by the spatial advection term (e.g., O’Connor 2015; Just et al. 2015). On the other hand, since the neutrino–matter interaction term couple the entire momentum space, including the energy advection term in the implicit update (as is also done by, e.g., Müller et al. 2010), which only couples nearest neighbors in energy, does not add significantly to the computational complexity. One should note that in their Appendix B, Kuroda et al. (2016) report significantly different electron fraction profiles when comparing explicit versus implicit integration of \({\mathbf {S}}_{\text{ avd,e }}\), but the reason for this is not clear.

The implicit solve in Eq. (451) requires the solution of a nonlinear system of equations. To this end, Kuroda et al. (2016), write the system as

$$\begin{aligned} {\mathbf {f}}({\mathbf {P}}^{n+1}) \equiv \frac{{\mathbf {U}}({\mathbf {P}}^{n+1})-{\mathbf {U}}^{*}}{\varDelta t} + {\mathbf {S}}_{\text{ avd,e }}({\mathbf {P}}^{n+1}) + {\mathbf {S}}_{\nu \text{ m }}({\mathbf {P}}^{n+1}) = 0, \end{aligned}$$
(452)

where the unknowns are given by the vector of “primitive” variables:

$$\begin{aligned} {\mathbf {P}} = \big (\,\rho ,\,v_{j},\,s,\,Y_{e},\,{\mathscr {E}}_{1},{\mathscr {F}}_{1,j},\ldots ,{\mathscr {E}}_{N_{{\textsc {Sp}}}},{\mathscr {F}}_{N_{{\textsc {Sp}}},j}\,\big )^{T}. \end{aligned}$$
(453)

To solve the nonlinear system in Eq. (452), Kuroda et al. (2016) employ a Newton-Raphson scheme:

$$\begin{aligned} \frac{\partial {\mathbf {f}}({\mathbf {P}}^{k})}{\partial {\mathbf {P}}}\delta {\mathbf {P}}^{k} = - {\mathbf {P}}^{k} \quad \rightarrow \quad {\mathbf {P}}^{k+1} = {\mathbf {P}}^{k} + \delta {\mathbf {P}}^{k} \end{aligned}$$
(454)

for \(k=0,1,2,\ldots \), with \({\mathbf {P}}^{0}={\mathbf {P}}^{*}\). The iteration is continued until \(|\delta {\mathbf {P}}^{k}|<\text{ tol }\,|{\mathbf {P}}^{k}|\), where the tolerance is typically set to \(\text{ tol }=10^{-8}\). Kuroda et al. (2016) treat the problem fully implicitly, evaluating the neutrino–matter interactions at \(t^{n+1}\), and thus include derivatives of opacities in \({\mathbf {S}}_{\nu \text{ m }}\) with respect to \({\mathbf {P}}\) in the Jacobian \((\partial {\mathbf {f}}/\partial {\mathbf {P}})\). To help convergence in the Newton-Raphson procedure, Kuroda et al. (2016) also monitor the change in total lepton number during iterations (see their Sect. 3.3 for details), which improves the robustness of the method. Note that in the primitive vector in Eq. (453) the radiation quantities are the Eulerian moments \(\big ({\mathscr {E}},{\mathscr {F}}_{j}\big )\), while the closure and the neutrino–matter interaction terms are most naturally expressed in terms of the Lagrangian moments \(\big ({\mathscr {J}},{\mathscr {H}}_{j}\big )\). To evaluate the closure and collision terms during the Newton-Raphson iterations, the Lagrangian moments are kept consistent with the Eulerian moments through the relations:

$$\begin{aligned} {\mathscr {J}}&= u_{\mu }u_{\nu }{\mathscr {T}}^{\mu \nu } = W^{2}\,{\mathscr {E}} - 2\,W\,u_{i}\,{\mathscr {F}}^{i} + u_{i}u_{j}{\mathscr {S}}^{ij}, \end{aligned}$$
(455)
$$\begin{aligned} {\mathscr {H}}_{j}&= - u_{\nu }\,h_{j\mu }{\mathscr {T}}^{\mu \nu } =\big (\,W\,{\mathscr {E}}-u_{k}\,{\mathscr {F}}^{k}\,\big )\,h_{j\mu }\,n^{\mu }+W\,h_{jk}\,{\mathscr {F}}^{k}-u_{i}\,h_{jk}\,{\mathscr {S}}^{ik}. \end{aligned}$$
(456)

The number of iterations needed to reach convergence varies during a simulation. It is at its maximum in the center around core bounce (several tens), but settles down to \(\sim 10\) after the shock stalls.

Just et al. (2015), employing the \({\mathscr {O}}(v/c)\) limit of the Lagrangian two-moment model in Sect. 4.7.3 coupled to non-relativistic hydrodynamics, also use a combination of explicit and implicit methods to integrate the coupled equations in time, but ease the computational cost by treating some interaction terms explicitly. They split the solution vector into radiation variables \({\mathbf {X}}=({\mathscr {J}},{\mathscr {H}}_{j})\) and fluid variables \({\mathbf {U}}=(\rho ,\rho Y_{e},\rho {\mathbf {v}},e_{\mathrm {t}})\), where the total fluid energy density is \(e_{\mathrm {t}}=e_{\mathrm {i}}+\rho v^{2}/2\), and \(e_{\mathrm {i}}\) is the internal energy density. They write the radiation hydrodynamics system as

$$\begin{aligned}&\partial _{t}{{\mathbf {X}}} + \big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {hyp}} + \big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {vel}} = \big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {src}}, \end{aligned}$$
(457)
$$\begin{aligned}&\partial _{t}{{\mathbf {U}}} + \big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {hyd}} = \big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {src}}, \end{aligned}$$
(458)

where in the transport equations, \(\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {hyp}}\) represents the velocity-independent hyperbolic terms, \(\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {vel}}\) represents all the velocity-dependent terms in the transport equations, and \(\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {src}}\) represent neutrino–matter interactions. The phase-space advection terms combine to \(\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {adv}}=\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {hyp}}+\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {vel}}\). In the hydrodynamics equations, \(\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {hyd}}\) represents the non-radiative physics, while \(\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {src}}\) the radiative source terms. For a given time step \(\varDelta t\), when advancing the system from \(t^{n}\) to \(t^{n+1}=t^{n}+\varDelta t\), a ‘predictor’ step to \(t^{n+1/2}=t^{n}+\varDelta t/2\) is performed first:

$$\begin{aligned} {\mathbf {X}}^{n+\frac{1}{2}}&={\mathbf {X}}^{n} + \frac{\varDelta t}{2}\,\Big [\,-\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {hyp}}^{n}+\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {src}}^{n,n+\frac{1}{2}}\,\Big ], \end{aligned}$$
(459)
$$\begin{aligned} {\mathbf {U}}^{n+\frac{1}{2}}&={\mathbf {U}}^{n} + \frac{\varDelta t}{2}\,\Big [\,-\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {hyd}}^{n}+\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {src}}^{n,n+\frac{1}{2}}\,\Big ], \end{aligned}$$
(460)

followed by the ‘corrector’ step:

$$\begin{aligned} {\mathbf {X}}^{n+1}&={\mathbf {X}}^{n} + \varDelta t\,\Big [\,-\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {hyp}}^{n+\frac{1}{2}}+\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {src}}^{n+\frac{1}{2},n+1}\,\Big ], \end{aligned}$$
(461)
$$\begin{aligned} {\mathbf {U}}^{n+1}&={\mathbf {U}}^{n} + \varDelta t\,\Big [\,-\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {hyd}}^{n+\frac{1}{2}}+\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {src}}^{n+\frac{1}{2},n+1}\,\Big ], \end{aligned}$$
(462)

where double superscripts indicate that the source terms can be evaluated using radiation and hydrodynamics variables in the old and the new state. (The implicit neutrino–matter solve can be simplified considerably by time-lagging some terms. See the discussion below.) When comparing with the scheme of Kuroda et al. (2016) in Eqs. (450)–(451), the scheme used by Just et al. (2015) uses two explicit evaluations and two implicit evaluations, instead of one of each. Also note that Kuroda et al. (2016) treat the velocity-dependent terms implicitly in time, while these terms are treated explicitly by Just et al. (2015). While being formally first-order accurate in time, it can be shown that the scheme in Eqs. (459)–(462) is second-order accurate with respect to the explicit part. Except for the use of both old and new variables in the implicit part, it is equivalent to the scheme presented by McClarren et al. (2008).

The prospect of evaluating some variables in the implicit neutrino–matter solve in the old state is potentially rewarding, since this part of the solve usually accounts for the majority of the computational cost in simulations. When doing this, stability and accuracy concerns are important to consider, and this could be investigated with rigorous analysis. Methods with time lagging can be considered unconverged or partially converged implicit methods, and can be quite accurate, but this depends on the chosen time step and the degree of nonlinearity of the problem (see, e.g., Knoll et al. 2001; Lowrie 2004). For stability of the explicit part of the IMEX scheme in Eqs. (459)–(462), an upper bound on the time step is given by the advection time scale \(\tau _{\mathrm {adv}}=\varDelta x/c\approx 3\,\mu \text{ s }\times (\varDelta x/1\,\text{ km})\). On the other hand, the neutrino–matter interaction time scale can be estimated as \(\tau _{\mathrm {int}}=\lambda _{\nu }/c\approx 10\,\text{ ns }\times (\lambda _{\nu }/3\times 10^{-3}\,\text{ km})\), where \(\lambda _{\nu }\) is the neutrino mean-free path (cf. Fig. 4 in Sect. 4.1). In the core of a core-collapse supernova, \(\lambda _{\nu }\approx 3\times 10^{-3}\) km, so that \(\tau _{\mathrm {int}}\ll \tau _{\mathrm {adv}}\), which implies that the neutrino–matter interactions terms should be integrated with implicit methods in order to keep \(\varDelta t/\tau _{\mathrm {adv}}={\mathscr {O}}(1)\). However, \(\tau _{\mathrm {int}}\) should be viewed as the time scale for neutrino–matter equilibration, and neutrinos have practically equilibrated with the matter for densities above \(10^{12}\mathrm {\ g\ cm}^{-3}\). Since in near equilibrium, the matter quantities (i.e., \(\rho \), \(e_{\mathrm {i}}\), and the electron density \(n_{e}\)) evolve on time scales that typically exceed \(\tau _{\mathrm {adv}}\), it is reasonable to ask whether some neutrino opacities, which depend nonlinearly on \(\rho \), \(e_{\mathrm {i}}\), and \(n_{e}\), can be evaluated in a lagged fashion in order to avoid costly reevaluations during an iterative implicit solve. Numerical experiments can give valuable insights into this question. To this end, Just et al. (2015) considered three cases for comparison

  1. (a)

    The radiation moments \({\mathbf {X}}\) and the fluid variables \(e_{\mathrm {i}}\) and \(n_{e}\) appearing in the source terms \(\big (\delta _{t}{\mathbf {X}}\big )_{\mathrm {src}}\) and \(\big (\delta _{t}{\mathbf {U}}\big )_{\mathrm {src}}\) are defined at \(t^{n+1}\). Only the Eddington and heat flux factors (\({\mathfrak {k}}\) and \({\mathfrak {q}}\)) and the coefficients of the Legendre expansion of energy-coupling interactions [e.g., scattering; cf. Eq. (182)] are evaluated at \(t^{n}\).

  2. (b)

    Like case (a), but \(e_{\mathrm {i}}\) and \(n_{e}\) in the source terms are evaluated at \(t^{n}\) for all the opacities. This alleviates the computational cost of recomputing the opacities within the iteration procedure. Iterations are still performed in this case because the radiation moments appearing in the blocking factors are treated implicitly.

  3. (c)

    Like case (b), but all the energy-coupling interactions are treated explicitly in time. This renders the matrix to be inverted in the implicit solve diagonal.

Using case (b) where \(\rho >10^{11}\mathrm {\ g\ cm}^{-3}\) and case (c) for \(\rho \le 10^{11}\mathrm {\ g\ cm}^{-3}\), Just et al. (2015) performed a detailed comparison of their scheme in spherical symmetry with results from Liebendörfer et al. (2005) (obtained with Boltzmann-based codes) for a \(13\,M_{\odot }\) star, and found good agreement. In addition, they computed an additional run with the same physical specifications, but where case (b) was replaced with case (a) for \(\rho >10^{11}\mathrm {\ g\ cm}^{-3}\), and found the results essentially unaltered (see their Fig. 11). See also Just et al. (2018) for an extensive comparison of the two-moment method of Just et al. (2015) with the Prometheus-Vertex code (Rampp and Janka 2002; Buras et al. 2006), and on the impact of various approximate treatments of relevant physics. We also note that O’Connor (2015), who also used explicit treatment of the matter quantities in evaluation the neutrino–matter sources, reported good agreement with Liebendörfer et al. (2005) across many quantities.

After obtaining expressions for the radiation moments, the changes to the fluid momentum and kinetic energy densities due to neutrino–matter interactions are computed as

$$\begin{aligned} \big (\delta _{t}\rho v_{j}\big )_{\mathrm {src}}&= -\sum _{\nu ,\xi }\big (\delta _{t}{\mathscr {H}}_{j,\nu ,\xi }\big )_{\mathrm {src}}, \end{aligned}$$
(463)
$$\begin{aligned} \big (\delta _{t}e_{\mathrm {k}}\big )_{\mathrm {src}}&=-v^{j}\sum _{\nu ,\xi }\big (\delta _{t}{\mathscr {H}}_{j,\nu ,\xi }\big )_{\mathrm {src}}, \end{aligned}$$
(464)

where the sums extend over all neutrino frequencies \(\nu \) and species \(\xi \), and the repeated index on the fluid velocity components \(v^{j}\) imply summation over all spatial dimensions.

Skinner et al. (2019), employing a very similar \({\mathscr {O}}(v/c)\) two-moment model as Just et al. (2015) coupled to non-relativistic hydrodynamics, also use explicit and implicit methods to integrate the coupled equations in time. They only describe their time integration scheme in the context of emission, absorption, and isotropic, isoenergetic scattering. Skinner et al. (2019) write the radiation hydrodynamics system as

$$\begin{aligned} \partial _{t}{Q}+\big (\,{\mathscr {F}}_{Q}^{i}\,\big )_{;i} = S_{\text{ non-stiff }} + S_{\text{ stiff }}, \end{aligned}$$
(465)

where the evolved quantities are \(Q=\big (\rho ,\rho v_{j},\rho e, \rho Y_{e},{\mathscr {J}},{\mathscr {H}}_{j}\big )\), where e is the total specific energy of the gas, and \({\mathscr {J}}\) and \({\mathscr {H}}_{j}\) are respectively the comoving frame spectral radiation energy density and momentum density, representing all species and groups. Components of \({\mathscr {J}}\) and \({\mathscr {H}}_{j}\) are denoted \({\mathscr {J}}_{sg}\) and \({\mathscr {H}}_{j,sg}\), where s denotes neutrino species and g denotes frequency group. In Eq. (465), \(\big (\,{\mathscr {F}}_{Q}^{i}\,\big )_{;i}\) and \(S_{\text{ non-stiff }}\) represent terms from the phase-space advection operator, while \(S_{\text{ stiff }}\) represents neutrino–matter interactions.

Skinner et al. (2019) use operator splitting to integrate the coupled system of equations. The phase-space advection terms are integrated with the optimal second-order SSP-RK scheme of Shu and Osher (1988), while the update due to neutrino–matter interactions is followed by a backward Euler solve. This scheme applied to Eq. (465) can be written as

$$\begin{aligned} Q^{(1)}&= Q^{n} + \varDelta t\,\Big \{\,-\big (\,{\mathscr {F}}_{Q}^{i}\,\big )_{;i}^{n} + S_{\text{ non-stiff }}^{n}\,\Big \}, \end{aligned}$$
(466)
$$\begin{aligned} Q^{-}&=\frac{1}{2}\,Q^{n} + \frac{1}{2}\,\Big [\,Q^{(1)} + \varDelta t\,\Big \{\,-\big (\,{\mathscr {F}}_{Q}^{i}\,\big )_{;i}^{(1)} + S_{\text{ non-stiff }}^{(1)}\,\Big \}\,\Big ], \end{aligned}$$
(467)
$$\begin{aligned} Q^{n+1}&=Q^{-} + \varDelta t\,S_{\text{ stiff }}^{n+1}, \end{aligned}$$
(468)

which requires two evaluations of \(\big (\,{\mathscr {F}}_{Q}^{i}\,\big )_{;i}\) and \(S_{\text{ non-stiff }}\) and one implicit solve to evaluate \(S_{\text{ stiff }}\) per time step.

After the explicit update in Eqs. (466)–(467), a nested iteration scheme is employed, where for each spatial point, the coupled system

$$\begin{aligned} \frac{u^{n+1}-u^{-}}{\varDelta t}&=-\sum _{s}\sum _{g}\big (j_{sg}^{n+1}-\kappa _{sg}^{n+1}{\mathscr {J}}_{sg}^{n+1}\big ), \end{aligned}$$
(469)
$$\begin{aligned} \rho \frac{\big (Y_{e}^{n+1}-Y_{e}^{-}\big )}{\varDelta t}&=\sum _{s}\sum _{g}\xi _{sg}\big (j_{sg}^{n+1}-\kappa _{sg}^{n+1}{\mathscr {J}}_{sg}^{n+1}\big ), \end{aligned}$$
(470)
$$\begin{aligned} \frac{{\mathscr {J}}_{sg}^{n+1}-{\mathscr {J}}_{sg}^{-}}{\varDelta t}&=j_{sg}^{n+1}-\kappa _{sg}^{n+1}{\mathscr {J}}_{sg}^{n+1}, \end{aligned}$$
(471)

is solved for the material internal energy density, u and electron fraction, \(Y_e\)—or equivalently, the temperature, T, and \(Y_{e}\)—and the spectral radiation energy density, \({\mathscr {J}}\). In Eqs. (469)–(471), \(j_{sg}\) and \(\kappa _{sg}\) are the emission and absorption coefficients (depending on \(\rho \), which is fixed in this step, T, and \(Y_{e}\)), and

$$\begin{aligned} \xi _{sg} =\left\{ \begin{array}{ll} -(N_{A}\,\nu )^{-1}, &{}\quad S=\nu _{e} \\ +(N_{A}\,\nu )^{-1}, &{}\quad s={\bar{\nu }}_{e} \\ 0, &{}\quad s=\nu _{x} \end{array} \right. , \end{aligned}$$
(472)

where \(N_{A}\) is Avogadro’s number and \(\nu \) is the neutrino frequency. In the nested iteration scheme, the updates are separated into “inner” and “outer” parts. In the k-th outer iteration, the radiation energy density is updated implicitly in the inner iteration as

$$\begin{aligned} \frac{{\mathscr {J}}_{sg}^{k}-{\mathscr {J}}_{sg}^{-}}{\varDelta t} = j_{sg}^{k-1}-\kappa _{sg}^{k-1}{\mathscr {J}}_{sg}^{k} \quad \Rightarrow \quad {\mathscr {J}}_{sg}^{k} = \frac{{\mathscr {J}}_{sg}^{-}+\varDelta t\,j_{sg}^{k-1}}{1+\varDelta t\,\kappa _{sg}^{k-1}}, \end{aligned}$$
(473)

where the opacities and emissivities are evaluated using \(T^{k-1}\) and \(Y_{e}^{k-1}\) (as an initial guess in the first iteration \(\{T^{0},Y_{e}^{0}\}=\{T^{-},Y_{e}^{-}\}\)). The changes in energy and electron fraction are then computed as

$$\begin{aligned} \varDelta E^{k}&=\sum _{s}\sum _{g}\big ({\mathscr {J}}_{sg}^{k}-{\mathscr {J}}_{sg}^{-}\big ), \end{aligned}$$
(474)
$$\begin{aligned} \varDelta Y_{e}^{k}&=\sum _{s}\sum _{g}\xi _{sg}\big ({\mathscr {J}}_{sg}^{k}-{\mathscr {J}}_{sg}^{-}\big ), \end{aligned}$$
(475)

and the residuals as

$$\begin{aligned} r_{E}^{k}&= u^{k}-u^{-} + \varDelta E^{k}, \end{aligned}$$
(476)
$$\begin{aligned} r_{Y_{e}}^{k}&= \rho \,\big (Y_{e}^{k}-Y_{e}^{-}\big ) - \varDelta Y_{e}^{k}, \end{aligned}$$
(477)

where \(u^{k}=u(T^{k},Y_{e}^{k})\) (the internal energy also depends on \(\rho \), which is fixed in this part of the solve). Then, using a Newton-Raphson technique, the temperature \(T^{k}\) and electron fraction \(Y_{e}^{k}\) are found such that \(r_{E}^{k}=r_{Y_{e}}^{k}=0\). The iteration scheme is terminated when the relative change in temperature and electron fraction, \(\delta T^{k}=|T^{k}-T^{k-1}|/T^{k-1}\) and \(\delta Y_{e}^{k}=|Y_{e}^{k}-Y_{e}^{k-1}|/Y_{e}^{k-1}\), are below a specified tolerance (e.g., \(10^{-6}\)). In this sense, the converged solutions satisfy Eq. (469)–(471). Skinner et al. (2019) report that in practice their iteration procedure converges in a few iterations for a wide range of conditions. An obvious benefit of this nested approach is that nonlinear iterations are performed on a smaller system with only two unknowns (T and \(Y_{e}\)). Note however that modifications to this algorithm are needed if energy coupling interactions such as scattering and pair processes are to be included in an implicit fashion as in cases (a) and (b) from Just et al. (2015) discussed above.

After obtaining \(u^{n+1}\), \(Y_{e}^{n+1}\), \(T^{n+1}\), and \({\mathscr {J}}_{sg}^{n+1}\) by solving Eqs. (469)–(471), the radiation momentum density is updated implicitly as

$$\begin{aligned} \frac{{\mathscr {H}}_{j,sg}^{n+1}-{\mathscr {H}}_{j,sg}^{-}}{\varDelta t}&= -\big (\kappa _{sg}^{n+1}+\sigma _{sg}^{n+1}\big )\,{\mathscr {H}}_{j,sg}^{n+1} \nonumber \\&\Rightarrow {\mathscr {H}}_{j,sg}^{n+1} = \frac{{\mathscr {H}}_{j,sg}^{-}}{1+\varDelta t\,\big (\,\kappa _{sg}^{n+1}+\sigma _{sg}^{n+1}\,\big )}, \end{aligned}$$
(478)

where \(\sigma _{sg}\) is the scattering coefficient. Finally, the fluid momentum and kinetic energy densities (\((\rho v_{j})\) and \((\rho e_{\mathrm {k}})\), respectively) are updated as

$$\begin{aligned} (\rho v_{j})^{n+1}&=(\rho v_{j})^{-} - \sum _{s}\sum _{g}\big ({\mathscr {H}}_{j,sg}^{n+1}-{\mathscr {H}}_{j,sg}^{-}\big ), \end{aligned}$$
(479)
$$\begin{aligned} (\rho e_{\mathrm {k}})^{n+1}&=(\rho e_{\mathrm {k}})^{-} - \sum _{s}\sum _{g}(v^{j})^{-}\big ({\mathscr {H}}_{j,sg}^{n+1}-{\mathscr {H}}_{j,sg}^{-}\big ), \end{aligned}$$
(480)

where, in the last equation, the repeated index j implies summation over spatial dimensions. The total energy density of the gas at \(t^{n+1}\) is then obtained from

$$\begin{aligned} (\rho e)^{n+1}&=u^{n+1} + (\rho e_{\mathrm {k}})^{n+1} \nonumber \\&=\underbrace{\big (u^{-}+(\rho e_{\mathrm {k}})^{-}\big )}_{(\rho e)^{-}} -\sum _{s}\sum _{g}\big ({\mathscr {J}}_{sg}^{n+1}-{\mathscr {J}}_{sg}^{-}\big ) - \sum _{s}\sum _{g}(v^{j})^{-}\big ({\mathscr {H}}_{j,sg}^{n+1}-{\mathscr {H}}_{j,sg}^{-}\big ), \end{aligned}$$
(481)

where Eq. (469), with Eq. (471) inserted, and Eq. (480) are used. Note that Eq. (481) differs from the total energy update listed in Skinner et al. (2019); see their Eq. (32), which is equivalent to Eq. (480), but with \(\rho e_{\mathrm {k}}\rightarrow \rho e\). We believe Eq. (481) is correct in this context since it accounts for changes in internal and kinetic energy due to neutrino–matter interactions.

6.5.4 Lepton number and energy conservation

We end this section on discretization techniques for two-moment models with a discussion on the topic of lepton number and energy conservation. These are conservation laws inherit in the system of equations evolved, and provide a crucial consistency check on the numerical solution. The challenges discussed here in the context of the two-moment model mirror the challenges discussed in Sect. (6.1.3) for Boltzmann transport. The concept of lepton number conservation is easily understood by considering Eqs. (44) and (124), which are evolution equations for the electron density and neutrino number density, respectively. The Eulerian electron number is given by \(N_{e}=D\,Y_{e}/m_{\text{ B }}=W\,n_{e}\), and the Eulerian neutrino lepton number density and lepton number flux density are

$$\begin{aligned} N_{\nu } = \sum _{s=1}^{N_{{\textsc {Sp}}}}{\mathsf {g}}_{s}\,N_{s} \quad \text {and}\quad G_{\nu }^{i} = \sum _{s=1}^{N_{{\textsc {Sp}}}}{\mathsf {g}}_{s}\,G_{s}^{i}, \end{aligned}$$
(482)

respectively. Then, combining Eq. (44), using the source term in Eq. (191), with Eq. (124) results in the conservation law for the total lepton number \(N_{\text{ Lep }}=N_{e}+N_{\nu }\)

$$\begin{aligned} \frac{1}{\alpha \sqrt{\gamma }} \big [\, \partial _{t}{}\big (\,\sqrt{\gamma }\,N_{\text{ Lep }}\,\big ) +\partial _{i}{}\big (\,\sqrt{\gamma }\,\big [\,\alpha \,G_{\text{ Lep }}^{i}-\beta ^{i}\,N_{\text{ Lep }}\,\big ]\,\big ) \,\big ] =0, \end{aligned}$$
(483)

where \(G_{\text{ Lep }}^{i} = N_{e}\,v^{i}+G_{\nu }^{i}\). A similar conservation statement for the total energy is not available in the relativistic case because the matter and neutrino equations governing the evolution of the four-momentum—Eqs. (45), (46), (113), and (114)—are not local conservation laws. Instead, the so-called ADM mass, \(M_{\text{ ADM }}\), (Baumgarte and Shapiro 2010) (defined as a global quantity) is conserved. (See, e.g., Kuroda et al. (2016), their Eq. (71), for a definition applicable to the CCSN context.) In this case, conservation of the ADM mass can be monitored as a consistency check. Kuroda et al. (2016), see their Figure 7, report violations of ADM mass conservation, \(\varDelta M_{\text{ ADM }}\), (i.e., deviations from the initial value) of order \(\varDelta M_{\text{ ADM }}\approx 8\times 10^{50}\) erg early after core bounce. Müller et al. (2010), see their Figure 12, report violations of ADM mass conservation of similar magnitude in a simulation extending beyond 500 ms after core bounce. In their simulation, \(\varDelta M_{\text{ ADM }}\) jumps by about \(5\times 10^{50}\) erg at bounce, and keeps increasing more gradually to \(\varDelta M_{\text{ ADM }}\approx 2\times 10^{51}\) erg at the end of the simulation. This change in the ADM mass is only about 0.5% relative to the initial value.

Müller et al. (2010) argue that the velocity-dependent terms in the transport equations are the most critical terms responsible for the violation of energy (or ADM mass) conservation. To see this, it is illustrative to consider the equations they solve in the special relativistic limit with Cartesian coordinates and no neutrino–matter interactions. Neutrino–matter interactions are entirely local, and lepton number and four-momentum conservation in this sector can be enforced by constraints as in Eqs. (198)–(200). The challenge stems from the discretization of the phase-space advection operators; i.e., the left-hand side of the moment equations. In the special relativistic limit with Cartesian coordinates and no neutrino–matter interactions, the Lagrangian two-moment model corresponding to the one used by Müller et al. (2010) is given by the energy equation [cf. Eq. (116)]

$$\begin{aligned} \partial _{\nu }{}\big (\,\hat{{\mathscr {J}}}u^{\nu }+\hat{{\mathscr {H}}}^{\nu }\,\big ) -\partial _{\varepsilon }{}\big (\,\varepsilon \,\hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{u_{\mu }}\,\big ) =-\hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{u_{\mu }} \end{aligned}$$
(484)

and the momentum equation [cf. Eq. (118)]

$$\begin{aligned} \partial _{\nu }{}\big (\,\hat{{\mathscr {H}}}_{j}\,u^{\nu }+\hat{{\mathscr {K}}}_{j}^{\nu }\,\big ) -\partial _{\varepsilon }{}\big (\,h_{j\rho }\,\hat{{\mathscr {Q}}}^{\rho \mu \nu }\,\partial _{\nu }{u_{\mu }}\,\big ) =\hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{h_{j\mu }}, \end{aligned}$$
(485)

where the “hat” is used to denote that a factor \(\varepsilon ^{2}\) has been absorbed into the definition of the moments; i.e.,

$$\begin{aligned} \big \{\,\hat{{\mathscr {J}}},\hat{{\mathscr {H}}}^{\nu },\hat{{\mathscr {K}}}^{\mu \nu },\ldots \,\big \} =\varepsilon ^{2}\,\big \{\,{\mathscr {J}},{\mathscr {H}}^{\nu },{\mathscr {K}}^{\mu \nu },\ldots \,\big \}. \end{aligned}$$
(486)

Note that neither Eq. (484) nor Eq. (485) are local conservation laws. Therefore, a numerical method based on these equations requires care in the discretization process to achieve neutrino number, energy, and momentum conservation. (Neutrino energy and momentum contribute to the ADM mass.)

First, note that by dividing Eq. (484) by \(\varepsilon \) results in

$$\begin{aligned} \partial _{\nu }{\hat{{\mathscr {N}}}^{\nu }} - \partial _{\varepsilon }{}\big (\,\hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{u_{\mu }}\,\big ) = 0, \end{aligned}$$
(487)

which is a local phase-space conservation law for the spectral number density. In arriving at Eq. (487), the remainder after bringing \(\varepsilon ^{-1}\) inside the energy derivative in Eq. (484) cancels with the right-hand side. This is exactly what the discretization of the energy derivative term developed by Müller et al. (2010) (discussed in Sect. 6.5.2) is designed to do in order to achieve lepton number conservation.

On the other hand,

$$\begin{aligned} -n_{\mu }\hat{{\mathscr {T}}}^{\mu \nu } = \big (\,\hat{{\mathscr {E}}}n^{\nu }+\hat{{\mathscr {F}}}^{\nu }\,\big ) = W\,\big (\,\hat{{\mathscr {J}}}u^{\nu }+\hat{{\mathscr {H}}}^{\nu }\,\big ) + v^{j}\,\big (\,\hat{{\mathscr {H}}}_{j}u^{\nu }+\hat{{\mathscr {K}}}_{j}^{\nu }\,\big ), \end{aligned}$$
(488)

where both the Eulerian and Lagrangian decompositions of \(\hat{{\mathscr {T}}}^{\mu \nu }\) are used; cf. Eqs. (88) and (99), respectively. Thus, by adding W times Eq. (484) and the contraction of \(v^{j}\) with Eq. (485) gives

$$\begin{aligned} \partial _{\nu }{}\big (\,\hat{{\mathscr {E}}}n^{\nu }+\hat{{\mathscr {F}}}^{\nu }\,\big ) -\partial _{\varepsilon }{}\big (\,(-n_{\rho })\,\hat{{\mathscr {Q}}}^{\rho \mu \nu }\,\partial _{\nu }{u_{\mu }}\,\big ) = 0, \end{aligned}$$
(489)

which is a local phase-space conservation law for the spectral energy density. When arriving at Eq. (489), the remainders after bringing W inside the spacetime derivative in Eq. (484) and \(v^{j}\) inside the spacetime derivative of Eq. (485) cancel with the terms due to the sources on the right-hand sides of Eqs. (484) and (485) in a nontrivial way:

$$\begin{aligned}&\big (\,\hat{{\mathscr {J}}}u^{\nu }+\hat{{\mathscr {H}}}^{\nu }\,\big )\,\partial _{\nu }{W} +\big (\,\hat{{\mathscr {H}}}_{j}\,u^{\nu }+\hat{{\mathscr {K}}}_{j}^{\nu }\,\big )\,\partial _{\nu }{v^{j}} -\hat{{\mathscr {T}}}^{\mu \nu }\,\big (\,W\partial _{\nu }{u_{\mu }}-v^{j}\partial _{\nu }{h_{j\mu }}\,\big ) \nonumber \\&\quad =-\big (\,u_{\mu }\,\partial _{\nu }{W}-h_{j\mu }\,\partial _{\nu }{v^{j}}+W\partial _{\nu }{u_{\mu }}-v^{j}\partial _{\nu }{h_{j\mu }}\,\big )\,\hat{{\mathscr {T}}}^{\mu \nu } \nonumber \\&\quad =-\partial _{\nu }{}\big (\,W\,u_{\mu }-h_{j\mu }\,v^{j}\,\big )\,\hat{{\mathscr {T}}}^{\mu \nu } = - \hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{n_{\mu }} = 0, \end{aligned}$$
(490)

since, in special relativity, \(n_{\mu }=(-1,0,0,0)\). Similarly,

$$\begin{aligned} \gamma _{j\mu }\,\hat{{\mathscr {T}}}^{\mu \nu } = \big (\,\hat{{\mathscr {F}}}_{j}\,n^{\nu }+\hat{{\mathscr {S}}}_{j}^{\nu }\,\big ) = Wv_{j}\,\big (\,\hat{{\mathscr {J}}}u^{\nu }+\hat{{\mathscr {H}}}^{\nu }\,\big ) + \big (\,\hat{{\mathscr {H}}}_{j}u^{\nu }+\hat{{\mathscr {K}}}_{j}^{\nu }\,\big ). \end{aligned}$$
(491)

Then, by adding \(Wv_{j}\) times Eqs. (484) and (485) one obtains

$$\begin{aligned} \partial _{\nu }{}\big (\,\hat{{\mathscr {F}}}_{j}\,n^{\nu }+\hat{{\mathscr {S}}}_{j}^{\nu }\,\big ) -\partial _{\varepsilon }{}\big (\,\gamma _{j\rho }\,\hat{{\mathscr {Q}}}^{\rho \mu \nu }\,\partial _{\nu }{u_{\mu }}\,\big ) = 0, \end{aligned}$$
(492)

which is a local conservation law for the spectral momentum density. Again, in arriving at Eq. (492), the remainder after bringing \(Wv_{j}\) inside the spacetime derivative in Eq. (484) cancels with the sources in Eqs. (484) and (485) in a nontrivial way:

$$\begin{aligned}&\big (\,\hat{{\mathscr {J}}}u^{\nu }+\hat{{\mathscr {H}}}^{\nu }\,\big )\,\partial _{\nu }{}\big (Wv_{j}\big ) -Wv_{j}\,\hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{u_{\mu }} + \hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{h_{j\mu }} \nonumber \\&\quad =-\big (\,u_{\mu }\,\partial _{\nu }{}\big (Wv_{j}\big ) + Wv_{j}\,\partial _{\nu }{u_{\mu }} - \partial _{\nu }{h_{j\mu }}\,\big )\,\hat{{\mathscr {T}}}^{\mu \nu } \nonumber \\&\quad =-\partial _{\nu }{}\big (\,Wv_{j}u_{\mu }-h_{j\mu }\,\big )\,\hat{{\mathscr {T}}}^{\mu \nu } =\hat{{\mathscr {T}}}^{\mu \nu }\,\partial _{\nu }{g_{j\mu }}=0, \end{aligned}$$
(493)

since, in special relativity and with Cartesian coordinates, \(\partial _{\nu }{g_{j\mu }}=0\).

Equations (490) and (493) can be viewed as constraints. Since the discretizations of Eqs. (484) and (485) are unlikely to satisfy these constraints, they are inconsistent with energy conservation in the sense of Eq. (489) and momentum conservation in the sense of Eq. (492). In the fully relativistic case, one is faced with the same issue, namely that the discretization of the Lagrangian two-moment model [Eqs. (116) and (118)] is to a certain degree inconsistent with the discretization of the Eulerian two-moment model [Eqs. (109) and (114)]. Since it is the Eulerian moments that enter into the definition of the ADM mass, this inconsistency can propagate and manifest itself as violations of ADM mass conservation. On the other hand, by using the Eulerian two-moment model as the starting point for a numerical method—e.g., as in Kuroda et al. (2016)—it may be easier to control \(\varDelta M_{\text{ ADM }}\). (The time evolution of the ADM mass reported by Kuroda et al. (2016) and Müller et al. (2010) are indeed quite different.) However, while the use of the Eulerian two-moment model may provide an advantage with regard to controlling energy conservation, one is still left with the equally challenging task of maintaining consistency with the number equation [Eq. (123)] and controlling lepton number conservation, as discussed in detail by Cardall et al. (2013b), and in this case violations of lepton number conservation in the sense of Eq. (483) may still result.

We conclude this section by discussing number, energy, and momentum conservation in the context of the \({\mathscr {O}}(v/c)\) limit of the relativistic Lagrangian two-moment model discussed above, implemented by Just et al. (2015) and Skinner et al. (2019). (Note, we use units in which \(c=1\).) The energy equation, Eq. (484), is then given by

$$\begin{aligned} \partial _{t}{}\big (\,\hat{{\mathscr {J}}}+\varTheta \,v^{i}\hat{{\mathscr {H}}}_{i}\,\big ) + \partial _{i}{}\big (\,\hat{{\mathscr {H}}}^{i}+v^{i}\hat{{\mathscr {J}}}\,\big ) - \partial _{\varepsilon }{}\big (\,\varepsilon \,\hat{{\mathscr {K}}}^{i}_{k}\,\partial _{i}{v^{k}}\,\big ) = - \hat{{\mathscr {K}}}^{i}_{k}\,\partial _{i}{v^{k}}, \end{aligned}$$
(494)

while the momentum equation, Eq. (485), is given by

$$\begin{aligned} \partial _{t}{}\big (\,\hat{{\mathscr {H}}}_{j}+\varTheta \,v^{i}\hat{{\mathscr {K}}}_{ij}\,\big ) + \partial _{i}{}\big (\,\hat{{\mathscr {K}}}^{i}_{j}+v^{i}\hat{{\mathscr {H}}}_{j}\,\big ) - \partial _{\varepsilon }{}\big (\,\varepsilon \,\hat{{\mathscr {L}}}^{i}_{kj}\,\partial _{i}{v^{k}}\,\big ) = - \hat{{\mathscr {H}}}^{i}\,\partial _{i}{v_{j}}. \end{aligned}$$
(495)

For simplicity, we ignore terms proportional to the time derivative of the fluid three-velocity, which is a reasonable approximation. In Eqs. (494) and (495), we introduced a constant parameter, \(\varTheta \), that is either zero or one. For \(\varTheta =0\), the two-moment model reduces to the one solved by Just et al. (2015) and by Skinner et al. (2019). However, when \(\varTheta =1\), as we will show below, the two-moment model is better aligned with number, energy, and momentum conservation.

First, by dividing Eq. (494) with the particle energy \(\varepsilon \) and rearranging, one obtains

$$\begin{aligned} \partial _{t}{}\big (\,\hat{{\mathscr {D}}}+\varTheta \,v^{i}\hat{{\mathscr {I}}}_{i}\,\big ) + \partial _{i}{}\big (\,\hat{{\mathscr {I}}}^{i}+v^{i}\hat{{\mathscr {D}}}\,\big ) - \partial _{\varepsilon }{}\big (\,\hat{{\mathscr {K}}}^{i}_{k}\,\partial _{i}{v^{k}}\,\big ) = 0, \end{aligned}$$
(496)

which is a local conservation law for the spectral number density \(\hat{{\mathscr {D}}}+\varTheta \,v^{i}\hat{{\mathscr {I}}}_{i}\). Note that, when \(\varTheta =0\), it is the Lagrangian number density defined in Eq. (87) that is conserved, which is incorrect in the \({\mathscr {O}}(v/c)\) limit. On the other hand, when \(\varTheta =1\), Eq. (496) is a conservation law for the \({\mathscr {O}}(v/c)\) approximation of the Eulerian number density defined in Eq. (97), which is conserved.

Next, we consider energy and momentum conservation. Following the approach in the relativistic case, by adding Eq. (494) and the contraction of \(v^{j}\) with Eq. (495) one obtains

$$\begin{aligned}&\partial _{t}{}\big (\hat{{\mathscr {J}}}+(1+\varTheta )\,v^{i}\hat{{\mathscr {H}}}_{i}\big ) +\partial _{i}{}\big (\hat{{\mathscr {H}}}^{i}+v^{i}\hat{{\mathscr {J}}}+v^{j}\hat{{\mathscr {K}}}^{i}_{j}\big ) \nonumber \\&\qquad - \partial _{\varepsilon }{}\big (\,\varepsilon \,\hat{{\mathscr {K}}}^{i}_{k}\,\partial _{i}{v^{k}}\,\big ) ={\mathscr {O}}(v^{2}), \end{aligned}$$
(497)

which, to \({\mathscr {O}}(v/c)\), is a local conservation law for the Eulerian spectral energy density \(\hat{{\mathscr {J}}}+(1+\varTheta )\,v^{i}\hat{{\mathscr {H}}}_{i}\). With \(\varTheta =1\), this is the correct \({\mathscr {O}}(v/c)\) limit of the Eulerian energy density in Eq. (101). Terms of higher order in the fluid velocity have been moved to the right-hand side of Eq. (497), which must remain small for the \({\mathscr {O}}(v/c)\) limit to be valid. Also note that, with \(\varTheta =0\), energy conservation breaks down to leading order in the fluid three-velocity (a factor of 2 should appear in the coefficient of the second term inside the parentheses of the time derivative). Similarly, by adding \(v_{j}\) times Eqs. (494) and (495) one obtains

$$\begin{aligned}&\partial _{t}{}\big (\hat{{\mathscr {H}}}_{j}+v_{j}\hat{{\mathscr {J}}}+\varTheta \,v^{i}\hat{{\mathscr {K}}}_{ij}\big ) + \partial _{i}{}\big (\,\hat{{\mathscr {K}}}^{i}_{j}+\hat{{\mathscr {H}}}^{i}v_{j}+v^{i}\hat{{\mathscr {H}}}_{j}\,\big ) \end{aligned}$$
(498)
$$\begin{aligned}&\qquad - \partial _{\varepsilon }{}\big (\varepsilon \,\hat{{\mathscr {L}}}^{i}_{kj}\,\partial _{i}{v^{k}}\big ) ={\mathscr {O}}(v^{2}), \end{aligned}$$
(499)

which, to \({\mathscr {O}}(v/c)\), is a local conservation law for the Eulerian spectral momentum density \(\hat{{\mathscr {H}}}_{j}+v_{j}\hat{{\mathscr {J}}}+\varTheta \,v^{i}\hat{{\mathscr {K}}}_{ij}\). Again, with \(\varTheta =1\), this is the correct \({\mathscr {O}}(v/c)\) limit of the Eulerian momentum density equation, Eq. (102).

Thus, at the expense of some additional computational complexity, by letting \(\varTheta =1\) in Eqs. (494) and (495), the two-moment model becomes consistent with number, energy, and momentum conservation in the \({\mathscr {O}}(v/c)\) limit.

6.6 One-moment kinetics

6.6.1 Newtonian-gravity, O(v/c), finite-difference implementation

One moment kinetics is typically deployed in the context of neutrino transport in core-collapse supernovae using the multigroup (i.e., multi-frequency) flux-limited diffusion approximation (MGFLD). Such MGFLD approaches solve the neutrino and antineutrino moment equations for the zeroth moment of the distribution function, the multigroup neutrino/antineutrino energy density, with closure provided at the level of the first moment, the neutrino/antineutrino energy flux, via a diffusion-like equation, modified in such a way that the flux cannot be come superluminal (flux limiting). Swesty and Myra (2009) were the first to implement such an approach in axisymmetric simulations of core-collapse supernovae. The equations for the neutrino/antineutrino multigroup energy densities used by Swesty and Myra are expressed as

$$\begin{aligned} \frac{\partial E_{\varepsilon }}{\partial t} + {\nabla } \cdot \left( E_{\varepsilon } \mathbf{v} \right) + {\nabla } \cdot \mathbf{F}_{\varepsilon } - \varepsilon \frac{\partial }{\partial \varepsilon } \left( {\mathsf P}_{\varepsilon }: {\nabla } \mathbf{v} \right)= & {} {{\mathbb {S}}}_{\varepsilon }, \end{aligned}$$
(500)
$$\begin{aligned} \frac{\partial {\bar{E}}_{\varepsilon }}{\partial t} + {\nabla } \cdot \left( {\bar{E}}_{\varepsilon } \mathbf{v} \right) + {\nabla } \cdot \bar{\mathbf{F}}_{\varepsilon } - \varepsilon \frac{\partial }{\partial \varepsilon } \left( \bar{{{\mathsf {P}}}}_{\varepsilon }: {\nabla } \mathbf{v} \right)= & {} \bar{{{\mathbb {S}}}}_{\varepsilon }, \end{aligned}$$
(501)

where \(E_\varepsilon \) and \({\bar{E}}_\varepsilon \) are the neutrino and antineutrino energy densities per group, \(P_\varepsilon \) and \({\bar{P}}_\varepsilon \) are the neutrino and antineutrino stress tensors, and \({{\mathbb {S}}}_\varepsilon \) and \(\bar{{{\mathbb {S}}}}_\varepsilon \) are the neutrino and antineutrino matter couplings, respectively. The energy flux in both equations is given by a Fick’s-like relation of the form

$$\begin{aligned} \mathbf{F}_\varepsilon \equiv -D_\varepsilon {\nabla } E_\varepsilon , \end{aligned}$$
(502)

where

$$\begin{aligned} D_\varepsilon = \frac{c}{3\kappa ^T_\varepsilon } \end{aligned}$$
(503)

is the diffusion coefficient, and \(\kappa _\varepsilon ^T\) is the total opacity. In flux-limited diffusion schemes, the diffusion coefficient \(D_\varepsilon \) is modified. A general form for such a modified diffusion coefficient is given by

$$\begin{aligned} D_\varepsilon \equiv \frac{c \lambda _\varepsilon ({R_\epsilon })}{\kappa ^T_\varepsilon }. \end{aligned}$$
(504)

In particular, the so-called Levermore–Pomraning flux limiter (Levermore and Pomraning 1981) is given by

$$\begin{aligned} \lambda _\varepsilon ({R_\epsilon }) \equiv \frac{2 + {R_\epsilon }}{6 + 3{R_\epsilon }+ {R_\epsilon }^2}, \end{aligned}$$
(505)

where \(R_\varepsilon \) is the radiation Knudsen number, which is the ratio of the mean free path to some characteristic length scale in the problem. The Knudsen number is written as

$$\begin{aligned} {R_\epsilon }\equiv \frac{\left| {\nabla } E_\varepsilon \right| }{\kappa ^T_\varepsilon E_\varepsilon }. \end{aligned}$$
(506)

Note that the Knudsen number is different for different energy groups given that the opacities are typically (and for neutrinos in core-collapse supernovae, definitely) energy dependent. The radiation stress tensor takes the typical form

$$\begin{aligned} {{\mathsf {P}}}_\varepsilon \equiv {{\mathsf {X}}}_\varepsilon E_\varepsilon , \end{aligned}$$
(507)

where

$$\begin{aligned} {{\mathsf {X}}}_\varepsilon \equiv \frac{1}{2} \left( 1-\chi _\varepsilon \right) {{\mathsf {I}}} + \frac{1}{2} \left( 3 \chi _\varepsilon - 1 \right) \mathbf{n}{} \mathbf{n}, \end{aligned}$$
(508)

where \(\chi _\varepsilon \) is the scalar Eddington factor, which in the case of the Levermore–Pomraning flux-limiting scheme becomes

$$\begin{aligned} \chi _\varepsilon = \lambda _\varepsilon ({R_\epsilon }) + \left\{ \lambda _\varepsilon ({R_\epsilon })\right\} ^2 \, {R_\epsilon }^2. \end{aligned}$$
(509)

Given the choice of Levermore-Pomraning flux limiting, the evolution equations (500) and (501) become

$$\begin{aligned} \frac{\partial E_{\varepsilon }}{\partial t} + {\nabla } \cdot \left( E_{\varepsilon } \mathbf{v} \right) - {\nabla } \cdot (D_\varepsilon {\nabla } E_{\varepsilon }) - \varepsilon \frac{\partial }{\partial \varepsilon } \left\{ ({{\mathsf {X}}}_{\varepsilon } E_\varepsilon ): {\nabla } \mathbf{v} \right\}= & {} {{\mathbb {S}}}_{\varepsilon }, \end{aligned}$$
(510)
$$\begin{aligned} \frac{\partial {\bar{E}}_{\varepsilon }}{\partial t} + {\nabla } \cdot \left( {\bar{E}}_{\varepsilon } \mathbf{v} \right) - {\nabla } \cdot ({\bar{D}}_\varepsilon {\nabla } {\bar{E}}_{\varepsilon }) - \varepsilon \frac{\partial }{\partial \varepsilon } \left\{ (\bar{{{\mathsf {X}}}}_{\varepsilon } {\bar{E}}_\varepsilon ): {\nabla } \mathbf{v} \right\}= & {} \bar{{{\mathbb {S}}}}_{\varepsilon }. \end{aligned}$$
(511)

Swesty and Myra note, these equations are not in conservative form. They opt to monitor conservation of lepton number and energy after the fact. The degree to which they achieve either was not documented. Their equations are operator split as follows (written here for just the neutrinos, not the antineutrinos):

$$\begin{aligned} \left[\!\!\left[\frac{ \partial E_\varepsilon }{\partial t}\right]\!\!\right]_{\mathrm{total}} = \left[\!\!\left[\frac{ \partial E_\varepsilon }{\partial t}\right]\!\!\right]_{\mathrm{advection}} + \left[\!\!\left[\frac{ \partial E_\varepsilon }{\partial t}\right]\!\!\right]_{\text {diff-coll}}, \end{aligned}$$
(512)

where

$$\begin{aligned} \left[\!\!\left[\frac{ \partial E_\varepsilon }{\partial t}\right]\!\!\right]_{\mathrm{advection}}= & {} -{\nabla } \cdot (E_\varepsilon \mathbf{v}), \end{aligned}$$
(513)
$$\begin{aligned} \left[\!\!\left[\frac{ \partial E_\varepsilon }{\partial t}\right]\!\!\right]_\text {diff-coll}= & {} {\nabla } \cdot (D_\varepsilon {\nabla } E_{\varepsilon }) + \varepsilon \frac{\partial }{\partial \varepsilon } \left\{ ({{\mathsf {X}}}_{\varepsilon } E_\varepsilon ): {\nabla } \mathbf{v} \right\} + {{\mathbb {S}}}_{\varepsilon }. \end{aligned}$$
(514)

For the purpose of describing their numerical method used to treat each of the operator split equations shown above, Swesty and Myra note, first, that the advection equations take the general form

$$\begin{aligned} \left[\!\!\left[\frac{ \partial \psi }{\partial t} \right]\!\!\right]_\mathrm{advection} + {\nabla } \cdot \left( \psi \mathbf{v} \right) = 0, \end{aligned}$$
(515)

where \(\psi \) is the scalar field (\(E_\varepsilon \) and \({\bar{E}}_\varepsilon \)) being advected. They then deploy the ZEUS consistent advection scheme of Stone and Norman (1992) in a directionally-split manner to each dimension (in their case, \(x_1\) and \(x_2\)) of the problem. For the \(x_1\) update, Eq. (515) is discretized as follows:

$$\begin{aligned}&{ \frac{ \left[ \varDelta V \right] _{i+(1/2),j+(1/2)}}{\varDelta t} \left( \left[ \psi \right] ^{n+\beta }_{i+(1/2),j+(1/2)} - \left[ \psi \right] ^{n+\alpha }_{i+(1/2),j+(1/2)} \right) } \nonumber \\&\quad = - \left( \left[ F_1(\psi ) \right] _{i+1,j+(1/2)}^{n+\alpha } \left[ \varDelta A_1 \right] _{i+1,j+(1/2)} - \left[ F_1(\psi ) \right] _{i,j+(1/2)}^{n+\alpha } \left[ \varDelta A_1 \right] _{i,j+(1/2)} \right) .\nonumber \\ \end{aligned}$$
(516)

The fluxes in Eq. (516) are given by

$$\begin{aligned} \left[ F_1(\psi ) \right] _{i,j+(1/2)} = \left[ \mathcal{I}_1\left( \frac{\psi }{\rho }\right) \right] _{i,j+(1/2)} \left[ F_1(\rho ) \right] _{i,j+(1/2)}, \end{aligned}$$
(517)

where

$$\begin{aligned} \left[ F_{1}(\rho ) \right] _{i,j+(1/2)} = \left[ {{{\mathscr {I}}}}_1(\rho ) \right] _{i,j+(1/2)} \left[ {\upsilon }_{1} \right] _{i,j+(1/2)}, \end{aligned}$$
(518)

and where

$$\begin{aligned}&\left[ {{{\mathscr {I}}}}_1(\psi ) \right] _{i,j+(1/2)} \nonumber \\&\quad = \left\{ \begin{array}{ll} \displaystyle { \left[ \psi \right] _{i-(1/2),j+(1/2)} + \left[ \delta _1(\psi ) \right] _{i-(1/2),j+(1/2)} \left( 1 - \frac{\left[ {\upsilon }_{1}\right] _{i,j+(1/2)} \varDelta t }{\left[ x_1\right] _{i} - \left[ x_1\right] _{i-1} } \right) } &{} \text {if} \; \left[ {\upsilon }_1 \right] _{i,j+(1/2)} > 0, \\ \displaystyle { \left[ \psi \right] _{i+(1/2),j+(1/2)} - \left[ \delta _1(\psi ) \right] _{i+(1/2),j+(1/2)} \left( 1 + \frac{ \left[ {\upsilon }_{1} \right] _{i,j+(1/2)} \varDelta t}{\left[ x_1\right] _{i+1} - \left[ x_1\right] _{i} } \right) } &{} \text {if} \; \left[ {\upsilon }_1 \right] _{i,j+(1/2)} < 0. \\ \end{array}\right. \nonumber \\ \end{aligned}$$
(519)

In Eq. (518), \(\rho \) is the fluid mass density. \(\mathcal{I}_1(\psi )\) is the van Leer monotonic upwind advection function (Van Leer 1977), given by

$$\begin{aligned}&\left[ \delta _1(\psi ) \right] _{i+(1/2),j+(1/2)}\nonumber \\&\quad = \left\{ \begin{array}{ll} \displaystyle { \frac{ \left[ \varDelta \psi \right] _{i,j+(1/2)} \left[ \varDelta \psi \right] _{i+1,j+(1/2)} }{\left[ \psi \right] _{i+(3/2),j+(1/2)} - \left[ \psi \right] _{i-(1/2),j+(1/2)}}} &{}\quad {\text {if}} \; \left[ \varDelta \psi \right] _{i,j+(1/2)} \left[ \varDelta \psi \right] _{i+1,j+(1/2)} > 0, \\ 0 &{}\quad {\text {otherwise}}, \end{array}\right. \nonumber \\ \end{aligned}$$
(520)

where

$$\begin{aligned} \left[ \varDelta \psi \right] _{i,j+(1/2)} = \left[ \psi \right] _{i+(1/2),j+(1/2)} - \left[ \psi \right] _{i-(1/2),j+(1/2)}. \end{aligned}$$
(521)

The \(x_2\) update is computed in the same way, with the obvious substitutions.

The remaining term, due to neutrino diffusion, relativistic effects, and collisions:

$$\begin{aligned} \left[\!\!\left[\frac{\partial (^eE_{\varepsilon })}{\partial t} \right]\!\!\right]_{\mathrm{diff-coll}} - {\nabla } \cdot \left( ^eD_\varepsilon ^eE_{\varepsilon } \right) - \varepsilon \frac{\partial }{\partial \varepsilon } \left( ^e{{\mathsf {P}}}_{\varepsilon }: {\nabla } \mathbf{v} \right) - ^e{{\mathbb {S}}}_{\varepsilon } = 0 \end{aligned}$$
(522)

is differenced implicitly in time and as follows in phase space:

$$\begin{aligned}&\frac{ \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} - \left[ E_\varepsilon \right] ^n_{k+(1/2),i+(1/2),j+(1/2)}}{\varDelta t}- \left[ {\nabla } \cdot D_\varepsilon \nabla E_{\varepsilon } \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \nonumber \\&\qquad -\left[ \varepsilon \frac{\partial \left( {\mathsf P}_{\varepsilon }: {\nabla } \mathbf{v}\right) }{\partial \varepsilon } \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} -\left[ {\mathbb S}_{\varepsilon } \right] ^{n+1}_{k+(1/2),i+(1/2),j+)1/2)} = 0 , \nonumber \\ \end{aligned}$$
(523)

where

$$\begin{aligned}&{ \left[ {\nabla } \cdot D_\varepsilon \nabla E_{\varepsilon } \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} } \nonumber \\&\quad \equiv \frac{1}{ \left[ g_2 \right] _{i+(1/2)} \left[ g_{31} \right] _{i+(1/2)} \left[ g_{32} \right] _{j+(1/2)}} \left\{ \frac{1}{ \left[ x_1 \right] _{i+(3/2)} - \left[ x_1 \right] _{i+(1/2)}} \right. \Big . \nonumber \\&\quad \quad \Big . \times \left( \left[ g_2 \right] _{i+1} \left[ g_{31} \right] _{i+1} \left[ g_{32} \right] _{j+(1/2)} \left[ D_\varepsilon (x_1) \right] ^{n+t}_{k+(1/2),i+1,j+(1/2)} \Big . \right. \nonumber \\&\quad \quad \Big . \left. \times \frac{ \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(3/2),j+(1/2)} - \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)}}{ \left[ x_1 \right] _{i+(3/2)} - \left[ x_1 \right] _{i+(1/2)}} \Big . \right. \nonumber \\&\quad \quad - \Big . \left. \left[ g_2 \right] _{i} \left[ g_{31} \right] _{i} \left[ g_{32} \right] _{j+(1/2)} \left[ D_\varepsilon (x_1) \right] ^{n+t}_{k+(1/2),i,j+(1/2)} \Big . \right. \nonumber \\&\quad \quad \Big . \left. \times \frac{ \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} - \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i-(1/2),j+(1/2)}}{ \left[ x_1 \right] _{i+(1/2)} - \left[ x_1 \right] _{i-(1/2)}} \right) \Big . \nonumber \\&\quad \quad \Big . + \frac{1}{ \left[ x_2 \right] _{j+(3/2)} - \left[ x_2 \right] _{j+(1/2)}} \Big . \nonumber \\&\quad \quad \Big . \times \left( \frac{ \left[ g_{31} \right] _{i+(1/2)} \left[ g_{32} \right] _{j+1}}{ \left[ g_2 \right] _{i+(1/2)}} \left[ D_\varepsilon (x_2) \right] ^{n+t}_{k+(1/2),i+(1/2),j+1} \Big . \right. \nonumber \\&\quad \quad \Big . \left. \times \frac{ \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(3/2)} - \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)}}{\left[ x_2 \right] _{j+(3/2)} - \left[ x_2 \right] _{j+(1/2)}} \Big . \right. \nonumber \\&\quad \quad \Big . \left. - \frac{ \left[ g_{31} \right] _{i+(1/2)} \left[ g_{32} \right] _{j}}{ \left[ g_2 \right] _{i+(1/2)}} \left[ D_\varepsilon (x_2) \right] ^{n+t}_{k+(1/2),i+(1/2),j} \Big . \right. \nonumber \\&\quad \quad \left. \Big . \left. \times \frac{ \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} - \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j-(1/2)}}{ \left[ x_2 \right] _{j+(1/2)} - \left[ x_2 \right] _{j-(1/2)}} \right) \right\} \end{aligned}$$
(524)

and

$$\begin{aligned}&{ \left[ \varepsilon \frac{\partial \left( {\mathsf P}_{\varepsilon }: {\nabla } \mathbf{v}\right) }{\partial \varepsilon } \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \equiv \frac{\left[ \varepsilon \right] _{k+(1/2)}}{\left[ \varepsilon \right] _{k+1} - \left[ \varepsilon \right] _k} } \nonumber \\&\qquad \times \big ( \left[ {\mathsf {X}}_{\varepsilon }:{\nabla } \mathbf{v} \right] ^{n+t}_{k+(3/2),i+(1/2),j+(1/2)} \left[ E_\varepsilon \right] ^{n+1}_{k+(3/2),i+(1/2),j+(1/2)} \nonumber \\&\qquad - \left[ {\mathsf {X}}_{\varepsilon }:{\nabla } \mathbf{v} \right] ^{n+t}_{k-(1/2),i+(1/2),j+(1/2)} \left[ E_\varepsilon \right] ^{n+1}_{k-(1/2),i+(1/2),j+(1/2)} \big ) \end{aligned}$$
(525)

and

$$\begin{aligned}&{ \left[ {{\mathbb {S}}}_{\varepsilon } \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} } \nonumber \\&\quad \equiv -\left[ S_\varepsilon \right] ^{n+t}_{k+(1/2),i+(1/2),j+(1/2)} \left( 1 + \frac{\eta \alpha }{\left( \left[ \varepsilon \right] _{k+(1/2)}\right) ^3} \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \right) \nonumber \\&\qquad + c \left[ \kappa ^a_\varepsilon \right] ^{n+t}_{k+(1/2),i+(1/2),j+(1/2)} \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \nonumber \\&\qquad - \left( 1 + \frac{\eta \alpha }{\left( \left[ \varepsilon \right] _{k+(1/2)}\right) ^3} \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \right) \left[ \varepsilon \right] _{k+(1/2)} \nonumber \\&\qquad \times \sum _{\ell =0}^{N_g-1} \left[ \varDelta \varepsilon \right] _{\ell +(1/2)} \left[ G\right] ^{n+t}_{k+(1/2),\ell +(1/2),i+(1/2),j+(1/2)} \nonumber \\&\qquad \left( 1 + \frac{\eta \alpha }{\left( \left[ \varepsilon \right] _{\ell +(1/2)}\right) ^3} \left[ {\bar{E}}_\varepsilon \right] ^{n+1}_{\ell +(1/2),i+(1/2),j+(1/2)} \right) \nonumber \\&\qquad - c\left( 1 + \frac{\eta \alpha }{\left( \left[ \varepsilon \right] _{k+(1/2)}\right) ^3} \left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \right) \nonumber \\&\qquad \times \sum _{\ell =0}^{N_g-1} \left[ \varDelta \varepsilon \right] _{\ell +(1/2)} \left[ \kappa ^s\right] ^{n+t}_{k+(1/2),\ell +(1/2),i+(1/2),j+(1/2)} \left[ E_\varepsilon \right] ^{n+1}_{\ell +(1/2),i+(1/2),j+(1/2)} \nonumber \\&\qquad + c\left[ E_\varepsilon \right] ^{n+1}_{k+(1/2),i+(1/2),j+(1/2)} \nonumber \\&\qquad \times \sum _{\ell =0}^{N_g-1} \left[ \varDelta \varepsilon \right] _{\ell +(1/2)} \left[ \kappa ^s\right] ^{n+t}_{\ell +(1/2),k+(1/2),i+(1/2),j+(1/2)}\nonumber \\&\qquad \left( 1 + \frac{\eta \alpha }{\left( \left[ \varepsilon \right] _{\ell +(1/2)}\right) ^3} \left[ E_\varepsilon \right] ^{n+1}_{\ell +(1/2),i+(1/2),j+(1/2)} \right) . \nonumber \\ \end{aligned}$$
(526)

In Eq. (524), the factors \(g_2\), \(g_{31}\), and \(g_{32}\) derive from the 3-covariant form of the spatial metric used by Swesty and Myra, which is given by

$$\begin{aligned} ds^2 = (g_1)^2 dx_1^2 + (g_2)^2 dx_2^2 + (g_{31} g_{32})^2 dx_3^2 \end{aligned}$$
(527)

and is written to accommodate Cartesian, cylindrical, and spherical coordinates. In Eq. (526), \(\kappa ^a\) and \(\kappa ^s\) are the absorption and scattering opacities, respectively, and \(G(\varepsilon ,\varepsilon ^{\prime })\) is the pair annihilation kernel. The factors \(\alpha \) and \(\eta \) are constants. \(N_g\) is the number of energy groups, and the superscript \(n+t\), with t taking on different values, designates the update stages for the electron, muon, and tau neutrino distributions in the overall update scheme used by Swesty and Myra, shown in their Figure 3. To solve Eq. (523) and its counterpart for antineutrinos, simultaneously, given their coupling, Swesty and Myra implement the Newton–BiCGSTAB subclass of Newton–Krylov iterative methods. Equation (523) and its antineutrino counterpart are first linearized. BiCGSTAB is used for a solution to the resultant “inner” linear system of equations for the updates to the iterates of the outer Newton iteration.

Once the quantities \(^l{\mathbb {S}}_\varepsilon \), where \(\ell \) denotes neutrino flavor, are known from the solution of Eq. (523) and its counterparts for heavy-flavor neutrinos, Swesty and Myra then update the fluid electron fraction and energy density using the following operator split equations:

$$\begin{aligned} \left[\!\!\left[\frac{\partial n_e}{\partial t}\right]\!\!\right]_{{\mathrm{collision}}}= & {} - \int \frac{1}{\varepsilon } \left( ^e{\mathbb S}_\varepsilon - ^e\bar{{{\mathbb {S}}}}_\varepsilon \right) d\varepsilon , \end{aligned}$$
(528)
$$\begin{aligned} \left[\!\!\left[\left( \frac{\partial E}{\partial t} \right) \right]\!\!\right]_{\mathrm{collision-\ell }}= & {} - \int \left( { ^\ell {{\mathbb {S}}}_\varepsilon + ^\ell \bar{{{\mathbb {S}}}}_\varepsilon }\right) d\varepsilon , \end{aligned}$$
(529)

where \(n_e\) is the electron number density and E is the matter energy density. Equation (529) is solved in operator split fashion for each flavor. The discretizations for Eqs. (528) and (529) for electron-neutrino flavor neutrinos (where both lepton number and energy are exchanged) are:

$$\begin{aligned} \left[ n_e \right] ^{n+1}_{i+(1/2),j+(1/2)}= & {} \left[ n_e \right] ^{n+b}_{i+(1/2),j+(1/2)} \nonumber \\&- \varDelta t \sum _{\ell =0}^{N_g-1} \left[ \varDelta \varepsilon \right] _{\ell +(1/2)} \left( \frac{ \left[ ^e{{\mathbb {S}}}_\varepsilon \right] ^{n+b}_{i+(1/2),j+(1/2)} - \left[ ^e\bar{{{\mathbb {S}}}}_\varepsilon \right] ^{n+b}_{i+(1/2),j+(1/2)}}{\left[ \varepsilon \right] _{\ell +(1/2)}} \right) , \nonumber \\ \end{aligned}$$
(530)
$$\begin{aligned} \left[ E \right] ^{n+d}_{i+(1/2),j+(1/2)}= & {} \left[ E \right] ^{n+b}_{i+(1/2),j+(1/2)} \nonumber \\&- \varDelta t \sum _{\ell =0}^{N_g-1} \left[ \varDelta \varepsilon \right] _{\ell +(1/2)} \left( { \left[ ^e{\mathbb S}_\varepsilon \right] ^{n+c}_{i+(1/2),j+(1/2)} - \left[ ^e\bar{{\mathbb S}}_\varepsilon \right] ^{n+c}_{i+(1/2),j+(1/2)}} \right) .\nonumber \\ \end{aligned}$$
(531)

In a similar manner, the neutrino–matter momentum exchanged is computed.

6.6.2 General-relativistic, finite-difference implementation

A general relativistic implementation of MGFLD was developed by Rahman et al. (2019). They begin with the 3 + 1 metric:

$$\begin{aligned} \mathrm {d}s^2\equiv & {} g_{ab} \mathrm {d}x^a \mathrm {d}x^b \nonumber \\= & {} - \alpha ^2 \mathrm {d}t^2 + \gamma _{ij} (\mathrm {d}x^i + \beta ^i \mathrm {d}t)(\mathrm {d}x^j + \beta ^j \mathrm {d}t), \end{aligned}$$
(532)

and the following definitions of the comoving-frame spectral neutrino energy density, momentum density, and stress tensor:

$$\begin{aligned} {\mathscr {J}}(x^{\mu },\varepsilon )\equiv & {} \varepsilon ^3 \int f(x^{\mu },p^{{{\hat{\mu }}}})\,\mathrm {d}\varOmega , \nonumber \\ {\mathscr {H}}^{{{\hat{i}}}}(x^{\mu },\varepsilon )\equiv & {} \varepsilon ^3 \int l^{{{\hat{i}}}} f(x^{\mu },p^{{{\hat{\mu }}}})\,\mathrm {d}\varOmega , \nonumber \\ {\mathscr {K}}^{{{\hat{i}}} {{\hat{j}}}}(x^{\mu },\varepsilon )\equiv & {} \varepsilon ^3 \int l^{{{\hat{i}}}} l^{{{\hat{j}}}} f(x^{\mu },p^{{\hat{\mu }}})\,\mathrm {d}\varOmega , \end{aligned}$$
(533)

respectively. \(p^{{{\hat{\mu }}}}\equiv \varepsilon (1,l^{{{\hat{i}}}})\) denotes the comoving-frame, momentum-space coordinates. \(l^{{{\hat{i}}}}\) is a unit comoving-frame, momentum-space three-vector. With these definitions and choice of phase-space coordinates, Rahman et al. express the evolution equation for the comoving-frame neutrino energy density as given by Eq. (155) in Sect. 4.7.5. Given the approximations discussed there, the neutrino energy density equation solved by Rahman et al. becomes

$$\begin{aligned}&\frac{1}{\alpha } \frac{\partial }{\partial t} (W {{\hat{\mathscr {J}}}}) + \frac{1}{\alpha } \frac{\partial }{\partial x^j} [\alpha W (v^j-\beta ^j/\alpha ) {{\hat{\mathscr {J}}}}] \nonumber \\&\qquad - \frac{1}{\alpha } \frac{\partial }{\partial x^j} \Big [\alpha ^{-2} \sqrt{\gamma } \Big \{ \gamma ^{i k} + W \Big (\frac{W}{W+1}v^j-\beta ^j/\alpha \Big ) v^k \Big \} D \partial _k (\alpha ^3 {\mathscr {J}}) \Big ] \nonumber \\&\qquad - \frac{e^{k {{\hat{i}}}}}{\alpha ^4} \frac{\partial }{\partial t} (W \sqrt{\gamma } {{\bar{v}}}_{{{\hat{i}}}}) D \partial _k(\alpha ^3 {\mathscr {J}}) +{\hat{R}}_\varepsilon - \frac{\partial }{\partial \varepsilon } (\varepsilon {\hat{R}}_\varepsilon ) \nonumber \\&\quad = \kappa _\mathrm {a} ({{\hat{\mathscr {J}}}}^{eq}-{{\hat{\mathscr {J}}}}), \end{aligned}$$
(534)

where the relation \(e^{{\hat{j}}}_{{\hat{i}}}e^{k{\hat{i}}}=\gamma ^{jk}\) was used.

Rahman et al. divide the numerical update into three steps, operator splitting Eq. (534) into the source term, the radial and spectral shift terms, and the nonradial terms. In step 1, the focus is on the source term, and the corresponding terms in the matter specific internal energy and electron fraction equations. The set of equations to be solved is given by

$$\begin{aligned} \frac{W}{\alpha } \partial _t {\mathscr {J}}_{\nu ,\xi }= & {} \big [ \kappa _\mathrm {a} ({\mathscr {J}}^{\mathrm {eq}} - {\mathscr {J}}) \big ]_{\nu ,\xi }, \nonumber \\ \frac{W}{\alpha } \rho \partial _t e (T,Y_\mathrm {e})= & {} - \sum _{\nu ,\xi } \big [ \kappa _\mathrm {a} ({\mathscr {J}}^{\mathrm {eq}} - {\mathscr {J}} ) \varDelta \varepsilon _\xi \big ]_{\nu ,\xi }, \nonumber \\ \frac{W}{\alpha } \rho \partial _t Y_\mathrm {e}= & {} - m_u \sum _{\xi } \big [ \big [\kappa _\mathrm {a} ({\mathscr {J}}^{\mathrm {eq}} - {\mathscr {J}}) \varDelta \varepsilon _\xi \big ]_{\nu _\mathrm {e}} \nonumber \\&- \big [\kappa _\mathrm {a} ({\mathscr {J}}^{\mathrm {eq}} - {\mathscr {J}}) \varDelta \varepsilon _\xi \big ]_{{{\bar{\nu }}}_\mathrm {e}} \big ]_{\xi }, \end{aligned}$$
(535)

where \(\nu \) and \(\xi \) indicate the neutrino species and energy bin, respectively, and \(\varDelta \varepsilon _\xi \) the energy bin width. \(m_{u}\) is the atomic mass unit. These equations are differenced fully implicitly in time and solved using Newton–Raphson iteration. Linearization of the equations in \({\mathscr {J}}_{\nu ,\xi }\), e, and \(Y_e\) leads to a system of linear equations that must be solved for each iteration. To do so, Rahman et al. use a direct (LAPACK) solver. The quantities \(\rho \), \(\alpha \), W, and \(\kappa _a\) are all held constant during the Newton–Raphson procedure.

In step 2, the following equation is solved:

$$\begin{aligned}&W\partial _t {{\hat{\mathscr {J}}}} + {\mathscr {R}}_r = 0, \end{aligned}$$
(536)

where

$$\begin{aligned} {\mathscr {R}}_r\equiv & {} \partial _t (W) {{\hat{\mathscr {J}}}} + \partial _r [\alpha W(v^r-\beta ^r \alpha ^{-1}){{\hat{\mathscr {J}}}}] \nonumber \\&- \partial _r \Big [ \alpha ^{-2} \sqrt{\gamma } \Big \{ \gamma ^{rr} + W \Big (\frac{W}{W+1}v^r-\beta ^r \alpha ^{-1}\Big ) v^r \Big \} D_1 \partial _r(\alpha ^3 {\mathscr {J}}) \Big ] \nonumber \\&- \alpha ^{-3} e^{r {{\hat{i}}}} \partial _t (W \sqrt{\gamma } \bar{v}_{{{\hat{i}}}})D_1 \partial _r (\alpha ^3 {\mathscr {J}}) + \alpha \Big [{{\hat{R}}}_\varepsilon - \frac{\partial }{\partial \varepsilon } (\varepsilon {{\hat{R}}}_\varepsilon ) \Big ]\, \end{aligned}$$
(537)

includes radial advection, diffusion, and acceleration, as well as spectral shifts. \(D_1\) denotes the radial diffusion coefficient. Equation (536) is solved using the Crank–Nicolson scheme:

$$\begin{aligned}&(W \sqrt{\gamma }) \frac{{\mathscr {J}}^{n+1}_i-{\mathscr {J}}^{n}_i}{\varDelta t} = -\frac{1}{2} ({\mathscr {R}}^{n+1}_{r,i}+{\mathscr {R}}^{n}_{r,i}) . \end{aligned}$$
(538)

All gravitation and hydrodynamics variables are kept fixed during transport updates. \({\mathscr {R}}^{n}_{r,i}\) on the right-hand side of equation (538) is evaluated at both \(t^n\) and \(t^{n+1}\). For \(t^{n+1}\), Rahman et al. provide the following discretizations. The diffusion term is discretized as

$$\begin{aligned}&\Big [ \partial _r \{A^r D_1 \partial _r(\alpha ^3 {\mathscr {J}})\} \Big ]^{n+1}_{i} \nonumber \\&\quad =\frac{1}{\varDelta r} \Big [A^{r}_{i+1/2}D^{n}_{1,i+1/2} \frac{\alpha ^3_{i+1} {\mathscr {J}}^{n+1}_{i+1}-\alpha ^3_{i} {\mathscr {J}}^{n+1}_{i}}{\varDelta r} \nonumber \\&\qquad - A^{r}_{i-1/2}D^{n}_{1,i-1/2} \frac{\alpha ^3_{i} {\mathscr {J}}^{n+1}_i-\alpha ^3_{i-1} {\mathscr {J}}^{n+1}_{i-1}}{\varDelta r} \Big ], \end{aligned}$$
(539)

where

$$\begin{aligned} A^r\equiv & {} \alpha ^{-2} \sqrt{\gamma } \Big \{ \gamma ^{rr} + W \Big (\frac{W}{W+1}v^r-\beta ^r \alpha ^{-1}\Big ) v^r \Big \} . \end{aligned}$$
(540)

\(i-1/2\) and \(i+1/2\) denote the left and right zone edges for zone i, respectively. Values of the gravity and hydrodynamics variables at zone edges are determined by linear interpolation of their zone-center counterparts. The fluid acceleration term is discretized as

$$\begin{aligned}&\Big [B^r D_{1} \partial _r(\alpha ^3 {\mathscr {J}})\Big ]^{n+1}_{i} \nonumber \\&\quad =\frac{B^{r}_i}{2} \left[ D^{n}_{1,i+1/2} \frac{\alpha ^3_{i+1} {\mathscr {J}}^{n+1}_{i+1}-\alpha ^3_{i}{\mathscr {J}}^{n+1}_{i}}{\varDelta r}\right. \nonumber \\&\qquad \left. + D^{n}_{1,i-1/2} \frac{\alpha ^3_{i}{\mathscr {J}}^{n+1}_i-\alpha ^3_{i-1}{\mathscr {J}}^{n+1}_{i-1}}{\varDelta r} \right] , \end{aligned}$$
(541)

where

$$\begin{aligned} B^r\equiv & {} \alpha ^{-3} e^{r {{\hat{i}}}} \partial _t(W \sqrt{\gamma } {{\bar{v}}}_{{{\hat{i}}}}) . \end{aligned}$$
(542)

The metric and hydrodynamics variables before and after the metric and hydrodynamics updates are used to evaluate the time derivative in Eq. (542). The advection term is discretized in an upwind fashion as

$$\begin{aligned} \Big [ \partial _r (C^r {{\mathscr {J}}}) \Big ]^{n+1}_{i}= & {} \frac{1}{\varDelta r} \Big [C^{r}_{i+1/2} {\mathscr {J}}^{n+1}_{\iota (i+1/2)} - C^{r}_{i-1/2} {\mathscr {J}}^{n+1}_{\iota (i-1/2)} \Big ], \end{aligned}$$
(543)

where

$$\begin{aligned} C^r\equiv & {} \alpha \sqrt{\gamma } W(v^r-\beta ^r \alpha ^{-1}) \end{aligned}$$
(544)

and

$$\begin{aligned} \iota (i+1/2)\equiv & {} \left\{ \begin{array}{ll} i, &{}\quad {\text {if}}\; v^r_{i+1/2} > 0 \, ,\\ i+1, &{}\quad {\text {otherwise}}. \end{array}\right. \end{aligned}$$
(545)

Spectral shifts—the last term in Eq. (537)—are discretized using the number-conservative scheme of Müller et al. (2010) discussed in Sect. 6.5.2. The flux factor, \(f^{{\hat{i}}}\), and the Eddington tensor, \(\chi ^{{\hat{i}}{\hat{j}}}\), are used to replace \({\mathscr {H}}^{{\hat{i}}}\) and \({\mathscr {K}}^{{\hat{i}}{\hat{j}}}\) by \(f^{{\hat{i}}}{\mathscr {I}}\) and \(\chi ^{{\hat{i}}{\hat{j}}}{\mathscr {I}}\), respectively. In evaluating the spectral shift terms, both the flux factor and the Eddington tensor are evaluated at \(t^n\), whereas the energy density, \({\mathscr {I}}\), is evaluated at \(t^{n+1}\). The remaining advection and diffusion terms are included in the last transport step, encapsulated in the equation

$$\begin{aligned}&W \sqrt{\gamma } \partial _t ({\mathscr {J}}) = {\mathscr {R}}({\mathscr {J}}), \end{aligned}$$
(546)

where

$$\begin{aligned} {\mathscr {R}}({\mathscr {J}})\equiv & {} -\partial _\theta [\alpha W (v^\theta -\beta ^\theta \alpha ^{-1}) {{\hat{\mathscr {J}}}}] - \partial _\phi [\alpha W (v^\phi -\beta ^\phi \alpha ^{-1}) {{\hat{\mathscr {J}}}}] \nonumber \\&+ \partial _\theta \Big [ \alpha ^{-2} \sqrt{\gamma } \Big \{ \gamma ^{\theta \theta } + W \Big (\frac{W}{W+1}v^\theta -\beta ^\theta \alpha ^{-1}\Big ) v^\theta \Big \} D_{2} \partial _\theta (\alpha ^3 {\mathscr {J}}) \Big ] \nonumber \\&+ \partial _\phi \Big [ \alpha ^{-2} \sqrt{\gamma } \Big \{ \gamma ^{\phi \phi } + W \Big (\frac{W}{W+1}v^\phi -\beta ^\phi \alpha ^{-1}\Big ) v^\phi \Big \} D_{3} \partial _\phi (\alpha ^3 {\mathscr {J}}) \Big ] \nonumber \\&+ \alpha ^{-3} e^{\theta {{\hat{i}}}} \partial _t(W \sqrt{\gamma } {{\bar{v}}}_{{{\hat{i}}}}) D_2 \partial _\theta (\alpha ^3 {\mathscr {J}}) \nonumber \\&+ \alpha ^{-3} e^{\phi {{\hat{i}}}} \partial _t(W \sqrt{\gamma } {{\bar{v}}}_{{{\hat{i}}}}) D_3 \partial _\phi (\alpha ^3 {\mathscr {J}}). \end{aligned}$$
(547)

\(D_2\) and \(D_3\) are the diffusions coefficients in the \(\theta \) and \(\phi \) directions, respectively. Equation (547) is evolved using one of two explicit methods: Allen–Cheng (Allen and Cheng 1970) and Runge–Kutta–Legendre (RKL2) (Meyer et al. 2012). The latter method is a conditionally stable method expressly designed for the diffusion equation. In the Allen–Cheng method, a predictor step provides the following partial update:

$$\begin{aligned} \frac{(W\sqrt{\gamma })}{\varDelta t}({\mathscr {J}}^{*}_k - {\mathscr {J}}^{n}_k)= & {} - \frac{1}{2 \varDelta y} (F_{k+1} {\mathscr {J}}^n_{k+1} - F_{k-1} {\mathscr {J}}^n_{k-1}) \nonumber \\&+ \frac{1}{\varDelta y^2} {[}E_{k+1/2} (\alpha ^3_{k+1} {\mathscr {J}}^n_{k+1}-\alpha ^3_{k} {\mathscr {J}}^{*}_k) \nonumber \\&- E_{k-1/2} (\alpha ^3_{k} {\mathscr {J}}^{*}_k-\alpha ^3_{k-1} {\mathscr {J}}^n_{k-1})]\, \nonumber \\&+ \frac{G_{k}}{2\varDelta y} \big [ D_{k+1/2} (\alpha ^3_{k+1} {\mathscr {J}}^{n}_{k+1}-\alpha ^3_{k} {\mathscr {J}}^{*}_{k}) \nonumber \\&+ D_{k-1/2} (\alpha ^3_{k} {\mathscr {J}}^{*}_{k}-\alpha ^3_{k-1} {\mathscr {J}}^{n}_{k-1}) \big ], \end{aligned}$$
(548)

which, in turn, is followed by a corrector step that provides the complete update:

$$\begin{aligned} \frac{(W\sqrt{\gamma })}{\varDelta t}({\mathscr {J}}^{n+1}_k - {\mathscr {J}}^{n}_k)= & {} - \frac{1}{2 \varDelta y} (F_{k+1} {\mathscr {J}}^{*}_{k+1} - F_{k-1} {\mathscr {J}}^{*}_{k-1}) \nonumber \\&+ \frac{1}{\varDelta y^2} {[}E_{k+1/2} (\alpha ^3_{k+1} {\mathscr {J}}^{*}_{k+1}-\alpha ^3_{k} {\mathscr {J}}^{n+1}_k) \nonumber \\&- E_{k-1/2} (\alpha ^3_{k} {\mathscr {J}}^{n+1}_k-\alpha ^3_{k-1} {\mathscr {J}}^{*}_{k-1})] \nonumber \\&+ \frac{G_{k}}{2\varDelta y} \big [ D_{k+1/2} (\alpha ^3_{k+1} {\mathscr {J}}^{*}_{k+1}-\alpha ^3_{k} {\mathscr {J}}^{n+1}_{k}) \nonumber \\&+ D_{k-1/2} (\alpha ^3_{k} {\mathscr {J}}^{n+1}_{k}-\alpha ^3_{k-1} {\mathscr {J}}^{*}_{k-1}) \big ], \end{aligned}$$
(549)

where

$$\begin{aligned}&E \equiv \alpha ^{-2} \sqrt{\gamma } \left\{ \gamma ^{jj} + W \Big (\frac{W}{W+1}v^j-\beta ^j \alpha ^{-1}\Big ) v^j \right\} D, \nonumber \\&F \equiv \alpha \sqrt{\gamma } W (v^j-\beta ^j \alpha ^{-1}), \nonumber \\&G \equiv \alpha ^{-3} e^{j {{\hat{i}}}} \partial _t(W \sqrt{\gamma } {{\bar{v}}}_{{{\hat{i}}}}). \end{aligned}$$
(550)

In Eqs. (548) and (549), only one spatial index, k, is explicitly shown and represents a zone index in either the \(\theta \) or the \(\phi \) direction. Moreover, in the discretizations shown, the gridding in the single dimension is assumed to be uniform, with zone width \(\varDelta y\). In the (s-stage) RKL2 method, which Rahman et al. deploy as a 4-stage method, the update in each of the four stages is given by

$$\begin{aligned} {\mathscr {J}}_{0}= & {} {\mathscr {J}}^{n}, \nonumber \\ {\mathscr {J}}_{1}= & {} {\mathscr {J}}_{0} + \frac{2}{27} \frac{\varDelta t}{W\sqrt{\gamma }} {\mathscr {R}}({\mathscr {J}}_{0}), \nonumber \\ {\mathscr {J}}_{2}= & {} \frac{3}{2} {\mathscr {J}}_{1} - \frac{1}{2} {\mathscr {J}}_{0} + \frac{\varDelta t}{W\sqrt{\gamma }} \Big ( \frac{1}{3} {\mathscr {R}}({\mathscr {J}}_{1}) - \frac{2}{9} {\mathscr {R}}({\mathscr {J}}_{0}) \Big ), \nonumber \\ {\mathscr {J}}_{3}= & {} \frac{25}{12} {\mathscr {J}}_{2} - \frac{5}{6} {\mathscr {J}}_{1} - \frac{1}{4} {\mathscr {J}}_{0} + \frac{\varDelta t}{W\sqrt{\gamma }} \Big ( \frac{25}{54} {\mathscr {R}}({\mathscr {J}}_{2}) - \frac{25}{81} {\mathscr {R}}({\mathscr {J}}_{0}) \Big ), \nonumber \\ {\mathscr {J}}_{4}= & {} \frac{189}{100} {\mathscr {J}}_{3} - \frac{81}{80} {\mathscr {J}}_{2} + \frac{49}{400} {\mathscr {J}}_{0} \nonumber \\&+ \frac{\varDelta t}{W\sqrt{\gamma }} \Big ( \frac{21}{50} {\mathscr {R}}({\mathscr {J}}_{3}) - \frac{49}{200} {\mathscr {R}}({\mathscr {J}}_{0}) \Big ), \nonumber \\ {\mathscr {J}}^{n+1}= & {} {\mathscr {J}}_{4} . \end{aligned}$$
(551)

For the s-th stage and zone k, \({\mathscr {R}}({\mathscr {J}})\) is discretized as

$$\begin{aligned} {\mathscr {R}}_k({\mathscr {J}}_s)= & {} - \frac{1}{2 \varDelta y} (F_{k+1} {\mathscr {J}}_{s,k+1} - F_{k-1} {\mathscr {J}}_{s,k-1}) \nonumber \\&+ \frac{1}{\varDelta y^2} (E_{k+1/2} (\alpha ^3_{k+1} {\mathscr {J}}_{s,k+1}-\alpha ^3_{k} {\mathscr {J}}_{s,k}) \nonumber \\&- E_{k-1/2} (\alpha ^3_{k} {\mathscr {J}}_{s,k}-\alpha ^3_{k-1} {\mathscr {J}}_{s,k-1})) \nonumber \\&+ \frac{G_{k}}{2\varDelta y} \big [ D_{k+1/2} (\alpha ^3_{k+1} {\mathscr {J}}_{s,k+1}-\alpha ^3_{k} {\mathscr {J}}_{s,k}) \nonumber \\&+ D_{k-1/2} (\alpha ^3_{k} {\mathscr {J}}_{s,k}-\alpha ^3_{k-1} {\mathscr {J}}_{s,k-1}) \big ]. \end{aligned}$$
(552)

Finally, it is important to note that Rahman et al. go to great lengths to ensure that their definitions of the diffusion coefficients preserve causality for both the individual and the total radiative fluxes. To accomplish this, they compute the gradient of the energy density as

$$\begin{aligned}&|\nabla {\mathscr {J}}|_{i,j,k} \nonumber \\&\quad = \sqrt{\Big (\frac{{\mathscr {J}}_{i+1,j,k}-{\mathscr {J}}_{i-1,j,k}}{r_{i+1}-r_{i-1}}\Big )^2 +\Big (\frac{{\mathscr {J}}_{i,j+1,k}-{\mathscr {J}}_{i,j-1,k}}{r_i(\theta _{j+1}-\theta _{j-1})}\Big )^2 +\Big (\frac{{\mathscr {J}}_{i,j,k+1}-{\mathscr {J}}_{i,j,k-1}}{r_i\sin {\theta _j}(\phi _{k+1}-\phi _{k-1})}\Big )^2}\nonumber \\ \end{aligned}$$
(553)

and the Knudsen number as

$$\begin{aligned} R_{i,j,k} = \frac{|\nabla {\mathscr {J}}|_{i,j,k}}{(\kappa _\mathrm {t})_{i,j,k} {\mathscr {J}}_{i,j,k}}, \end{aligned}$$
(554)

where \(({\kappa _\mathrm {t}})_{i,j,k}\) is the transport opacity at the cell center (ijk). Equation (161) is then used to compute the flux limiter, and the causality-preserving diffusion coefficients are given by

$$\begin{aligned} D_{i,j,k} = \frac{\lambda _{i,j,k}}{(\kappa _\mathrm {t})_{i,j,k}}. \end{aligned}$$
(555)

Rahman et al. do not report on the conservation of lepton number in their code, but given their use of the method developed by Müller et al. (2010), which is specifically designed to conserve lepton number, it should be quite good. They do report on their conservation of energy. They report a change in total energy of \(1.85\times 10^{51}\) erg at 60 ms after bounce, most of which results at bounce, and a much more gradual increase between 60 and 525 ms after bounce to their final value of \(\varDelta E\) of \(2.0\times 10^{51}\) erg. As discussed in Sect. 6.5.4, their use of the Lagrangian two-moment model as the starting point for their MGFLD implementation does not lend itself to conserving energy, nor does their use of flux-limited diffusion, as discussed in Just et al. (2015) and in references cited therein.

6.6.3 Newtonian-gravity, O(v/c), finite-volume implementation

As part of the development of the CASTRO code, Zhang et al. (2013) developed a MGFLD solver using finite-volume methods. They express the equations of multigroup radiation hydrodynamics as

$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot (\rho \mathbf {u}) = { }&0, \end{aligned}$$
(556)
$$\begin{aligned} \frac{\partial (\rho \mathbf {u})}{\partial t} + \nabla \cdot (\rho \mathbf {u} \mathbf {u}) + \nabla p + \sum _{g} \lambda _g \nabla E_g = { }&\mathbf {F}_G, \end{aligned}$$
(557)
$$\begin{aligned} \frac{\partial (\rho E)}{\partial t} + \nabla \cdot (\rho E \mathbf {u} + p \mathbf {u}) + \mathbf {u} \cdot \sum _{g}\lambda _g \nabla E_g = { }&\sum _{g} c (\kappa _gE_{g}-j_g) + \mathbf {u}\cdot \mathbf {F}_G, \end{aligned}$$
(558)
$$\begin{aligned} \frac{\partial (\rho Y_e)}{\partial t} + \nabla \cdot (\rho Y_e \mathbf {u}) = { }&\sum _{g} c \xi _g (\kappa _gE_{g}-j_g) , \end{aligned}$$
(559)
$$\begin{aligned}&\frac{\partial E_g}{\partial t} + \nabla \cdot \left( \frac{3-f_g}{2} E_g \mathbf {u}\right) - \mathbf {u} \cdot \nabla \left( \frac{1-f_g}{2} E_g\right) \nonumber \\&\quad = - c (\kappa _gE_{g}-j_g) + \nabla \cdot \left( \frac{c\lambda _g}{\chi _g} \nabla E_g \right) \nonumber \\&\qquad + \int _g \frac{\partial }{\partial \nu } \Big [\left( \frac{1-f}{2} \nabla \cdot \mathbf {u} + \frac{3f-1}{2} \hat{\mathbf {n}}\hat{\mathbf {n}} : \nabla \mathbf {u} \right) \nu E_\nu \Big ] \mathrm {d}\nu - \frac{3f_g-1}{2} E_g \hat{\mathbf {n}}\hat{\mathbf {n}} : \nabla \mathbf {u}, \end{aligned}$$
(560)

where the group quantities are defined as

$$\begin{aligned} E_g= & {} \int _{\nu _{g-1/2}}^{\nu _{g+1/2}} E_\nu \mathrm {d}\nu , \end{aligned}$$
(561)
$$\begin{aligned} j_g= & {} \frac{4\pi }{c}\eta (\nu _g) \varDelta \nu _g, \end{aligned}$$
(562)

and

$$\begin{aligned} \xi _g = s \frac{m_{\mathrm {B}}}{h \nu _g}. \end{aligned}$$
(563)

In Eq. (561), the neutrino energy density per frequency, \(E_\nu \), is integrated over the frequency group defined by the interval \([\nu _{g}-1/2,\nu _{g}+1/2]\) to yield the energy density per group. Equation (562) defines the group emissivity in terms of the emissivity, \(\eta \), and the group width \(\varDelta \nu _g=\nu _g+1/2-\nu _g-1/2\). In order of appearance in the equations, the remaining quantities are, \(\lambda _g\), \(\kappa _g\), and \(f_g\), and are defined by evaluating the flux limiter, \(\lambda \), the absorption coefficient, \(\kappa \), and the Eddington factor, f, at a representative group frequency, \(\nu _g\)—i.e., they are all group-mean values. Finally, for neutrinos, \(\xi _g\) is given by Eq. (563), with \(s=+1\) for electron neutrinos and \(s=-1\) for electron antineutrinos. Zhang et al. split these equations into three subsets, based on their mathematical characteristics and in an effort to minimize issues arising from operator splitting. There is a hyperbolic subsystem that includes the evolution of the electron fraction (it also includes pieces of the evolution equation for the neutrino energy density, but the neutrino energy density is not evolved using this subsystem, as will be discussed):

$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot (\rho \mathbf {u}) = { }&0, \end{aligned}$$
(564)
$$\begin{aligned} \frac{\partial (\rho \mathbf {u})}{\partial t} + \nabla \cdot (\rho \mathbf {u} \mathbf {u}) + \nabla p + \sum _{g} \lambda _g \nabla E_g = { }&\mathbf {F}_G , \end{aligned}$$
(565)
$$\begin{aligned} \frac{\partial (\rho E)}{\partial t} + \nabla \cdot (\rho E \mathbf {u} + p \mathbf {u}) +\mathbf {u} \cdot \sum _{g} \lambda _g \nabla E_g = { }&\mathbf {u} \cdot \mathbf {F}_G , \end{aligned}$$
(566)
$$\begin{aligned} \frac{\partial (\rho Y_e)}{\partial t} + \nabla \cdot (\rho Y_e \mathbf {u}) = { }&0, \end{aligned}$$
(567)
$$\begin{aligned} \frac{\partial E_g}{\partial t} + \nabla \cdot \left( \frac{3-f_g}{2} E_g \mathbf {u}\right) - \mathbf {u} \cdot \nabla \left( \frac{1-f_g}{2} E_g\right) = { }&0 . \end{aligned}$$
(568)

There is a second set of hyperbolic equations that governs the evolution of the neutrino energy density sans the diffusion term and the term that describes the coupling of neutrinos to the matter:

$$\begin{aligned} \frac{\partial E_g}{\partial t} = { }&-\nabla \cdot (E_g \mathbf {u}), \end{aligned}$$
(569)
$$\begin{aligned} \frac{\partial E_\nu }{\partial t} = { }&\frac{\partial }{\partial \ln {\nu }} \left[ \left( \frac{1-f}{2} \nabla \cdot \mathbf {u} + \frac{3f-1}{2} \hat{\mathbf {n}}\hat{\mathbf {n}} : \nabla \mathbf {u}\right) E_\nu \right] . \end{aligned}$$
(570)

This second set of equations results from a splitting of their equation for the neutrino energy density per frequency, \(E_\nu \), prior to integration over group frequencies:

$$\begin{aligned} \frac{\partial E_{\nu }}{\partial t} + \nabla \cdot (E_{\nu } \mathbf {u}) = { }&\nabla \cdot \left( \frac{c\lambda }{\chi } \nabla E_{\nu } \right) - (c\kappa E_{\nu } - 4\pi \eta ) \nonumber \\ { }&+ \frac{\partial }{\partial \ln {\nu }} \left( \frac{1-f}{2} E_\nu \nabla \cdot \mathbf {u} + \frac{3f-1}{2} E_\nu \hat{\mathbf {n}}\hat{\mathbf {n}} : \nabla \mathbf {u}\right) . \end{aligned}$$
(571)

Finally, there is a parabolic system of equations that describes the evolution of the neutrino energy density due to the diffusion of neutrinos in the stellar core, as well as the evolution of the matter internal energy and electron fraction as a result of neutrino–matter interactions:

$$\begin{aligned} \frac{\partial (\rho e)}{\partial t} = { }&\sum _{g} c (\kappa _gE_{g}-j_g), \end{aligned}$$
(572)
$$\begin{aligned} \frac{\partial (\rho Y_e)}{\partial t} = { }&\sum _{g} c \xi _g (\kappa _gE_{g}-j_g), \end{aligned}$$
(573)
$$\begin{aligned} \frac{\partial E_g}{\partial t} = { }&-c (\kappa _gE_{g}-j_g) + \nabla \cdot \left( \frac{c\lambda _g}{\chi _g} \nabla E_g \right) . \end{aligned}$$
(574)

The equations in the first hyperbolic subsystem, Eqs. (564) through (568), are solved using an explicit, unsplit, PPM method, with characteristic limiting, full corner coupling, and the approximate Riemann solver of Bell et al. (1989). Given the Godunov states computed, the radiation field energy density is in turn updated via Eq. (569). Finally, Eq. (570), which takes the form of an advection equation in neutrino-energy space, is solved using a second, explicit Godunov method, based on the method of lines. In this explicit part of the update scheme, a third-order, TVD, Runge–Kutta scheme developed by Shu and Osher (1988) is used.

The parabolic system, Eqs. (572) through (574), is instead solved implicitly. Zhang et al. reformulate the equations as

$$\begin{aligned} F_e = { }&\rho e - \rho e^{-} - \varDelta t \sum _g c (\kappa _g E_g - j_g) = 0 , \end{aligned}$$
(575)
$$\begin{aligned} F_Y = { }&\rho Y_e - \rho Y_e^{-} - \varDelta t \sum _g c \xi _g (\kappa _g E_g - j_g) = 0 , \end{aligned}$$
(576)
$$\begin{aligned} F_g = { }&E_g - E_g^{-} - \varDelta t\, \nabla \cdot \left( \frac{c\lambda _g}{\chi _g} \nabla E_g \right) + \varDelta t\, c (\kappa _g E_g - j_g) = 0 , \end{aligned}$$
(577)

and linearize in T, \(Y_e\), and \(E_g\) to obtain the (outer) linear system

$$\begin{aligned} \left[ \begin{array}{ccc} ({\partial F_e}/{\partial T})^{(k)} &{} ({\partial F_e}/{\partial Y_e})^{(k)} &{} ({\partial F_e}/{\partial E_g})^{(k)} \\ ({\partial F_Y}/{\partial T})^{(k)} &{} ({\partial F_Y}/{\partial Y_e})^{(k)} &{} ({\partial F_Y}/{\partial E_g})^{(k)} \\ ({\partial F_g}/{\partial T})^{(k)} &{} ({\partial F_g}/{\partial Y_e})^{(k)} &{} ({\partial F_g}/{\partial E_g})^{(k)} \end{array} \right] \left[ \begin{array}{c} \delta T^{(k+1)}\\ \delta Y_e^{(k+1)}\\ \delta E_g^{(k+1)} \end{array}\right] = \left[ \begin{array}{c} - F_e^{(k)}\\ - F_Y^{(k)}\\ - F_g^{(k)} \end{array}\right] . \end{aligned}$$
(578)

They point out that if the derivatives of the diffusion coefficient, \(c\lambda _g/\chi _g\), with respect to T, \(Y_e\), and \(E_g\) are ignored, the linear system of equations collapses to an equation for the \((k+1)^{\mathrm{st}}\) iterate, \(E^{(k+1)}_{g}\):

$$\begin{aligned} \begin{aligned} \left( c\kappa _g + \frac{1}{\varDelta t}\right) E_g^{(k+1)}&- \nabla \cdot \left( \frac{c\lambda _g}{\chi _g} \nabla E_g^{(k+1)} \right) = c j_g + \frac{E_g^{-}}{\varDelta t} \\&+ H_g \left[ c\sum _{g^{\prime }}\left( \kappa _{g^{\prime }} E_{g^{\prime }}^{(k+1)} - j_{g^{\prime }} \right) - \frac{1}{\varDelta t} (\rho e^{(k)} - \rho e^{-}) \right] \\&+ \varTheta _g \left[ c \sum _{g^{\prime }}\xi _{g^{\prime }}\left( \kappa _{g^{\prime }} E_{g^{\prime }}^{(k+1)} - j_{g^{\prime }} \right) - \frac{1}{\varDelta t} (\rho Y_e^{(k)} - \rho Y_e^{-}) \right] , \end{aligned} \end{aligned}$$
(579)

where \(\lambda _g\), \(\kappa _g\), \(\chi _g\), and \(j_g\) are evaluated at the \(k{\mathrm{th}}\) iterate, and where

$$\begin{aligned} H_g = { }&\left( \frac{\partial j_g}{\partial T} - \frac{\partial \kappa _g}{\partial T} E_g^{(k)} \right) \eta _T - \left( \frac{\partial j_g}{\partial Y_e} - \frac{\partial \kappa _g}{\partial Y_e} E_g^{(k)}\right) \eta _Y , \end{aligned}$$
(580)
$$\begin{aligned} \varTheta _g = { }&\left( \frac{\partial j_g}{\partial Y_e} - \frac{\partial \kappa _g}{\partial Y_e} E_g^{(k)}\right) \theta _Y - \left( \frac{\partial j_g}{\partial T} - \frac{\partial \kappa _g}{\partial T} E_g^{(k)} \right) \theta _T , \end{aligned}$$
(581)

and

$$\begin{aligned} \eta _T = { }&\frac{c\varDelta t}{\varOmega } \left[ \rho + c \varDelta t \sum _g \xi _g \left( \frac{\partial j_g}{\partial Y_e} - \frac{\partial \kappa _g}{\partial Y_e} E_g^{(k)}\right) \right] , \end{aligned}$$
(582)
$$\begin{aligned} \eta _Y = { }&\frac{c\varDelta t}{\varOmega } \left[ c \varDelta t \sum _g \xi _g \left( \frac{\partial j_g}{\partial T} - \frac{\partial \kappa _g}{\partial T} E_g^{(k)} \right) \right] , \end{aligned}$$
(583)
$$\begin{aligned} \theta _T = { }&\frac{c\varDelta t}{\varOmega } \left[ \rho \frac{\partial e}{\partial Y_e} + c \varDelta t \sum _g \left( \frac{\partial j_g}{\partial Y_e} - \frac{\partial \kappa _g}{\partial Y_e} E_g^{(k)}\right) \right] , \end{aligned}$$
(584)
$$\begin{aligned} \theta _Y = { }&\frac{c\varDelta t}{\varOmega } \left[ \rho \frac{\partial e}{\partial T} + c\varDelta t \sum _g \left( \frac{\partial j_g}{\partial T} - \frac{\partial \kappa _g}{\partial T} E_g^{(k)} \right) \right] , \end{aligned}$$
(585)
$$\begin{aligned} \varOmega = { }&\left[ \rho \frac{\partial e}{\partial T} + c\varDelta t \sum _g \left( \frac{\partial j_g}{\partial T} - \frac{\partial \kappa _g}{\partial T} E_g^{(k)} \right) \right] \left[ \rho + c \varDelta t \sum _g \xi _g \left( \frac{\partial j_g}{\partial Y_e} - \frac{\partial \kappa _g}{\partial Y_e} E_g^{(k)}\right) \right] \nonumber \\&- \left[ \rho \frac{\partial e}{\partial Y_e} + c \varDelta t \sum _g \left( \frac{\partial j_g}{\partial Y_e} - \frac{\partial \kappa _g}{\partial Y_e} E_g^{(k)}\right) \right] \left[ c \varDelta t \sum _g \xi _g \left( \frac{\partial j_g}{\partial T} - \frac{\partial \kappa _g}{\partial T} E_g^{(k)} \right) \right] , \end{aligned}$$
(586)

all of which are evaluated at the \(k{\mathrm{th}}\) iterate. Equation (579) couples \(E_g\) to its values across all energy groups. To decouple the groups, Zhang et al. choose to use an (inner) iterative procedure by evaluating the right-hand-side at the \(k{\mathrm{th}}\) iterate of \(E_g\) and iterating the solution of Eq. (579) to convergence. Once \(E^{k+1}_g\) is known, the updates for \(\rho e\) and \(Y_e\) are determined by

$$\begin{aligned} \rho e^{(k+1)} = { }&H \rho e^{(k)} + (1-H) \rho e^- + \varTheta (\rho Y_e^{(k)} - \rho Y_e^-) \nonumber \\&+ c \varDelta t \sum _g\left[ (\kappa _gE_g^{\ell +1} - j_g) - (H + \varTheta \xi _g) (\kappa _gE_g^{\ell } - j_g)\right] , \end{aligned}$$
(587)
$$\begin{aligned} \rho Y_e^{(k+1)} = { }&{\bar{\varTheta }} \rho Y_e^{(k)} + (1-{\bar{\varTheta }}) \rho Y_e^- + {\bar{H}} (\rho e^{(k)} - \rho e^-) \nonumber \\&+ c\varDelta t \sum _g\left[ \xi _g(\kappa _gE_g^{\ell +1} - j_g) - ({\bar{H}} + {\bar{\varTheta }} \xi _g) (\kappa _gE_g^{\ell } - j_g)\right] , \end{aligned}$$
(588)

which stem from Eqs. (575) and (576) upon linearization and are conservative for energy and lepton number. In Eqs. (587) and (588), H, \(\varTheta \), \({\bar{H}}\), and \({\bar{\varTheta }}\) are defined by

$$\begin{aligned} H= & {} \sum _g H_g, \end{aligned}$$
(589)
$$\begin{aligned} \varTheta= & {} \sum _g \varTheta _g, \end{aligned}$$
(590)
$$\begin{aligned} {\bar{H}}= & {} \sum _g \xi _g H_g, \end{aligned}$$
(591)
$$\begin{aligned} {\bar{\varTheta }}= & {} \sum _g \xi _g \varTheta _g. \end{aligned}$$
(592)

In turn, T is updated, and the next outer iteration is initiated. Zhang et al. deploy the synthetic acceleration scheme of Morel et al. (1985, 2007), extended in this case by them to neutrino transport, to accelerate convergence of their outer iteration. Note that the system given by Eqs. (572)–(574) does not include energy coupling interactions (e.g., inelastic scattering). Inclusion of these interactions in a fully implicit solve requires modifications to the solution procedure.

The degree to which the approach outlined here conserves lepton number and energy was not documented.

6.7 Structure-preserving methods

Structure-preserving methods are advanced numerical methods that aim to capture key properties of the underlying, continuous PDEs, and include methods that preserve physical bounds on solutions (e.g., positive distribution functions), achieve asymptotic limits of a multi-scale model (e.g., diffusion limit in radiation transport and steady states), preserve constraints (e.g., the divergence-free condition in magnetohydrodynamics), or conserve secondary quantities (e.g., simultaneous conservation of neutrino number and energy). As such, structure-preserving methods are more faithful to the physics, and often improves accuracy and robustness. The energy conserving discretization of the spherically symmetric Boltzmann equation by Liebendörfer et al. (2004) discussed in Sect. 6.1.3, and the number conserving discretization of the energy equation in the Lagrangian two-moment model by Müller et al. (2010) discussed in 6.5 are examples of structure-preserving discretizations already in use in simulations. These aim to preserve secondary quantities that are not evolved directly by the numerical method. Below we discuss discretizations that aim to preserve physical bounds on evolved quantities.

6.7.1 Preamble: discontinuous Galerkin methods

Since the following subsections employ the discontinuous Galerkin (DG) method, which has yet to be adapted to modeling CCSN, we include a short description of key elements here by considering the scalar conservation law,

$$\begin{aligned} \partial _{t}{u}+\partial _{x}{f(u)} = 0, \end{aligned}$$
(593)

with a linear flux \(f(u)=a\,u\), where a is a constant in space and time. We refer to Cockburn and Shu (1989, 1991, 1998) and Cockburn et al. (1989, 1990) for pioneering, in-depth expositions on the early development of DG methods. See also Cockburn and Shu (2001) and Shu (2016) for reviews.

To solve Eq. (593), the computational domain D is divided into a triangulation \({\mathscr {T}}\) of non-overlapping elements \(K=(x_{{\textsc {L}}},x_{{\textsc {H}}})\), so that \(D = \cup _{K \in {\mathscr {T}}}\). On each element, the solution will then be approximated by functions in the approximation space

$$\begin{aligned} {\mathbb {V}}_{h}^{k}=\{\varphi _{h} : \varphi _{h}\big |_{K} \in {\mathbb {P}}^{k}(K), \, \, \forall \ K\in {\mathscr {T}} \}, \end{aligned}$$
(594)

where \({\mathbb {P}}^{k}(K)\) denotes the space of one-dimensional polynomials of maximal degree k (e.g., Legendre polynomials). Functions in \({\mathbb {V}}_{h}^k\) can be discontinuous across element interfaces (hence discontinuous Galerkin). One then writes the approximate solution to Eq. (593) on element K as the expansion

$$\begin{aligned} u_{h}^{K}(x,t) = \sum _{i=1}^{k+1}u_{i}^{K}(t)\,b_{i}^{K}(x), \end{aligned}$$
(595)

where the expansion coefficients \(u_{i}^{K}\) are the unknowns for which we solve the equations, and \(b_{i}^{K}\in {\mathbb {V}}_{h}^{k}\) are the basis functions. Next, one defines in what sense \(u_{h}^{K}\) will approximate u, the solution to Eq. (593). To this end, the residual

$$\begin{aligned} R(u_{h}^{K}) = \partial _{t}{u_{h}^{K}}+\partial _{x}{f(u_{h}^{K})} \end{aligned}$$
(596)

is defined, which is required to be orthogonal to all test functions \(\varphi _{h}\in {\mathbb {V}}_{h}^{k}\); i.e.,

$$\begin{aligned} \int _{K}R(u_{h}^{K})\,\varphi _{h}^{K}\,dx = 0, \quad \forall \varphi _{h}^{K}\in {\mathbb {V}}_{h}^{k}. \end{aligned}$$
(597)

Inserting Eq. (596) into Eq. (597), and performing an integration by parts on the flux term gives

$$\begin{aligned} \int _{K}(\partial _{t}{u_{h}^{K}})\,\varphi _{h}^{K}\,dx + \big [\,f(u_{h}^{K})(x_{{\textsc {H}}}^{-})\,\varphi _{h}^{K}(x_{{\textsc {H}}}^{-})-f(u_{h}^{K})(x_{{\textsc {L}}}^{+})\,\varphi _{h}^{K}(x_{{\textsc {L}}}^{+})\,\big ] - \int _{K}f(u_{h}^{K})\,\partial _{x}{\varphi _{h}^{K}}\,dx = 0, \end{aligned}$$
(598)

where \(x_{{\textsc {L}}/{\textsc {H}}}^{\pm }=\lim _{\delta ^{+}\rightarrow 0}x_{{\textsc {L}}/{\textsc {H}}}\pm \delta \). However, the entirely local formulation in Eq. (598) is problematic because it does not specify how solutions in adjacent elements interact. In addition, a unique flux must be defined on the element interfaces at \(x_{{\textsc {L}}/{\textsc {H}}}\) to recover the conservation statement inherent in Eq. (593). To resolve this, the fluxes on the element interfaces are replaced by a unique value, which then gives the semi-discrete DG method in weak form: Find \(u_{h}^{K} \in {\mathbb {V}}_{h}^{k}\) such that

$$\begin{aligned} \int _{K}(\partial _{t}{u_{h}^{K}})\,\varphi _{h}^{K}\,dx + \big [\,\widehat{f(u_{h}^{K})}(x_{{\textsc {H}}})\,\varphi _{h}^{K}(x_{{\textsc {H}}}^{-})-\widehat{f(u_{h}^{K})}(x_{{\textsc {L}}})\,\varphi _{h}^{K}(x_{{\textsc {L}}}^{+})\,\big ] - \int _{K}f(u_{h}^{K})\,\partial _{x}{\varphi _{h}^{K}}\,dx = 0 \end{aligned}$$
(599)

holds for all \(\varphi _{h}\in {\mathbb {V}}_{h}^{k}\) and all \(K\in {\mathscr {T}}\). In Eq. (599), \(\widehat{f(u_{h}^{K})}(x)\) is a unique numerical flux defined on the interface. For the scalar problem considered here, the familiar upwind flux can be used:

$$\begin{aligned} \widehat{f(u_{h}^{K})}(x) =\frac{1}{2}\,\big (\,f(u_{h}^{K}(x^{-}))+f(u_{h}^{K}(x^{+}))-|a|\,(u_{h}^{K}(x^{+})-u_{h}^{K}(x^{-}))\,\big ), \end{aligned}$$
(600)

which is defined in terms of approximations to the immediate left and right of x, which can be different.

Undoing the integration by parts that resulted in Eq. (599) gives the semi-discrete DG method in strong form: Find \(u_{h}^{K} \in {\mathbb {V}}_{h}^{k}\) such that

$$\begin{aligned}&\int _{K}R(u_{h}^{K})\,\varphi _{h}^{K}\,dx \nonumber \\&\quad = \big [\,\big (f(u_{h}^{K}(x_{{\textsc {H}}}^{-}))-\widehat{f(u_{h}^{K})}(x_{{\textsc {H}}})\big )\,\varphi _{h}^{K}(x_{{\textsc {H}}}^{-})-\big (f(u_{h}^{K}(x_{{\textsc {L}}}^{+}))-\widehat{f(u_{h}^{K})}(x_{{\textsc {L}}})\big )\,\varphi _{h}^{K}(x_{{\textsc {L}}}^{+})\,\big ], \end{aligned}$$
(601)

for all \(\varphi _{h}^{K}\in {\mathbb {V}}_{h}^{k}\) and all \(K\in {\mathscr {T}}\). Here, the weak and the strong formulations [Eqs. (599) and (601), respectively] are equivalent statements. By comparing the strong formulation with Eq. (597), one sees that the residual in the DG solution is orthogonal to \(\varphi _{h}\) only in the convergent limit when \(f(u_{h}^{K}(x^{\pm }))\rightarrow \widehat{f(u_{h}^{K})}(x)\). In Sects. 6.7.2 and 6.7.3, we will only refer to the weak formulation in Eq. (599).

To further illustrate how the weak formulation in Eq. (599) is used in practice, let

$$\begin{aligned} {\mathbf {u}}^{K}(t) =\big (\,u_{1}^{K}(t),\ldots ,u_{k+1}^{K}(t)\,\big )^{T} \quad \text {and}\quad {\mathbf {b}}^{K}(x) =\big (\,b_{1}^{K}(x),\ldots ,b_{k+1}^{K}(x)\,\big )^{T}. \end{aligned}$$
(602)

Then, by inserting Eq. (595) into Eq. (599), and letting \(\varphi _{h}=b_{j}\,(j=1,\ldots ,k+1)\), one obtains an equation for the expansion coefficients:

$$\begin{aligned} \frac{d {\mathbf {u}}^{K}}{d t} =-(M^{K})^{-1}\,\Big \{\,\big [\,\widehat{f(u_{h}^{K})}(x_{{\textsc {H}}})\,{\mathbf {b}}^{K}(x_{{\textsc {H}}}^{-})-\widehat{f(u_{h}^{K})}(x_{{\textsc {L}}})\,{\mathbf {b}}^{K}(x_{{\textsc {L}}}^{+})\,\big ] - S^{K}\,{\mathbf {u}}^{K}\,\Big \}, \end{aligned}$$
(603)

where components of the mass matrix and stiffness matrix are defined as

$$\begin{aligned} M_{ij}^{K} = \int _{K}b_{i}^{K}\,b_{j}^{K}\,dx \quad \text {and}\quad S_{ij}^{K} = a\,\int _{K}(\partial _{x}{b_{i}^{K}})\,b_{j}^{K}\,dx, \end{aligned}$$
(604)

respectively. Since the basis functions are polynomials, the integrals in Eq. (604) can be computed exactly with, e.g., Gaussian quadratures. Equation (603) is now a system of ODEs, which can be integrated in time with an ODE solver. For non-stiff problems, explicit Runge–Kutta methods can be used.

The DG method has been used to develop structure-preserving methods in a range of applications; see for example Zhang and Shu (2010b) and Wu and Tang (2016) for physical-constraint-preserving methods for the non-relativistic and relativistic Euler equations, respectively, Li and Xing (2018) for a steady-state preserving method for the Euler equations with gravitation, and Juno et al. (2018) for an energy-conserving DG method for kinetic plasma simulations. We also mention the work of Heningburg and Hauck (2020), where DG and finite-volume methods are combined to a hybrid transport scheme that captures the diffusion limit and is more efficient in terms of memory usage and computational time than the corresponding DG-only scheme.

6.7.2 Bound-preserving methods

Zhang and Shu (2010a) developed a general framework for “maximum-principle-preserving”, high-order methods for scalar conservation laws (see also Zhang and Shu 2011). Inspired by this work, Endeve et al. (2015) developed bound-preserving methods in the context of neutrino transport, aiming to maintain a distribution function satisfying \(f\in [0,1]\), as dictated by Pauli’s exclusion principle. They considered the (collisionless) phase-space advection problem in curvilinear coordinates, and included a general relativistic example in spherical symmetry with a time-independent spacetime metric given by

$$\begin{aligned} ds^{2} = -\alpha ^{2}\,dt^{2} + \gamma _{ij}\,dx^{i}\,dx^{j}, \quad \text {with}\quad \gamma _{ij}=\psi ^{4}\text{ diag }\big [\,1,r^{2},r^{2}\sin ^{2}\theta \,\big ], \end{aligned}$$
(605)

where \(\alpha \) is the lapse function and \(\psi \) the conformal factor. Under these assumptions, the Boltzmann takes the form

$$\begin{aligned}&\frac{1}{\alpha }\frac{\partial f}{\partial t} +\frac{1}{\alpha \,\psi ^{6}\,r^{2}}\frac{\partial }{\partial r}\Big (\,\alpha \,\psi ^{4}\,r^{2}\,\mu \,f\,\Big ) -\frac{1}{\varepsilon ^{2}}\frac{\partial }{\partial \varepsilon } \Big (\,\varepsilon ^{3}\,\frac{1}{\psi ^{2}\,\alpha }\frac{\partial \alpha }{\partial r}\,\mu \,f\,\Big ) \nonumber \\&\qquad +\frac{\partial }{\partial \mu } \Big (\,\big (1-\mu ^{2}\big )\,\psi ^{-2}\, \Big \{\, \frac{1}{r} +\frac{1}{\psi ^{2}}\frac{\partial \psi ^{2}}{\partial r} -\frac{1}{\alpha }\frac{\partial \alpha }{\partial r} \,\Big \}\,f \,\Big ) =0, \end{aligned}$$
(606)

where \(r\ge 0\) is the radius, \(\mu \in [-1,1]\) the momentum space angle cosine, and \(\varepsilon \ge 0\) is the neutrino energy. By defining phase-space coordinates \(z^{1}=r\), \(z^{2}=\mu \), and \(z^{3}=\varepsilon \), the phase space volume Jacobian \(\tau =\psi ^{6}\,r^{2}\,\varepsilon ^{2}\), and

$$\begin{aligned} H^{1} = H^{(r)}= \frac{\alpha }{\psi ^{2}}\mu ,\quad H^{2} = H^{(\mu )} = \frac{\alpha \big (1-\mu ^{2}\big )}{\psi ^{2}r}\,\varPsi ,\quad \text {and}\quad H^{3} = H^{(\varepsilon )} = - \frac{\varepsilon }{\psi ^{2}}\frac{\partial \alpha }{\partial r}\mu , \end{aligned}$$
(607)

where

$$\begin{aligned} \varPsi = 1+r\,\partial _{r}{\ln \psi ^{2}}-r\,\partial _{r}{\ln \alpha }, \end{aligned}$$
(608)

Equation (606) can be written in the compact form

$$\begin{aligned} \frac{\partial f}{\partial t}+\frac{1}{\tau }\sum _{i=1}^{3}\frac{\partial }{\partial z^{i}}\Big (\,\tau \,H^{i}f\,\Big ) = 0. \end{aligned}$$
(609)

It is straightforward to show that

$$\begin{aligned} \frac{1}{\tau }\sum _{i=1}^{3}\frac{\partial }{\partial z^{i}}\Big (\,\tau \,H^{i}\,\Big ) = 0 \end{aligned}$$
(610)

holds. The divergence-free condition on the phase-space flow in Eq. (610) plays an important role in maintaining \(f\le 1\).

Endeve et al. (2015) employed the discontinuous Galerkin (DG) method (see, e.g., Cockburn and Shu 2001; Shu 2016, and references therein) to solve Eq. (606). To this end, the phase space domain D is divided into a triangulation \({\mathscr {T}}\) of elements \({\mathbf {K}}\), so that \(D = \cup _{{\mathbf {K}} \in {\mathscr {T}}}\). Each element is a logically Cartesian box

$$\begin{aligned} {\mathbf {K}}=\{(r,\mu ,\varepsilon )\in {\mathbb {R}}^{3} : r\in K^{(r)}:=(r_{{\textsc {L}}},r_{{\textsc {H}}}),\, \mu \in K^{(\mu )}:=(\mu _{{\textsc {L}}},\mu _{{\textsc {H}}}),\, \varepsilon \in K^{(\varepsilon )}:=(\varepsilon _{{\textsc {L}}},\varepsilon _{{\textsc {H}}})\}, \end{aligned}$$
(611)

where \(z_{{\textsc {L}}}^{i}\) and \(z_{{\textsc {H}}}^{i}\) are, respectively, the coordinates of the lower and higher boundaries of \({\mathbf {K}}\) in the ith dimension. On each element, the approximation space for the DG method, \({\mathbb {V}}_{h}^k\), is

$$\begin{aligned} {\mathbb {V}}_{h}^{k}=\{\varphi _{h} : \varphi _{h}\big |_{{\mathbf {K}}} \in {\mathbb {Q}}^{k}({\mathbf {K}}), \, \, \forall \ {\mathbf {K}}\in {\mathscr {T}} \}, \end{aligned}$$
(612)

where \({\mathbb {Q}}^{k}\) is the space of tensor products of one-dimensional polynomials of maximal degree k. The approximation to the distribution function, \(f_{h}\), is then expressed as

$$\begin{aligned} f_{h}({\mathbf {z}},t)=\sum _{i=1}^{(k+1)^{3}}C_{i}(t)\,P_{i}({\mathbf {z}}), \end{aligned}$$
(613)

where each \(P_{i}\in {\mathbb {V}}_{h}^{k}\). Note that functions in \({\mathbb {V}}_{h}^k\) can be discontinuous across element interfaces. Then, for any \((r,\mu ,\varepsilon ) \in D\) and any \(\varphi _{h} \in {\mathbb {V}}_{h}^{k}\), the DG method is as follows: Find \(f_{h} \in {\mathbb {V}}_{h}^{k}\) such that

$$\begin{aligned}&\int _{{\mathbf {K}}}\partial _{t}{}f_{h}\,\varphi _{h}\,dV -\int _{{\mathbf {K}}}H^{(r)}f_{h}\partial _{r}{\varphi _{h}}\,dV -\int _{{\mathbf {K}}}H^{(\mu )}f_{h}\partial _{\mu }{\varphi _{h}}\,dV -\int _{{\mathbf {K}}}H^{(\varepsilon )}f_{h}\,\partial _{\varepsilon }{\varphi _{h}}\,dV \nonumber \\&\qquad + \int _{{\tilde{K}}^{(r)}}\widehat{H^{(r)}f_{h}}(r_{{\textsc {H}}},\mu ,\varepsilon )\,\varphi _{h}(r_{{\textsc {H}}}^{-},\mu ,\varepsilon )\,\tau (r_{{\textsc {H}}},\varepsilon )\,d{\tilde{V}}^{(r)} \nonumber \\&\qquad - \int _{{\tilde{K}}^{(r)}}\widehat{H^{(r)}f_{h}}(r_{{\textsc {L}}},\mu ,\varepsilon )\,\varphi _{h}(r_{{\textsc {L}}}^{+},\mu ,\varepsilon )\,\tau (r_{{\textsc {L}}},\varepsilon )\,d{\tilde{V}}^{(r)} \nonumber \\&\qquad + \int _{{\tilde{K}}^{(\mu )}}\widehat{H^{(\mu )}f_{h}}(r,\mu _{{\textsc {H}}},\varepsilon )\,\varphi _{h}(r,\mu _{{\textsc {H}}}^{-},\varepsilon )\,\tau (r,\varepsilon )\,d{\tilde{V}}^{(\mu )} \nonumber \\&\qquad - \int _{{\tilde{K}}^{(\mu )}}\widehat{H^{(\mu )}f_{h}}(r,\mu _{{\textsc {L}}},\varepsilon )\,\varphi _{h}(r,\mu _{{\textsc {L}}}^{+},\varepsilon )\,\tau (r,\varepsilon )\,d{\tilde{V}}^{(\mu )} \nonumber \\&\qquad + \int _{{\tilde{K}}^{(\varepsilon )}}\widehat{H^{(\varepsilon )}f_{h}}(r,\mu ,\varepsilon _{{\textsc {H}}})\,\varphi _{h}(r,\mu ,\varepsilon _{{\textsc {H}}}^{-})\,\tau (r,\varepsilon _{{\textsc {H}}})\,d{\tilde{V}}^{(\varepsilon )} \nonumber \\&\qquad - \int _{{\tilde{K}}^{(\varepsilon )}}\widehat{H^{(\varepsilon )}f_{h}}(r,\mu ,\varepsilon _{{\textsc {L}}})\,\varphi _{h}(r,\mu ,\varepsilon _{{\textsc {L}}}^{+})\,\tau (r,\varepsilon _{{\textsc {L}}})\,d{\tilde{V}}^{(\varepsilon )}=0, \end{aligned}$$
(614)

where the infinitesimal phase-space volume and “area” elements are

$$\begin{aligned} dV=\tau \,dr\,d\mu \,d\varepsilon ,\quad d{\tilde{V}}^{(r)}=d\mu \,d\varepsilon ,\quad d{\tilde{V}}^{(\mu )}=dr\,d\varepsilon ,\quad d{\tilde{V}}^{(\varepsilon )}=dr\,d\mu , \end{aligned}$$
(615)

and the subelements are

$$\begin{aligned} {\tilde{K}}^{(r)}=K^{(\mu )}\times K^{(\varepsilon )},\quad {\tilde{K}}^{(\mu )}=K^{(r)}\times K^{(\varepsilon )},\quad {\tilde{K}}^{(\varepsilon )}=K^{(r)}\times K^{(\mu )}. \end{aligned}$$
(616)

In Eq. (614), upwind fluxes are used for the numerical fluxes on element interfaces:

$$\begin{aligned} \widehat{H^{(r)}f_{h}}(r_{{\textsc {H}}/{\textsc {L}}},\mu ,\varepsilon )&={\mathscr {H}}^{(r)}\big (f_{h}(r_{{\textsc {H}}/{\textsc {L}}}^{-},\mu ,\varepsilon ),f_{h}(r_{{\textsc {H}}/{\textsc {L}}}^{+},\mu ,\varepsilon ); r_{{\textsc {H}}/{\textsc {L}}},\mu ,\varepsilon \big ) \nonumber \\&=\frac{\alpha _{{\textsc {H}}/{\textsc {L}}}}{\psi _{{\textsc {H}}/{\textsc {L}}}^{2}} \Big \{\, \frac{1}{2}\big (\mu +|\mu |\big )\,f_{h}(r_{{\textsc {H}}/{\textsc {L}}}^{-},\mu ,\varepsilon )+\frac{1}{2}\big (\mu -|\mu |\big )\,f_{h}(r_{{\textsc {H}}/{\textsc {L}}}^{+},\mu ,\varepsilon ) \,\Big \}, \end{aligned}$$
(617)
$$\begin{aligned} \widehat{H^{(\mu )}f_{h}}(r,\mu _{{\textsc {H}}/{\textsc {L}}},\varepsilon )&={\mathscr {H}}^{(\mu )}\big (f_{h}(r,\mu _{{\textsc {H}}/{\textsc {L}}}^{-},\varepsilon ),f_{h}(r,\mu _{{\textsc {H}}/{\textsc {L}}}^{+},\varepsilon ); r,\mu _{{\textsc {H}}/{\textsc {L}}},\varepsilon \big ) \nonumber \\&=\frac{\alpha }{\psi ^{2}\,r}\,(1-\mu _{{\textsc {H}}/{\textsc {L}}}^{2})\, \Big \{\, \frac{1}{2}\big (\varPsi +|\varPsi |\big )\,f_{h}(r,\mu _{{\textsc {H}}/{\textsc {L}}}^{-},\varepsilon ) \nonumber \\&\quad +\frac{1}{2}\big (\varPsi -|\varPsi |\big )\,f_{h}(r,\mu _{{\textsc {H}}/{\textsc {L}}}^{+},\varepsilon ) \,\Big \}, \end{aligned}$$
(618)
$$\begin{aligned} \widehat{H^{(\varepsilon )}f_{h}}(r,\mu ,\varepsilon _{{\textsc {H}}/{\textsc {L}}})&={\mathscr {H}}^{(\varepsilon )}\big (f_{h}(r,\mu ,\varepsilon _{{\textsc {H}}/{\textsc {L}}}^{-}),f_{h}(r,\mu ,\varepsilon _{{\textsc {H}}/{\textsc {L}}}^{+}); r,\mu ,\varepsilon _{{\textsc {H}}/{\textsc {L}}}\big ) \nonumber \\&=-\frac{\varepsilon _{{\textsc {H}}/{\textsc {L}}}}{\psi ^{2}} \Big \{\, \frac{1}{2}\big (\partial _{r}{}\alpha \,\mu -|\partial _{r}{}\alpha \,\mu |\big )\,f_{h}(r,\mu ,\varepsilon _{{\textsc {H}}/{\textsc {L}}}^{-}) \nonumber \\&\quad +\frac{1}{2}\big (\partial _{r}{}\alpha \,\mu +|\partial _{r}{}\alpha \,\mu |\big )\,f_{h}(r,\mu ,\varepsilon _{{\textsc {H}}/{\textsc {L}}}^{+}) \,\Big \}. \end{aligned}$$
(619)

Key to the design of a time-explicit, bound-preserving method for Eq. (606) is to find conditions such that, after the update from \(f_{h}^{n}\) to \(f_{h}^{n+1}\) with time step \(\varDelta t=t^{n+1}-t^{n}\), the cell-averaged distribution function, defined as

$$\begin{aligned} f_{{\mathbf {K}}}=\frac{1}{V_{{\mathbf {K}}}}\int _{{\mathbf {K}}}f_{h}\,dV, \quad \text {where}\quad V_{{\mathbf {K}}} =\int _{{\mathbf {K}}}dV, \end{aligned}$$
(620)

satisfies the bounds; i.e., \(f_{{\mathbf {K}}}^{n+1}\in [0,1]\). The standard approach is to find sufficient conditions such that these bounds hold with the first-order forward Euler method, while the extension to higher-order accuracy in time relies on the use of a strong stability-preserving (SSP) time stepping method, which can be expressed as convex combinations of forward Euler operators (Gottlieb et al. 2001). The conditions that are sought include a time step restriction. Then, if the bounds on the cell average at \(t^{n+1}\) hold with the forward Euler method provided \(\varDelta t\le \varDelta t_{\mathrm {FE}}\) (where \(\varDelta t_{\mathrm {FE}}\) is to be determined), the bounds will also hold when an SSP method is used, provided \(\varDelta t\le C_{\mathrm {SSP}}\times \varDelta t_{\mathrm {FE}}\), where \(0<C_{\mathrm {SSP}}\le 1\). For the optimal second- and third-order SSP Runge–Kutta (SSP-RK) methods from Shu and Osher (1988), \(C_{\mathrm {SSP}}=1\).

The equation for the cell-average is obtained from Eq. (614) with \(\varphi _{h}=1\) (the lowest possible degree polynomial in the approximation space \({\mathbb {V}}_{h}^{k}\)). With forward Euler time stepping, we then have

$$\begin{aligned} f_{{\mathbf {K}}}^{n+1}&=f_{{\mathbf {K}}}^{n} -\frac{\varDelta t}{V_{{\mathbf {K}}}} \Big \{\, \psi ^{6}(r_{{\textsc {H}}})\,r_{{\textsc {H}}}^{2}\int _{{\tilde{K}}^{(r)}}\widehat{H^{(r)}f_{h}^{n}}(r_{{\textsc {H}}},\mu ,\varepsilon )\,\varepsilon ^{2}\,d{\tilde{V}}^{(r)} \nonumber \\&\quad -\psi ^{6}(r_{{\textsc {L}}})\,r_{{\textsc {L}}}^{2}\int _{{\tilde{K}}^{(r)}}\widehat{H^{(r)}f_{h}^{n}}(r_{{\textsc {L}}},\mu ,\varepsilon )\,\varepsilon ^{2}\,d{\tilde{V}}^{(r)} \nonumber \\&\quad +\int _{{\tilde{K}}^{(\mu )}}\widehat{H^{(\mu )}f_{h}^{n}}(r,\mu _{{\textsc {H}}},\varepsilon )\,\psi ^{6}(r)\,r^{2}\,\varepsilon ^{2}\,d{\tilde{V}}^{(\mu )} \nonumber \\&\quad -\int _{{\tilde{K}}^{(\mu )}}\widehat{H^{(\mu )}f_{h}^{n}}(r,\mu _{{\textsc {L}}},\varepsilon )\,\psi ^{6}(r)\,r^{2}\,\varepsilon ^{2}\,d{\tilde{V}}^{(\mu )} \nonumber \\&\quad +\varepsilon _{{\textsc {H}}}^{2}\int _{{\tilde{K}}^{(\varepsilon )}}\widehat{H^{(\varepsilon )}f_{h}^{n}}(r,\mu ,\varepsilon _{{\textsc {H}}})\,\psi ^{6}(r)\,r^{2}\,d{\tilde{V}}^{(\varepsilon )} \nonumber \\&\quad -\varepsilon _{{\textsc {L}}}^{2}\int _{{\tilde{K}}^{(\varepsilon )}}\widehat{H^{(\varepsilon )}f_{h}^{n}}(r,\mu ,\varepsilon _{{\textsc {L}}})\,\psi ^{6}(r)\,r^{2}\,d{\tilde{V}}^{(\varepsilon )} \,\Big \}. \end{aligned}$$
(621)

Assuming that \(f_{{\mathbf {K}}}^{n}\in [0,1]\), the flux terms (which can be positive or negative) can bring \(f_{{\mathbf {K}}}^{n+1}\) outside the bounds. The contributions from these terms vanish as \(\varDelta t\rightarrow 0\), and this is where restrictions on the time step comes in. To find these restriction, \(f_{{\mathbf {K}}}^{n}\) is split into three parts and combined with the flux terms arising from the three phase-space dimensions in the current setting. To this end, we define positive constants \(s_{1},s_{2},s_{3}>0\), satisfying \(s_{1}+s_{2}+s_{3}=1\), and write the cell-average as

$$\begin{aligned} f_{{\mathbf {K}}}^{n}&= \frac{s_{1}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(r)}}\int _{K^{(r)}}f_{h}^{n}\,\tau \,dr\,d{\tilde{V}}^{(r)} + \frac{s_{2}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\mu )}}\int _{K^{(\mu )}}f_{h}^{n}\,\tau \,d\mu \,d{\tilde{V}}^{(\mu )} \nonumber \\&\quad + \frac{s_{3}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\varepsilon )}}\int _{K^{(\varepsilon )}}f_{h}^{n}\,\tau \,d\varepsilon \,d{\tilde{V}}^{(\varepsilon )}. \end{aligned}$$
(622)

Inserting this into Eq. (621) gives

$$\begin{aligned} f_{{\mathbf {K}}}^{n+1} = \frac{s_{1}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(r)}}\varGamma ^{(r)}[f_{h}^{n}]d{\tilde{V}}^{(r)} + \frac{s_{2}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\mu )}}\varGamma ^{(\mu )}[f_{h}^{n}]d{\tilde{V}}^{(\mu )} + \frac{s_{3}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\varepsilon )}}\varGamma ^{(\varepsilon )}[f_{h}^{n}]d{\tilde{V}}^{(\varepsilon )}, \end{aligned}$$
(623)

where

$$\begin{aligned}&\varGamma ^{(r)}[f_{h}^{n}](\mu ,\varepsilon ) \nonumber \\&\quad =\int _{K^{(r)}}f_{h}^{n}\,\tau \,dr - \frac{\varDelta t\,\varepsilon ^{2}}{s_{1}}\Big \{\,\psi ^{6}(r_{{\textsc {H}}})\,r_{{\textsc {H}}}^{2}\,\widehat{H^{(r)}f_{h}^{n}}(r_{{\textsc {H}}},\mu ,\varepsilon )-\psi ^{6}(r_{{\textsc {L}}})\,r_{{\textsc {L}}}^{2}\,\widehat{H^{(r)}f_{h}^{n}}(r_{{\textsc {L}}},\mu ,\varepsilon )\,\Big \}, \end{aligned}$$
(624)
$$\begin{aligned}&\varGamma ^{(\mu )}[f_{h}^{n}](r,\varepsilon ) \nonumber \\&\quad =\int _{K^{(\mu )}}f_{h}^{n}\,\tau \,d\mu -\frac{\varDelta t\tau }{s_{2}}\Big \{\,\widehat{H^{(\mu )}f_{h}^{n}}(r,\mu _{{\textsc {H}}},\varepsilon ) - \widehat{H^{(\mu )}f_{h}^{n}}(r,\mu _{{\textsc {L}}},\varepsilon ) \,\Big \}, \end{aligned}$$
(625)
$$\begin{aligned}&\varGamma ^{(\varepsilon )}[f_{h}^{n}](r,\mu ) \nonumber \\&\quad =\int _{K^{(\varepsilon )}}f_{h}^{n}\,\tau \,d\varepsilon -\frac{\varDelta t\psi ^{6}(r)r^{2}}{s_{3}}\Big \{\,\varepsilon _{{\textsc {H}}}^{2}\,\widehat{H^{(\varepsilon )}f_{h}}(r,\mu ,\varepsilon _{{\textsc {H}}})-\varepsilon _{{\textsc {L}}}^{2}\,\widehat{H^{(\varepsilon )}f_{h}}(r,\mu ,\varepsilon _{{\textsc {L}}})\,\Big \}. \end{aligned}$$
(626)

With the cell-average expressed as in Eq. (623), in order to ensure \(f_{{\mathbf {K}}}^{n+1}\ge 0\), it is sufficient to find conditions for which each of the right-hand sides in Eqs. (624), (625), (626) are nonnegative. We illustrate the details of this for Eq. (624) (see Endeve et al. 2015, for full details). In the DG method, the integrals over the faces \({\tilde{K}}^{(r)}\), \({\tilde{K}}^{(\mu )}\), and \({\tilde{K}}^{(\varepsilon )}\) in Eq. (623) are typically evaluated with a quadrature rule. In this case, it is sufficient that \(\varGamma ^{(r)},\varGamma ^{(\mu )},\varGamma ^{(\varepsilon )}\ge 0\) holds in the respective quadrature points. As an example, we let \(\tilde{{\mathbf {S}}}^{(r)}(\in {\tilde{K}}^{(r)})\) denote the set of quadrature points used to integrate over \({\tilde{K}}^{(r)}\) in Eq. (623).

To evaluate the integral on the right-hand side of Eq. (624), an \(N^{(r)}\)-point Gauss-Lobatto quadrature rule is used on the interval \(K^{(r)}\), with points

$$\begin{aligned} {\hat{S}}^{(r)} = \big \{\,r_{{\textsc {L}}}={\hat{r}}_{1},\ldots ,{\hat{r}}_{N^{(r)}}=r_{{\textsc {H}}}\,\big \}, \end{aligned}$$
(627)

and weights \({\hat{w}}_{q}\in (0,1]\), normalized such that \(\sum _{q=1}^{N^{(r)}}{\hat{w}}_{q}=1\). This quadrature integrates polynomials in r of degree \(2N^{(r)}-3\) exactly. We can then write

$$\begin{aligned} \int _{K^{(r)}}f_{h}^{n}\,\tau \,dr = \varDelta r\sum _{q=1}^{N^{(r)}}{\hat{w}}_{q}\,f_{h}^{n}({\hat{r}}_{q},\mu ,\varepsilon )\,\tau ({\hat{r}}_{q},\mu ,\varepsilon ). \end{aligned}$$
(628)

If the distribution function is approximated with a polynomial of degree k in r, and \(\psi ^{6}\) is approximated by a polynomial of degree \(k_{\psi }\), the quadrature is exact if \(N^{(r)}\ge (k+k_{\psi }+5)/2\). The reason for using the Gauss-Lobatto quadrature for the integral over \(K^{(r)}\) is because it includes the end points of the interval (\(r_{{\textsc {L}}},r_{{\textsc {H}}}\)). These end points are used to balance the flux terms in the radial dimension. Inserting Eq. (628) into Eq. (624) gives

$$\begin{aligned} \frac{1}{\varDelta r}\varGamma ^{(r)}[f_{h}^{n}]&=\sum _{q=1}^{N^{(r)}}{\hat{w}}_{q}\,f_{h}^{n}({\hat{r}}_{q})\,\tau ({\hat{r}}_{q}) \nonumber \\&\quad - \frac{\varDelta t\,\varepsilon ^{2}}{s_{1}}\Big \{\,\psi ^{6}(r_{{\textsc {H}}})\,r_{{\textsc {H}}}^{2}\,{\mathscr {H}}^{(r)}\big (f_{h}(r_{{\textsc {H}}}^{-}),f_{h}(r_{{\textsc {H}}}^{+}); r_{{\textsc {H}}}\big ) \nonumber \\&\quad -\psi ^{6}(r_{{\textsc {L}}})\,r_{{\textsc {L}}}^{2}\,{\mathscr {H}}^{(r)}\big (f_{h}(r_{{\textsc {L}}}^{-}),f_{h}(r_{{\textsc {L}}}^{+}); r_{{\textsc {L}}}\big )\,\Big \} \nonumber \\&=\sum _{q=2}^{N^{(r)}-1}{\hat{w}}_{q}\,f_{h}^{n}({\hat{r}}_{q})\,\tau ({\hat{r}}_{q}) + {\hat{w}}_{1}\,\varPhi _{1}^{(r)}\big [f_{h}^{n}(r_{{\textsc {L}}}^{-}),f_{h}^{n}(r_{{\textsc {L}}}^{+})\big ]\,\tau (r_{{\textsc {L}}}) \nonumber \\&\quad +{\hat{w}}_{N^{(r)}}\,\varPhi _{N^{(r)}}^{(r)}\big [f_{h}^{n}(r_{{\textsc {H}}}^{-}),f_{h}^{n}(r_{{\textsc {H}}}^{+})\big ]\,\tau (r_{{\textsc {H}}}), \end{aligned}$$
(629)

where

$$\begin{aligned} \varPhi _{1}^{(r)}\big [f_{h}^{n}(r_{{\textsc {L}}}^{-}),f_{h}^{n}(r_{{\textsc {L}}}^{+})\big ]&=f_{h}^{n}(r_{{\textsc {L}}}^{+})+\frac{\varDelta t}{s_{1}{\hat{w}}_{1}\varDelta r}\,{\mathscr {H}}^{(r)}\big (f_{h}^{n}(r_{{\textsc {L}}}^{-}),f_{h}^{n}(r_{{\textsc {L}}}^{+}); r_{{\textsc {L}}}\big ), \end{aligned}$$
(630)
$$\begin{aligned} \varPhi _{N^{(r)}}^{(r)}\big [f_{h}^{n}(r_{{\textsc {H}}}^{-}),f_{h}^{n}(r_{{\textsc {H}}}^{+})\big ]&=f_{h}^{n}(r_{{\textsc {H}}}^{-})-\frac{\varDelta t}{s_{1}{\hat{w}}_{N^{(r)}}\varDelta r}\,{\mathscr {H}}^{(r)}\big (f_{h}^{n}(r_{{\textsc {H}}}^{-}),f_{h}^{n}(r_{{\textsc {H}}}^{+}); r_{{\textsc {H}}}\big ). \end{aligned}$$
(631)

(For notational brevity, we have suppressed the \((\mu ,\varepsilon )\)-dependence.) Using the numerical flux function in Eq. (617), one can write

$$\begin{aligned}&\varPhi _{1}^{(r)}\big [f_{h}^{n}(r_{{\textsc {L}}}^{-}),f_{h}^{n}(r_{{\textsc {L}}}^{+})\big ] \nonumber \\&\quad = f_{h}^{n}(r_{{\textsc {L}}}^{+}) +\frac{\varDelta t}{s_{1}{\hat{w}}_{1}\varDelta r}\,\frac{\alpha (r_{{\textsc {L}}})}{\psi ^{2}(r_{{\textsc {L}}})} \Big \{\, \frac{1}{2}\big (\mu +|\mu |\big )\,f_{h}^{n}(r_{{\textsc {L}}}^{-})+\frac{1}{2}\big (\mu -|\mu |\big )\,f_{h}^{n}(r_{{\textsc {L}}}^{+}) \,\Big \} \nonumber \\&\quad =\frac{\varDelta t}{s_{1}{\hat{w}}_{1}\varDelta r}\,\frac{\alpha (r_{{\textsc {L}}})}{\psi ^{2}(r_{{\textsc {L}}})}\,\frac{1}{2}\big (\mu +|\mu |\big )\,f_{h}^{n}(r_{{\textsc {L}}}^{-}) +\Big \{\,1+\frac{\varDelta t}{s_{1}{\hat{w}}_{1}\varDelta r}\,\frac{\alpha (r_{{\textsc {L}}})}{\psi ^{2}(r_{{\textsc {L}}})}\,\frac{1}{2}\big (\mu -|\mu |\big )\,\Big \}\,f_{h}^{n}(r_{{\textsc {L}}}^{+}). \end{aligned}$$
(632)

On the right-hand side of Eq. (632) (last line), the coefficient in front of \(f_{h}^{n}(r_{{\textsc {L}}}^{-})\) is nonnegative since \(\alpha (r_{{\textsc {L}}}),\psi ^{2}(r_{{\textsc {L}}})>0\) and \(\big (\mu +|\mu |\big )\ge 0\). Only the coefficient in front of \(f_{h}^{n}(r_{{\textsc {L}}}^{+})\) can become negative since \(\big (\mu -|\mu |\big )\le 0\). Assuming \(f_{h}^{n}(r_{{\textsc {L}}}^{-}),f_{h}^{n}(r_{{\textsc {L}}}^{+})\ge 0\), it is easy to show that \(\varPhi _{1}^{(r)}\big [f_{h}^{n}(r_{{\textsc {L}}}^{-}),f_{h}^{n}(r_{{\textsc {L}}}^{+})\big ]\ge 0\), if

$$\begin{aligned} \varDelta t\le \frac{s_{1}{\hat{w}}_{1}\varDelta r}{|\mu |}\,\frac{\psi ^{2}(r_{{\textsc {L}}})}{\alpha (r_{{\textsc {L}}})}. \end{aligned}$$
(633)

Similarly, for \(f_{h}^{n}(r_{{\textsc {H}}}^{-}),f_{h}^{n}(r_{{\textsc {H}}}^{+})\ge 0\), one finds that \(\varPhi _{N^{(r)}}^{(r)}\big [f_{h}^{n}(r_{{\textsc {H}}}^{-}),f_{h}^{n}(r_{{\textsc {H}}}^{+})\big ]\ge 0\), if

$$\begin{aligned} \varDelta t\le \frac{s_{1}{\hat{w}}_{N^{(r)}}\varDelta r}{|\mu |}\,\frac{\psi ^{2}(r_{{\textsc {H}}})}{\alpha (r_{{\textsc {H}}})}. \end{aligned}$$
(634)

Therefore, assuming \(f_{h}^{n}\ge 0\) in the combined quadrature set \({\mathbf {S}}^{(r)}={\hat{S}}^{(r)}\otimes \tilde{{\mathbf {S}}}^{(r)}\), where the points in \({\hat{S}}^{(r)}\) are used to evaluate the integral over \(K^{(r)}\) in Eq. (624) and the points in \(\tilde{{\mathbf {S}}}^{(r)}\) are used to evaluate the integral over \({\tilde{K}}^{(r)}\) in Eq. (623), a sufficient condition on the time step to guarantee \(\int _{{\tilde{K}}^{(r)}}\varGamma ^{(r)}[f_{h}^{n}]d{\tilde{V}}^{(r)}\ge 0\) is given by

$$\begin{aligned} \varDelta t&\le \min \big (\psi ^{2}(r_{{\textsc {L}}})/\alpha (r_{{\textsc {L}}}),\psi ^{2}(r_{{\textsc {H}}})/\alpha (r_{{\textsc {H}}})\big )\,{\hat{w}}_{N^{(r)}}\,s_{1}\,\varDelta r. \end{aligned}$$
(635)

(Here, \({\hat{w}}_{1}={\hat{w}}_{N^{(r)}}\) is used.) Sufficient conditions on \(\varDelta t\) for \(\int _{{\tilde{K}}^{(\mu )}}\varGamma ^{(\mu )}[f_{h}^{n}]d{\tilde{V}}^{(\mu )}\ge 0\) and \(\int _{{\tilde{K}}^{(\varepsilon )}}\varGamma ^{(\varepsilon )}[f_{h}^{n}]d{\tilde{V}}^{(\varepsilon )}\ge 0\) can be derived in a similar way [we refer the interested reader to Endeve et al. (2015) for details]. Together, these restrictions on the time step ensures \(f_{{\mathbf {K}}}^{n+1}\ge 0\). It should be noted that the time step restrictions derived here are sufficient, not necessary, conditions. They are typically more restrictive than the time step required for numerical stability. Thus, in practical calculations, larger time steps may be taken. If violations of the physical bounds are detected after a time step, \(\varDelta t\) can be reduced to the sufficient conditions before the time step is redone.

The proof for \(f_{{\mathbf {K}}}^{n+1}\le 1\) relies on the divergence-free condition in Eq. (610), which can be written as

$$\begin{aligned}&\frac{1}{V_{{\mathbf {K}}}} \Big \{\, \psi ^{6}(r_{{\textsc {H}}})\,r_{{\textsc {H}}}^{2}\int _{{\tilde{K}}^{(r)}}H^{(r)}(r_{{\textsc {H}}},\mu ,\varepsilon )\,\varepsilon ^{2}\,d{\tilde{V}}^{(r)} \nonumber \\&\quad \quad -\psi ^{6}(r_{{\textsc {L}}})\,r_{{\textsc {L}}}^{2}\int _{{\tilde{K}}^{(r)}}H^{(r)}(r_{{\textsc {L}}},\mu ,\varepsilon )\,\varepsilon ^{2}\,d{\tilde{V}}^{(r)} \nonumber \\&\qquad +\int _{{\tilde{K}}^{(\mu )}}H^{(\mu )}(r,\mu _{{\textsc {H}}},\varepsilon )\,\psi ^{6}(r)\,r^{2}\,\varepsilon ^{2}\,d{\tilde{V}}^{(\mu )} \nonumber \\&\qquad -\int _{{\tilde{K}}^{(\mu )}}H^{(\mu )}(r,\mu _{{\textsc {L}}},\varepsilon )\,\psi ^{6}(r)\,r^{2}\,\varepsilon ^{2}\,d{\tilde{V}}^{(\mu )} \nonumber \\&\qquad +\varepsilon _{{\textsc {H}}}^{2}\int _{{\tilde{K}}^{(\varepsilon )}}H^{(\varepsilon )}(r,\mu ,\varepsilon _{{\textsc {H}}})\,\psi ^{6}(r)\,r^{2}\,d{\tilde{V}}^{(\varepsilon )} \nonumber \\&\qquad -\varepsilon _{{\textsc {L}}}^{2}\int _{{\tilde{K}}^{(\varepsilon )}}H^{(\varepsilon )}(r,\mu ,\varepsilon _{{\textsc {L}}})\,\psi ^{6}(r)\,r^{2}\,d{\tilde{V}}^{(\varepsilon )} \,\Big \} = 0. \end{aligned}$$
(636)

In Eq. (614), we approximate the derivatives \(\partial _{r}\alpha \) and \(\partial _{r}\psi ^{4}\) in \(K^{(r)}\) [appearing in \(H^{(\mu )}\) and \(H^{(\varepsilon )}\); cf. Eq. (607)] with polynomials and compute \(\alpha \) and \(\psi ^{4}\) from

$$\begin{aligned} \alpha (r)=\alpha (r_{{\textsc {L}}})+\int _{r_{{\textsc {L}}}}^{r}\partial _{r}{}\alpha (r')\,dr'\quad \text{ and }\quad \psi ^{4}(r)=\psi ^{4}(r_{{\textsc {L}}})+\int _{r_{{\textsc {L}}}}^{r}\partial _{r}{}\psi ^{4}(r')\,dr', \end{aligned}$$
(637)

where the Gaussian quadrature rule is used to evaluate the integrals exactly. Two-dimensional Gaussian quadrature rules are also used to evaluate the integrals over \({\tilde{K}}^{(r)}\), \({\tilde{K}}^{(\mu )}\), and \({\tilde{K}}^{(\varepsilon )}\), using \(L^{(r)}\), \(L^{(\mu )}\), and \(L^{(\varepsilon )}\) points in the r, \(\mu \), and \(\varepsilon \) dimensions, respectively. With this choice, it is straightforward to show that the discretization satisfies the divergence-free condition (636), provided \(L^{(\mu )}\ge 1\), \(L^{(\varepsilon )}\ge 2\), while \(L^{(r)}\) depends on the degree of the polynomials approximating \(\partial _{r}{}\alpha \) and \(\partial _{r}{}\psi ^{4}\).

Using the definitions in Eqs. (624)–(626), a direct calculation shows that

$$\begin{aligned}&\frac{s_{1}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(r)}}\varGamma ^{(r)}[1]d{\tilde{V}}^{(r)} + \frac{s_{2}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\mu )}}\varGamma ^{(\mu )}[1]d{\tilde{V}}^{(\mu )} + \frac{s_{3}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\varepsilon )}}\varGamma ^{(\varepsilon )}[1]d{\tilde{V}}^{(\varepsilon )} \nonumber \\&\quad =\frac{s_{1}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(r)}}\int _{K^{(r)}}\tau \,dr\,d{\tilde{V}}^{(r)} + \frac{s_{2}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\mu )}}\int _{K^{(\mu )}}\tau \,d\mu \,d{\tilde{V}}^{(\mu )} + \frac{s_{3}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\varepsilon )}}\int _{K^{(\varepsilon )}}\tau \,d\varepsilon \,d{\tilde{V}}^{(\varepsilon )} \nonumber \\&\qquad -\frac{\varDelta t}{V_{{\mathbf {K}}}} \Big \{\, \psi ^{6}(r_{{\textsc {H}}})\,r_{{\textsc {H}}}^{2}\int _{{\tilde{K}}^{(r)}}H^{(r)}(r_{{\textsc {H}}},\mu ,\varepsilon )\,\varepsilon ^{2}\,d{\tilde{V}}^{(r)} -\psi ^{6}(r_{{\textsc {L}}})\,r_{{\textsc {L}}}^{2}\int _{{\tilde{K}}^{(r)}}H^{(r)}(r_{{\textsc {L}}},\mu ,\varepsilon )\,\varepsilon ^{2}\,d{\tilde{V}}^{(r)} \nonumber \\&\qquad +\int _{{\tilde{K}}^{(\mu )}}H^{(\mu )}(r,\mu _{{\textsc {H}}},\varepsilon )\,\psi ^{6}(r)\,r^{2}\,\varepsilon ^{2}\,d{\tilde{V}}^{(\mu )} - \int _{{\tilde{K}}^{(\mu )}}H^{(\mu )}(r,\mu _{{\textsc {L}}},\varepsilon )\,\psi ^{6}(r)\,r^{2}\,\varepsilon ^{2}\,d{\tilde{V}}^{(\mu )} \nonumber \\&\qquad +\varepsilon _{{\textsc {H}}}^{2}\int _{{\tilde{K}}^{(\varepsilon )}}H^{(\varepsilon )}(r,\mu ,\varepsilon _{{\textsc {H}}})\,\psi ^{6}(r)\,r^{2}\,d{\tilde{V}}^{(\varepsilon )}-\varepsilon _{{\textsc {L}}}^{2}\int _{{\tilde{K}}^{(\varepsilon )}}H^{(\varepsilon )}(r,\mu ,\varepsilon _{{\textsc {L}}})\,\psi ^{6}(r)\,r^{2}\,d{\tilde{V}}^{(\varepsilon )}\,\Big \} \nonumber \\&\quad =s_{1}+s_{2}+s_{3}=1, \end{aligned}$$
(638)

where the divergence-free condition in Eq. (636) is used. Since the divergence-free condition holds, it is then straightforward to show that the cell-average of \(g_{h}=1-f_{h}\) satisfies [cf. Eq. (623)]

$$\begin{aligned} g_{{\mathbf {K}}}^{n+1}&=1 - f_{{\mathbf {K}}}^{n+1} \nonumber \\&= \frac{s_{1}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(r)}}\big (\varGamma ^{(r)}[1]-\varGamma ^{(r)}[f_{h}^{n}]\big )d{\tilde{V}}^{(r)} + \frac{s_{2}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\mu )}}\big (\varGamma ^{(\mu )}[1]-\varGamma ^{(\mu )}[f_{h}^{n}]\big )d{\tilde{V}}^{(\mu )} \nonumber \\&\quad +\frac{s_{3}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\varepsilon )}}\big (\varGamma ^{(\varepsilon )}[1]-\varGamma ^{(\varepsilon )}[f_{h}^{n}]\big )d{\tilde{V}}^{(\varepsilon )} \nonumber \\&= \frac{s_{1}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(r)}}\varGamma ^{(r)}[g_{h}^{n}]d{\tilde{V}}^{(r)} + \frac{s_{2}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\mu )}}\varGamma ^{(\mu )}[g_{h}^{n}]d{\tilde{V}}^{(\mu )} + \frac{s_{3}}{V_{{\mathbf {K}}}}\int _{{\tilde{K}}^{(\varepsilon )}}\varGamma ^{(\varepsilon )}[g_{h}^{n}]d{\tilde{V}}^{(\varepsilon )}, \end{aligned}$$
(639)

where the linearity property of the operators in Eq. (624)–(626) is used; e.g., \(\varGamma ^{(r)}[1]-\varGamma ^{(r)}[f_{h}^{n}]=\varGamma ^{(r)}[1-f_{h}^{n}]=\varGamma ^{(r)}[g_{h}^{n}]\). Thus, provided Eq. (636) and the restrictions on \(\varDelta t\) hold, and the conditions on \(f_{h}^{n}\) also hold for \(g_{h}^{n}\), it follows that \(g_{{\mathbf {K}}}^{n+1}\ge 0\) (or \(f_{{\mathbf {K}}}^{n+1}\le 1\)).

The numerical method developed by Endeve et al. (2015), and outlined above, is designed to preserve the physical bounds of the cell averaged distribution function (i.e., \(0\le f_{{\mathbf {K}}}\le 1\)), provided sufficiently accurate quadratures are used, specific time step restrictions are satisfied, and that the polynomial approximating the distribution function inside each phase space element \({\mathbf {K}}\) at time \(t^{n}\) is bounded in a set of quadrature points, which we denote S. After one time step, it is possible that \(f_{h}^{n+1}\) violates the bounds for some points in the set S. In the DG method, the limiter proposed by Zhang and Shu (2010a) is used to reenforce the bounds. That is, the polynomial obtained after a time step \(\varDelta t\), \(f_{h}^{n+1}({\mathbf {z}})\), is replaced by with the “limited” polynomial

$$\begin{aligned} {\tilde{f}}_{h}^{n+1}({\mathbf {z}})=\vartheta \,f_{h}^{n+1}({\mathbf {z}})+(\,1-\vartheta \,)\,f_{{\mathbf {K}}}^{n+1}, \end{aligned}$$
(640)

where the limiter parameter \(\vartheta \in [0,1]\) is given by

$$\begin{aligned} \vartheta =\min \Big \{\Big |\frac{M-f_{{\mathbf {K}}}^{n+1}}{M_{S}-f_{{\mathbf {K}}}^{n+1}}\Big |,\Big |\frac{m-f_{{\mathbf {K}}}^{n+1}}{m_{S}-f_{{\mathbf {K}}}^{n+1}}\Big |,1\Big \}, \end{aligned}$$
(641)

with \(m=0\) and \(M=1\), and

$$\begin{aligned} M_{S}=\max _{{\mathbf {z}} \in S}f_{h}^{n+1}({\mathbf {z}}), \qquad m_{S}=\min _{{\mathbf {z}} \in S}f_{h}^{n+1}({\mathbf {z}}), \end{aligned}$$
(642)

and S represents the finite set of quadrature points in \({\mathbf {K}}\) where the bounds must hold. For \(\vartheta =0\), the entire solution is limited to the cell-average, while for \(\vartheta =1\) \({\tilde{f}}_{h}^{n+1}=f_{h}^{n+1}\). It is thus absolutely necessary to maintain the bounds on the cell-average, otherwise the limiting procedure will be futile. In practice, \(\vartheta \) remains close to unity, and the limiting is a small correction. It has been shown (Zhang and Shu 2010a) that this “linear scaling limiter” maintains high order of accuracy. Also, note that the limiting procedure is conservative for particle number since it preserves the cell averaged distribution function; i.e., by inserting Eq. (640) into the definition of the cell average in Eq. (620):

$$\begin{aligned} \frac{1}{V_{{\mathbf {K}}}}\int _{{\mathbf {K}}}{\tilde{f}}_{h}^{n+1}\,dV =\frac{1}{V_{{\mathbf {K}}}}\int _{{\mathbf {K}}}\big (\,\vartheta \,f_{h}^{n+1}+(\,1-\vartheta \,)\,f_{{\mathbf {K}}}^{n+1}\,\big )\,dV =f_{{\mathbf {K}}}^{n+1}. \end{aligned}$$
(643)

In the discussion above, forward Euler time stepping is used, which is only first-order accurate. For explicit time integration, the bound-preserving scheme can easily be extended to higher-order accuracy in time by using high-order SSP time stepping methods (Shu and Osher 1988; Gottlieb et al. 2001), which are multi-stage methods that can be formulated as convex combinations of forward Euler operators. Provided limiting is applied at each stage, the bound-preserving property follows from convexity arguments. For neutrino transport problems where neutrino–matter interactions are treated with implicit methods, it is difficult to achieve both high-order accuracy and bounded solutions, and this topic remains open for further research. We will discuss this issue further below in the context of a two-moment model. Another open issue is the challenge of simultaneous number and energy conservation in the phase space advection problem discussed here: The limiter in Eq. (640) preserves the particle number, but not higher moments of the distribution function. In the present model, the space time is stationary, which implies that the so-called Komar mass (\(\alpha \,\varepsilon \,f\)) is conserved. Thus, if bounded solutions and exact conservation of the Komar mass is desired, modifications to the limiter is needed.

6.7.3 Realizability-preserving moment methods

Chu et al. (2019) developed a numerical method for a two-moment model based on DG spatial discretization and IMEX time stepping. The method is specifically designed to preserve bounds on the moments as dictated by Pauli’s exclusion principle. As such, it is an extension of the bound-preserving method discussed above, but for a nonlinear system of hyperbolic balance laws with stiff sources. As is reasonable for an initial investigation, the model adopted by Chu et al. (2019) is rather simple, when compared to the two-moment models used to model neutrino transport in contemporary core-collapse supernova simulations. However, the work highlighted the role of the moment closure in the design of robust two-moment methods for neutrino transport, and developed an IMEX scheme with a reasonable time step restriction that is compatible with bounded solutions. As such, the work put down the foundations for a framework that may help future developments of robust methods for models with improved physical fidelity. To simplify the discussion, we consider the model in Chu et al. (2019) for one spatial dimension and define moments of the distribution function as

$$\begin{aligned} \big \{\,{\mathscr {J}},{\mathscr {H}},{\mathscr {K}}\,\big \}(x,t)=\frac{1}{2}\int _{-1}^{1}f(\mu ,x,t)\,\mu ^{\{0,1,2\}}\,d\mu . \end{aligned}$$
(644)

The two-moment model can be written as a system of hyperbolic balance laws as

$$\begin{aligned} \partial _{t}{{\mathbf {u}}} + \partial _{x}{{\mathbf {f}}({\mathbf {u}})} = \mathbf {\eta } - R\,{\mathbf {u}} \equiv {\mathbf {c}}({\mathbf {u}}), \end{aligned}$$
(645)

where the evolved moment vector is \({\mathbf {u}}=({\mathscr {J}},{\mathscr {H}})^{T}\), the flux vector is \({\mathbf {f}}=({\mathscr {H}},{\mathscr {K}})^{T}\), the emissivity is \(\mathbf {\eta }=(\sigma _{A}\,{\mathscr {J}}_{0},0)^{T}\), and \(R=\text{ diag }(\sigma _{A},\sigma _{T})\). Here, \({\mathscr {J}}_{0}\) is the zeroth moment of an equilibrium distribution function, \(f_{0}\), satisfying \(f_{0}\in [0,1]\) (i.e., Fermi–Dirac statistics), \(\sigma _{A}\ge 0\) is the absorption opacity, and \(\sigma _{T}=\sigma _{A}+\sigma _{S}\), where \(\sigma _{S}\ge 0\) is the scattering opacity (assuming isotropic and isoenergetic scattering). In Eq. (645), a closure is assumed so that \({\mathscr {K}}={\mathscr {K}}({\mathbf {u}})\).

For fermions, the Pauli exclusion principle requires the distribution function to satisfy the condition \(0 \le f \le 1\). This puts corresponding restrictions on realizable values for the moments of f. It is then interesting to study the design of a numerical method for solving the system of moment equations given by Eq. (645) that preserves realizability of the moments; i.e., the moments evolve within the set of admissible values as dictated by Pauli’s exclusion principle. If we let

$$\begin{aligned} {\mathfrak {R}} := \left\{ \,f\,|\,0\le f \le 1 \,\text {and}\,0<\frac{1}{2}\int _{-1}^{1}f\,d\mu <1\,\right\} , \end{aligned}$$
(646)

the moments \({\mathbf {u}}=({\mathscr {J}},{\mathscr {H}})^{T}\) are realizable if they can be obtained from a distribution function \(f(\mu )\in {\mathfrak {R}}\). The set of all realizable moments \({\mathscr {R}}\) is (e.g., Larecki and Banach 2011)

$$\begin{aligned} {\mathscr {R}}:=\big \{\,{\mathbf {u}}=\big ({\mathscr {J}},{\mathscr {H}}\big )^{T}\,|\,{\mathscr {J}}\in (0,1)\,\text {and}\,(1-{\mathscr {J}})\,{\mathscr {J}}-|{\mathscr {H}}| > 0\,\big \}. \end{aligned}$$
(647)

The geometry of the set \({\mathscr {R}}\) in the \(({\mathscr {H}},{\mathscr {J}})\)-plane is illustrated in Fig. 18 (light blue region). For comparison, the realizable domain \({\mathscr {R}}^{+}\) of positive distribution functions (no upper bound on f), which is a cone defined by \({\mathscr {J}}>0\) and \({\mathscr {J}}-|{\mathscr {H}}|>0\) (light red region), is also shown. The realizable set \({\mathscr {R}}\) is a bounded subset of \({\mathscr {R}}^{+}\). Importantly, the set \({\mathscr {R}}\) is convex. This means that for two arbitrary elements \({\mathbf {u}}_{a},{\mathbf {u}}_{b}\in {\mathscr {R}}\), the convex combination \({\mathbf {u}}_{c} = \vartheta \,{\mathbf {u}}_{a} + (1-\vartheta )\,{\mathbf {u}}_{b}\in {\mathscr {R}}\), where \(0\le \vartheta \le 1\). This property is used repeatedly (sometimes in a nested fashion) to design the numerical method.

Fig. 18
figure 18

Illustration of the realizable set \({\mathscr {R}}\) (light blue region) defined in Eq. (647). The black lines define the boundary \(\partial {\mathscr {R}}\). The red lines indicate the boundary of the realizable set of positive distributions \({\mathscr {R}}^{+}\) (light red region)

The DG method for the two-moment model is in many ways very similar to that discussed in Sect. 6.7.2. The computational domain D is divided into elements \(K=(x_{{\textsc {L}}},x_{{\textsc {H}}})\). One each element, the approximation space is

$$\begin{aligned} {\mathbb {V}}_{h}^{k}=\{\varphi _{h} : \varphi _{h}\big |_{K} \in {\mathbb {P}}^{k}(K), \, \, \forall \ K\in D \}, \end{aligned}$$
(648)

where \({\mathbb {P}}^{k}\) is the space of polynomials in x of maximal degree k. The approximation to the moments, \({\mathbf {u}}_{h}\), is then expressed as

$$\begin{aligned} {\mathbf {u}}_{h}(x,t)=\sum _{i=1}^{k+1}{\mathbf {u}}_{i}(t)\,P_{i}(x), \end{aligned}$$
(649)

where each \(P_{i}\in {\mathbb {V}}_{h}^{k}\) and each \({\mathbf {u}}_{i}\) is a two-component vector representing the unknowns per element in the DG method. Then, for any \(x \in D\) and any \(\varphi _{h} \in {\mathbb {V}}_{h}^{k}\), the semi-discrete DG method is as follows: Find \({\mathbf {u}}_{h} \in {\mathbb {V}}_{h}^{k}\) such that

$$\begin{aligned} \int _{K}\partial _{t}{{\mathbf {u}}_{h}}\,\varphi _{h}\,dx&+\big [\,\widehat{{\mathbf {f}}({\mathbf {u}}_{h})}(x_{{\textsc {H}}})\,\varphi _{h}(x_{{\textsc {H}}}^{-})-\widehat{{\mathbf {f}}({\mathbf {u}}_{h})}(x_{{\textsc {L}}})\,\varphi _{h}(x_{{\textsc {L}}}^{+})\,\big ] \nonumber \\&-\int _{K}{\mathbf {f}}({\mathbf {u}}_{h})\,\partial _{x}{\varphi _{h}}\,dx =\int _{K}{\mathbf {c}}({\mathbf {u}}_{h})\,\varphi _{h}\,dx \end{aligned}$$
(650)

holds for all \(\varphi _{h}\in {\mathbb {V}}_{h}^{k}\) and all \(K\in D\). In Eq. (650),

$$\begin{aligned} \widehat{{\mathbf {f}}({\mathbf {u}}_{h})}(x_{{\textsc {H}}/{\textsc {L}}}) ={\mathbf {h}}\big ({\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{-}),{\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{+})\big ) \end{aligned}$$
(651)

is a numerical flux, where \({\mathbf {h}}\) is a numerical flux function. In the DG method, any standard numerical flux designed for hyperbolic conservation laws can be used. However, Chu et al. (2019) used the global Lax-Friedrichs flux, where

$$\begin{aligned} {\mathbf {h}}\big ({\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{-}),{\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{+})\big ) =\frac{1}{2} \Big [ {\mathbf {f}}\big ({\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{-})\big )+{\mathbf {f}}\big ({\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{+})\big ) -\big ({\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{+})-{\mathbf {u}}_{h}(x_{{\textsc {H}}/{\textsc {L}}}^{-})\big ) \Big ]. \end{aligned}$$
(652)

It should be noted that when using the DG method for radiation transport, as long as the approximation space includes at least linear elements, it is not necessary to switch between centered and upwind-type fluxes [e.g., as is done in Eqs. (416)–(417) for finite-volume and finite-difference methods to capture both the streaming and diffusive regimes]. As such, the DG spatial discretization is naturally structure-preserving with respect to the diffusion limit, and well-suited for radiation transport (e.g., Larsen and Morel 1989; Adams 2001). In fact, the dissipation term in the numerical flux in Eq. (652), which is not present in the diffusive regime when employing switching between centered and upwind fluxes, plays an important role in the proof of the realizability-preserving property of the two-moment method presented here. It may therefore be difficult, if not impossible, to design realizability-preserving methods for the two-moment model without this term. Note that in the diffusion limit, \(|{\mathscr {H}}|\ll {\mathscr {J}}\), the moment vector \({\mathbf {u}}\) is close to the line connecting (0, 0) and (0, 1) in Fig. 18. Then, if the particle density is low (\({\mathscr {J}}\ll 1\)) the moment vector is safely inside \({\mathscr {R}}\). On the other hand, if the particle density is high (\({\mathscr {J}}\lesssim 1\)), which, e.g., is the case for electron neutrinos in the supernova core, the moment vector is dangerously close to the boundary of \({\mathscr {R}}\), and care is needed in order to maintain \({\mathbf {u}}\in {\mathscr {R}}\). Further away from the supernova core, where neutrinos transition to streaming conditions, \(|{\mathscr {H}}|\lesssim (1-{\mathscr {J}})\,{\mathscr {J}}\) (\(\approx {\mathscr {J}}\) when \({\mathscr {J}}\ll 1\)), the moment vector is again close to the boundary of \({\mathscr {R}}\), and care in the numerics is again warranted. Maintaining \({\mathbf {u}}\in {\mathscr {R}}\) is necessary to ensure the well-posedness of the moment closure procedure (Levermore 1996; Junk 1998; Hauck et al. 2008). Realizability-preserving methods maintain \({\mathbf {u}}\in {\mathscr {R}}\) and thus improve robustness.

The semi-discretization of the two-moment model in Eq. (650) results in a system of ODEs of the form

$$\begin{aligned} \frac{d {\mathbf {U}}}{d t} = {\mathbf {T}}({\mathbf {U}}) + {\mathbf {C}}({\mathbf {U}}), \end{aligned}$$
(653)

where \({\mathbf {U}}\) represents all the degrees of freedom evolved with the DG method,

$$\begin{aligned} {\mathbf {U}} =\Big \{\, \int _{K}\partial _{t}{{\mathbf {u}}_{h}}\,\varphi _{h}\,dx \,\Big \}_{K\in D, \varphi _{h}\in {\mathbb {V}}_{h}^{k}}, \end{aligned}$$
(654)

which includes the cell-average of \({\mathbf {u}}_{h}\) in each element:

$$\begin{aligned} {\mathbf {u}}_{K} = \frac{1}{\varDelta x}\int _{K}{\mathbf {u}}_{h}\,dx. \end{aligned}$$
(655)

In Eq. (653), the transport operator \({\mathbf {T}}({\mathbf {U}})\) corresponds to the second (surface) and third (volume) terms on the left-hand side of Eq. (650), while the collision operator \({\mathbf {C}}({\mathbf {U}})\) corresponds to the right-hand side of Eq. (650). To evolve Eq. (653) forward in time, Chu et al. (2019) developed IMEX schemes, where the transport operator is treated explicitly and the collision operator is treated implicitly. As discussed in Sect. 6.7.2, the extension of the bound-preserving property to high-order methods relies on the strong-stability-preserving (SSP) property of the ODE solver. Explicit SSP Runge–Kutta (RK) methods of moderate order (\(\le 3\)) are relatively easy to construct. Unfortunately, high-order (second- or higher-order temporal accuracy) SSP-IMEX methods with time step restrictions solely due to the explicit transport operator do not exist (see for example Proposition 6.2 in Gottlieb et al. (2001), which rules out the existence of implicit SSP-RK methods of order higher than one). Because of this, Chu et al. (2019) resorted to develop formally first-order accurate IMEX schemes with the following properties: (i) second-order accurate in the streaming limit, (ii) SSP (called convex-invariant in Chu et al. 2019), with a time step restriction solely due to the explicit part, and (iii) well-behaved in the diffusion limit in the sense that the flux density remains proportional to the gradient of the number density with the correct constant of proportionality. The optimal scheme, in the sense that it is SSP with the same timestep as the forward Euler scheme applied to the explicit part, is given by

$$\begin{aligned} {\mathbf {U}}^{(1)}&= \varLambda _{{\mathscr {R}}}\Big \{{\mathbf {U}}^{n} + \varDelta t\,{\mathbf {T}}({\mathbf {U}}^{n})\Big \}, \end{aligned}$$
(656)
$$\begin{aligned} \widetilde{{\mathbf {U}}}^{(2)}&={\mathbf {U}}^{(1)} + \varDelta t\,{\mathbf {C}}(\widetilde{{\mathbf {U}}}^{(2)}); \quad {\mathbf {U}}^{(2)}=\varLambda _{{\mathscr {R}}}\Big \{\widetilde{{\mathbf {U}}}^{(2)}\Big \}, \end{aligned}$$
(657)
$$\begin{aligned} {\mathbf {U}}^{(3)}&= \varLambda _{{\mathscr {R}}}\Big \{{\mathbf {U}}^{(2)} + \varDelta t\,{\mathbf {T}}({\mathbf {U}}^{(2)})\Big \}, \end{aligned}$$
(658)
$$\begin{aligned} \widetilde{{\mathbf {U}}}^{n+1}&= \frac{1}{2}\big (\,{\mathbf {U}}^{n} + {\mathbf {U}}^{(3)}\,\big ) + \frac{1}{2}\varDelta t\,{\mathbf {C}}(\widetilde{{\mathbf {U}}}^{n+1}); \quad {\mathbf {U}}^{n+1}=\varLambda _{{\mathscr {R}}}\Big \{\widetilde{{\mathbf {U}}}^{n+1}\Big \}. \end{aligned}$$
(659)

This IMEX scheme involves two explicit evaluations of the transport operator and two implicit solves to evaluate the collision operator. The explicit stages, Eqs. (656) and (658), are forward Euler steps, while the implicit stages, Eqs. (657) and (659), can be viewed as backward Euler steps. Without collisions (\({\mathbf {C}}=0\)), the scheme reduces to the optimal second-order accurate SSP-RK scheme of Shu and Osher (1988) (also referred to as Heun’s method). Although the scheme is formally only first-order accurate in time when collisions are frequent, quantities evolve on a diffusive time scale in this case, which is much longer than the time step restriction required for stability of the explicit part. Therefore, temporal discretization errors remain small. On the other hand, second-order accuracy in the streaming limit is essential in maintaining non-oscillatory radiation solutions with the DG method in the streaming regime. In Eqs. (656)–(659), \(\varLambda _{{\mathscr {R}}}\) is a realizability-enforcing limiter used to enforce point-wise realizability within each element. The limiter, which we describe in more detail below, assumes that the cell-average is realizable after each step. We begin by finding sufficient conditions for realizability-preservation of the cell-average in each step. For this purpose, since the remaining steps are equivalent, we consider only the explicit step in Eq. (656) and the implicit step in Eq. (657).

For an explicit forward Euler update, as in Eq. (656), the equation for the cell-averaged moments [obtained from Eq. (650) with \(\varphi _{h}=1\)] is given by

$$\begin{aligned} {\mathbf {u}}_{{\mathbf {K}}}^{(1)} = {\mathbf {u}}_{{\mathbf {K}}}^{n} - \frac{\varDelta t}{\varDelta x}\big [\,\widehat{{\mathbf {f}}({\mathbf {u}}_{h}^{n})}(x_{{\textsc {H}}})-\widehat{{\mathbf {f}}({\mathbf {u}}_{h}^{n})}(x_{{\textsc {L}}})\,\big ]. \end{aligned}$$
(660)

To construct a realizability-preserving explicit update for the two-moment model, one seeks to find sufficient conditions such that \({\mathbf {u}}_{{\mathbf {K}}}^{(1)}\in {\mathscr {R}}\). The strategy is very similar to that taken for the bound-preserving scheme discussed in Sect. 6.7.2. To evaluate the integral on the right-hand side of Eq. (660) [cf. Eq. (655)], an N-point Gauss-Lobatto quadrature rule is used on the interval K, with points

$$\begin{aligned} {\hat{S}} = \big \{\,x_{{\textsc {L}}}={\hat{x}}_{1},\ldots ,{\hat{x}}_{N}=x_{{\textsc {H}}}\,\big \}, \end{aligned}$$
(661)

and weights \({\hat{w}}_{q}\in (0,1]\), normalized such that \(\sum _{q=1}^{N}{\hat{w}}_{q}=1\). Using this quadrature and the numerical flux function in Eq. (651), one can write Eq. (660) as

$$\begin{aligned} {\mathbf {u}}_{{\mathbf {K}}}^{(1)}&= \sum _{q=1}^{N}{\hat{w}}_{q}\,{\mathbf {u}}_{h}^{n}({\hat{x}}_{q}) -\frac{\varDelta t}{\varDelta x}\big [\,{\mathbf {h}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})\big )-{\mathbf {h}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+})\big )\,\big ] \nonumber \\&= \sum _{q=2}^{N-1}{\hat{w}}_{q}\,{\mathbf {u}}_{h}^{n}({\hat{x}}_{q}) + ({\hat{w}}_{1}+{\hat{w}}_{N})\,\varPhi \big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})\big ), \end{aligned}$$
(662)

which is a convex combination of \(\{{\mathbf {u}}_{h}^{n}({\hat{x}}_{q})\}_{q=2}^{N-1}\) and \(\varPhi \). (Note that \({\hat{w}}_{1}={\hat{w}}_{N}\), so that \(2\,{\hat{w}}_{1}=2\,{\hat{w}}_{N}={\hat{w}}_{1}+{\hat{w}}_{N}\).) Thus, if, for each element K, \({\mathbf {u}}_{h}^{n}({\hat{x}}_{q})\in {\mathscr {R}},\forall q=2,\ldots ,N-1\) and \(\varPhi \in {\mathscr {R}}\), since the set \({\mathscr {R}}\) is convex it follows that \({\mathbf {u}}_{{\mathbf {K}}}^{(1)}\in {\mathscr {R}}\). In Eq. (662),

$$\begin{aligned}&\varPhi \big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})\big ) \nonumber \\&\quad = \frac{1}{2}\big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+})+\lambda \,{\mathbf {h}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+})\big )\,\big ] + \frac{1}{2}\big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-})-\lambda \,{\mathbf {h}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})\big )\,\big ] \nonumber \\&\quad =(1-\lambda )\,\varPhi _{0} + \frac{1}{2}\,\lambda \,\varPhi _{1} + \frac{1}{2}\,\lambda \,\varPhi _{2}, \end{aligned}$$
(663)

where \(\lambda =\varDelta t/(\varDelta x\,{\hat{w}}_{1})=\varDelta t/(\varDelta x\,{\hat{w}}_{N})\) and

$$\begin{aligned} \varPhi _{0}&=\frac{1}{2}\,\Big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+})+{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-})\,\Big ], \end{aligned}$$
(664)
$$\begin{aligned} \varPhi _{1}&=\frac{1}{2}\,\Big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-})+{\mathbf {f}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-})\big )\,\Big ] + \frac{1}{2}\,\Big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-})-{\mathbf {f}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-})\big )\,\Big ] , \end{aligned}$$
(665)
$$\begin{aligned} \varPhi _{2}&=\frac{1}{2}\,\Big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+})+{\mathbf {f}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+})\big )\,\Big ] +\frac{1}{2}\,\Big [\,{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})-{\mathbf {f}}\big ({\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})\big )\,\Big ]. \end{aligned}$$
(666)

In the last line in Eq. (663), if \(\lambda \le 1\), \(\varPhi \) is expressed as a convex combination of \(\varPhi _{0}\), \(\varPhi _{1}\), and \(\varPhi _{2}\). Thus, if \(\varPhi _{0},\varPhi _{1},\varPhi _{2}\in {\mathscr {R}}\), the time step restriction

$$\begin{aligned} \varDelta t\le {\hat{w}}_{N}\,\varDelta x\end{aligned}$$
(667)

is sufficient to guarantee \({\mathbf {u}}_{{\mathbf {K}}}^{(1)}\in {\mathscr {R}}\). The condition \(\varPhi _{0}\in {\mathscr {R}}\) follows from the assumption \({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{+}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{-})\in {\mathscr {R}}\), while the conditions \(\varPhi _{1},\varPhi _{2}\in {\mathscr {R}}\) follow from the additional assumptions \({\mathbf {u}}_{h}^{n}(x_{{\textsc {L}}}^{-}),{\mathbf {u}}_{h}^{n}(x_{{\textsc {H}}}^{+})\in {\mathscr {R}}\) and Lemma 2 in Chu et al. (2019), which proves \(\varPhi _{1},\varPhi _{2}\in {\mathscr {R}}\) provided these expressions can be generated from distributions \(f\in {\mathfrak {R}}\). We note that for Lemma 2 in Chu et al. (2019) to hold in the current setting, the moments must be consistent with a distribution function satisfying \(0\le f\le 1\), which demands a two-moment closure based on Fermi–Dirac statistics (the second component of \(\varPhi _{1}\) and \(\varPhi _{2}\) involves the Eddington factor). The maximum entropy closures of Cernohorsky and Bludman (1994) and Larecki and Banach (2011) and the Kershaw-type closure of Banach and Larecki (2017) are suitable. On the other hand, the Minerbo, M1, and Kershaw closures discussed in Sect. 4.7.4 are based on positive distribution functions (with no upper bound), and are therefore not suitable if \({\mathbf {u}}\in {\mathscr {R}}\) is desired. These closures are only compatible with the relaxed condition \({\mathbf {u}}\in {\mathscr {R}}^{+}\). (In this case the approach discussed here, with minor modifications, is still applicable; e.g., see Olbrant et al. (2012) for a method with explicit time stepping.)

For the implicit solve in Eq. (657), the cell-average with backward Euler gives

$$\begin{aligned} {\mathbf {u}}_{{\mathbf {K}}}^{(2)} = \big (\,I+\varDelta t\,R\,\big )^{-1}\big (\,{\mathbf {u}}_{{\mathbf {K}}}^{(1)}+\varDelta t\,\mathbf {\eta }\,\big ). \end{aligned}$$
(668)

Here it is assumed that the opacity is constant within each element. The first component of Eq. (668) is then

$$\begin{aligned} {\mathscr {J}}_{{\mathbf {K}}}^{(2)} = \frac{{\mathscr {J}}_{{\mathbf {K}}}^{(1)} + \varDelta t\,\sigma _{A}\,{\mathscr {J}}_{0,{\mathbf {K}}}}{1+\varDelta t\,\sigma _{A}}. \end{aligned}$$
(669)

Since \({\mathscr {J}}_{{\mathbf {K}}}^{(1)},{\mathscr {J}}_{0,{\mathbf {K}}}\in (0,1)\), it follows that \({\mathscr {J}}_{{\mathbf {K}}}^{(2)}\in (0,1)\). The second component of Eq. (668) is

$$\begin{aligned} {\mathscr {H}}_{{\mathbf {K}}}^{(2)} = \frac{{\mathscr {H}}_{{\mathbf {K}}}^{(1)}}{1+\varDelta t\,\sigma _{T}}. \end{aligned}$$
(670)

Then, Lemma 3 in Chu et al. (2019), which considers the moments in Eqs. (669) and (670), shows that \(|{\mathscr {H}}_{{\mathbf {K}}}^{(2)}|<(1-{\mathscr {J}}_{{\mathbf {K}}}^{(2)})\,{\mathscr {J}}_{{\mathbf {K}}}^{(2)}\), so that \({\mathbf {u}}_{{\mathbf {K}}}^{(2)}\in {\mathscr {R}}\). Note that this assumes a very simple form of the collision operator (i.e., emission, absorption, and isotropic and isoenergetic scattering). For more complicated collision operators with anisotropic kernels, energy coupling interactions, and Pauli blocking factors it can become very difficult to prove that realizability of the cell-average is preserved in the implicit solve, and this must be investigated separately for each neutrino–matter interaction type. Moreover, the ability to prove results rigorously may then depend on the implicit solver used.

The update in Eq. (660) requires that for each element the polynomial approximation is realizable in each point in the quadrature set \({\hat{S}}\) in Eq. (661). Thus, after each stage in the time stepping algorithm in Eqs. (656)–(659), a limiter is applied in preparation for the next. Let the unlimited solution after any of the stages be \(\widetilde{{\mathbf {u}}}_{h}=\big (\widetilde{{\mathscr {J}}}_{h},\widetilde{{\mathscr {H}}}_{h}\big )^{T}\). Following Zhang and Shu (2010a), a limiter from Liu and Osher (1996) is first used to enforce the bounds on the zeroth moment \(\widetilde{{\mathscr {J}}}_{h}\). We replace the polynomial \(\widetilde{{\mathscr {J}}}_{h}(x)\), the first component of \(\widetilde{{\mathbf {u}}}_{h}\), with the limited polynomial

$$\begin{aligned} \widehat{{\mathscr {J}}}_{h}(x) =\vartheta _{1}\,\widetilde{{\mathscr {J}}}_{h}(x)+(1-\vartheta _{1})\,{\mathscr {J}}_{{\mathbf {K}}}, \end{aligned}$$
(671)

where the limiter parameter \(\vartheta _{1}\) is given by

$$\begin{aligned} \vartheta _{1} =\min \Big \{\,\Big |\frac{M-{\mathscr {J}}_{{\mathbf {K}}}}{M_{{\hat{S}}}-{\mathscr {J}}_{{\mathbf {K}}}}\Big |,\Big |\frac{m-{\mathscr {J}}_{{\mathbf {K}}}}{m_{{\hat{S}}}-{\mathscr {J}}_{{\mathbf {K}}}}\Big |,1\,\Big \}, \end{aligned}$$
(672)

with \(m=\delta \) and \(M=1-\delta \), where \(\delta \) is some small number (e.g., \(10^{-16}\)), and

$$\begin{aligned} M_{{\hat{S}}}=\max _{x\in {\hat{S}}}{\mathscr {J}}_{h}(x) \quad \text {and}\quad m_{{\hat{S}}}=\min _{x\in {\hat{S}}}{\mathscr {J}}_{h}(x). \end{aligned}$$
(673)

This step, which ensures \(\widehat{{\mathscr {J}}}_{h}\in (0,1)\), corresponds to the bound-enforcing limiter described in Sect. 6.7.2. After this step, we denote \(\widehat{{\mathbf {u}}}_{h}=\big (\widehat{{\mathscr {J}}}_{h},\widetilde{{\mathscr {H}}}_{h}\big )^{T}\).

The next step is to enforce \(\gamma (\widehat{{\mathbf {u}}}_{h})\equiv (1-\widehat{{\mathscr {J}}}_{h})\,\widehat{{\mathscr {J}}}_{h}-|\widetilde{{\mathscr {H}}}_{h}|>0\) for all \(x\in {\hat{S}}\), which follows a procedure similar to that developed by Zhang and Shu (2010b) to ensure positivity of the pressure when solving the Euler equations of gas dynamics. If \(\widehat{{\mathbf {u}}}_{h}\) is outside \({\mathscr {R}}\) for any quadrature point \(x\in {\hat{S}}\), i.e., \(\gamma (\widehat{{\mathbf {u}}}_{h})<0\), since \({\mathbf {u}}_{{\mathbf {K}}}\in {\mathscr {R}}\), there exists an intersection point of the straight line \({\mathbf {s}}_{q}(\psi )\), connecting \({\mathbf {u}}_{{\mathbf {K}}}\) and \(\widehat{{\mathbf {u}}}_{h}\) evaluated in the troubled quadrature point \(x_{q}\) (denoted \(\widehat{{\mathbf {u}}}_{q}\)), and the boundary of \({\mathscr {R}}\). This line is parameterized by

$$\begin{aligned} {\mathbf {s}}_{q}(\psi )=\psi \,\widehat{{\mathbf {u}}}_{q}+(1-\psi )\,{\mathbf {u}}_{{\mathbf {K}}}, \end{aligned}$$
(674)

where \(\psi \in [0,1]\). The intersection point \(\psi _{q}\) is obtained by solving \(\gamma ({\mathbf {s}}_{q}(\psi ))=0\) for \(\psi \). (In practice, \(\psi \) needs not be accurate to many significant digits, and a bisection algorithm terminated after a few iterations is sufficient.) This completes the description of major steps in the scheme presented in Chu et al. (2019).

6.8 Hybrid methods

From the preceding sections, it is clear that the landscape of approaches to neutrino transport, and the associated numerical methods, is growing rapidly. One- and two-moment models have reached a level of maturity where general relativistic core-collapse supernova modeling is feasible (e.g., Kuroda et al. 2016; Rahman et al. 2019). Multidimensional models with Boltzmann neutrino transport—e.g., using discrete ordinate or Monte Carlo methods—are also under development and results in axial symmetry have already been published (Nagakura et al. 2017), but more work is needed to reach the same level of maturity as found in moments-based models. One primary reason is, of course, the computational cost associated with transport models that provide better resolution of the angular dimensions of momentum space, such as Boltzmann models. In particular, the computational cost of the neutrino–matter coupling problem increases dramatically with increased fidelity in this sector. However, the multiscale nature of the neutrino transport problem implies that Boltzmann neutrino transport is probably not necessary everywhere in a simulation. On the one hand, the radiation field is well captured by the low-order moment models in the collision dominated region below the neutrinospheres. On the other hand, higher-fidelity models may be warranted in the gain region since heating rates are sensitive to the angular shape of the neutrino distributions. (There is already some evidence that two-moment closures are unable to capture certain details in the radiation field; e.g., Harada et al. 2019.) This motivates the use of hybrid methods, which, for example, aim to combine low- and high-fidelity approaches in order to provide sufficient resolution where needed, but at a reduced computational cost. Hybrid approaches are used in many areas of computational physics, but are not widely adopted to model neutrino transport in core-collapse supernovae. We note that the variable Eddington factor (VEF) method of Rampp and Janka (2002), which has been shown to compare well with Boltzmann neutrino transport in spherical symmetry (Liebendörfer et al. 2005), can be regarded as a hybrid method, where a simplified (and less computationally expensive) Boltzmann solver is used in the context of a two-moment model to provide the moment closure. Adopting hybrid methods to model neutrino transport in multidimensional models is a potentially rewarding direction for near-future research, and some approaches may even be able to leverage investments in capabilities that have already been developed. Since these methods have not fully found their way into the core-collapse supernova modeling community, we will not go into details, but rather briefly mention some existing work, which in most cases will require further development to account for relativity and domain-specific microphysics details. We hope to report more on this interesting field in the future.

So-called high-order–low-order (HOLO) approaches (see, e.g., review by Chacon et al. 2017) are one type of hybrid method gaining popularity for use in radiation transport (and related) applications, and combine, as the name suggests, high-fidelity solvers for the (Boltzmann) transport equation with lower-fidelity solvers (typically based on one- or two-moment models, and commonly in a gray formulation) to accelerate the process of solving the high-fidelity model—in particular, the nonlinear coupling between radiation and a material background. In these applications, the radiation field is governed by a kinetic model, while the material is governed by a fluid-like model (as in the core-collapse supernova problem). The basic idea is that, in the collisional regime, the interaction between the kinetic and fluid components occurs in a low-dimensional subspace where only a few moments of the particle distribution function are needed to accurately capture the coupling. Thus, HOLO methods are effective primarily in regions where the particle mean free path is small and the problem is stiff, and one challenge is to ensure consistency between the two model components. Recent work on HOLO methods applied to the problem of thermal radiative transfer include applications where the high-order model is solved with continuum methods such as discrete ordinates (e.g., Park et al. 2012, 2013; Lou et al. 2019) or Monte Carlo methods (e.g., Park et al. 2014; Bolding et al. 2017). We also point out related work on solving the linear transport equation (i.e., without nonlinear coupling to the material) with HOLO (or hybrid) methods by Hauck and McClarren (2013), Willert et al. (2013, 2015) and Crockatt et al. (2017, 2019, 2020).

7 Solution methods

When ultimately expressed in computer code, all of the previously discussed deterministic methods require the use of implicit numerical methods. When discretized, the transport equations produce a set of nonlinear algebraic equations. When linearized, these equations in turn lead to linear systems of equations that relate the values of the change in the distribution functions (or moments of the distribution functions) to the neutrino–matter and neutrino–neutrino interactions encoded in the terms on the right-hand side of the equations: the collision term. These source terms depend on the changes in the neutrino radiation field, as well, giving rise to the need for implicit methods.

The solution of these linear systems is associated with the dominant computational cost for any deterministic method for neutrino transport. The remainder of the panoply of physics that complete a core-collapse supernova model—hydrodynamics, nuclear kinetics, and even the global solution of the gravitational field—are typically associated with much less computational intensity and often require significantly less memory capacity and bandwidth. Because the solution techniques for the transport linear system solve depend on almost every important dimension of modern computer platforms—floating point performance, memory bandwidth, communication bandwidth and latency—the particulars of individual platforms become an important consideration when a practitioner looks to instantiate a real implementation in the form of a production code. Therefore, the structural components of modern computers and the quantitative requirements for realistic modeling of transport are inextricably linked together when one looks to build a neutrino radiation hydrodynamics code.

7.1 Simulation requirements

Regardless of the particulars of the architecture enlisted to solve the requisite equations, the computational demands of neutrino radiation hydrodynamics are prodigious. Some of these demands are imposed directly by the high dimensionality of the transport equation itself. The need to discretize the neutrino phase space with adequate resolution to capture the particulars of the neutrino–matter interactions (cf. Sect. 5) results in energy resolutions that are typically on the order of dozens of groups. This requirement is amplified by the need to spatially resolve matter features in the flow that are of roughly the size of the neutrino mean free path at various points in the computational domain. Adaptive mesh refinement (AMR) can help ameliorate the need to refine the grid everywhere to resolve the shortest mean free paths, but this reduction is typically only partially effective. Indeed, the time-dependent nature of the core-collapse supernova problem often leads to much of the grid having to be refined as the reheating and explosion epochs evolve. These resolution requirements directly impact the size of the linear systems that must be solved via deterministic methods, typically resulting in quadratic growth in the size of the system for increases in any given phase space dimension.

Therefore, the product of required energy resolution, spatial resolution, number of neutrino flavors and their distribution functions or their angular moments directly translates into a need for scalable implementations of the solution algorithms. Any implementation needs to be able to effectively take advantage of any future platform. This type of scalability is typically termed weak scaling. The figure of merit for weak scaling is how close to a constant runtime can be achieved as the computational load is increased commensurately with the amount of resources. For example, as problem size is increased along with the number of MPI ranks used in a simulation, good weak scalability is achieved if the runtime remains constant. Weak scalability is often highly dependent on effective distributed-memory parallelism, including possibly overlapping slow inter-node communication with on-node computation (Fig. 19).

Fig. 19
figure 19

A schematic of the structure of a typical neutrino transport linear system that must be solved at each time step. The diagonal, dense blocks are generally non-symmetric and have characteristic substructure arising from the coupling in angle, energy, isospin (i.e. between neutrinos and antineutrinos), and neutrino flavor, though the particulars of that structure are dependent on the lexical ordering of the solution vector. Fully implicit methods also couple individual spatial zones to one another, producing a linear system that contains a series of outlying bands in addition to the diagonally dominant dense block structure. This global linear system typically requires considerable communication on parallel platforms, where domain decomposition is often used to spread the spatial extent of the problem across the distributed memory space. IMEX methods do not require solution of this global system, but the inversion of a similarly structured set of dense blocks is required at each spatial index. However, this reduction of the implicit problem to a purely local operation can result in considerable performance advantages

However, this is a necessary, but not sufficient, condition for effective investigation. The resultant simulations must also be capable of execution in reasonable amounts of wall-clock time. Runtimes of several months are untenable if one wishes to explore a more-or-less complete set of supernova progenitors. Therefore, reducing the wall-clock time for transport computations is equally important. This so-called strong scalability is achievable if node-level execution is made faster. On modern platforms, this has very much become a question of the effective use of hybrid-node architectures.

7.2 Implementation on heterogeneous architectures

Currently, the most widely available and performant microarchitectures are based on graphical processing units (GPUs). As suggested by their name, GPUs were originally designed to handle computer graphics-intensive tasks in applications ranging from scientific visualization to video games. However, the very high intensity with which they compute and their relatively low power-consumption traits (as compared to modern CPUs) led to their adoption as engines for a variety of scientific computing tasks. Indeed, at this writing, GPU-based architectures dominate much of the highest-end HPC platforms, and all planned near-future exascale platforms will employ GPUs as the primary source of compute power.

The primary characteristic that provides the compute power of modern GPUs is the large number of compute cores, as compared to traditional CPUs. Modern GPUs (e.g. the NVIDIA V100) contain more than 5000 cores, compared to the few dozen that are present on contemporary CPUs. Each core may have a relatively low clock speed compared to a CPU, but the sheer number of processors available on a GPU leads to a much higher intensity of computation.

The architecture of the GPUs is wholly shaped by the single-instruction, multiple-data (SIMD) execution model. In this execution model, each execution unit takes as input two vectors, performs identical operations on both sets of operands (one operand from each vector), and produces a resultant vector. Modern CPUs also typically contain SIMD units: MMX, SSE, and AVX instructions are available on Intel architectures, and POWER and ARM architectures have similar extensions to execution sets to take advantage of similar units. In the case of GPUs, however, these instructions are essentially the only ones available, restricting the amount of branching and conditional execution that can be effectively carried out by the device.

All modern GPU architectures make use of a similar set of hardware components and associated software abstractions. Here, we will primarily make use of the nomenclature used by NVIDIA to describe their GPU devices, but other vendors make use of virtually identical concepts and constructions, albeit with slightly different naming. In all cases, kernels are launched on the device as a set of threads. Each of these threads executes a single SIMD pipeline. Within the kernel launch of threads, threads are grouped into a number of blocks. These thread blocks are mapped to individual streaming multiprocessors (SMs). Each SM executes threads in groups of parallel threads termed a warp (the number of threads in a warp, or wave, is typically some multiple of 32). Inside each warp, a single, common instruction is executed during a clock cycle. This lockstep execution can be broken by conditionals (e.g. if-then-else instructions). When this occurs, the effect of this thread divergence within a warp breaks the parallelization of the warp. The execution on the conditional thread continues in a serial fashion, and all the other threads are stalled.

This execution model is further complicated by the hierarchical memory on GPUs. Global memory is accessible by all cores. This global memory is typically several GBs on each device. The bandwidth of this memory is often termed high-bandwidth, as it typically has bandwidths several times that for DRAM that might be attached to the CPU host. Closer to each multiprocessor there is a shared memory that offers a space accessible to all cores inside the multiprocessor. It is typically used as a user-managed cache of the global memory. The bandwidth to this cache is typically much faster than fetching addresses from the global memory for each core. Ultimately, each core has a certain number of registers that provide the greatest memory bandwidth, but, concomitantly, have the smallest capacities.

Programming GPUs relies on providing as many operands as possible at the maximum possible rate to all of the SMs on a device. The complexity of the memory hierarchy, the execution model, and the possibility of thread divergences can make this a formidable programming task.

Several programming models have been introduced to program GPUs. These include:

  1. 1.

    CUDA: a minor extension of C/C++ for GPU thread programming. CUDA is a proprietary programming model created and supported by NVIDIA.

  2. 2.

    ROCm: an extension of C/C++, much like CUDA in purpose and syntax. ROCm was created by AMD and is Open Source.

  3. 3.

    OpenCL: a multi-vendor standard. OpenCL is designed to work on a wide variety of platforms, not just GPUs. This makes the model very powerful, but also introduces a measure of irreducible complexity to accommodate this power.

  4. 4.

    OpenACC: A directive-based approach to GPU programming, OpenACC uses code decoration much like OpenMP or other directives-based models. OpenACC provides a straightforward path for GPU programming in Fortran.

  5. 5.

    OpenMP (with offload): Modern OpenMP standards include a set of extensions to provide facilities for thread-level programming on GPU devices.

The choice for any programmer between these options depends on the code to be produced and the relative agility of the development team. For neutrino radiation transport, the Oak Ridge group, for example, has chosen to work primarily in Fortran, with OpenMP directives to marshal the GPUs. This approach allows them to extend legacy code (in Fortran) in a straightforward and performant manner. Using OpenMP provides them with a measure of platform independence, as it is the only programming model currently envisaged to be supported on all major GPU hardware (i.e. NVIDIA, AMD, and Intel devices). The partial loss of thread-level control ceded by not using a more low-level model like CUDA or ROCm is not so important for radiation transport, as the vectorized computational kernels produced in evaluation of both the left and right-hand sides of the transport equation provide plenty of floating-point operations to saturate any modern GPU streaming multiprocessor. Therefore, decorating the multi-level loop nests that contain these vectorized operations at their deepest levels with directives is an effective model. In addition, this programming model can be effectively and easily extended with GPU-enabled scientific libraries (e.g., the GPU-accelerated version of BLAS), regardless of the model used by those libraries internally.

Many computational radiation transport practitioners have moved to Monte Carlo (MC) approaches in recent years, driven to this choice by the relative abundance of compute power available on GPUs. However, these approaches are not without complexities on GPUs, as the widely disparate sizes of the memory spaces described above (i.e., GBs to kBs to bytes as one moves from global memory to shared memory to registers) mean that MC histories are not so simply preserved. These complications mean that the relative expense of Monte Carlo methods (cf. Sect. 6.4) cannot be fully ameliorated by porting to GPUs. Because the dense linear algebra underpinning their implementations do make effective use of GPU compute architectures, IMEX and discrete ordinates approaches have the potential to compete with MC approaches with reduced memory footprint. But, this strong reliance on a single class of numeric operations means that the success of these approaches is almost wholly dependent on the performance of linear algebra subprograms on GPUs. This is especially true for so-called batched execution of the solution of linear systems of equations, wherein several matrices and right-hand sides are computed by a single kernel invocation and the solver effectively divides the work among SMs.

8 Summary and outlook

The last decade has seen considerable, and accelerated, progress made on multiple fronts: (1) Ascertaining the explosion mechanism of core-collapse supernovae. (2) The development of the theory of general relativistic neutrino radiation hydrodynamics. (3) The development of robust numerical methods for the solution of the neutrino radiation hydrodynamics equations in core-collapse supernova environments. (4) And the application of these methods in increasingly sophisticated three-dimensional core-collapse supernova models. At this point, it is fair to say that we are theory and methods rich and that the frontier lies more in the application of these methods in three-dimensional core-collapse supernova models, although further method development is certainly needed. Three-dimensional, fully general relativistic models with all of the relevant neutrino physics in multi-frequency one- or two-moment approaches are on the horizon, the leading examples of which are documented in the work of Kuroda et al. (2016), Roberts et al. (2016) and Rahman et al. (2019). But counterpart models in three dimensions using Boltzmann neutrino transport are farther off, though here too there is a leading example in the work of Nagakura et al. (2017). Adding a new dimension to the discussion, three-dimensional Boltzmann-based models are limited right now more by supercomputing capabilities than anything else. We have documented both moments and Boltzmann approaches here that have been developed and used by multiple research groups. Boltzmann approaches have been used in core-collapse supernova models with reduced spatial dimensionality and have served to gauge moments approaches in multidimensional models for some time. Recent developments emphasize even more the need for Boltzmann-based models. The history of core-collapse supernova theory has seen quantum leaps on a number of occasions over the past more than fifty years, often associated with an increased glimpse of the rich physics that drive such supernovae. In the past 5 years, evidence has mounted that neutrino quantum effects—specifically, due to neutrino–neutrino coupling in the proto-neutron star surface region—may impact the electron-flavor neutrino luminosities and spectra responsible for neutrino shock reheating and, consequently, may play a role in the supernova mechanism itself. These early conclusions will require the same extensive development to supplant them as has been documented here for the classical neutrino transport problem. We are far from the equivalent three-dimensional, general relativistic, full-physics models that deploy neutrino quantum kinetics. Early serious work on the implementation of neutrino quantum kinetics in supernova-like environments (e.g., see Richers et al. 2019) has illuminated yet new numerical challenges that will in turn require augmented methods, to handle both the classical and the quantum mechanical evolution of the three-flavor neutrino radiation field. In this context, then, it is very clear that a Boltzmann kinetic approach, which is a component of a complete quantum kinetics approach, must be a major step toward instantiating full neutrino quantum kinetics. We look forward to watching progress on this front and reporting on these developments as well, as they mature. The core-collapse supernova problem continues to manifest itself as a generational problem, one that will continue to serve as a fertile testbed for the development of transport and radiation hydrodynamics methods.