Monte Carlo Radiative Transfer

The theory and numerical modelling of radiation processes and radiative transfer play a key role in astrophysics: they provide the link between the physical properties of an object and the radiation it emits. In the modern era of increasingly high-quality observational data and sophisticated physical theories, development and exploitation of a variety of approaches to the modelling of radiative transfer is needed. In this article, we focus on one remarkably versatile approach: Monte Carlo Radiative Transfer (MCRT). We describe the principles behind this approach, and highlight the relative ease with which they can (and have) been implemented for application to a range of astrophysical problems. All MCRT methods have in common a need to consider the adverse consequences of Monte Carlo noise in simulation results. We overview a range of methods used to suppress this noise and comment on their relative merits for a variety of applications. We conclude with a brief review of specific applications for which MCRT methods are currently popular and comment on the prospects for future developments.


The role of radiative transfer in astrophysics
Much of astrophysics is at a disadvantage compared to other fields of physics. While normally theories can be tested and phenomena studied by performing repeatable experiments in the controlled environment of a lab, astrophysics generally lacks this luxury. Instead, researchers have to mainly rely on observations of very distant objects and phenomena over which they have no control. The vast majority of information about astrophysical systems is gathered by observing their emitted radiation over the electro-magnetic spectrum. Other messengers, such as neutrinos, charged particles and recently gravitational waves, are also used but typically restricted to specific astrophysical phenomena. Given that the observation and interpretation of electro-magnetic radiation is therefore the cornerstone of astrophysical research, a firm understanding of how the observed signal forms and propagates is crucial. The framework of radiative transfer (RT) builds the theoretical foundation for this problem. It combines concepts from kinetic theory, atomic physics, special relativity and quantum mechanics, and provides a formalism to describe how the radiation field is shaped by the interactions with the ambient medium.
Finding an analytic solution for RT problems is usually very challenging, a process that typically requires approximations and quickly reaches its limits as the complexity of the problem increases. Thus, numerical methods are normally employed instead. In such cases, one considers a discretized version of the transfer equation, e.g. by replacing differentials with finite differences, and uses sophisticated solution schemes to minimize the inevitably introduced numerical errors. While being an established approach, this often leads to very complex numerical schemes and faces some particular challenges when scattering interactions have to be included or when problems without internal symmetries require a fully multidimensional treatment.
MC methods offer a completely different approach to RT problems. Instead of discretizing the RT equations, the underlying RT process is "simulated" by introducing a large number of "test particles" (later referred to as "packets" in this article). These test particles behave in a manner similar to their physical counterparts, namely real photons. In particular, particles move, can scatter or be absorbed during a MC calculation. In the simulations, decisions about the propagation behaviour of a particular test particle, e.g. when, where and how it interacts, are taken stochastically. Seemingly, this leads to a random propagation behaviour of the individual particles. However, as an ensemble, the particle population can provide an accurate representation of the transfer process and the evolution of the radiation field, provided that the sample size is chosen sufficiently large.
Given its design, the MC approach to RT offers a number of compelling benefits. Due to its inspiration from the microphysical interpretation of the RT process, devising a MC RT scheme is very intuitive and conceptually simple. This often leads to comparably simple computer programs and involves moderate coding efforts: basic MCRT routines to solve simple RT problems can be coded in only a few lines that combine a random number generator with a number of basic loops (we provide a number of simple examples of how this can be done as part of our discussions later in this article). From a physical standpoint, a significant advantage of MC methods is the ease with which scattering processes are incorporated, a task which proves much more challenging for traditional, deterministic solution approaches to RT problems. In addition, MCRT calculations can be generalised with little effort from problems with internal symmetries to problems with arbitrarily complex geometrical characteristics. This feature makes MCRT techniques often the preferred choice for multidimensional RT calculations. Finally, the MCRT treatment is often referred to as "embarrassingly parallel" to describe its ideal suitability for modern high performance computing in which the workload is distributed on a multitude of processing units. Just as the photons they represent, the individual MC particles are completely decoupled and propagate independently of each other. Thus, each processing unit can simply treat a subset of the entire particle population without the need for much communication. 1 Of course, the MC approach is not without its downsides. The most severe disadvantage is a direct consequence of the probabilistic nature of MC techniques: inevitably, any physical quantity extracted from MC calculations will be subject to stochastic fluctuations. This MC noise can be decreased by increasing the number of particles, which naturally requires more computational resources. Consequently, MC calculations are often computationally expensive. These costs further increase if the optical thickness of the simulated environment is high. Since the propagation of each particle has to be followed explicitly, the efficiency of conventional MCRT schemes suffers greatly if the number of interactions the particles experience increases. Consequently, MCRT approaches are typically ill-suited for RT problems in the diffusion regime. Furthermore, as pointed out by Camps and Baes (2018), care has to be taken when interpreting results of MCRT simulations applied to environments with intermediate to high optical depth. The need to explicitly follow the propagation of the individual MC particles is the cause for yet another drawback of MCRT approaches. In deterministic solution strategies to RT, implicit time-stepping is often used to improve numerical stability in situations with short characteristic time scales. By design, conventional MCRT schemes require following the propagation of the individual particles in a time-explicit fashion. It is thus very challenging to devise truly implicit MCRT approaches to overcome numerical stability problems. In the course of this review, we will highlight a variety of different techniques which have been devised to address and alleviate each of these drawbacks.

Scope of this review
MC techniques have become a popular and widely-used approach to address RT problems in many disciplines of physical and engineering research. Covering all the different aspects and applications of MCRT is beyond the scope of this article and we refer readers to existing surveys of the respective fields. Among these, we highlight the recent overview of MCRT in atmospheric physics by Mayer (2009), the seminal report by Carter and Cashwell (1975) and the book by Dupree and Fraley (2002), which both discuss MCRT techniques to solving neutron transport problems, and to the article by Rogers (2006), who reviews MCRT methods in the field of medical physics. In this article, we aim to provide an introduction to MC techniques used in astrophysics to mainly address photon transport problems. While attempting to provide a general and comprehensive overview, we take the liberty to put some emphasis on specific techniques used in our own field of research, namely RT in fast outflows, i.e. supernova (SN) ejecta, accretion disc and stellar winds. We feel that this approach is appropriate given that dedicated overviews of MCRT methods for specific fields of astrophysical research already exist. In particular, we refer the reader to the reviews by Whitney (2011) and Steinacker et al. (2013) on MCRT for astrophysical dust RT problems.

Structure of this review
We have structured this review as follows: in Sect. 2 we briefly review some fundamentals of radiative transfer theory that are relevant for our presentation. We begin the actual discussion of MCRT methods with a brief look at their history and review of their astrophysical applications in Sect. 3, and by introducing the basic concepts of a random number generator and random sampling in Sect. 4. In the following part, Sect. 5, the basic discretization into MC quanta or packets will be introduced before their propagation procedure is explained in Sect. 6. In Sect. 7, we discuss how emissivity by thermal and/or fluorescent processes can be incorporated in MCRT simulations.
Having introduced the basic MCRT principles, the complications arising in moving media, in particular the need to distinguish reference frames, are discussed in Sect. 8. In Sect. 9 we detail various techniques to reconstruct important radiation field quantities from the ensemble of MC packet trajectories and interaction histories. Here, particular emphasis is put on methods that reduce the inherent stochastic fluctuations in the reconstructed quantities, such as biasing and volume-based estimators. In Sect. 10 advanced MC techniques, such as Implicit Monte Carlo (IMC) and Discrete Diffusion Monte Carlo (DDMC), are described which can be used to improve the numerical stability of MCRT calculations and their efficiency in optically thick environments. We conclude this review by touching upon the challenge of coupling MCRT to hydrodynamical calculations in Sect. 11 and by presenting a hands-on example of applying MCRT to SN ejecta in Sect. 12.

Radiative transfer background
Before turning to the main focus of this review, a brief overview of the fundamentals of RT is in order to introduce the necessary nomenclature and to define the basic physical concepts underlying MCRT calculations. We assume the reader is already familiar with the principles of RT and so will not present a complete derivation. More rigorous presentation of these principles are available in the literature, for example in the books by Chandrasekhar (1960), Mihalas (1978), Rybicki and Lightman (1979), Mihalas and Mihalas (1984) and Hubeny and Mihalas (2014).
From a macroscopic perspective, RT calculations rest on the transfer equation 2 1 c ∂ ∂t + ∇ · n I (x, t; n, ν) = η(x, t; n, ν) − χ(x, t; n, ν)I (x, t; n, ν), (1) which encodes how the radiation field, expressed in terms of the specific intensity I , varies with time, t and in space, x. The specific intensity is defined in terms of the monochromatic energy dE in the frequency range [ν, ν + dν] streaming through a surface element dA during the time dt into the solid angle dΩ about the direction n: dE(x, t; n, ν) = I (x, t; n, ν) dν dt dΩ dA · n. (2) The transfer equation can be interpreted as capturing the changes in the radiation field induced by an imbalance of in-and outflows (left hand side) and by interactions with the ambient material (source and sink terms on the right hand side). This coupling to the surrounding material is described by two material functions. The emissivity η encodes how much energy is added to the radiation field due to emission processes. The second term on the right hand side of Eq. (1), which involves the opacity χ , captures the opposite effect, namely how much radiation energy is removed by absorptions. Emissivity and opacity are often combined into the source function The opaqueness of a medium along a given ray is usually quantified in terms of the optical depth τ (l) = l 0 dsχ(x(s), t; n, ν), which essentially measures the mean number of interactions a photon would undergo along a trajectory s from x(s = 0) to x(s = l). Scatterings can be incorporated into this description by formally splitting the scattering process into an absorption which is immediately followed by an emission. It should be noted, however, that the RT problem is often significantly complicated by the presence of scattering interactions since these processes redistribute radiation in both frequency and direction and introduce a non-local coupling to the ambient material.
It is often insightful to describe the radiation field not only in terms of the specific intensity but also consider its moments. These involve a solid angle average over the specific intensity and different powers of the propagation direction. From the zeroth to the second moment, these quantities have a clear physical interpretation. In particular, the zeroth moment is identical to the mean intensity J (x, t; n, ν) = 1 4π dΩ I (x, t; n, ν), which in turn is closely related to the energy density of the radiation field E(x, t; n, ν) = 4π c J (x, t; n, ν).
Analogously, the second moment becomes a tensor K i j (x, t; n, ν) = 1 4π dΩ I (x, t; n, ν)n i n j (9) and relates to the radiation pressure with each entry describing the flux of the radiation field momentum component i into the direction j. We conclude this brief overview of basic RT concepts by introducing two important reference frames. As the name suggests, the laboratory frame (LF) is defined such that the laboratory is at rest. Consequently, it lends itself naturally for convenient measurements of space and time. However, from the perspective of interaction physics, a more natural frame is one in which the interaction partner, i.e. the ambient material, is at rest. This frame is typically referred to as the comoving frame (CMF). In general, it cannot be defined globally whenever gradients in the fluid velocity are encountered. However, a local definition of the CMF, which is advected by the fluid flow, can be performed. 3 Throughout this work, we adopt the nomenclature that quantities defined or measured in the CMF are designated with a subscribed zero. When changing between these reference frames, certain transformation rules have to be obeyed. Most importantly, these transformations lead to the Doppler effect ν 0 = γ ν(1 − β·n) (11) and induce aberration where β = v/c is the ratio of the local velocity to the speed of light and γ = (1 − β 2 ) −1/2 (with β = |β|). Transformation rules for the other quantities introduced in this section, such as the specific intensity, the opacity and emissivity η 0 (x 0 , t 0 ; n 0 , ν 0 ) = η(x, t; n, ν) have been first derived by Thomas (1930) and are also discussed by Mihalas and Mihalas (1984), for example.

Historical sketch of the Monte Carlo method
When Nicholas Metropolis suggested a name for the statistical method just invented to study neutron transport through fissionable material (Metropolis 1987), he clearly drew inspiration from the game of chance which is always played at the heart of MC calculations. From a historical perspective, Georges-Louis Leclerc, Comte de Buffon, is commonly credited as having devised the first MC experiment (cf. House and Avery 1968;Dupree and Fraley 2002;Kalos and Whitlock 2008). He considered a plane with a superimposed grid of parallel lines and was interested in the probability that a needle which is tossed onto the plane intersects one of the lines (Buffon 1777). It was later suggested, by Laplace, that such a scenario may be used to experimentally determine the value of π (Laplace 1812). In 1873, the astronomer Asaph Hall reports in a short note to the Messenger of Mathematics the successful execution of this experiment, carried out in 1864 by his friend Captain O. C. Fox (Hall 1873). A detailed description of what is known today as "Buffon's needle problem" is for example provided by Dupree and Fraley (2002) or Kalos and Whitlock (2008). Notwithstanding these early rudimentary applications, the MC method in its modern form to solve transport problems has been established and shaped in the 1940s, mostly by Stanisław Ulam and John von Neumann (see e.g. Metropolis 1987). Recognising the immense potential and utility of the first large-scale electronic computers, which became operational at the time, they harnessed the mathematical concept of "statistical sampling" to solve the neutron transport problems in fissionable material, thus launching the MC method. 4 With the growing availability of computational resources, which accompanied the rapid development of computers, MC methods became increasingly popular and their application spread across many different scientific disciplines. In the late 1960s, MC calculations finally entered the astrophysics stage, for example with the works by Auer (1968), Avery and House (1968) and Magnan (1968Magnan ( , 1970. House and Avery (1968) review the status of these early MC-based RT investigations. In the time since, MC methods have become established, successful and reliable tools for the study of a variety of astrophysical RT phenomena. These range all the way from planetary atmospheres (e.g. Lee et al. 2017) to cosmological simulations of reionization (e.g. Ciardi et al. 2001;Baek et al. 2009;Maselli et al. 2009;Graziani et al. 2013). The wide range of applications indicates the broad utility of MC methods for astrophysical applications. Many of these fields have in common needs that involve a sophisticated treatment of scattering, complex (i.e. non-spherical) geometries and/or complicated opacities. For example, many astrophysical MCRT studies involve stellar winds (e.g. Lucy 1983Lucy , 2007Lucy , 2010Lucy , 2012aLucy , b, 2015Abbott and Lucy 1985;Lucy and Perinotto 1987;Hillier 1991;Lucy and Abbott 1993;Schmutz 1997;Vink et al. 1999Vink et al. , 2000Vink et al. , 2011Harries 2000;Sim 2004;Watanabe et al. 2006;Vink 2008, 2014;Muijres et al. 2012a, b;Šurlan et al. 2012Noebauer and Sim 2015;Vink 2018), mass outflows from disks (e.g. Knigge et al. 1995;Knigge and Drew 1997;Long and Knigge 2002;Sim 2005;Sim et al. 2005Sim et al. , 2008Sim et al. , 2010Sim et al. , 2012Noebauer et al. 2010;Odaka et al. 2011;Higginbottom et al. 2013;Kusterer et al. 2014;Hagino et al. 2015;Matthews et al. 2015Matthews et al. , 2016Matthews et al. , 2017Tomaru et al. 2018), or supernovae (e.g. Lucy 1987Lucy , 1999bLucy , 2005Janka and Hillebrandt 1989;Mazzali and Lucy 1993;Mazzali 2000;Stehle et al. 2005;Kasen et al. 2006;Sim 2007;Kromer and Sim 2009;Jerkstrand et al. 2011Jerkstrand et al. , 2012Jerkstrand et al. , 2015Abdikamalov et al. 2012;Wollaeger et al. 2013;Kerzendorf and Sim 2014;Wollaeger and van Rossum 2014;Bulla et al. 2015;Fransson and Jerkstrand 2015;Roth and Kasen 2015;Botyánszki et al. 2018;Ergon et al. 2018;Sand et al. 2018). In these environments a treatment of multiply overlapping spectral lines in high-velocity gradient flows are crucial. Others depend on accurate simulations of scattering, be it for high-energy processes (e.g. Pozdnyakov et al. 1983;Stern et al. 1995;Molnar and Birkinshaw 1999;Cullen 2001;Yao et al. 2005;Dolence et al. 2009;Ghosh et al. 2009Ghosh et al. , 2010Schnittman and Krolik 2010;Tamborra et al. 2018) or from dust-rich structures (e.g. Witt 1977;Yusef-Zadeh et al. 1984;Dullemond and Turolla 2000;Bjorkman and Wood 2001;Gordon et al. 2001;Misselt et al. 2001;Juvela and Padoan 2003;Niccolini et al. 2003;Jonsson 2006;Pinte et al. 2006Pinte et al. , 2009Bianchi 2008;Jonsson et al. 2010;Baes et al. 2011;Robitaille 2011;Whitney 2011;Lunttila and Juvela 2012;Camps et al. 2013;Camps and Baes 2015;Gordon et al. 2017;Verstocken et al. 2017). Many of the applications primarily aim to calculate synthetic observables but MCRT methods are also used to determine physical and/or dynamical conditions in complex multidimensional geometries, such as star forming environments, disc-like structures, nebulae or circumstellar material configurations (e.g. Wood et al. 1996;Och et al. 1998;Bjorkman and Wood 2001;Ercolano et al. 2003Ercolano et al. , 2005Ercolano et al. , 2008Kurosawa et al. 2004;Bjorkman 2006, 2008;Altay et al. 2008;Pinte et al. 2009;Harries 2011Harries , 2015Haworth and Harries 2012;Hubber et al. 2016;Lomax and Whitworth 2016;Harries et al. 2017). MCRT schemes have also found use in astrophysical problems that require a general relativistic treatment of radiative transfer processes (e.g. Zink 2008;Dolence et al. 2009;Ryan et al. 2015).

Monte Carlo basics
At the heart of MCRT techniques lies a large sequence of decisions about the fate of the simulated quanta. These decisions reflect the underlying physical processes and, as an ensemble, provide a representative description of the transport process. On an individual level, this is realised by selecting from the pool of possible outcomes based on a set of probabilities that encode the underlying physics. This procedure is typically referred to as "random sampling" and will be discussed below.

Random number generation
Critical to the outline above is that some form of randomness is required to perform the sampling, and thus the MCRT calculation, on a computer. True randomness is difficult to achieve on a machine which is inherently deterministic, but for many purposes "pseudo-randomness" is sufficient which can be obtained via a so-called (pseudo) Random Number Generator (RNG). Based on a starting value (referred to as the seed), these algorithms provide sequences of numbers, ξ , typically uniformly distributed over the interval [0, 1[. Such sequences are referred to as "pseudo" random since they share statistical properties with true randomness but are still generated by relying on deterministic prescriptions. A well-known example of such algorithms is the family of linear congruential methods. Based on a previous draw, ξ i , and a set of large numbers, a, c and M, a new random number is generated by 5 For the purpose of MCRT applications, the "pseudo"-randomness is not problematic as long as the RNG period, i.e. the lengths after which repetitions occur, 6 is large and as long as the RNG exhibits a good lattice structure. The latter implies that s-tuples of successive RNG draws, (ξ n , ξ n+1 , . . . , ξ n+s−1 ), are evenly distributed within the s-dimensional hypercube (see e.g. Kalos and Whitlock 2008), a property which a number of early multiplicative congruential methods-algorithms of the family Eq. (16) but with c = 0-lacked (first pointed out by Marsaglia 1968). Figure 1 illustrates some possible shortcomings of poorly performing RNGs. Popular examples for RNGs, which fulfil the above requirements and are well-suited for MCRT applications include for example the Mersenne Twister (Matsumoto and Nishimura 1998) or members of the xorshift family (Marsaglia 2003).

Random sampling
With the help of RNGs, random numbers 7 can be rapidly produced on the computer and used for sampling physical processes by mapping them onto the target probability distribution. To illustrate the different sampling schemes, we first introduce some 5 The resulting numbers can be mapped onto the unit interval [0, 1[ by dividing by M. 6 For the linear congruential methods as defined by Eq. (16), the period can at most reach M. 7 For the sake of brevity we will omit the attribute "pseudo".  16). The upper left panel shows pairs, (x, y) = (ξ i , ξ i+1 ), of (normalised) sequential draws from an RNG with a deliberately short period (a = 3, c = 1, M = 257 and ξ 0 = 11). The lower panels contain draws from an RNG with a much longer period of M = 2 31 based on sequential pairs (left) and triples ((x, y, z) = (ξ i , ξ i+1 , ξ i+2 ), right) . Due to the short period, the first RNG performs poorly, exhibits strong correlations between successive draws and does not fill the unit square uniformly as seen in the upper left panel. The RNG with the significantly longer period seems to perform much better: the unit square is filled evenly and no obvious correlations stand out when considering two successive draws (lower left panel). However, if three successive draws from this RNG are examined, strong correlations become apparent as seen in the lower right panel. The infamous RANDU (a = 65,539, c = 0, M = 2 31 ) was used to generate the data for this demonstration and illustrates an RNG that fails lattice tests (e.g. Fishman and Moore 1982) basic concepts from statistics. For the sake of brevity, we again refrain from a rigorous mathematical presentation and instead refer the interested reader to the standard literature on the topic, e.g. Kalos and Whitlock (2008).

Sampling from an inverse transformation
Consider a physical process with outcomes described by the random variable X . We further assume, that X is continuous and can take values from [0, ∞[. In this case, the probability that a certain event occurs, i.e. that X takes a value within [x, x + dx], is given by the so-called Probability Density Function (PDF) ρ X (x)dx, which fulfils the normalisation With the density, the probability that X takes any value less or equal to x can be calculated, resulting in the Cumulative Distribution Function (CDF) Unlike the probability density, which is always positive but not necessarily monotonous, the cumulative distributed function (by definition) is always monotonous. Consequently, it can be used to establish a mapping between two probability distributions via This is the fundamental concept of sampling one probability distribution ρ X (x) using draws from another, ρ Y (y). Using the random numbers ξ provided by the RNG, which are uniformly distributed between 0 and 1 and thus have f ξ (ξ ) = ξ , this simplifies to which, after inversion, results in the sampling rule Below, we illustrate this sampling process via examples of relevance to a number of physical process in MCRT applications.
Example 1: Selecting directions Consider the situation of isotropic scattering of a photon (using a spherical polar coordinate system). In this case, no propagation direction after the interaction is preferred and the probability that the photon escapes into a specific solid angle element dΩ = dφdθ sin θ (with φ ∈ [0, 2π ] and θ ∈ [0, π]) is constant ρ(φ, θ)dφdθ sin θ = const.
Since the two angles are independent, and after the introduction of the directional parameter μ = cos θ, this reduces to and finally results in the sampling rules Thus to randomly select a direction of propagation, we draw two independent random numbers (ξ 1 , ξ 2 ) from the RNG sequence and use them to determine the direction via Eqs. (26) and (27). Example 2: Selecting interaction points A critically important application of random sampling is the decision when photons interact. The probability that a photon interacts within dl after having covered a distance l along its trajectory is given by the probability density We omit a rigorous deviation of this result and refer to the literature instead, in particular to Kalos and Whitlock (2008, Sec. 6.3). However, the physics of this result can be quickly appreciated by recognising it as the product of the probability of having travelled distance l with no interaction (given by the exponential term) and the probability per unit length (χ ) of undergoing an interaction at the position reached after travelling l. The inverse transformation technique can be combined with Eq. (28) to determine the distance a photon will travel to the next interaction event leading to Here, we used the fact that 1−ξ is equally distributed as ξ . In the case of a homogeneous medium, χ is constant and the sampling rule simplifies to

Alternative sampling techniques
In the examples above, the inverse transformation technique was used to sample the involved physical processes since the underlying cumulative distribution function could easily and analytically be inverted. Naturally, this is not always feasible and in such cases, one has to rely on other sampling methods. However, even if determining the cumulative distribution function is analytically challenging, it can be done by means of numerical integration and values for f X (x) pre-calculated for a number of monotonically increasing x i . Once these values, f X (x i ), are available, the distribution can be sampled by first selecting a grid interval Since ξ now lies between f X (x i ) and f X (x i+1 ), the final sampling is performed by linear interpolation (see e.g. Carter and Cashwell 1975) x Naturally, this approach only approximates the underlying probability distribution and the accuracy increases with the number of grid points at which f X is evaluated. Another popular sampling technique which is applicable also to complex distributions is the so-called rejection sampling method (see, e.g. Carter and Cashwell 1975;Kalos and Whitlock 2008, for a detailed description). This approach is closely related to MC-based integration. We briefly illustrate its basic principles for the example of one-dimensional distributions. In this case, pairs of random numbers (x, y) = (ξ 1 , ξ 2 ) are generated. 9 If the trial ξ 1 is accepted as a valid sample of ρ X (x), otherwise it is rejected and the procedure repeated until the desired number of samples is obtained. This technique involves a certain level of unpredictability since not every trial draw produces an accepted sample. In addition to the general sampling techniques outlined above, a number of specific schemes tailored to probability distributions of particular interest are available. In the context of RT, a prominent example is the sampling of frequencies for a thermal radiation field from the normalized Planck distribution, For this problem, Barnett and Canfield 10 have proposed an efficient sampling technique based on the series expansion of the Planck function. This technique, which has been reviewed numerous times in the literature (for example Fleck and Cummings 1971;Carter and Cashwell 1975;Bjorkman and Wood 2001), relies on five uniform random numbers ξ 0 , . . . , ξ 4 . The first one is used in the minimization process providing L, which in turn is combined with the remaining random numbers to give the final non-dimensional frequency According to Fleck and Cummings (1971), 1.1 trials [in terms of elements in the summation in Eq. (36)] per calculated frequency are on average required, resulting in an efficient and accurate algorithm for sampling a thermal radiation field.

Monte Carlo quanta
Unlike traditional approaches to RT problems, MCRT calculations do not attempt to solve the RT equation directly. Instead, a simulation of the RT process is performed. Specifically, the radiation is discretized so that it may be represented by a large number of MC quanta. During the initialization of such a simulation, each quantum is assigned a position, an initial propagation direction, an energy and frequency, if desired, a polarization vector, and some measure of importance or weight. This last property essentially determines the contribution of the individual quanta to the final results. After the discretization and initialization, the quanta are propagated through the computational domain to simulate the RT process. In the following sections, we highlight two discretization paradigms, namely the photon packet and the energy packet scheme. These derive from different interpretations of what the quanta represent and provide different prescriptions for the choice and treatment of their weights. We then discuss packet initialization. The process of propagating packets during the simulation is described in Sect. 6.

Discretization into photon packets
Historically, MCRT applications drew inspiration from nature's inherent discretization of radiation and thus interpreted the fundamental MC quanta as photons. Indeed, in many early MCRT studies performed in astrophysics, such as Auer (1968), Avery and House (1968) and Caroff et al. (1972), the quanta are simply referred to as "photons". Although the number of MC photons that are introduced and considered is usually large in a statistical sense, it is completely insignificant compared to the actual number of real photons constituting the physical radiation field. Thus, it is inherent to this discretization scheme that the MC photons, or machine photons as they are sometimes called (cf. House and Avery 1968), actually represent a large number of physical photons instead of individual ones. As a consequence, the MC quanta are typically referred to as photon packets or simply packets. From this discretization perspective, packet weights can be interpreted as encoding that the individual MC packets represent many physical photons. However, the weights are practically never assigned uniformly or held constant during the simulation in MCRT schemes that rely on the photon packet discretization approach. These manipulations of packet weights lead to an often dramatic reduction in variance (i.e. increase of statistics and reduction of MC noise) and belong to the more generic class of biasing techniques (see Sect. 9.4). Considering MCRT applications in astrophysics, the majority relies on the photon packet discretization scheme with non-uniform and variable packet weights. Prominent examples certainly include the many MCRT simulations performed in dust RT problems (see, e.g. reviews by Whitney 2011;Steinacker et al. 2013).

Energy packets and indivisibility of packets
The energy packet discretization approach has been mainly developed and shaped by L. Lucy. The basic interpretation was already given by Abbott and Lucy (1985), but it was only after extending the approach and applying it to radiative equilibrium (RE) calculations (Lucy 1999a), that its full potential and benefits were explored. The scheme was further generalized to include the possibility of non-resonant interactions and of realising statistical equilibrium (Lucy 1999b for further details).
Compared to the photon packet scheme introduced above, the energy packet approach rests on a different interpretation of what MC quanta fundamentally represent: packets are now primarily thought of as parcels of radiant energy and the packet energy also acts as its weight. Again, these parcels of radiant energy represent many physical photons. At this point, the difference between the photon and energy packet schemes seems very subtle, almost semantic. However, the distinctiveness of this discretization scheme becomes apparent once the treatment of packet weights is included into the consideration.
The primary attraction of viewing the quanta as packets of radiation energy, rather than bundles of photons, is the ease (and accuracy) which with energy flows can be tracked during a simulation. For example, in RE problems, the combination of an energy packet discretization and an indivisible packet algorithm allow strict energy conservation to be imposed (Lucy 1999b). While all other packet properties, in particular its frequency, can change during the simulation, the packet energy, i.e. its weight, is strictly held constant after the initial assignment (i.e. the packets are indivisible, and also indestructible, excepting that they can exit through the boundaries of the computational domain). The indivisibility property can readily be applied to all interactions, even those that on first sight seem to require the splitting of packets or adjustment of weights. Instead of splitting, such events are handled by probabilistically sampling the different outcome channels (see the downbranching scheme by Lucy 1999b or the macro atom approach by Lucy 2002Lucy , 2003 which will both be described in detail in Sect. 7.4). In this process, a change in frequency assigned to the packets may occur, but the CMF energy will always stay constant. A noteworthy property of indivisible energy packet schemes is that a MC packet may represent a varying number of physical photons during its lifetime: the scheme does not enforce conservation of photon number (and nor should it: many physical radiation-matter processes e.g. recombination cascades or fluorescence do not conserve the number of photons).
Relying on this indivisible energy packet formalism offers a number of advantages as pointed out by Abbott and Lucy (1985) and Lucy (1999a). Most importantly, it enforces strict local energy conservation in RE applications by construction. However, we note that this energy conserving property does not restrict the scheme to RE problems: well posed sources and sinks of radiative energy can be readily incorporated while maintaining strict energy conservation (see Sect. 11). In addition, the packet indivisibility naturally controls the number of quanta tracked in an MCRT calculation and avoids the need to incorporate an elimination scheme for quanta with small weights which may otherwise accumulate and slow-down the calculation. The indivisible energy packet scheme has been widely used in MCRT calculations of RT in mass outflows (e.g. Abbott and Lucy 1985;Vink et al. 1999;Long and Knigge 2002;Sim 2004Sim , 2005Bjorkman 2006, 2008;Noebauer et al. 2010) andin SN ejecta (e.g. Lucy 2005;Kasen et al. 2006;Sim 2007;Kromer and Sim 2009;Noebauer et al. 2012;Kerzendorf and Sim 2014).
We note that many of the advantages of indivisible energy packet schemes can still be retained when strict indivisibility is relaxed. In particular, splitting of energy packets can be introduced in attempts to improve statistics (e.g. Harries 2015;Ergon et al. 2018) where strict energy conservation is retained (i.e. the algorithm is free to split an energy packet at any point, provided that the sum of the energies of the newly created packets matches that of the original unsplit packet). Similarly, there is no requirement of the scheme that all packets have the same energy as each other: the only rule is that the combined packet energies correctly sum to the total energy / energy flow of the process under consideration.
Example: Packet scheme applied to Compton scattering To illustrate the manner in which physical processes are described in the different packet approaches, we use the example of Compton scattering, following Lucy (2005). Specifically, we consider Compton scattering of an ensemble of high-energy photons by a population of lowtemperature free electrons (i.e. near-stationary in the CMF). Each single Compton scattering process can be roughly described (in the CMF) as where the electron initial state (e − i ) has close to zero kinetic energy but the final state (e − f ) has non-zero kinetic energy (associated with the recoil). Conversely, the post-scattering photon (γ f ) will have less energy (E γ f < E γ i ) and lower frequency (ν γ f < ν γ i ) than the initial photon state (γ i ). In a photon packet scheme, the manner in which this process can be simulated is obvious: whenever one of the MC photon packets undergoes such a Compton scattering event, the number of photons it represents remains fixed but the frequency of the photons represented by the packet is reduced (accordingly, the packet then represents less energy).
For an indivisibly energy packet scheme, the treatment is more subtle (Lucy 2005). Here we consider how energy flows through the problem: from the initial energy of the incoming photon population (γ i ) to the combination of final photon population (γ f ) and final electron kinetic energy (e − f ). In particular, a fraction F γ = E γ f /E γ i of the incident photon energy is passed to the outgoing photon while F e = E e f /E γ i = 1 − F γ goes to the electron. Thus, adopting the indivisible energy packet principle, an initial MC (γ i ) packet is converted to an outgoing γ f packet with probability F γ or to a packet representing the electron kinetic energy with probability F e . In either case, the energy represented by the packet remains fixed (i.e. strict energy conservation), but the nature of the energy has changed: in the first case the energy is still being carried by photons, but now of lower photon frequency (in accordance with the γ f state); in the second case, the energy has been passed to the electron kinetic pool from where its role in powering further emission of the material can be followed using e.g. the k-packet formalism of Lucy (2002, see also Sect. 7.4).
This example primarily serves to illustrate the subtle difference between photonpacket and energy-packet schemes but it is natural to wonder which scheme is better. In general, there is no absolute statement to be made: both approaches are valid and which is better suited will depend on the problem in question. However, the relative merits are clear and can be stated fairly simply for our example: the photon packet scheme will rigorously conserve photon number (as does the physical Compton scattering process) and is well suited if the aim of the simulation is to calculate the Comptonized photon spectrum (e.g. Pozdnyakov et al. 1983;Laurent and Titarchuk 1999), potentially following many scattering events. On the other hand, multiple scattering in the photon packet approach may lead to a proliferation of low-frequency photon packets that carry very little energy, but still require the same computational effort per scattering to simulate. This may not be ideal for e.g. applications in which the primary interest in high-energy Compton scattering lies in its role as a heating process (such as the modelling of SN ejecta powered by radioactive decay, as discussed by Lucy 2005). For such a problem, the indivisible energy packet scheme provides a simple means to determine the rate at which energy flows into the electron pool with the computational effort being invested in proportion to the energy carried by the photons, rather than to the photon number.

Initialisation of packets
Closely related to the fundamental discretization of the radiation field into discrete MC packets is the initialization process. Here, the initial packet properties are assigned by drawing from the spatial, directional and spectral distribution of the radiation field by relying on the sampling techniques presented in Sect. 4.2. The instantaneous values of these properties, 11 i.e. the position, direction, frequency, 12 energy and weight, 13 fully describe the packet state during the entire MC simulation. If the effect of polarization is included in MCRT simulations, packets are additionally assigned appropriate values for the Stokes vectors (see, e.g. Kasen et al. 2006;Whitney 2011;Bulla et al. 2015).
In the following, we briefly sketch the initialization process within the indivisible energy packet scheme. Note that the corresponding procedure is not fundamentally different within the photon packet scheme. In the following presentation, we distinguish between the initial assignment of properties for packets that represent the radiation field in the domain at the onset of the MC simulation and for packets that represent the inflow of radiation into the domain through the boundaries.
We substantiate these concepts by highlighting the initialization of N packets representing an initial thermal radiation field, at temperature T , which is assumed to be 11 For the moment, we neglect polarization.
12 Throughout this review we generally assume monochromatic packets for simplicity. Some of the techniques presented here can also be generalized to polychromatic packets (see, e.g. Steinacker et al. 2013, for more information on polychromatism). 13 We note that packet energies and weights are somewhat interchangeable concepts. Thus, we will make use of both terminologies in this review. uniform within a cuboid volume ΔV . Despite its simplicity, this situation is often encountered in MCRT calculations. In the energy packet scheme, a commonly used practise involves assigning a uniform packet energy. Thus, if N packets are initialized, each packet carries an energy of where a R is the radiation constant. The thermal radiation field is isotropic and as a consequence the initial propagation direction of a packet can be assigned using previously presented sampling rules, namely Eqs. (26) and (27). Since we have assumed that the initial radiation field is uniform over the volume ΔV , locating the packets is trivially done by where (x 0 , y 0 , z 0 ) and (x 1 , y 1 , z 1 ) are opposite corners of the cuboid volume. Finally, the packet frequency is obtained from sampling the Planck function, for example by relying on the technique given by Eqs. (36) and (37). The initialization process has been implicitly performed in the CMF. Their LF properties are obtained by applying the appropriate frame transformations (see Sect. 2). This procedure will be revisited when discussing MCRT in expanding media (see Sect. 8).
In applications for which radiation is streaming into the domain, MC packets are continuously created to represent the inflow of energy. A frequently encountered example is that of a photosphere being located at the inner boundary of a spherical domain which emits as a black body with temperature T phot (used for example in the MCRT approaches of Mazzali and Lucy 1993;Kerzendorf and Sim 2014 for studying SN ejecta). In this, case packets with energy ε are initialized during the time interval Δt (here, σ R is the Stefan-Boltzmann constant). Since these packets are launched from the photosphere, their initial position is simply If limb-darkening can be neglected, packets leave the photosphere along directions drawn by 14 μ = ξ.
Finally, the initial packet frequency is again drawn from the Planck function. We conclude this description by noting that in MCRT applications packets may also be initialized to represent the continuous radiative cooling of the ambient material or to represent the emission from other sources within the domain (e.g. in radiative nonequilibrium applications). The basic initialization principles highlighted remain the same and can be simply transferred to these applications.

Propagation of quanta
The discretization paradigms and the initialization principles outlined above (see Sect. 5) provide rules for the creation and launching of MC packets. The bulk of the computational effort involved in a MCRT calculation is spent in tracing the movement of these packets through the ambient material to simulate the RT process. During the propagation, their trajectories are interrupted as the packets experience radiationmatter interactions. Depending on the nature of these interactions, the packet properties may change or the propagation may even be terminated. The MC packets are thus followed until certain termination conditions are met, e.g. the packet leaves the computational domain or has been active for a pre-defined time (this aspect will be treated in detail in Sect. 6.6). The propagation procedure of a MCRT simulation is complete when all packets representing the initial radiation field and the effects of sources of radiative energy (e.g. inflows through boundaries, internal sources in radiative nonequilibrium applications, etc.) have been processed this way. In the following, we first outline the fundamental propagation principles (Sect. 6.1) and then detail how basic absorption and scattering interactions are handled (Sect. 6.2-6.5) before finally turning to the inclusion of time dependence (Sect. 6.6).

Basic propagation principle
In the absence of general relativistic effects (which we neglect in this review), a MC packet propagates on a straight path in its propagation direction n. In the simplest version of a MC packet propagation algorithm, the packet properties do not change along these straight-flight elements of its path: interactions with the medium are treated as discrete interaction events, and the primary aim of the MC algorithm is to determine where those interaction events occur.
Finding the interaction points depends on the opacity in the medium. Along its trajectory, measured by l, a packet (p), continuously accumulates optical depth Here the specific functional form of the opacity depends on the physical interaction processes that are taken into consideration and can, in principle, be very complicated. When the accumulated optical depth surpasses a threshold value, τ , the packet will undergo an interaction at the corresponding location l. As anticipated in Sect. 4.2, this threshold value is determined for each packet individually and probabilistically. In particular, at the beginning of each packet trajectory segment, the packet is assigned a randomly sampled optical depth distance to the next interaction by Whenever a packet experiences an interaction, its properties may change depending on the nature of the interaction process. In general, these interactions can be broadly divided into scatterings and absorptions. In the former, the packet is deflected and continues its propagation in a different direction, possibly with a different energy and/or frequency. Depending on the nature of the scattering process, the assignment of emergent packet properties may become quite complex (e.g. in dust scattering applications). Alternatively, if absorption occurs, the propagation is terminated and the packet removed from the active pool. 15 For locating packet interactions using Eq. (45), we highlight an important property of the exponential distribution, namely its infinite divisibility (see for example Bose et al. 2002). Probability distributions with this property can be replaced by the "distribution of the sum of an arbitrary number of independent and identically distributed random variables". 16 For the purpose of MCRT, this implies that, as long as the packet has not interacted, one is at liberty to reset the comparison between accumulated and interaction optical depth arbitrarily often. I.e. one can opt to redraw the optical depth distance to the next interaction with Eq. (45) and reset the tracking of accumulated optical depth, Eq. (44), at the current packet location. This property is often used when performing MCRT simulations on numerical grids (see Sect. 6.3).
Following the generic propagation principles outlined above, each packet is moved through the domain until a termination condition is reached. Depending on the problem, this may be an absorption interaction, or the packet intercepting a transparent domain surface through which it escapes to infinity, or simply that a pre-defined amount of physical time has elapsed. The propagation process, which is visually summarized in Fig. 2, is complete after all MC packets have been processed in this manner. During the propagation process, various events may be recorded or the change of certain packet properties tracked. These may then be used to reconstruct physical properties of the radiation field (see Sect. 9).

Absorption as continuous weight degradation
The general scheme outlined in the previous section can be applied to find discrete interaction events for any physical contribution to the opacity. It is particularly important (and widely used) for addressing scattering problems: an accurate treatment of scattering is most easily formulated as an ensemble of discrete interactions where the properties of the packets are changed at the interaction point in accordance with the physics of the scattering process. The scheme is also widely applied to true absorption processes, and this is particularly attractive in applications that aim to exploit the energy-conserving qualities of radiative equilibrium problems (see Sect. 7).
However, in some applications (see, e.g. the treatment of continuum absorption used by Knigge 2002 andreferences in Steinacker et al. 2013) Fig. 2 Illustration of the packet propagation process. At its core, the main processing loop works through the entire packet population, propagating each packet individually. In this loop, each packet is moved through the domain until a termination event occurs. Depending on the specific MCRT application, this may for example be an absorption interaction, escaping through one of the domain boundaries or reaching the end of a pre-defined time interval. The instruction "update estimators" refers to all activities related to recording packet properties that are used to reconstruct physical information about the radiation field from the packet ensemble. This procedure will be discussed in more detail in Sect. 9 treatment of absorption is used. Specifically, rather than treating absorption via discrete interaction events it is simulated by continuous reduction of the energy carried by a MC packet as it propagates along its flight path. Specifically, whenever a packet travels along a trajectory of length l, the energy it carries (weight, w) is reduced according to where is the optical depth associated with the absorption component of the opacity (χ a ). Conceptually, the energy removed from the MC packet in this way is being transferred out of the radiation field by whichever physical process(es) contribute to χ a . For example, this might represent energy invested in heating or ionising the ambient medium.
There are several advantages for this approach to absorption compared to the discrete scheme outlined above. First, it can reduce the MC noise since it replaces the stochastic identification of specific interaction points with a smooth degradation of packet weight. This seems especially attractive if considering small contributions to the opacity (e.g. background continua): using the discrete event algorithm to simulate such interactions would be noisy since the number of events associated with a low opacity will be small. 17 Second, it can greatly simplify the MC algorithm for applications in which pure absorption is dominant: in such cases, pure weight attenuation of packets on straight trajectories may be sufficient to solve the problem. 18 However, there are some limitations associated with continuous weight degradation. In particular, if the interaction processes is associated (at the microphysical level) with some radiative re-emission process, such as effective scattering/fluorescence in atomic or molecular line transitions, this approach loses a direct connection between the absorption and re-emission process. If important, the re-emission must be simulated by injecting new MC packets to represent it (see Sect. 7.1). For this reason, MCRT applications that depend on simulating e.g. atomic line interactions have found it more practical to use a discrete interaction approach for this process (similar to e.g. Abbott and Lucy 1985). We note, however, that hybrid schemes have been successfully employed where the continuous attenuation approach is used for smooth continuum absorption opacity while a discrete interaction algorithm is applied for atomic line absorption and electron scattering (e.g. Long and Knigge 2002). Throughout most of this review we will focus on methods that adopt the discrete interaction approach for treating both scattering and absorption but note that many of the principles discussed in later chapters can be applied to simulations that employ a weight-degradation approach to absorption, or a combination of both.

Material properties and numerical discretization
To perform the packet propagation process, the local material state, such as velocity, density and temperature, has to be accessible. It sets the local opacity and thus determines the rate at which optical depth is accumulated along the propagation path [cf. Eq. (44)]. Moreover, the material state dictates the re-emission characteristics in scattering interactions. Ideally, the material state is directly accessible in closed analytic form such that the optical depth integration can be performed analytically. In practise, however, the complex local dependence of the material properties calls for a numerical integration. In addition, the continuous material state is often not available but instead only a discrete representation. This could, for example, be the snapshot of a hydrodynamical simulation. Consequently, the packet propagation is typically realised by introducing a computational grid, dividing the domain into cells on which the matter state is discretely represented. Often, the material properties are approximated as constant throughout the grid cells (although interpolation can be used).
The packet propagation process can then proceed on the numerical grid along the basic principles outlined above. If the material state is assumed to be constant throughout the individual grid cells, determining the rate of accumulation of τ is generally simple. 19 However, one does need to track when packets cross over grid cell boundaries: at such points, quantities that depend on the material state, such as opacities, have to be recalculated. Some codes, for algorithmic convenience, also exploit the infinite indivisibility property of the exponential distribution and re-draw the random optical depth from the usual sampling law, Eq. (45), when cell crossing occurs.

Absorption and scattering
Having outlined the principles of how packets can be propagated through the simulation domain, we now discuss how interactions are handled. In any real application, the details of how interaction events are to be processed will depend on the particulars of the radiation-matter physics being simulated. To illustrate the general principles here, we adopt a number of simplifications, namely that the medium is static and that all material functions are frequency independent and isotropic. We also restrict the presentation to basic absorption 20 and coherent and isotropic scattering interactions. Lifting these simplifications, in particular, including more complex interaction processes, naturally complicates the individual steps of the propagation process but the basic structure of the procedure remains the same.
We start by considering the packet propagation process in the presence of only true absorptions, described by the opacity χ a . As detailed above, a packet interacts after accumulating the optical depth τ , assigned according to Eq. (45). This decision in optical depth space has to be translated into a physical location within the com-putational domain by inverting Eq. (44). This is a critical part of the MC algorithm that can be very challenging when the material functions are complicated functions of location, frequency and direction. For our example, however, the translation is trivial since optical depth and physical distance only differ by the opacity coefficient, which is constant within a grid cell. Thus, a packet interacts after covering the distance For pure absorption, the propagation would end at this point with the packet being discarded and the next packet of the active pool being treated. Note, however, that many implementations, including those that enforce RE based on the indivisible energy packet formalism of Lucy (1999a) do not terminate the packet at the absorption point: instead absorption is immediately followed be re-emission, as will be elaborated in Sect. 7.4. Unlike in deterministic RT approaches, where scattering generally introduces complex non-local couplings (see, e.g. Hubeny and Mihalas 2014, sec. 11.1), including scattering does not pose any conceptual difficulties for MCRT approaches. In this case, the total opacity is a sum of absorption and scattering terms: The exercise of locating packet interaction points is as trivial as before: it follows from Eq. (48) adopting the total opacity Once the interaction point is found, an additional random number experiment has to be performed to decide the nature of the interaction. In particular, the packet scatters Otherwise it is absorbed and is treated as detailed above. We note that this procedure can easily be extended to situations in which more than two outcomes are possible (e.g. if multiple mechanisms with different scattering properties were relevant) by subdividing the interval [0, 1[ into bins with sizes proportional to the relative strengths of the different processes and sampling from this discrete set. If the packet scatters it continues its propagation along a new direction with any relevant properties (e.g. photon frequency) set by rules that describe the scattering process. For example, in the simple case of isotropic coherent scattering, a new propagation direction is assigned by the sampling rule while the photon frequency would remain unchanged. Of course, realistic scattering processes may be neither isotropic nor coherent: but this merely requires that the sampling procedures be altered to reflect the true directional dependence and that the frequency be changed in the interaction. After the scattering event, the packet continues its propagation along the new direction. A new distance to the next interaction event is drawn from Eq. (45) and the tracking of the accumulated optical depth of Eq. (44) is re-initialized. In this manner, the flow of packets can be followed including multiple scatterings in arbitrary media. We emphasise that the ease with which scattering interactions can be treated is a major benefit of MCRT approaches.

Propagation example
Using the principles described so far, simple MCRT simulations can be built using only a few lines of code. As one example, consider calculating the escape probability of photons from a homogeneous sphere (uniform density and uniform emissivity) in which radiation may be absorbed or scattered (for simplicity we assume opacities and emissivities that are independent of frequency and direction). For the analytic solution to this test problem, see Appendix A.1. A simple Python implementation of the MCRT simulation for this problem is part of our code repository distributed on GitHub (see Appendix B). In the outline of the MCRT procedure below, we specifically refer to the respective parts of the Python code by providing the corresponding line numbers. The elements of this calculation are: -The MCRT simulation begins by initialising N packets and uniformly distributing them throughout the sphere. As this is a one-dimensional problem and since we are only interested in the escape probability, the packet state is essentially described by r and μ. The initial location in the uniform sphere is assigned by where R is the outer radius of the sphere (code line 90), and the initial direction is chosen isotropically (code line 92) -After initialisation, the pool of packets is processed with each being propagated through the sphere following the principles outlined above. This includes drawing the random interaction τ -values (code line 143) and calculating distances to boundary crossing (code line 145). Packet interaction occurs when the randomly drawn optical depth is reached before the boundary of the simulation (code line 149). Whenever a packet interacts, Eq. (51) is used to decide whether the packet is absorbed (destroyed in this case) or scattered. Following scattering, a new direction (code line 172) is drawn with Eq. (52) and the propagation continues. -Each packet is followed until it either is absorbed or escapes through the surface of the sphere, and the entire MCRT simulation ends when all packets are processed and different scattering to absorption strengths. In the case of χ s = 0, the MCRT results agree very well with the analytic predictions from Eq. (A.11). As the scattering opacity increases, the escape probability grows since the absorption probability is smaller for a given optical thickness of the sphere. Figure 4 further illustrates the packet propagation process by showing a small number of packet trajectories for two MCRT simulations at τ = 2, with χ s = 0 and χ s = χ a .

MCRT: time-dependent applications
The presentation so far has been centred on time-independent MCRT applications, i.e. on finding a steady-state solution. For many applications, however, obtaining the detailed time-dependent evolution of the radiation field is required. Conceptually, only few changes in the MCRT techniques outlined so far have to be made to include time dependence. Fundamentally, each MC quanta is assigned a time stamp, t, which tracks the current simulation time. During the initialisation process, the internal clock of each packet is set depending on the physical process responsible for the packet emission. If the packet represents the initial radiation field, t is simply set to the starting time of the current phase of the MCRT simulation. If the packet models the effect of a particular source of radiative energy, the time is sampled from the temporal emission profile of the source. In the simplest case of a continuous emitter, t is uniformly drawn from the duration of the current MCRT simulation step. After initialisation, as the packet propagates, this time stamp is continuously updated. In particular, after covering the distance l, the internal clock of the packet is advanced by In time-dependent problems, a discretization of time is commonly introduced, similar to the spatial grid covering the computational domain. On the one hand, this defines time intervals over which packet properties will be tallied to discretely reconstruct the time evolution of radiation field properties (see Sect. 9 for more details). On the other hand, the time discretization is used to represent changes in the material state, which in general will vary in time-dependent problems. In analogy to the spatial discretization procedure, the material state in the different grid cell is typically assumed to be constant during a time step. Before the start of the next time step, the material state is then updated. This topic will be revisited in more detail in Sect. 11, when describing the coupling of MCRT schemes with full hydrodynamic calculations. Also in a later part of this review, namely in Sect. 10, we discuss problems arising from very rapid changes in the material state and how to best address these within MCRT framework.
In addition to tracking the current time for each packet the basic propagation scheme as outlined in Sects. 6.1-6.4 has to be further extended to account for the subdivision of the MCRT simulation into a series of time steps. Whenever the internal clock of a MC packet has progressed to the end of the current simulation time step, t n+1 , the packet's propagation is interrupted. The instantaneous state of the packet, i.e. its position, current frequency, energy, propagation direction and any further properties, is stored and the next packet of the active population is treated. At the end of the propagation process, all packets stored are transferred to the next simulation cycle and the packets continue their propagation at t = t n+1 with the saved properties.

Thermal and line emission in MCRT
The treatment of absorption and pure scattering processes as outlined above are relatively standard and the principles used are very well established. In contrast, the manner in which emission is handled in MCRT schemes is relatively varied and much of the sophistication and ongoing developments in the MCRT field relate to the manner in which this is done.
In this section we aim to review some of the approaches to treating emissions within a computational domain. To be clear, this is distinct from questions of how MC quanta might be injected at some computational boundary: in Sect. 5.3 we already reviewed how e.g. a population of packets might be injected to represent a seed blackbody radiation field as might be appropriate as an initial condition in a time-dependent simulation. Likewise, we described how packets might be injected at some boundary surface to represent a radiation source external to the simulation domain, for example a photospheric boundary condition. Here, instead, our focus is on how emissivity within the computational domain can be taken into account.

Known emissivity
The most obvious case to handle is any for which the emissivity is externally known (or can be easily estimated) without prior knowledge of the RT process within the domain. One such example might be a non-equilibrium plasma that is predominantly heated and ionized by non-radiative processes.
In this case, the emissivity can simply be sampled using standard sampling techniques (Sect. 4.2) to create a population of packets with properties that represent the emission process (i.e. photon frequency, weights/energies, propagation directions etc.). This population is simply injected alongside any packets due to external radiation field boundary conditions, and their subsequent propagation followed in the same manner. 21

Radiative equilibrium (RE)
For several of the applications to which MCRT has been applied, the emissivity is not known a prior. Indeed, for many astronomical RT problems (e.g. stellar/disk atmospheres, winds, SNe etc.), RE is a good approximation and the emissivity is effectively determined by the radiation field itself (i.e. near-equilibrium is achieved between absorption and emission of radiation). In such cases, the emissivity usually cannot be anticipated independently of a radiation transport simulation, which poses a challenge for consistent modelling. In the following sections we review methods that can be applied to problems with this requirement.

Radiative equilibrium in MCRT by iteration
One approach for RE problems is to use an iteration scheme to determine the conditions of the medium (temperature, ionization state, level populations etc.) on which the emissivity depends. Here, an iterative sequence of RT simulations would be performed: in each iteration the current best estimate of the conditions in the medium would be adopted to calculate the emissivity, and the outcome of the RT calculations 22 used to make an improved estimate for those conditions in the next cycle. This approach can be applied to schemes based on photon packets and/or energy packets and it has been used for modelling at least some part of the emissivity (e.g. the part associated with radiative cooling by Long and Knigge 2002) and works well provided that the complexity of the problem is not too severe.
However, as is well known from the history of modelling stellar atmospheres (cf. Hubeny and Mihalas 2014), the non-local character of RT problems can lead to significant convergence problems for this type of iteration scheme, especially when considering regions associated with high optical depth. In particular, in its pure form, this scheme suffers from the issue that energy conservation is only achieved asymptotically (i.e. once/if a converged equilibrium solution is found). As a result, during the iteration process, over-or under-estimated emissivities (due e.g. to unconverged temperatures) will lead to spurious sources and sinks of radiation that might slow or inhibit convergence.

"On-the-fly" radiative equilibrium in MCRT via indivisible energy packets
As described/developed in the works of Lucy (Abbott and Lucy 1985;Lucy 1999aLucy , 2005, it is actually very simple to rigorously enforce the conservation of energy required by RE via an indivisible 23 energy packet MCRT scheme. The principle is straight forward: RE implies that at all points there is (local) balance between the rates of absorption and emission of energy from/to the radiation field. Since, in an energy packet discretizaion the MC quanta already represent (local) bundles of radiative energy, the condition of RE is trivially enforced by insisting that the MC quanta are never destroyed, or otherwise degraded in weight, in the course of the simulation. Thus, all interactions between MC packets and the medium-even those representing pure absorption processes-become effective scatterings controlled by rules devised for the MCRT simulation. The rules for how packets should be altered in these effective scattering events depend on the physical process(es) being simulated (Lucy 2002(Lucy , 2003(Lucy , 2005: commonly, considerations of statistical equilibrium and/or thermal equilibrium (TE) will form the basis for formulating these rules. In the following sections we will elaborate these principles more generally, but for concreteness we first discuss one of the simplest specific examples.

Example: effective resonant scattering in a two-level atom
To illustrate the principle, we consider one of the simplest possible radiation-matter interactions that might be of relevance to a radiative/statistical equilibrium problem, that of absorption by a spectral line in a two-level atom (see Fig. 5). For further simplicity we restrict our discussion to the regime in which radiation dominates both the rate of excitation and de-excitation and we assume that the emissivity is isotropic.
If stimulated emission is treated as negative absorption, the emissivity of the spectral line (photon frequency ν) can simply be written (cf. Hubeny and Mihalas 2014, p. 118) where the rate R ul R ul = n u A ul (58) depends on the population of atoms in the upper level (n u ) and the Einstein-A coefficient for the transition ( A ul ). ψ ul (ν) is the emission profile function that determines the line shape (details of this function are not pertinent to our discussion here except to note that this function is normalised over all frequencies). Thus to evaluate η ul directly we need to know the upper level population (n u ). If the radiation field itself controls excitation to u, however, we cannot reliably determine n u until after the radiation field has been simulated. Provided that statistical equilibrium applies, however, we do know that the rate of absorption of energy from the radiation field (R lu ) is equal to the rate of emission: This statement can effectively be used to replace a direct evaluation of Eq. (58) by imposing it as a rule of an indivisible energy packet MCRT simulation: whenever a MC packet is absorbed by the line, it is immediately (in situ) re-radiated by the line. In essence, this is interpreting Eq. (59) as a traffic flow problem in which absorption of MC packets is the realisation of the R lu term and re-emission is described by R ul . This particular example is almost trivial but, as will be elaborated below, the basic idea of combining RE (indivisible packets) with statistical equilibrium (traffic flow rules to process packet interactions) can be extended to much more sophisticated cases. Before discussing more general cases, however, we pause to comment on some of the key features that this simple example already highlights.

Fluorescence and thermal emissivity via redistribution parameters
Early MCRT implementations, such as Abbott and Lucy (1985), applied the twolevel effective resonance line treatment of opacity, essentially as outlined above. The two-level approximation is relatively well-justified for many of the strong metal lines in the ultraviolet (UV, such as C iv 1550 Å or N v 1242 Å) that were relevant to studies of stellar winds (Abbott and Lucy 1985) and also later studies of accretion disk winds (Long and Knigge 2002;Kusterer et al. 2014). However, the two-level atom approximation has limited utility and is not realistic for problems in which flux redistribution via fluorescence and/or thermal reprocessing of radiation is important.
Various methods, with differing degrees of sophistication, can be employed to simulate flux redistribution in indivisible packet MCRT. One approach is to assume that the radiation-matter interactions can be modelled as a combination of resonance scattering and some form of complete flux redistribution across the spectrum. In this approach, a redistribution parameter, Λ, is introduced and used to determine the outcome of each packet interaction by drawing a random number (ξ ) and comparing: if ξ > Λ then the MC packet is assumed to undergo coherent scattering (i.e. a new direction is assigned but the CMF frequency is conserved as it would be in electron scattering or resonance line scattering); otherwise an incoherent (effective) scattering is executed in which a new random direction of propagation is assigned along with a new frequency. When the incoherent case is selected, the new frequency must be drawn from some suitable normalised emissivity distribution. One simple possibility is the local thermodynamic equilibrium (LTE) thermal emissivity (χ ν B ν ), but alternative choices could be made. The redistribution parameter (Λ) can be set globally or made a function of the interaction process (e.g. for line-scattering problems it might be estimated by comparing the relative importance of collisional and radiative deexcitation, similar to the considerations by Long and Knigge 2002). The effectiveness of this approach naturally depends on the problem under consideration. However, at least for some applications studied with MCRT it has been shown that this scheme is effective. In particular, as demonstrated by Baron et al. (1996), Pinto and Eastman (2000a, b) and Kasen et al. (2006), flux redistribution in Type Ia Supernova (SN Ia) modelling can be quite effectively approximated via a simple (thermal) redistribution parameter, achieving good agreement with more detailed treatments without too much sensitivity to the particular value of Λ adopted (see also Magee et al. 2018). We note that, in the limit Λ → 1, it may appear that this type of approach seems very similar to that outlined in Sect. 7.1: selecting post-interaction properties of the MC packets depends on knowing the material state sufficiently well to estimate an emissivity distribution from which to draw e.g. photon frequencies. However, the notable difference is that here the absolute normalisation of the emissivity is not used: i.e. although the emissivity distribution is used to select most properties, the packet energies remain fixed by the indivisible packet principle. As a consequence, strict RE is still enforced in the radiation/matter interactions.

Fluorescence and redistribution: macro atom method
Approaches similar to those outlined above (i.e. that treat interactions as either coherent or fully redistributive) are easy to implement, fast to run and, with appropriate parameter choices, can capture many of the essential features of scattering and redistribution. However, not all physical processes are readily captured this way: for example fluorescence (and cascades) between energy levels in an atom or ion certainly leads to a coupling of emission in different parts of the spectrum, but it cannot be well described via a single "redistribution emissivity" that can be sampled for all interactions. In general, we must acknowledge that every distinct radiation-matter interaction can lead to its own distinct set of outcomes. For example, consider a three-level atom in a problem for which statistical and RE are assumed (cf. Fig. 6): if a MCRT packet is absorbed in the transition from the lowest level (1) to the highest (3), it is expected that the range of outcomes following that event should at least include a combination of re-emission in the 3 → 1 transition plus the cascade 3 → 2 and 2 → 1. It is therefore desirable to construct sets of rules for processing packet interactions in MCRT simulations that can accurately describe this physics.
One way to handle the three-level atom example would simply be to split the original interacting packet, whether a photon or energy packet, in proportion to the number of emitted photons or corresponding energy flow for each of the outgoing channels and continue the simulation. For the three-level case, this is quite feasible but the drawback of such an approach in general is that, for interactions with very many possible outcomes (e.g. atomic models with large numbers of levels) this can lead to a proliferation of packets that is computationally too expensive to follow. Moreover, it is non-trivial to generalise that approach to handle e.g. the inverse: our three-level atom ought to also be able to absorb 1 → 2 and 2 → 3, and then emit 3 → 1 (inverse fluorescence). How can this process be captured in such a redistribution scheme?
Following Lucy (1999b), key steps towards finding a general solution stem from the procedural approach outlined for the two-level atom above (see Sect. 7.4.1). In particular, a first generalisation of the effective resonance scattering treatment to multilevel atoms is to extend the traffic flow interpretation to include all possible transitions out from an excited atomic state. Specifically, when an energy packet is absorbed by a transition to some upper level u of an atom, our MC rule is to re-emit that energy packet in any of the transitions out of that level to a lower level l with probability q l proportional to the rate of energy flow in the transition: where ε uk is the difference in the energies of the levels (ε uk = ε u − ε k ). Note that we use the energy flow rates (rather than the pure transition rates) since our quanta are considered parcels of fixed energy (not photons). As shown by Lucy (1999b) and Mazzali (2000), this downbranching scheme already provides a huge improvement to the modelling of flux redistribution in SN spectra over a resonance scattering approximation. Indeed, as demonstrated by later studies (e.g. Kerzendorf and Sim 2014), this scheme performs extremely well even when compared to more complete treatments such as the full macro atom approach described below. Nevertheless, the Lucy (1999b) downbranching scheme is still not complete and does not address all the issues raised even by our simple three-level atom example (e.g. while it will reproduce flux redistribution between the 1 → 3 and 2 → 3 transitionsbecause they share an upper level-it does not connect the 1 → 2 transition to the cascade). A more complete solution that can incorporate all transitions in multi-level atoms was later devised by Lucy (2002Lucy ( , 2003 via what he called the macro atom method. This approach provides a generalised approach to formulating rules to process interactions of MC energy packets in accordance with the requirements of radiative and statistical equilibrium. In essence, we can view all of the possible excited levels of the matter as energy pools. Energy flows into/out of each pool via the set of transitions into that energy level and the equilibrium condition (namely that the energy associated with each such pool is stationary) is satisfied by imposing a traffic flow set of rules to process interactions for each possible energy level. The extra sophistication compared to the downbranching scheme is that we include the fact that physical processes represent not only energy flow to and from the radiation field, but also between the various energy pools associated with the different available levels of the atoms/ions/molecules in the medium. Expressions for the general macro atom transition probabilities and their interpretation are derived by Lucy (2002). We will not repeat the general case here but, in the example below, illustrate its application to our example three-level atom in order to clarify the principles.

Example and discussion: macro atom scheme for a three-level atom
Here we illustrate how the macro atom transition probabilities can be obtained for a three-level atom assuming radiative and statistical equilibrium (see Fig. 6). For simplicity we consider only bound-bound processes here and assume that all rates are dominated by radiation processes (i.e. neglect collisions and the associated coupling to the thermal energy pool; see Lucy 2003 and Sect. 7.4.5 for the more general case). Our aim is to formulate a set of rules to use during an indivisible energy packet MCRT simulation whenever the random walk experiment determines that a packet is absorbed by one of the three spectral lines (which have frequencies ν 12 , ν 13 and ν 23 , We start from the equations of statistical equilibrium applied to levels 1, 2 and 3 of the three-level atom. These can be expressed as: For each of the transitions we can also specify the rate at which energy is being transferred into or out of the population of atoms in each of the energy states (i.e. the rates of energy flow into and out of the pool of excitation energy associated with that Energy level diagram for the simple three-level atom used as an example for the formulation of macro atom rules state). Specifically, we identifẏ as the rates of energy flow from the radiation energy pool to the corresponding excitation energy pool for each of the three transitions. Similarly, the rates at which energy is converted back from excitation to radiation can be writteṅ Multiplying Eqs. (61)-(63) by ε 1 , ε 2 and ε 3 respectively, and using the definitions from Eqs. (64) and (65) leads, following some rearrangement, 24 to We now make a traffic flow interpretation of these equations to formulate rules for the interaction of packets in a MCRT simulation. In particular, by definition,Ȧ 12 ,Ȧ 13 andȦ 23 are the rates of energy flow from the radiation field to the atom levels via the specific transitions: thus absorption of radiation MC packets in our simulation is interpreted as the realisation of these terms. Similarly, each of theĖ terms represents the flow of energy to the radiation field in the corresponding transition: i.e. in the MCRT simulation, these terms correspond to injecting radiation packets. The interpretation of the remaining terms (all of form "energy × rate") can be made by recognising that each such term appears twice: always once on the left hand side (LHS) of one equation and once on the right hand side (RHS) of another. I.e. each of these terms can be regarded as a source term for energy in one level of the macro atom (appearing as an "incoming" energy term on the LHS, alongside the absorption from the radiation field) but simultaneously a sink term for another energy level. Accordingly, these terms are viewed as driving internal (radiationless) transitions between levels of the macro atom-they facilitate the energy flow between states such that the underlying equations of statistical equilibrium are conserved. These equations can thus be embedded in our MCRT simulation via the following algorithm: (A) Whenever an active radiation packet is absorbed by any of the three transitions we view this as a discrete realisation of the correspondingȦ term in the macro atom equation. We say that this process has activated a macro atom in the corresponding energy level. (B) We then inspect the sink terms (i.e. RHS terms) for the activated level of the macro atom and use a random number to select an outcome with probabilities proportional to the energy flows implied by the system of macro atom equations.
Thus, for example, if the macro atom is activated to level 2, with probabilitẏ E 21 /D 2 we select emission in the 2 → 1 transition, and with probabilities of ε 2 R 23 /D 2 and ε 1 R 21 /D 2 we select internal macro atom transitions 2 → 3 and 2 → 1, respectively (D 2 =Ė 21 + ε 2 R 23 + ε 1 R 21 is selected to normalise the probabilities correctly). (C) (i) If the selection corresponds to an emissionĖ term, the macro atom deactivates and the radiation packet is returned to the main MC simulation with new properties (photon frequency, direction etc.) set in accordance with the properties of the corresponding emission process. The total energy carried by the packet (in the CMF) remains equal to that when the packet was absorbed (in accordance with the requirements of RE). (ii) Alternatively, if an internal transition term is selected, the macro atom remains active but is switched from its current state to a new state in accordance with the selected term [e.g. selecting the ε 2 R 23 /D 2 term results in a transition from macro atom state 2 to state 3, conceptually representing the "sink" on the RHS of Eq. (67) into the matching "source" term on the LHS of Eq. (68)]. The algorithm then returns to step B and processes the activated macro atom again. This continues until deactivation occurs.
By repeated application of these rule for packet interactions, the activation and deactivation of macro atoms will encode both radiative and statistical equilibrium on the effective emissivity in the simulation. Several specific features of this macro atom algorithm are noteworthy of consideration for its appropriateness to a range of applications. We comment on some of these in the following. First, we note that all rates R i j are directly proportional to the level population n i , which would imply that, like the normal line emissivity, Eq. (58), determining the terms in the macro atom equations depends on already knowing the level populations. However, because of the normalisation process in step (B), this leading dependence cancels out from the transition probabilities. Of course, additional effects (e.g. corrections for stimulated emission to absorption rates or introduction of Sobolev escape probabilities; see Sect. 8.2) can still lead to dependencies on the populations. Nevertheless, cancelling of the leading-order effect means that the macro atom transition rates can be relatively well determined even in the absence of a converged set of level populations. This property can be rather powerful when treating complex systems for which exact calculations of excited state level populations (and therefore a direct evaluation of absolute emissivities) is challenging: as shown by Lucy (2002, fig.  5), even for a complicated ion such as Fe ii the macro atom scheme produces fairly accurate excited state effective emissivities without any iteration to determine level populations. 25 Second, we note that the first of the set of macro atom traffic flow equations [Eq. (66)] involves no activation (Ȧ) terms and no deactivation (Ė) terms: it is a balance only between internal transition rates. This makes sense because it follows from the equation of statistical equilibrium for the lowest energy state: there are no channels for absorption of energy directly to that state nor emission of energy directly from it. Moreover, we note that the choice ε 1 = 0 26 trivially satisfies Eq. (66) and also eliminates the corresponding internal transition terms from Eqs. (67) and (68). Making use of this definition will therefore (slightly) simplify the macro atom algorithm by effectively removing the need to explicitly consider the ground state.
Third, it can be seen that both of the simpler treatments introduced earlier for handling atomic line interactions are special cases of the full macro atom. Specifically, the effective resonance scattering approach used in several early studies (example in Sect. 7.4.1) is a two-level macro atom with ε 1 = 0. The downbranching scheme by Lucy (1999b) outlined in Sect. 7.4.3 is the macro atom scheme with all internal transition terms suppressed (formally, this can be derived from the general macro atom algorithm by assuming (i) downwards transition rate coefficients dominate and (ii) for all transitions between upper level u and lower level l, ε u ε l ). Repeated cycling through steps (B) and (C)(ii) in the algorithm above can make it computationally inefficient, particularly when the scheme is extended to also include coupling to the thermal pool. This can be addressed in several ways, however. As noted by Lucy (2002), the macro atom algorithm can be viewed as recursive application of the set of transition/deactivation probabilities and recently, Ergon et al. (2018) have presented a Markov-chain approach to the macro-atom machinery. This method effectively solves the problem without the need to follow internal macro-atom state transitions which can be a substantial advantage in terms of computational efficiency.

The thermal energy pool
The macro atom scheme is readily generalisable to include additional energy pools relevant to the simulation. In particular, the thermal pool of particle kinetic energies. In the nomenclature of Lucy (2002), interactions with the thermal pool are described as kinetic packet (k-packet) events, and the processing rules are derived by considering energy flow into and out of the k-packet pool. Here, the relevant "transition" processes are all heating and cooling rates corresponding to a flow of energy into and out of the thermal pool. These include, in particular, direct radiative heating rate (H R ), which includes processes such as free-free absorption, heating by collisional de-excitation of e.g. atomic states (H C ), and their inverse cooling processes (rates C R and C C ).
Energy flow through the thermal pool is governed by an assumption of TE (analogous to the assumption of statistical equilibrium for the atomic energy levels descried in the macro atom scheme): Thus, whenever a physical process representing the flow of energy into the thermal pool (e.g. absorption of a MC radiation energy packet via a heating process which is a realisation of the H R term), the process governing the fate of that energy is determined by randomly sampling all available cooling processes with probabilities proportional to their respective rates. This can lead either to direct remission of a radiation energy packet with photon-frequency, direction etc. randomly assigned by one of the radiative cooling processes (i.e. simulating part of the C R term) or by activation of a macro atom (associated with the collisional cooling process, C C ). We note that the scheme can also be applied to physical processes that involve cross-talk between multiple energy pools: for example, bound-free processes, which involve both changes in the populations of atomic states and heating/cooling (see Lucy 2003).

Indivisible energy packets beyond radiative equilibrium
In the preceding sections we have discussed how equilibrium assumptions (radiative, statistical, thermal) can be used to devise rules for MCRT algorithms that can handle complicated non-resonance scattering/fluorescent processes without sacrificing rigorous conservation of energy. Formulated in this way, however, such approaches will not be immediately suitable for problems in which the corresponding equilibrium assumption is not satisfied. Nevertheless, the scheme can be generalised to include appropriate source or sink terms. Consider, for example, material in a stellar outflow that is irradiated (e.g. by the stellar photosphere) but is also heated by external non-radiative (H E ) processes (examples might include hydrodynamical or magnetohydrodynamical heating) and cooled by expansion (C E ). In such a case, radiation processes may still be important but RE is no longer well justified. If cooling time scales are sufficiently short that TE can be sustained, the heating and cooling balance might be expressed at every point in the medium: In an indivisible energy packet MCRT scheme, the H R and H C (local radiative and collisional heating) terms are realised via the absorption of radiation energy packets via radiation-matter processes that directly transfer radiative energy to the kinetic particle pool (H R ) and collisional deexcitation of atomic states (H C ). The H E term can then be added to the problem as an external source of packets injected in the course of the simulation-the rate of injection being determined by knowledge of the external process responsible for H E . In this particular case, because the external source is a heating term, the energy of those new packets is initially injected directly to the pool of kinetic energies (i.e. k-packet pool) and the subsequent properties of those packets followed via the usual traffic flow interpretation of the equation.
As before, the three terms on the RHS of Eq. (70) are the sink terms for the k-packet pool and so they are sampled to determine the manner in which the energy flow out of the k-packet pool behaves. C R and C C can be simulated just as before: packets are fed back to the radiation field (C R ) or to the excitation energy of macro atom pools via collisional excitation (C C ). The additional term, C E , can be treated as a true external sink term: i.e. with probability C E /(C R + C C + C E ) energy packets that flow into the thermal pool are terminated. Alternatively, this term could be treated via a reduction in packet energies: i.e. one could opt to sample the RHS of Eq. (70) only considering C R and C C but reduce the energy of all packets processed through this channel by a factor of (C R + C E )/(C R + C E + C C ). We note that, in the limit where the external sources and sink term (H E and C E above) become dominant, an indivisible energy packet simulation performed with this machinery will essentially reproduce the elementary scheme explained in Sect. 7.1.
The example above illustrates how the indivisible packet scheme can be altered to take account of specific departures from RE, and this is done in many of the existing implementations of this method, particularly to account for adiabatic cooling (e.g. Long and Knigge 2002;Kasen et al. 2006;Kromer 2009;Vogl et al. 2019). In principle a similar logic could be employed to deal with departures from statistical equilibrium (affecting the macro atom transition rules) or TE (further affecting the k-packet transition rules). Specifically, if the inflow and outflow rates are not in balance such that there is a net rate-of-change of the energy reservoir, then terms representing the ongoing accumulation (or loss) of energy from the pool could be built into the formulation (i.e. retain terms including the derivatives of the level populations and/or the kinetic temperature). Provided that values for those derivatives are known they could also then be included in the packet flow. To our knowledge, however, extensions of the macro atom/k-packet schemes that consider such terms have not yet been implemented.

MCRT: application in outflows and explosions
A prominent field in astrophysics, in which MCRT methods are very popular and successful, is the study of fast mass outflows. For example, MCRT schemes can be used to calculate mass-loss rates and the structure of hot-star winds (see e.g. Abbott and Lucy 1985;Lucy and Abbott 1993;Schaerer and Schmutz 1994;Schmutz 1997;Vink et al. 1999Vink et al. , 2000Sim 2004;Müller and Vink 2008;Muijres et al. 2012b;Lucy 2012b;Noebauer and Sim 2015;Vink 2018), to determine synthetic light curves and spectra for SNe (see e.g. Mazzali and Lucy 1993;Lucy 1999aLucy , 2005Mazzali 2000;Kasen et al. 2006;Sim 2007;Kromer and Sim 2009;Wollaeger et al. 2013;Wollaeger and van Rossum 2014;Kerzendorf and Sim 2014;Bulla et al. 2015;Magee et al. 2018) or to treat RT in winds emanating from accretion discs of cataclysmic variables (Knigge et al. 1995;Long and Knigge 2002;Noebauer et al. 2010;Kusterer et al. 2014;Matthews et al. 2015) or active galactic nuclei (Sim 2005;Sim et al. 2010Sim et al. , 2012Higginbottom et al. 2013;Matthews et al. 2016Matthews et al. , 2017Tomaru et al. 2018). In applications such as these, our current implicit assumption, namely that RT occurs in static media or in environments with material velocities low enough to be safely ignored, can no longer be maintained. Instead, special relativistic effects play an important role and have to be taken into account. In the following, we outline some important aspects of performing MCRT in moving media. While many of the described concepts are generic, the treatment of line interactions using the Sobolev approximation (see Sect. 8.2) is specific to MCRT in expanding media, such as SN ejecta or winds.

The mixed-frame approach
As introduced in Sect. 2, there are two fundamental frames of reference for RT, namely the LF and the local rest frame (CMF). Until this point, we have largely ignored the distinction between these frames since RT was assumed to occur in static media or in low-velocity environments. When the material velocities become large, however, this simplification is no longer justified. In these situations, MCRT schemes often rely on a so-called mixed-frame approach (see for example Lucy 2005). This exploits the fact that the handling of different tasks involved in MCRT simulations is easier in one or other of the two frames. Specifically, the spatial and temporal mesh is usually defined in the lab frame, making it most convenient for measuring distances and thus for tracking packets and simulating their propagation. Radiation-matter interactions, on the other hand, are more easily described in the local rest frame of the material. Here, the material functions take their simplest form. Consequently, MCRT schemes adopting the mixed-frame approach propagate packets in the LF but treat all interactions in the CMF.
The inclusion of relativistic effects during the LF-CMF transformation can be performed to varying degrees of accuracy. We first focus on arguably the most important effect in the presence of matter flows, namely the Doppler effect. For this illustration, we again assume a simplified situation of coherent and isotropic scattering. In moving media, these assumptions will only ever hold in the CMF. Whenever a MC packet interacts, its properties are transformed into the CMF, where the interaction is simulated and post-interaction properties are assigned. Adopting the convention introduced in Sect. 2 and denoting quantities measured in the CMF by a 0 subscript, the incident CMF frequency of an interacting packet is Likewise, it carries an energy in the CMF. Here, the additional subscripts (i and e) are used to denote incident and emergent packet properties with respect to the interaction. Performing interactions in the CMF has the benefit that, for our example of coherent scattering, energy is conserved in this frame and thus ε 0,e = ε 0,i .
Similarly, the packet frequency remains constant during the interaction in the CMF 27 ν 0,e = ν 0,e .
In many applications, the frequency shifts due to the Doppler effect will be the most significant consequence of the material motion. However, for a detailed relativistic treatment, achieving at least O(v/c) accuracy, additional effects have to be taken into account. In particular, aberration affects the propagation direction of packets and has to be taken into account when transforming the incident direction into the CMF and the emergent direction back into the LF (cf. Mihalas and Auer 2001) Also, care has to be taken when calculating the accumulated optical depth. Since the propagation is carried out in the LF, the integration is best performed in this frame as well. For this, opacities have to be properly transformed into the LF using Eq. (14). The optical depth integration is further complicated in the presence of matter flows by the continuous change in CMF frequency as packets propagate through material with varying velocities. Thus, Eq. (44) becomes which accounts for the CMF frequency change along the trajectory by dl/dν 0 and includes a Doppler factor (ν 0 /ν) due to the transformation of the opacity into the CMF. Performing this integration for non-trivial opacity laws and general flow patterns is very challenging and computationally demanding since it has to be carried out frequently during the propagation process of each packet. However, for an important class of astrophysical applications of MCRT, the Sobolev approximation can be adopted and drastically reduces the complexity of the integration by turning it into a purely local problem. We elaborate on this approach in Sect. 8.2. Finally, we note that one also has to decide in which frame the MC packets are launched during the initialisation. Often, the CMF is the natural choice for this process, e.g. when representing a thermal radiation field. In such cases, packet properties are drawn in the CMF and then transformed into the LF using the rules given here and in Sect. 2 before starting the propagation.

Line interactions in outflows
As described above, treating frequency-dependent opacities in the presence of large material velocities is challenging. However, in the case of bound-bound processes, the situation can be significantly simplified with the so-called Sobolev approximation (Sobolev 1960). Indeed, RT through fast expanding mass outflows is the classical example for the use of the Sobolev approximation. We refrain from a detailed description of Sobolev theory since it is a widely used technique in astrophysical RT problems (see e.g. Castor 2007 for a general overview of the approximation and Hummer 1978, 1983;Hummer and Rybicki 1985;Jeffery 1993Jeffery , 1995 for various extensions of the original formulation) but highlight some key aspects and describe how a Sobolev line interaction scheme can be easily incorporated into MCRT simulations for fast outflows. An illustrative overview of this approximation can be found in Lamers and Cassinelli (1999).
Bound-bound processes are fundamentally resonant processes in the sense that only photons, and thus MC packets, with CMF frequencies in a small window can perform these interactions. The frequency dependence of the bound-bound opacity is encoded in the line profile function, ψ, which is narrowly peaked about the natural line frequency corresponding to the energy separation of the two atomic levels, ε l and ε u , connected by the line transition. The width of the line transition is mainly a result of the turbulent and thermal motion of the atoms. Together with the local gradient of the velocity field it can be translated into a characteristic length scale, the so-called Sobolev length, over which the photon shifts into and out of resonance with the line in the CMF. This scale is compared with the typical length over which the plasma state, and thus the frequencyindependent parts of the line opacity change. If the Sobolev length is much smaller, the line profile can be approximated by a delta-distribution around ν lu . As a consequence, the optical depth integration in Eq. (79) collapses and turns into an expression that only depends on the material state at the so-called Sobolev point. At this location, the CMF frequency of the photon exactly equals ν lu . We note that in addition to the condition concerning plasma state changes, the traditional Sobolev approximation can only be applied to environments with a monotonous velocity gradient. 28 Whenever the Sobolev approximation can be adopted, MCRT simulations that include line interactions are dramatically simplified. In addition to reducing the calculation of the line optical depth to a local problem, line overlaps are eliminated within the Sobolev approximation. Since photons continuously red shift in monotonously expanding flows (or blue shift in compression flows), they successively scan over the possible line transitions in the CMF one-by-one. This reduces the book-keeping effort in MCRT simulations and suggests storing the line transitions in a frequency-ordered list (Lucy 1999b). In outlining the basic MCRT propagation routine for such cases, which has been developed by Abbott and Lucy (1985) and Mazzali and Lucy (1993), we assume that in addition to line interactions, only frequency-independent continuum processes, such as electron scattering in the Thomson limit, contribute to the total opacity. As the packet propagates, it continuously accumulates optical depth due to continuum processes (see Sect. 6.1). Whenever the packet reaches the Sobolev point of the next line transition, i.e. when its local CMF frequency equals the natural frequency of the line, the accumulated optical depth is instantaneously incremented by the full line opacity of the transition m e c f lu n l 1 − n u n l g l g u r s dl dν 0 r s .
Here, e and m e denote the elementary charge and the mass of the electron, f lu is the absorption oscillator strength of the transition from the lower to the upper energy lever, l → u, and n l,u and g l,u are the population numbers and statistical weights of these levels. The subscribed r s denotes that the plasma state entering the optical depth calculation is evaluated locally at the Sobolev point. In one-dimensional geometries, the derivative in Eq. (81) simplifies to The optical depth accumulation procedure is further illustrated in Fig. 7. The decision about the nature of the next interaction a MC packet experiences is based on whether the assigned optical depth value is surpassed between Sobolev points or during the instantaneous increments at one of these resonance points. In the former, the packet is simply moved by the distance l given by Eq. (44), which now only involves continuum opacity. At the interaction site, the packet properties are changed according to the specific continuum process. If the optical depth value is reached at a Sobolev point, however, the packet is moved to this location and performs the corresponding line interaction. It should be noted that the re-emission direction for the packet should be assigned according to the Sobolev escape probability For the particular case of an homologous flow, v r = dv dr = const., the Sobolev optical depth becomes independent of direction and the escape probability is isotropic. However, in more general velocity fields the Sobolev escape probability Fig. 7 Illustration of the optical depth accumulation process devised by Abbott and Lucy (1985) and Mazzali and Lucy (1993) for MCRT simulations relying on the Sobolev approximation. At the Sobolev points, S 1 and S 2 , the accumulated optical depth is instantaneously incremented by the full line optical depth. In addition four possible outcomes of the decision process about the next interactions are shown. For τ 1 and τ 3 , the packet would experience a continuum interaction at l 1 and l 3 respectively. In the remaining cases, i.e. for τ 2 and τ 4 , the packet undergoes a line interaction at the respective Sobolev points l S,1 and l S,2 . This figure is loosely adapted from Mazzali and Lucy (1993, Fig. 1) will vary with direction and must be sampled whenever directions for re-emission of packets are needed.
Many MCRT applications in outflows adopt the Sobolev approximation and follow a line interaction scheme similar to the one just outlined. Examples include the studies by Abbott and Lucy (1985), Lucy and Abbott (1993), Vink et al. (1999), Sim (2004), Noebauer and Sim (2015) dealing with hot star winds, or the works by Long and Knigge (2002) performing MCRT in disc winds and Mazzali and Lucy (1993), Mazzali (2000), Kasen et al. (2006), Sim (2007), Kromer and Sim (2009), Kerzendorf and Sim (2014) who use MCRT in SN ejecta. There are several studies that treat line interactions without relying on the Sobolev approximation such as Knigge et al. (1995) and Kusterer et al. (2014). Here, the conceptual and computational effort is, however, significantly higher.
To demonstrate the use of the Sobolev approximation in MCRT, we describe a simple test simulation to calculate the H Lyman α line profile formed in a homologous flow composed of only neutral hydrogen in Appendix A.2. This leads to the line profile shown in Fig. 8. For comparison, the analytic solution for this test problem is included. 29 The collection of tools available as part of this review contains a simple Python implementation of this MCRT simulation (see Appendix B).  Jeffery and Branch (1990). Note that the emergent spectra have been normalized by dividing by the incident thermal radiation field F cont ν

MCRT and expansion work
In RE, packet energy is conserved in the CMF during interactions, which partly motivates the introduction of the mixed-frame approach for MCRT in moving media. We emphasize, however, that packet energy conservation does not necessarily hold in the LF. In fact, depending on the flow of radiation relative to the moving ambient material, photons may either lose or gain energy in interactions. This is a crucial process in astrophysical applications involving strong mass outflows, for example hot-star winds. Here, photons collectively lose energy in interactions by performing expansion work, ultimately driving and maintaining the outflow (cf. Lamers and Cassinelli 1999;Puls et al. 2008). In the following, we briefly demonstrate that MCRT schemes adopting the mixed-frame approach readily capture this work term (indeed this was one of the original motivating factors for the approach; Abbott and Lucy 1985).
When a packet interacts, the mixed-frame MCRT approach switches to the local CMF to perform the interaction and update the packet properties before returning into the LF and continuing the propagation. Considering again isotropic and resonant scatterings for illustrative purposes, the LF energy of a packet changes during the interaction from ε i to ε e according to Depending on the orientation of the propagation direction prior and after the scattering relative to the material flow, the packet thus gains or loses energy in the LF. In astrophysical mass outflows, radiation is typically emitted from a source at the base of the flow, for example by a central star or an accretion disc. As a consequence, photons predominantly propagate in the direction of the flow before they interact. Electron scatterings in the Thomson limit or resonant line interactions, two processes playing important roles in mass outflows, are either approximately isotropic or exhibit at least a re-emission profile that is forwards-backwards symmetric. Thus, a radiation field that is initially predominantly aligned with the expanding flow will be partly diffused by the interactions and packets, on average, lose energy in the LF. This process can be further illustrated by considering the mean packet energy,ε e , after the first interaction, which is obtained by averaging over all re-emission directions. Specifically, in terms of the incident and emergent direction cosines (μ i and μ e ), where, for simplicity, we have assumed spherical symmetry and have neglected aberration. Performing the integration gives For small values of β, this reduces tō but in general,ε e ε i < 1 for β > 0 holds for incident photons propagating in the direction of the flow, i.e. with μ i = 1. As a final illustration for the energy loss experienced by MC packets in MCRT calculations in mass outflows, we present the model SN calculation described by Lucy (2005). This test, which is described in detail in Appendix A.3, constitutes a simplified and idealised view of RT in SN Ia ejecta. We use the code Mcrh (see Noebauer et al. 2012) to perform the MCRT simulation for this test problem as described in Appendix A.3. In Fig. 9, the synthetic bolometric light curve from this test calculation is shown. Following Lucy (2005), we illustrate the different energy flow terms in the simulation in Fig. 10. In particular, the energy currently stored in the radiation field, E R (t), is shown. In addition, the total energy that has escaped through the ejecta surface, E ∞ (t), the total energy generated in radioactive decays, E γ (t), and the total work performed by the radiation field, W (t), are given. These three quantities represent cumulative measures over the time interval from 0 to t. All quantities are simply reconstructed from the MCRT simulation by counting appropriate packet energies. In particular, the work term is obtained by summing up the difference between incident  Lucy (2005). In addition to the evolution of the radiation emerging from the model SN (blue), the rate of energy released in the radioactive decay chain of 56 Ni (orange) and the rate at which γ packets deposit their energy in the ultraviolet-optical-infrared radiation field (green) are shown. See Appendix A.3 for more details on the test problem setup and the MCRT calculation and emergent packet energies in the LF during each scattering. In addition, Fig. 10 contains the imbalance between the source and sink terms of radiation energy, i.e. between E γ (t) on the one hand and E ∞ (t) and E R (t) on the other hand. This quantity perfectly follows the reconstructed work term, W . For more technical details about this test, we refer to the original works by Lucy (2005)

Extracting information from MCRT simulations
With the algorithms outlined above, the flight paths of MC quanta can be determined and tracked during MCRT simulations. In general, the individual trajectories are not of primary interest. Rather, meaningful information that effectively represents the radiation field needs to be extracted from them. In some cases, only radiation escaping from the simulation box may be of interest to construct synthetic spectra, light curves or images. For other applications, the most important outcome may be a characterisation of the radiation field internal to the system. In this part of the review, we present a number of common approaches that can be used to extract physical information from MCRT simulations. We preface this by a brief discussion of MC noise, which is a fundamental, often undesired property of MCRT simulations that motivates the design of the extraction techniques described below.

MC noise
MCRT simulations are probabilistic by nature. Consequently, results obtained with these approaches will generally be subject to stochastic fluctuations. This fundamental Fig. 10 Energy flows in the test problem devised by Lucy (2005). E γ shows the energy that has been released by radioactive decays, E R indicated the energy currently stored in the radiation field, and E ∞ is the energy that has escaped to infinity. In addition to the work term, W (t) obtained directly from the MCRT simulation by balancing the emergent and incident LF packet energies in the interactions (red), the imbalance between E γ and E ∞ and E R is shown (dashed black). These two quantities agree very well, demonstrating the mixed-frame MCRT approach captures the expansion work term and conserves energy and inherent property of MC calculations is often referred to as Monte Carlo noise or simply noise. Here, we briefly present the basic behaviour of this noise component and discuss the implications for devising techniques to extract or reconstruct physical information from a MC simulation. More details about this subject may be found in the standard literature, e.g. in Kalos and Whitlock (2008).
In general, one exploits the law of large numbers when reconstructing physical information from MCRT calculations. To illustrate this, we consider extracting a specific physical property from the simulation (e.g. the escape probability from a homogeneous sphere: see Appendix A.1). This quantity has to be related to a particular behaviour or property of the MC quanta (e.g. a packet emerges from the sphere or not). We now assume that the process of the quanta performing this behaviour or its property taking a specific value is expressed by the random variable X with a probability density of ρ X (x). Considering the fate or measuring a property of an individual packet will not result in conclusive statements about X (i.e. we cannot make meaningful statements about the escape probability by considering whether one particular packet emerged from the sphere or not). However, if this process is repeated and performed for the entire packet population, the law of large numbers ensures that the resulting average converges towards the expectation value of X . 30 Consequently, the extraction of physical information from a MCRT calculation can typically be described mathematically by To estimate the uncertainty associated with such an MC estimate, we rely on the central limit theorem. 31 According to this statement, the MC estimator process G N will be governed by a normal distribution in the limit of infinite contributions (N → ∞) and its standard deviation or standard error will follow Here, σ X denotes the standard error of the individual process X i . The first natural implication of this fundamental MC error behaviour is that the accuracy improves when the number of contributions to the estimator given by Eq. (90) increases. This is typically achieved by increasing the number of MC quanta in the simulation and is illustrated in Fig. 11 by the example of determining the escape probability for the homogeneous sphere problem (see Appendix A.1). However, since this improvement scales only with N −1/2 , efficient and effective reconstruction techniques strive towards maximising the number of contributions to the estimator procedure for a given computational effort, which is often equivalent to considering a certain number of MC quanta in a simulation. These methods are also often referred to as acceleration techniques since they achieve a certain signal-to-noise level with fewer quanta and thus by spending less computational effort. In the following, a variety of such strategies for the extraction of physical information from MCRT calculations and reducing the stochastic fluctuations are presented. We focus first on simple counting techniques, then turn to the volume-based estimator approaches introduced by Lucy (1999a) and finally introduce the widely-used acceleration concept of biasing.

Direct counting of packets
Undoubtedly the most obvious and straight-forward approach to reconstruct a physical property of the radiation field (or any associated RT process) from the ensemble of packet trajectories is to simply count the relevant packet properties or packet interaction events. For example, a simple synthetic image of an astrophysical system can be produced by recording all packets emerging from the domain through the surface element ΔS i during a time interval Δt then binning them according to their propagation directions Here, the summation only involves packets that escape through the surface element ΔS i into the solid angle element ΔΩ i . Likewise, internal properties can be reconstructed by querying the positions of all packets at a given instant during a time-dependent simulation and then summing up  91), in case of determining the escape probability from a homogeneous sphere (cf. Appendix A.1). In particular a sphere with optical depth τ sp = 1 was considered and the escape probability determined for different numbers of MC quanta. The standard error σ 1000 was determined after repeating each experiment 1000 times and follows the expected behaviour almost perfectly the relevant properties of all packets that are currently located within a certain control volume ΔV . In particular, the radiation field energy density in a grid cell i with the volume ΔV i can be determined by following this strategy. Here, the summation over j involves all packets that at t = t n are within cell i. By analogy, any quantity that involves radiation-matter interactions, such as the amount of absorbed radiant energy, can be determined by counting all absorptions packets perform during a certain time interval (e.g. the duration of a simulation time step). While this reconstruction approach is very intuitive, it is also the least sophisticated and does not optimally use the information contained in MCRT simulations. In general, a large number of packets will be needed to achieve acceptable results since the approach requires that a sufficient number of packets propagate into the desired direction, are at a certain location or have performed a particular interaction (or combination of all these). Fulfilling this requirement becomes even more challenging when fully three-dimensional and/or frequency-dependent calculations are performed. As a consequence, the utility of the direct packet counting technique is often limited due to strong noise in the reconstructed quantities. Still, this approach can be of use as a reference when testing and verifying more complex reconstruction techniques. Moreover, the quality of direct counting estimates can be vastly improved when combined with biasing techniques, which will be introduced below.

Volume-based estimators
Lucy (1999a) introduced a technique to reconstruct properties of the internal radiation field that is less vulnerable to noise than direct counting approaches since information along the entire packet trajectory is used instead of only a momentary snapshot of the packet distribution. These techniques have been refined by Lucy (1999bLucy ( , 2003Lucy ( , 2005 and are often referred to as volume-based estimators. 32 The effective use of such estimators has been a key consideration for many MCRT studies relying on Lucy's approach (e.g. Sim 2004Sim , 2007Kasen et al. 2006;Kromer and Sim 2009;Harries 2011Harries , 2015Noebauer et al. 2012;Kerzendorf and Sim 2014). The volume-based estimator approach rests on the idea that instead of considering packets at certain discrete instances, time-averaged estimates of radiation field properties can be constructed by incorporating information from the full packet propagation path. The fundamental notion is that the packet flight histories form an ensemble of trajectory elements that statistically represent the radiation field. To better illustrate this principle, we follow Lucy (1999a) and repeat the formulation of a volume-based estimator for the radiation field energy density.

Example: formulation of volume-based estimator for the radiation energy density
We consider the trajectory of a packet with energy ε propagating during a simulation time interval of Δt. 33 In general, this trajectory will consist of multiple separate segments that correspond to the flight of the packet between events in the simulation (i.e. neglecting general relativistic light bending, the photon packet trajectory will be a sequence of straight line segments between scattering/interaction points, cell boundary crossing events etc.). We denote the time the energy packet spends on any particular segment j by δt j . Each packet trajectory segment contributes to the total radiation energy content with its packet energy, weighted by the relative time spent on that trajectory: i.e. ΔE = δt j Δt ε j . Thus the implied total energy density for a grid cell i of volume ΔV i in a simulation may be constructed from a volume-based estimator obtained by summing over all trajectory elements of all packets that were active in the cell: As packets propagate at the speed of light, the estimator can be expressed in terms of trajectory segment length l = cδt Here and in Eq. (94), the summation includes the trajectory segments j of all packets that lie within the cell i. We stress that the ensemble of segments contributing to these sums includes all packet trajectories between events, both physical interactions, like scatterings, and numerical events, such as grid cell boundary crossings. We also note that, although our presentation derives from a time-based formulation, the resulting estimator depends only on the ratio ε j /Δt and so can be applied without adjustment to steady-state RT problems (i.e. where time steps need not explicitly appear in the algorithm). In such cases the problem will involve a fixed luminosity, and the packet energies will be normalised to correspond to a pre-determined or arbitrary time interval. In Eq. (94), the choice of this normalisation time interval will be inconsequential: the value of the estimator will depend only on the ratio of the packet energies to the duration of the simulation time interval (i.e. sensitive to the luminosity but neither to the absolute packet energies nor absolute time interval). The advantage of the volume-based estimator scheme compared to simple direct counting measurements is two-fold. First, a single packet can contribute to the estimators in multiple cells, provided that its trajectory intersects these cells during the time step. Second, the same packet can in principle contribute repeatedly to the estimator in a specific cell, if it is scattered in the cell or backscattered from a different cell. Both features of the volume-based estimator scheme drastically increase statistics and thus reduce the amount of MC noise in the reconstructed quantity. Also, this technique reduces the risk of obtaining undetermined results. In the direct counting approach, at least one packet must reside in the cell at the instant considered to obtain a non-zero result. This condition is mitigated to the much less restrictive requirement that at least one packet has resided in the cell at any point during the time step.

Constructing volume-based estimators: radiation field quantities
Having established a volume-based estimator reconstruction scheme for E and having appreciated the benefits such an approach offers, other radiation field properties can be determined in a similar manner. For this purpose, the relationship can be used together with Eq. (95) to obtain an estimator for the mean intensity (cf. Lucy 1999a) Once the summation is restricted to contributions of trajectory segments which point into a certain solid angle element ΔΩ k , the specific intensity can be reconstructed by means of a volume-based estimator (see e.g. Lucy 2005). Similarly, a further restriction to segments of packets with a frequency in the interval [ν, ν + Δν], allows monochromatic radiation field properties to be determined. For example the monochromatic specific intensity Volume-based estimators for moments of the specific intensity (J , H and K ) can now be easily formulated by including powers of the propagation direction and summing over all directions. Following this procedure, for example the total radiation flux can be estimated with (see e.g. Noebauer et al. 2012) To demonstrate the capabilities of the volume-based estimator approach to accurately track the characteristics of the radiation field, we perform a MCRT test simulation of the homogeneous sphere problem presented in Appendix A.1. Adopting the parameters suggested by Abdikamalov et al. (2012) and listed in Appendix A.1, we perform a simple time-independent MCRT simulation in spherical symmetry, injecting packets according to the local emissivity and following them until they either escape from the computational domain or are absorbed. During their propagation paths, volume-based estimators for J , H , and K are continuously incremented. These first three moments of the specific intensity are shown in Fig. 12. Within the inherent MC noise (indicated by uncertainty bands 34 ), the estimators agree very well with the analytic solution as outlined in Appendix A.1. An example implementation of how the reconstruction of the moments may be achieved within the volume-based estimator formalism can be found in the Python program designed for this test problem and distributed with the tools repository (cf. Appendix B). Specifically, this task is performed by the routine update_estimators in the mcrt_hom_sphere.py program. Note that due to spherical geometry, not the instantaneous value of the direction cosine but its mean value along the trajectory segment (mu_mean) appears in the estimator increments.

Constructing volume-based estimators: extracting physical rates
For many problems, simulation of the radiation field serves not only to predict synthetic observables but also to determine thermodynamic conditions of the astrophysical Fig. 12 Volume-based estimators for J , H , and K in the homogeneous sphere test (see Appendix A.1). These are shown relative to the adopted source function, S, which is assumed to be uniform throughout the sphere (see Appendix A.1). The 1 σ and 2 σ confidence bands (dark and lightly-shaded regions) have been constructed by performing ten MCRT simulations with 10 5 packets each and different RNG seeds. The open symbols represent the mean values for the radiation field moments in the different shells. Additionally, the analytic solution according to Eqs. (A.5)-(A.10) is included in dashed black. The Python tool with which these MCRT simulations have been performed is part of the source code repository distributed with this work (see Appendix B) plasma: e.g. often the radiation field is crucial for determining the ionization/excitation state and heating (e.g. Mazzali and Lucy 1993;Bjorkman and Wood 2001;Long and Knigge 2002;Ercolano et al. 2003Ercolano et al. , 2005Ercolano et al. , 2008. In such cases, we therefore wish to extract information on the relevant rates of physical processes in the simulations. Following the principle outlined in Sect. 9.2, this could be achieved simply by counting the rate at which individual packet events corresponding to the process in question occur during the simulation. However, such an approach relies on a sufficient number of such interactions happening to achieve acceptable statistics and an accurate result. This becomes very challenging in optically thin regions since very few packets or even none interact. Again, the volume-based estimator approach offers a significant improvement since it takes a broader view and includes the information encoded in the entire packet propagation paths instead of only considering a series of isolated snapshots. In particular, a volume-based estimator can be formulated for any quantity that depends on the radiation field by applying constructions similar to those outlined in Sect. 9.3. The general principle will be that the rate of energy extracted from the radiation field by some process can be described in terms of a sum over packet trajectories weighted by the appropriate absorption coefficient. These energy flow rates can then be recast in other forms (e.g. transitions rates), as required. Fig. 13 Illustration of the benefits of the volume-based estimator approach (right panel) compared to direct counting techniques (left panel). The trajectory of a single packet is shown (blue), which is absorbed at the location denoted by a filled circle after passing trough a number of cells. While in the direct counting approach this packet only contributes to the heating rate estimator in the final grid cell, it adds to the local heating rate in all cells it passed through in the volume-based estimator scheme. The relative contribution to the heating rate in each cell is encoded by the color transparency in this sketch

Example: photoionization rate estimators
As a concrete example, we illustrate the case of extracting a particular photoionization rate from a simulation. In particular, the photoionization rate coefficient 35 can be written (101) where ν th is the threshold frequency and σ ν the cross section for photoionization at photon frequency ν. Into this expression we substitute our expression for the MC volume-based estimator for the relevant moment of the radiation field, in this case Eq. (97), and immediately obtain our estimator for γ We note that the estimator runs over all packet trajectories on which the packet frequency is above the photoionization edge (ν > ν th ) and each contribution to the sum is multiplied by a factor that depends on the cross-section of the process for the contributing packet.
With the possibility that all packet propagation segments can in principle contribute to the estimator, the reconstruction of interaction-based radiation field properties yields non-zero results as long as at least one packet entered the grid cell of interest. This significant advantage of the volume-based estimator approach (compared to estimating rates of processes by direct counting) is illustrated in Fig. 13.

Volume-based estimators for energy and momentum flow
The principle outlined above, and illustrated by the photoionization example is readily generalised to provide estimators for the rate of any physical process of interest (see e.g. Lucy 2003) that can be cast in terms of energy transfer from the radiation field. The principle is always the same: the optical depth of each segment is calculated which can be interpreted as the expected number of interactions a packet would on average experience when propagating this distance. This information is then used to scale the contribution of each segment to the estimator and determine the amount of radiant energy that is absorbed by the process. Of particular relevance to many problems, including radiation hydrodynamics, is the total rate at which energy is transferred from the radiation field into the ambient material, i.e. the heating rate. If the heating process is for example described by a pure absorption coefficient χ , then the amount of radiant energy absorbed is (see Lucy 1999a) We note that the form of this estimator is very similar to that used to reconstruct J itself, Eq. (97), which is to be expected given the close relationship of J to the rate of any radiative heating processes. This idea also readily generalises to provide volume-based estimators for momentum transfer (see e.g. Noebauer et al. 2012;Roth and Kasen 2015), which are instrumental for MC-based Radiation Hydrodynamics (RH) calculations (see Sect. 11). For continuum driving, the form of these estimators is very similar to the F estimator, Eq. (100). Lucy (1999b) used similar considerations for the formulation of estimators in applications which are dominated by atomic line interactions that are treated in the Sobolev limit, such as stellar winds or the ejecta of thermonuclear SNe. Here, the formulation is slightly more complicated and the form of the energy/momentum flow rate estimators is different from those outlined above: they are formed as summations over all packets that have come into Sobolev resonance within a grid cell (see also Sim 2004;Noebauer and Sim 2015). The principal advantage compared to direct counting still applies since all resonances contribute, regardless of whether the packet actually undergoes interaction. Given the potential importance of forests of weak lines to heating/driving of outflows, as is for example the case in hot star winds where many weak iron lines drive the outflow (Vink et al. 1999), this is a critical advantage.

Biasing
In many MCRT applications, only a subset of the packet population is of interest. For example, when creating a synthetic image, only packets that escape towards the virtual observer are relevant. It is therefore desirable to selectively invest more effort into propagating packets that are crucial for the determination of the quantity or process of interest instead of treating packets that do not contribute. This selective increase in statistics can be achieved with the help of so-called biasing techniques. The underlying basic principle is known as importance sampling in the field of MC integration.
The key concept of biasing techniques is to increase the occurrence of desired packet properties by introducing a new, biasing PDF, q(x), that emphasises the correspond-ing regions in the parameter space. We then sample from this PDF rather than from the physical one, ρ(x), in the random processes governing the MCRT procedure. In order to ensure physical consistency, the packet weights have to be adjusted to counterbalance the artificial over-(and under-) representation of samples from particular parameter space regions. In particular, the packet weight in the absence of biasing, w nb , is replaced by Biasing techniques are among the most popular and widely-used variance reduction methods (see e.g. Carter and Cashwell 1975;Dupree and Fraley 2002). In the following, some of the popular biasing techniques used in astrophysical applications are briefly described. Most of the presented schemes are designed to address challenges commonly encountered in dust RT, since biasing techniques are heavily used in this branch of astrophysical research (see e.g. the overview by Steinacker et al. 2013).

Biased emission
Biased emission is a simple but powerful illustration of a biasing scheme. This approach helps in problems where we wish to accurately describe the emission from sources with very different emissivities. These can be external sources, such as stars irradiating some environment, or simply the internal emissivity of the ambient material occupying grid cells of a computational mesh. Biased emission is frequently used in dust RT, for example by Yusef-Zadeh et al. (1984) and Juvela (2005). A detailed account of the technique is given by Baes et al. (2016).
For illustrative purposes, we consider a problem that only involves two sources, with luminosities L 1 and L 2 , and wish to study the case where L 1 L 2 . One approach to simulate the emission, is to spawn N MC packets, each with equal energy (i.e. weight) Here, the total luminosity (L = L 1 + L 2 ) and the physical duration corresponding to the MCRT simulation (Δt) appear. Each of these packets is now associated with one of the two sources according to the discrete probabilities For L 1 L 2 , this leads to a very uneven distribution of packets, with the weaker source being represented by very few packets. In the biased emission approach, an alternative PDF is introduced that increases the association with the weaker source. One possibility would be to choose the uniform distribution In this case, equal numbers of packets represent the emissions from both sources. This has to be balanced by adjusting the packet weights according to Eq. (104), leading to packets from the first source representing less energy and packets from the second source more energy than in the unbiased case. In addition to addressing imbalances in source luminosities, biased emission often involves preferentially launching packets into directions of particular interest (cf. Baes et al. 2016).

Forced scattering
A well-established biasing technique, already discussed in the context of neutron transport by Cashwell et al. (1957), is the forced scattering scheme. This method is often used in dust RT applications (see e.g. Mattila 1970;Witt 1977;Steinacker et al. 2013;Baes et al. 2011Baes et al. , 2016 to increase the efficiency in optically thin regions. Here, packets would otherwise escape without interacting leading to a low dust-scattering efficiency and challenges in determining heating rates. To circumvent these difficulties, the interaction probability for packets is biased such that they are guaranteed to interact before reaching the edge of the computational domain. Denoting the optical depth from the current packet position to the point of escape as τ edge , the interaction location is drawn from the biased PDF (cf. Steinacker et al. 2013) instead of using Eq. (45). This ensures that the interaction location lies between 0 and τ edge . Following Eq. (104), the packet weight is modified by If continuously applied in time-independent applications without an absorption component, this scheme allows packets to propagate indefinitely, with a continuously decreasing weight. Thus, a termination mechanism (e.g. Russian Roulette, see below) has to be introduced to stop the propagation at a certain point, typically once the packet weight has dropped below some pre-defined threshold. Alternatively, forced scattering can also only be applied once for each packet thus ensuring at least one interaction but leaving the normal propagation termination mechanism (escape through domain edge) intact. We note, however, that particular care has to be taken when information is extracted from MCRT simulations employing this technique that also require the contribution from free-streaming packets.

Peel-off
Constructing properties of the emerging radiation field by simply examining the properties of escaping packets often yields unsatisfactory results, particularly in multidimensional simulations: typically only a small fraction of the packet population escapes towards the observer meaning that the reconstruction will suffer from strong noise. Here, the so-called peel-off technique (sometimes also referred to as next event estimate) helps (e.g. Yusef-Zadeh et al. 1984;Wood and Reynolds 1999;Baes et al. 2011;Steinacker et al. 2013;Lee et al. 2017). In the context of MCRT in fast mass outflows, this method is sometimes referred to as viewpoint technique or virtual packet scheme (Woods 1991;Knigge et al. 1995;Long and Knigge 2002;Kerzendorf and Sim 2014;Bulla et al. 2015).
The peel-off approach introduces ray tracing concepts into the MC simulation. At every interaction point in the simulation, the probability is calculated that the interaction could have given rise to a packet that propagated in the direction of the observer and successfully emerged from the simulation (and so could contribute to the synthetic observables). Specifically, the weight contributed to the synthetic observables associated with the interaction of a packet with weight w can be written where p(n obs ) is the probability that the interaction led to reemission of the packet in the direction of the observer (n obs ) and exp(−τ obs ) describes the attenuation of the packet as it travels through the total optical depth from the interaction point to the observer (τ obs ). The optical depth is obtained by casting a ray towards the observer and integrating the opacity along this path. Since every interaction any packet performs contributes to the reconstruction, the improvement in statistics in peel-off methods is substantial. However, the ray tracing exercise of the peel-off technique adds significantly to the overall computational effort of the MCRT calculation, sometimes even dominating the computational costs.
We note that variants of methods similar to peel-off have been used in specific applications. Lucy (1991Lucy ( , 1999b introduced a ray tracing technique for variance reduction specifically designed for applications in which a photosphere approximation can be adopted and in which the medium is freely expanding, e.g. SN ejecta. During the MCRT simulation the source function is reconstructed from the packet interaction histories and then used in a formal integration step to calculate the emergent radiation field along cast rays. By relying on this technique, virtually noise-free spectra can be determined. 36 Also, as described by Bulla et al. (2015), the peel-off technique can be applied not only to interaction events but instead to all MC packet trajectory elements. Here, packets can contribute to the synthetic observation even when no interactions occur: the synthetic observables are obtained by a sum over contributions from all packet trajectories weighted similar to Eq. (112) but including an additional multiplicative term that gives the probability that an interaction event could have happened along the trajectory element.

Further biasing techniques
In addition to the schemes outlined so far, a variety of other biasing techniques have been developed and are actively used. Among them are, for example, techniques called path length stretching (Baes et al. 2016), continuous absorption (known also as packet splitting or survival biasing Carter and Cashwell 1975;Steinacker et al. 2013;Lee et al. 2017) or polychromatism (Jonsson 2006;Steinacker et al. 2013). We refer the reader to the literature for example to the review by Steinacker et al. (2013) and the book by Dupree and Fraley (2002) for detailed accounts.

Limitations-Russian Roulette and composite biasing
Naturally, biasing techniques are not a universal remedy and are also afflicted by drawbacks. Here, we highlight some of the more severe limitations and discuss techniques that have been proposed and developed to alleviate them.
By design, the increase in statistics in some regions of the parameter space comes at a cost, namely the decrease of statistics in other regions. Specifically, the loss of statistics occurs in regions where the biased PDF q(x) is smaller than the original one, ρ(x). This has the immediate consequence that biasing techniques should be only used if the loss in statistics happens for packets that are not relevant for the result one is interested in. In addition, the packets associated with draws from these regions experience an increase in their weight, which in principle is unbound. This poses numerical problems, since a few high-weight packets may then dominate the MC noise. To alleviate this deficit of biasing approaches, a technique called composite biasing has been proposed (Baes et al. 2016). Here, samples are drawn from a linear combination of the biased and the original PDF: As a consequence, the weight adjustment If, for example ζ = 1/2, is chosen (as suggested by Baes et al. 2016), the weight increase can at most be a factor of two. Note, however, that in some biasing techniques (e.g. packet splitting) weights are potentially modified repeatedly. Then, composite biasing limits each consecutive weight change but as a consequence of the continuous application of biasing, the weights can in principle still become very large. When applying biasing techniques that can act multiple times on the same packet, also small packet weights can become a hindrance. Packets with very small weights only contribute insignificantly to the reconstructed property but roughly the same computational effort has to be invested to follow their propagation as for important packets.
Based on this cost-benefit argument, it is advisable to terminate the propagation once the weight and thus importance of a packet has decreased beyond some predefined threshold. In this context, the so-called Russian Roulette method provides a stochastic framework to remove low-weight packets from the simulation, while still retaining energy conservation in a statistical sense (see e.g. Carter and Cashwell 1975;Dupree and Fraley 2002). In its simplest form, a termination probability p T is defined. Whenever a packet enters the roulette, the termination probability is sampled and the packet propagation is terminated if the sampling outcome is positive. Otherwise, the packet survives and its weight increases to w/ p T . This way, the weights of the terminated packets are distributed probabilistically onto the surviving ones and energy/weight conservation is ensured statistically. A detailed description of the Russian Roulette technique, and more sophisticated realisations, is given by Dupree and Fraley (2002).

Implicit and diffusion Monte Carlo techniques
Conventional MCRT methods, built upon the techniques outlined so far, inherently rely on explicitly tracking packet flight paths. Although this has a range of compelling benefits, not least the conceptual ease with which it can be developed, it has limitations particularly in regard to efficiency for many applications. For example, MCRT calculations become prohibitively slow when applied in optically thick media since the number of physical and numerical events that has to be explicitly tracked increase drastically. Another challenge is posed in Thermal Radiative Transfer (TRT) applications where successive absorptions and re-emissions occur frequently. Achieving a stable and accurate solution of the evolution of the ambient medium and of the radiation field typically requires a drastic reduction of the size of the physical time step. In the following, we outline a number of developments and techniques that have been proposed and are actively used to alleviate these shortcomings.

Implicit Monte Carlo
Standard explicit MC techniques face challenges when dealing with TRT problems since these involve a rapid succession of absorption and emission processes. In this situation sufficiently short time steps have to be used so that the ambient conditions (temperature etc.) can properly react to absorption-emission imbalances. Otherwise, the radiation source term may deplete the internal energy reservoir of the ambient material between successive temperature updates and lead to unphysical conditions (e.g. negative temperatures).
These difficulties are addressed by the so-called Implicit Monte Carlo (IMC) method, introduced in the seminal work by Fleck and Cummings (1971). Here, the sequence of absorption and emission events is replaced by an effective scattering prescription and only the net imbalance remains as a true absorption and emission contribution. Despite the name, the IMC method does not constitute a truly implicit solution approach, comparable to techniques encountered in the field of solving differential equations. Instead, a semi-implicit recasting of the discretized RT equation is performed. This procedure leads to the main advantage of the IMC approach, namely the introduction of unconditional stability. In the following, we briefly outline the formulation of the IMC technique and discuss some important properties of this approach. For an in-depth discussion of the method, we refer to the original work by Fleck and Cummings (1971) and to the recent detailed review by Wollaber (2016) on the subject.
Following Fleck and Cummings (1971), we introduce the IMC approach for the example of the one-dimensional grey TRT problem in the absence of scattering interactions. The derived equations can be easily modified to account for scatterings and a generalisation to frequency-dependent and multidimensional problems is possible. 37 With these simplifications, the governing equations are describing the change in specific intensity and internal energy density U in the presence of a grey, pure-absorption opacity χ and of an additional generic source term S. Note that due to the assumption of a one-dimensional slab geometry, the angle-integrated specific intensity 2π 0 dφ I is denoted as I for simplicity. In a first step, this system is slightly modified by introducing the equilibrium radiation energy, E, and its relation to the internal energy of the ambient material by resulting in At the heart of the IMC method lies a semi-implicit recasting of Eq. (121) to approximate E and thus the emission term in Eq. (120). For this purpose, a discrete version of Eq. (121) is considered, in which all time-continuous quantities are replaced by appropriate time averages, which we denote by bars 38 : Here, the time discretization Δt n = t n+1 − t n has been introduced. As proposed by Fleck and Cummings (1971), this equation is solved for E n+1 by invoking the specific time-centering schemeĒ for the radiation energy with the time-centering parameter 0 ≤ α ≤ 1. Here, α = 0 would revert back to the traditional, fully explicit MC scheme. Finally, all remaining time-averaged values are re-interpreted as their time-continuous counterparts and reinserted into Eq. (120), yielding the modified RT equation after introducing the so-called Fleck factor Compared to the original RT equation, Eq. (116), the IMC semi-implicit recasting reduces the importance of physical absorption interactions by the factor f . At the same time, this reduction is compensated by the introduction of terms that formally behave as a scattering contribution whose strength is governed by (1 − f )χ . From the definition of the Fleck factor it is apparent that, as the time steps become large or as the coupling between the internal and radiation energy pool becomes strong (i.e. β and/or χ become large), the contribution of the true absorption-emission interactions to the transfer process are reduced and replaced by effective scatterings. This behaviour of the IMC scheme is illustrated in Fig. 14 and leads to unconditional stability (i.e. also holds for Δt → ∞) as long as a time-centering parameter 0.5 ≤ α ≤ 1 is chosen.
This unconditional stability constitutes the main advantage of IMC and a substantial improvement over conventional MCRT approaches. This beneficial property has led to widespread adoption of the IMC scheme. In the astrophysics community, IMC schemes are predominantly applied in the field of RT in SN ejecta. Abdikamalov et al. (2012) have incorporated the method in a MCRT scheme for neutrino transport, Wollaeger et al. (2013), Wollaeger and van Rossum (2014) have developed a MC tool for RT in SNe based on IMC and recently Roth and Kasen (2015) have included IMC into the MCRT code Sedona (Kasen et al. 2006) and demonstrated its utility in one-dimensional radiation hydrodynamical calculations.
The stability benefit of IMC does, however, come at a cost and some of the less desirable features of this technique should not go unmentioned. In general, the construction of the governing IMC equations introduces a time discretization error which is formally of O(Δt). As a consequence, the scheme becomes increasingly inaccurate as the time step becomes larger. Moreover, Wollaber (2016) cite the so-called maximum principle violation which can occur within IMC calculations as its main weakness. Here, temperatures within a computational domain can non-physically exceed the imposed boundary temperatures in the absence of internal sources. Larsen and Mercer (1987) formulate a time step constraint under which these violations may be avoided. (1 − f )χ can become orders of magnitudes larger than the remaining "net" absorption term, χ abs eff = f χ . The illustration shows the behaviour for two choices for the time-centering parameter α However, these conditions are very restrictive and limit the applicability of IMC. More information about the maximum principle violation, and about efforts to alleviate it within the IMC framework as well as other drawbacks, such as accurately reproducing the diffusion limit, the introduction of damped oscillations or teleportation errors, are summarized by Wollaber (2016).
Finally, we note that the linearisation, semi-implicit recasting and discretisation proposed by Fleck and Cummings (1971) and reviewed here constitutes only one possibility to improve numerical stability. The recent review by Wollaber (2016) provides a comprehensive overview of a number of alternative approaches. In particular, we draw attention to the family of techniques, mainly shaped by Brooks and collaborators (e.g. Brooks 1989;Brooks et al. 2005), denoted Symbolic Implicit Monte Carlo (SIMC), which leave the thermal emission term formally unknown by introducing unknown symbolic packet weights. This technique may be denoted as a truly implicit MC method in the same sense as applied in the field of solving differential equations (see Wollaber 2016).

Efficient Monte Carlo techniques in optically thick media
While conventional MC techniques are well suited for problems with a moderate or low optical depth, their efficiency decreases dramatically in optically thick applications. In a pure scattering environment, packets are frequently deflected by collisions and their propagation effectively becomes a random walk. Explicitly following and treating the multitude of interactions as is required in conventional MC approaches becomes very inefficient and computationally expensive. The situation is similar in problems with high absorption opacities. At first glance the short packet trajectories due to rapid truncation by frequent absorption events seem to argue for a efficient application of MC techniques in this regime. However, in equilibrium/steady-state problems this would need to be countered by very large numbers of quanta to describe the propagation while in explicit time-dependent MC treatments, small time steps are required to ensure numerical stability (see discussion in Sect. 10.1). As detailed above, the IMC approach offers a solution to the time-step problem since it ensures unconditional stability. However, the IMC approach suffers equally in efficiency in the optically thick regime since the Fleck factor is very small in such situations and the vast majority of interactions proceed as effective scatterings.
A number of authors have developed techniques that improve the efficiency of MC calculations in optically thick regimes. These acceleration techniques replace the conventional MC transport process by a diffusion treatment that efficiently transports MC quanta through regions of high optical depth. The appropriate probabilities for these transport processes are found by a stochastic interpretation of the diffusion equation that constitutes the correct physical limit for RT processes in the presence of high opacities. Typically, these MC diffusion techniques are interfaced with a conventional, often IMC transport approach to yield a hybrid scheme that efficiently solves RT in problems with varying optical thickness. In the following, we briefly outline two popular flavours of these diffusion techniques, which predominantly differ in how the diffusion regions, in which the normal transport simulation is switched off, are treated. These are the so-called random walk or Modified Random Walk (MRW) techniques originally developed by Fleck and Canfield (1984) and the Discrete Diffusion MC (DDMC) methods (see e.g. Densmore et al. 2007, and references therein).

Modified random walk
The Random Walk (RW, or MRW as coined by ) was developed by Fleck and Canfield (1984) as an extension to their IMC method (see Sect. 10.1) to improve the computational efficiency in applications with regions of high optical depth. The main idea underlying this approach is the introduction of spherical diffusion regions whenever the optical depth is high. Instead of following the multitude of effective scatterings in these regions with IMC, the conventional packet transport process is switched off and replaced by a diffusion procedure. Here, packets are able to traverse the diffusion regions in one-step processes. The probabilities governing this propagation mode are derived by Fleck and Canfield (1984) by examining the statistical properties of the random walk process and the solution to the diffusion equation. While the original MRW scheme has been derived for IMC applications, it naturally applies to explicit MC approaches as well after setting the Fleck factor to 1.
We briefly outline the MRW procedure and refer to the original work by Fleck and Canfield (1984) for a detailed derivation. Diffusion spheres, originating from the current location of all packets are constructed. The radii R 0 of the spheres are chosen such that they entirely lie within their host grid cells but occupy the largest possible volume (see illustration in Fig. 15). In these homogeneous and isothermal spheres, the explicit MC packet transport may be replaced by a diffusion solution. However, this replacement is only accurate if the packets are expected to perform a multitude of interactions as they propagate in the sphere so that the diffusion approximation becomes valid. Fleck and Canfield (1984) translate this requirement into the following criteria for the activation of the diffusion process: Here, λ R denotes the Rosseland mean free path 39 and l col the physical distance the packet has to cover to the next interaction as determined in the standard MC transport propagation procedure (see Sect. 6). These conditions ensure that packets are expected to perform multiple interactions 40 within the sphere and are guaranteed to interact at least once. As soon as these conditions apply, the normal IMC transport of packets is stopped in the sphere and a diffusion procedure is started (see Fig. 15). This process is govern by transport rules obtained from considering how far packets could propagate under diffusion conditions as a function of time. Specifically, the probability of finding a packet at distance r from its initial position (which is by construction at r = 0) after time t is given by Here, D is the diffusion constant for the process (see Fleck and Canfield 1984, Eq. 21). This result can be used to determine the probability that the packet still resides within the sphere after time t, which is Consequently, at any given time (e.g. the end of a time step), the fate of the packet can be decided by the random number experiment with the escape probability If the packet escapes, its position is updated to a location uniformly drawn from the surface of the diffusion sphere. A new direction is assigned by sampling the cosine Fig. 15 Illustration of the construction of diffusion spheres in the MRW approach. The sketch illustrates the current packet location as a thick black dot. With this location as the origin, a sphere with radius R 0 that fully fits into the current grid cell (defined in the plane-parallel case by the boundaries x i and x i+1 ) and has maximum volume is constructed. Two possible packet propagation paths in the normal transport scheme are shown. In the first case (a), the packet leaves the sphere without interacting while in the second case (b) the packet scatters inside the sphere after covering the distance l col . Since λ R is smaller than R 0 in the illustrated situation, the MRW diffusion scheme would be switched on in case (b) while the packet would be transported normally in case (a). This illustration is adapted from Fleck and Canfield (1984, Fig. 3) distribution about the normal to the surface of the sphere and finally the propagation time is advanced to the time of escape which is found by This identity is solved using the same random number that determined the decision about the escape from the diffusion sphere, i.e. in the experiment described by Eq. (130). If the packet remains in the sphere after time t has elapsed, it is moved within the diffusion sphere. A new direction is drawn uniformly and its location is updated by determining a new sphere with radius R 1 4π and placing the packet randomly onto its surface. Once the packet resumes its propagation (either in a new time step or after escaping from a diffusion sphere), the criteria in Eqs. (126) and (127) are again used to determine whether the packet is transported via normal (I)MC procedures or by defining a new diffusion sphere. This technique is schematically illustrated in Fig. 16. Following such a MRW scheme can significantly increase the efficiency of MCRT calculations in optically thick regimes (Fleck and Canfield 1984). However, it still faces efficiency problems if packets are close to grid cell boundaries. In this region, the diffusion spheres can become too small to allow the criteria in Eqs. (126) and (127) to activate the a b Fig. 16 Illustration of the MRW propagation process. On the left hand side, the situation in an optically thick cell is shown if only a conventional MC transport scheme (in this case IMC) is shown. The packet will scatter multiple times which is time consuming to simulate directly. Instead, when the IMC approach is coupled with a MRW scheme, packets can much more efficiently step through optically thick regions, as illustrated in the right panel. As long as the applicability criteria for MRW, Eqs. (126), (127), are met the explicit packet transport procedure (denoted by thick solid lines) is switched off and the packet can traverse the diffusion spheres in simple one-step processes (denoted by the thin dotted lines). In the illustration shown here, it is assumed that the packet escapes the diffusion spheres by virtue of the MC experiment in Eq. (130) during the time step. At this point, it is placed randomly (according to a uniform distribution) onto the surface of the sphere (black dots). Only close to the cell boundaries (x i and x i+1 ), the diffusion spheres are often too small to activate MRW and the normal MC transport scheme has to be used (cf. last segment of the depicted packet trajectory) diffusion procedure. Thus, the inefficient transport scheme has to be used to propagate packets in these situations (see comments by Densmore et al. 2012).
Recently, the MRW approach has been applied in astrophysical RT problems by Robitaille (2010). There, the scheme is incorporated into MC approaches to dust RT and specifically helps to transport packets through optically thick parts of dusty discs. However, it seems very challenging to adapt this scheme to applications in which complex opacities, particularly Sobolev-type line opacities, have to be taken into accounted.

Discrete diffusion Monte Carlo
In the MRW scheme, only spherical subregions of grid cells are designated diffusion zones. As outlined above, constraints imposed on the size of the sphere lead to efficiency problems when packets are located close to grid cell boundaries. This drawback is eliminated in other MC diffusion approaches. In so-called Discrete Diffusion Monte Carlo (DDMC) techniques, entire grid cells are treated as diffusion regions. Within, DDMC packets are generated that can traverse these cells efficiently in one-step processes. The propagation rules for this procedure are again extracted from a probabilistic interpretation of the discretized diffusion equation. In analogy to the MRW method, DDMC schemes are commonly used in hybrid approaches in combination with IMC transport techniques to ensure an efficient applicability to problems with regions of different optical thickness (see e.g. Gentile 2001;Densmore et al. 2007).
DDMC techniques have their origin in neutron transport problems (see overview by Densmore et al. 2007) but a popular variant designed for photon RT has been presented by Densmore et al. (2007). Another flavour of the diffusion technique has been developed by Gentile (2001) and is often referred to as Implicit Monte Carlo Diffusion (IMD). 41 The main difference with respect to the DDMC approach by Densmore et al. (2007) lies in the treatment of how time is tracked by the DDMC packets. While both DDMC and IMD have originally been presented for grey problems, multi-group extensions appropriate for frequency-dependent applications have already been devised, in particular by Densmore et al. (2012) and Cleveland et al. (2010) respectively.
Of the DDMC schemes, the variant of Densmore et al. (2007Densmore et al. ( , 2012 seems to currently have experienced the most attention in the astrophysical community. Abdikamalov et al. (2012) have developed a hybrid DDMC-IMC approach for neutrino transport in core-collapse SNe and Wollaeger et al. (2013) and Wollaeger and van Rossum (2014) have introduced a MC method for RT in SN ejecta, constructed around a DDMC-IMC core. Consequently, we only focus on the DDMC scheme of Densmore et al. (2007) in the following, where we briefly highlight the guiding principles of discrete diffusion techniques. We refer the reader to Gentile (2001), Cleveland et al. (2010) and Cleveland and Gentile (2015) for details about the closely related IMD approach.
The DDMC scheme as presented by Densmore et al. (2007) is formulated for grey and static diffusion problems. Extensions for frequency dependent problems and moving media are introduced by Densmore et al. (2012) and by Abdikamalov et al. (2012) respectively. The DDMC approach begins with the designation of a subset of grid cells in the computational domain, which are considered sufficiently optically thick, as DDMC diffusion zones. For simplicity in presenting the governing principles, we assume that there is one continuous subregion in the domain in which DDMC is active and which consists of N cells. The interfaces of these cells lie at x j−1/2 and x j+1/2 for j = 1 . . . N . The diffusion cells are logically separated into interior cells (i.e. cells with neighbouring DDMC cells at both sides) and interface cells (i.e. cells which lie at the interface between DDMC and transport regions or lie at the domain boundary). In the diffusion cells, DDMC particles are generated at the beginning of each simulation (time) step based on the active radiation field from the previous time step, internal emission due to source terms or, in the interface cells, due to influx of normal IMC packets or due to inflow imposed by the domain boundary conditions. The magnitude of all these processes and rules for how the DDMC particles propagate are obtained from a discretized diffusion equation. For this purpose, the basic IMC transport equation, Eq. (124), is considered as the starting point in the DDMC, and its zeroth moment is discretized in space, yielding 1 c 41 Densmore et al. (2007) still classifies the IMD approach as a member of the class of DDMC techniques.
Here, cell-averaged scalar intensities and cell-edge fluxes are used. Equation (135) is closed by using Fick's diffusion law (Fick 1855) An appropriate representation of the spatial derivative is found by finite-differencing. This leads to a time-continuous diffusion equation, discretized in space, which takes the form for interior cells. Here, left and right leakage opacities are defined in terms of the face-averaged opacities σ + j−1/2 and σ − j+1/2 . In this nomenclature, the superscripts denote which neighbouring cell is used for the opacity calculation. In particular, σ − j+1/2 is based on the material properties of cell j and conversely, σ + j+1/2 uses the information from cell j + 1. According to Densmore et al. (2007), this procedure is necessary to avoid propagation problems in cases where one of the opacities becomes very large. As the authors point out, in addition to relying on the material properties of the appropriate neighbouring cell, the use of a common cell-edge temperature is vital for overcoming these problems (see discussion by Densmore et al. 2007). Equation (139) builds the foundation of the DDMC scheme and sets the propagation behaviour of DDMC particles after a probabilistic interpretation has been performed. It describes a time-dependent 42 evolution equation for the scalar intensity and thus for the population of DDMC particles. In this view, the terms on the RHS of Eq. (139) describe processes that increase φ j , and thus the population of DDMC particles in cell j. This can either be by emission (first term on RHS) or by leakage of DDMC particles from neighbouring cells (second and third terms on RHS). Conversely, the term on the LHS captures all processes which reduce φ j and thus remove DDMC particles from cell j. Again, this can occur via leakage into one of the neighbouring cells or through an absorption event. In this interpretation, DDMC packets are not associated with explicit location or direction information but only with their current host grid cell. They are propagated by considering the time to the end of the time step and the time to the next collision, which can be determined analogously to the corresponding procedure in conventional MC transport by Here, collisions can either refer to an absorption or leakage event. The exact nature of the collision can be established in a random number experiment similar to the decision between scattering and absorption in conventional MC transport (see Sect. 6.4) based on the relative magnitudes of the terms appearing in the denominator of Eq. (142). In the case of absorption, the propagation of the DDMC particle is terminated, otherwise its internal clock is advanced by t col and it continues its propagation in the new cell until the end of the time step is reached or it is absorbed. An equation similar to Eq. (139) is found for DDMC interface cells, which are at the edge of the diffusion regions, after imposing appropriate boundary conditions. Instead of relying on the Marshak boundary condition, Densmore et al. (2007) propose a condition inspired by the asymptotic diffusion-limit. This ensures an accurate behaviour of the DDMC scheme in situations in which the incoming transport packet population has a very anisotropic angular distribution too (see Densmore et al. 2007). The resulting space-discretized diffusion equation has the same structure as for interior cells apart from an additional source term that describes the influx of radiation from the transport region (or from outside of the computational domain if the interface is at the domain edge). This source term can be converted into a probability which is sampled every time a MC packet from the transport region or from the domain boundary condition impinges onto the diffusion region to decide whether the packet is converted into a DDMC particle or reflected back. The complementary process of DDMC packets leaking out of the diffusion region is handled by placing them isotropically onto the interface. Such packets then continue propagating according to the conventional MC transport scheme.

MCRT and dynamics
In Sect. 9 we reviewed how estimators can be constructed to determine the rate of transfer of energy and momentum from the radiation field to the ambient medium. This transfer can become dynamically important and drastically affect the evolution of a system. In the astrophysical realm, prominent examples for such circumstances include radiatively driven mass outflows from hot stars (see review by Puls et al. 2008) or accretion discs (e.g. Proga et al. 1998Proga et al. , 2000Proga and Kallman 2004), the star formation process (see review by McKee and Ostriker 2007, and references therein) or the shock outbreak phase in SNe (see e.g. overview in Mihalas and Mihalas 1984). In situations such as these, a decoupled treatment of hydrodynamics and RT is no longer accurate but a coupled RH solution approach has to be followed.
Historically, RH studies have been typically performed with deterministic solution techniques. But particularly in the field of line-driven winds from hot stars, there is a substantial literature based on MC studies by Abbott and Lucy (1985) and, among others, Lucy and Abbott (1993), Vink et al. (1999Vink et al. ( , 2000 and Müller and Vink (2008). The main motivation for relying on MC schemes certainly lies in their ease of treating the Sobolev-type line opacities encountered in these winds. Specifically, a MC calculation is used to determine the momentum deposition in the outflow material according to which a steady-state wind structure is calculated. In addition, fully dynamic RH approaches which rely on MC methods have been developed and applied. For example, Nayakshin et al. (2009) andAcreman et al. (2010) coupled Smoothed Particle Hydrodynamics (SPH) approaches with MCRT calculations. Haworth and Harries (2012) investigated triggered star formation with a RH approach in which the gas temperature is adjusted by a MC-based photo-ionization calculation. Harries (2015) and Harries et al. (2017) continued the development of MC-based RH methods for star formation problems. Noebauer et al. (2012) and Roth and Kasen (2015) introduced MC-based RH techniques with a general-purpose scope, with a particular focus on IMC techniques in the latter. Implicit MC diffusion methods were coupled with hydrodynamics calculations by Cleveland and Gentile (2015). This limited list of examples illustrates that the possibility of using MCRT techniques in fully dynamic applications is actively researched and developed. In the following, we briefly sketch how energy and momentum transfer terms may be reconstructed from MCRT calculations and included in fluid dynamics calculations.

Reconstructing energy and momentum transfer terms
The full RH problem can be formulated in terms of the normal fluid dynamical equations describing mass, momentum and energy conservation but modified by terms capturing the energy and momentum exchange mediated by radiation-matter interactions. Following Mihalas and Auer (2001), we present the equations in the LF and refer the reader to the standard literature, in particular to Mihalas and Mihalas (1984), for more details and full derivations: Here, the material density ρ, pressure p and specific internal energy e appear. Also, the Kronecker delta δ i j is used. On the right-hand side of the energy and momentum equation, the components of the so-called radiation force appear, G 0 and G i , encoding energy and momentum exchange between the radiation field and the ambient material. These are defined by Relying on similar techniques as outlined in Sect. 9.3, the radiation force components can be reconstructed from the ensemble of MC packet histories. This procedure is particularly simple if we adopt a thermal equilibrium emission coefficient 43 and treat the emission and scattering processes as isotropic in the CMF. In this case, the radiation force in the CMF is given by where χ tot,0 (ν 0 ) and χ a,0 (ν 0 ) are the total and absorption parts of the opacity (see Sect. 6.4) in the CMF. E 0ν and F i 0ν are, respectively the CMF specific radiation energy density and flux, and B ν 0 is the Planck function.
For grey opacities, this simplifies further to where a R is the radiation constant and T 0 the temperature. In this case, the radiation force may be reconstructed using the efficient volume-based estimators for the radiation energy density and flux E 0 and F 0 which already have been presented in Sect. 9.3. For frequency-dependent material functions, one may introduce appropriate frequency averages of the opacities, as for example done by Roth and Kasen (2015) and retain the analogous equations as above. Alternatively, the opacities may be included in the volume-based averaging process.
This may be interpreted as summing packet energies and momenta, and weighting the individual contributions by the probability that a packet interacts along the trajectory element l i (see also the discussion about reconstructing heating rates with volume-based estimators in Sect. 9.3). In the above formulation, we evaluate the frequency-dependent opacity at the beginning of each individual packet trajectory element according to the instantaneous packet frequency. This naturally assumes that the opacity varies only mildly along the trajectory segment. In situations, in which this is not fulfilled, alternative formulations have to be devised. In astrophysical applications, this occurs for example whenever bound-bound processes, i.e. interactions with atomic line transitions, are important, as in line-driven mass outflows from hot stars or in SN Ia ejecta (cf. Sect. 8.2). Here, the opacity varies strongly whenever photons resonate with a line transition. For such applications, the energy and momentum transfer terms may be reconstructed as proposed by Lucy (1999b) and Noebauer and Sim (2015). 44 Reconstructing the momentum deposition based on Eq. (151) as sketched above relies on the radiation flux in the CMF. As pointed out by Roth and Kasen (2015), volume-based estimators as derived previously involve the cancellation of contributions from packets propagating in opposite directions. In particular in the diffusion regime, in which the net flux is expected to be very small, such estimators suffer from high MC shot noise. Thus, Roth and Kasen (2015) proposed an alternative reconstruction scheme for this regime, based on the first moment of the transfer equation, which reduces to under diffusive conditions (Mihalas and Auer 2001). Now, the momentum deposition depends on the radiation pressure tensor which can be easily reconstructed without relying on cancellation effects. As an alternative to the CMF-based reconstruction approaches detailed above, the radiation force components can also be determined in the LF. A corresponding reconstruction procedure within the volume-based estimator approach was outlined by Noebauer et al. (2012).

Coupling to fluid dynamics
Once the energy and momentum transfer terms are available via the radiation force components they can be coupled to a fluid dynamical calculation. Typically, an operator-splitting approach (see e.g. LeVeque 2002, for a detailed explanation of the operator splitting principle) is used to tackle the RH problem. This is a widely used technique to deal with source terms in hydrodynamical equations (e.g. gravity, nuclear energy release, etc.) and is part of many MC-based RH approaches (e.g. Noebauer et al. 2012;Roth and Kasen 2015). Implementing the simplest incarnation of this technique, the so-called Godunov splitting (cf. LeVeque 2002), a RH time step would then pro- Fig. 17 Illustration of a simple Godunov-splitting approach to MC-based RH ceed as outlined in Fig. 17. It begins with a pure hydrodynamical solver call, assuming the absence of any source terms on the right hand side of Eqs. (144), (145) due to RT. The new fluid state thus determined is then used to solve the RT problem using MC techniques. From the ensemble of packet trajectories, energy and momentum transfer between the ambient material and the radiation field can be reconstructed using the concepts detailed above. According to these transfer terms, the fluid momentum and energy are updated and the time step is complete.

Example application
As originally suggested by Ensman (1994), solving the structure of radiative shocks has become a standard test problem for RH solution techniques. In these shocks, a radiative precursor emerging from the shocked domain penetrates the upstream material preheating and compressing it (for a detailed overview of these phenomena, we refer the reader to Zel'dovich and Raizer 1969). Depending on the strength of the pre-heating, sub-and super-critical shocks are distinguished. The temperature in the precursor region remains below that of the shocked material in the sub-critical case but reaches it in super-critical shocks. Thanks to the seminal works by Lowrie and Rauenzahn (2007) and Lowrie and Edwards (2008), analytic steady-state solutions are available for these shocks.
As a test of the methods, Noebauer et al. (2012) and Roth and Kasen (2015) have used operator-splitting techniques to successfully calculate the structure of radiative shocks with MC-based RH approaches. Here, we discuss the success of these testsfurther details about the physical and numerical setup of these simulations are given in Appendix A.4. Figure 18 shows the time evolution of the structure of sub-and supercritical nonsteady radiative shocks solved with the MC-based approach Mcrh (Noebauer et al. 2012), compared with the results of calculations performed with the finite-difference approach Zeus-Mp2 (Hayes and Norman 2003;Hayes et al. 2006). In addition, Fig. 19 shows the structure of a steady radiative shock with Mach number M = 5 obtained with Mcrh in comparison with the analytic predictions following the solution strategy developed by Lowrie and Edwards (2008). 45 In both cases, the results of the MC simu- Fig. 18 The temperature structure of sub-critical (top panel) and supercritical radiative shocks (bottom panel), calculated with Mcrh (orange) and Zeus-Mp2 (blue). The gas (solid lines) and radiation temperature (dashed and dotted lines) are shown for 4 different snapshots. These are 5.5 × 10 3 , 1.7 × 10 4 , 2.8 × 10 4 and 3.8 × 10 4 s for the subcritical and 8.6 × 10 2 , 4.0 × 10 3 , 7.5 × 10 3 and 1.3 × 10 4 s for the supercritical case. This illustration is adapted from Noebauer et al. (2012, figs. 5 and 6). More details about the setup are provided in Appendix A.4 lation agree very well with the reference calculation and the semi-analytic predictions respectively.

Challenges and limitations
Despite being conceptually simple and easily implemented, MC-based radiation hydrodynamical approaches relying on the operator splitting techniques suffer from limitations. For a successful application of operator-splitting, strict limits have to be set on the duration of the time step. These restrictions are imposed by the characteristic time scales of the source terms, most notably the heating and cooling terms in the energy equations. The difficulties arising from these time-scale limits are best illustrated at the example of TRT (see also Sect. 10.1). If thermal emission is stronger than the corresponding absorption of radiation, a characteristic cooling time can be Fig. 19 Comparison between the semi-analytic solution (blue solid) for the steady radiative shock with M = 5 according to Lowrie and Edwards (2008) If a global time step larger than this value is chosen, thermal emission during the RT sub-step will completely deplete the internal energy content of the ambient material and unphysical states with negative internal energy are induced in the final step. This restriction renders the simple time-explicit operator splitting MC RH approach inefficient in the stiff source term regime, i.e. in situations in which the characteristic radiative time scales are much shorter than the typical fluid-flow time scales, which in explicit schemes are given by the Courant criterion (Courant et al. 1928). 47 The stiff source term problem is not unique to the MC RH problem but a general challenge when dealing with source terms in hydrodynamical calculations (cf. LeVeque 2002). A common approach to address this problem is to rely on implicit solution techniques. In this context, the IMC techniques outlined in Sect. 10.1 seem very promising. In fact, Roth and Kasen (2015) coupled an IMC RT scheme with a fluid dynamical calculation and successfully applied it to test problems in which the radiative time scales are smaller than the fluid-flow time scales. Nevertheless, as stressed in Sect. 10.1, IMC methods are not truly implicit in the traditional sense and also suffer from other potential downsides, e.g. maximum principle violation (cf. Wollaber 2016).
A completely different approach to the stiff source term problem was suggested by Miniati and Colella (2007). An unsplit Godunov scheme was developed, consisting of a modified predictor and a semi-implicit corrector step which incorporates the effects of the source term. This method was adapted to RH by Sekora and Stone (2010) and Jiang et al. (2012). In principle, the hybrid Godunov approach could also be utilised in MC-based RH calculations, but a successful application of this scheme in conjunction with MCRT methods has yet to be demonstrated.
Notwithstanding the challenges, MC-based techniques constitute a valuable alternative approach to RH. Such methods offer the possibility to benefit from the same advantages that MC techniques already bring to pure RT calculations, namely a straightforward generalization to multidimensional geometries and the ease with which complex interaction processes are incorporated.

Example astrophysical application
We conclude this article by presenting a concrete example from our own experience of how MCRT methods can be used to solve RT problems in astrophysics. In this first version of our Living Review, we will focus on a discussion of calculating synthetic spectra for SNe Ia. This example, makes use of many techniques outlined in this review, particularly, the indivisible energy packet scheme (cf. Sect. 5.2), a variant of the macro-atom scheme (cf. Sect. 7), volume-based estimators (cf. Sect. 9.3) and the peeling-off technique for variance reduction (cf. Sect. 9.4). Throughout the discussion we make use of the open source code Tardis (Kerzendorf and Sim 2014;Kerzendorf et al. 2018), which is readily available 48 for inspection (or use) by the interested reader.
In future verstions of this Review we will plan to gradually extend our discussion of examples. In particular, we aim to summarise closely related work on the modelling of fast outflows for other classes of astrophysical sources such as hot stars and accretion disk winds (see references in Sect. 3). Such applications also make use of many of the techniques outlined in this review and are generally quite closely related to the methods used in the SN Ia example discussed here. The most important difference, arguably, is that the SN problem often requires only an homologous velocity law, which leads to a number of simplifications (see Sect. 8.2). In contrast, more general stellar/disk wind applications require that more complicated velocity fields are considered.

Type Ia supernovae
SNe Ia are transient events that have been instrumental in establishing our currently accepted cosmological standard model and are still widely used in precision cosmology (see e.g. Goobar and Leibundgut 2011). In particular, Riess et al. (1998) and Perlmutter et al. (1999) pioneered the use of SNe Ia as standardisable distance indicators to map out the recent expansion history of our Universe, finding an accelerated expansion. Apart from their relevance in cosmological studies, SNe Ia play an important role in many other branches of astrophysics as well, for example in galactic chemical evolution (e.g. Kobayashi et al. 1998;Seitenzahl and Townsley 2017). Notwithstanding the importance of SNe Ia, a full understanding of the exact nature of these transients still remains elusive and a range of proposed models remain under study (see e.g. Hillebrandt and Niemeyer 2000;Hillebrandt et al. 2013;Röpke 2017;Röpke and Sim 2018). One important strategy to study SNe Ia is to model their observed spectra with the aim of inferring the ejecta composition and structure as a means to understand the explosion itself. Tardis, which we use for this demonstration, is a tool aimed at this problem in which highly parameterized and flexible RT simulations are used to interpret observations.
MCRT methods are well-suited for calculating synthetic observables in SNe Ia. Due to the absence of hydrogen and helium and the dominance of heavy elements in the ejecta of SNe Ia, RT is mainly driven by bound-bound interactions. As a consequence, SN Ia spectra show no true continuum but rather a pseudo-continuum, generated by the flux redistribution achieved in a multitude of non-resonant line interactions. This property in combination with the fact that many models predict anisotropies in the overall morphology and chemical structure of SN Ia ejecta make MCRT an attractive choice for treating RT. Popular numerical approaches relying on MCRT for SN Ia studies include Artis (Kromer and Sim 2009), Sedona (Kasen et al. 2006), SuperNu (Wollaeger et al. 2013;Wollaeger and van Rossum 2014), Tardis (Kerzendorf and Sim 2014;Kerzendorf et al. 2018;Vogl et al. 2019), the scheme developed by Mazzali and Lucy (1993) and Mazzali (2000) and Sumo (Jerkstrand et al. 2011(Jerkstrand et al. , 2012.

Model type Ia supernova
Since Tardis was specifically designed as a highly parameterized MCRT approach for spectral synthesis in SNe Ia, it adopts a number of simplifications. For a detailed overview we refer to the original publication by Kerzendorf and Sim (2014) and the publicly available documentation. 49 Here, we only highlight some of the key aspects of the MCRT machinery of Tardis.
Similar to the approach by Mazzali and Lucy (1993), Tardis adopts the elementary SN model of Jeffery and Branch (1990). Here, the SN ejecta are approximated as spherically symmetric and divided into two domains, the continuum-forming region and the atmosphere. A photosphere separates both regions. It is assumed that thermalization processes are only relevant below the photosphere and that interactions in the atmosphere are either electron scatterings in the Thomson limit or line interactions. Tardis follows the spectral synthesis process in the atmosphere with a time-independent, frequency-dependent MCRT approach. Packets are launched from the photosphere at the inner computational boundary from a thermal distribution according to the photospheric temperature and followed as they propagate through the envelope until escaping through either boundary. An important aspect of the Tardis approach is the determination of a self-consistent plasma state and photospheric temperature, which is achieved using volume-based estimator techniques akin to those outlined in Sect. 9.3 49 http://tardis.readthedocs.io/en/latest/.
in an iterative process. Only after a converged plasma state has been found, the final synthetic spectrum is calculated. Tardis includes electron scattering and bound-bound interactions relying on the Sobolev-approximation (see Sect. 8.2). Fluorescence can be treated either using the downbranching scheme by Lucy (1999b) or a simplified version of the macro atom scheme by Lucy (2002. To reduce the MC noise in the synthetic spectra, a variant of the peel-off technique can be used, referred to as virtual packet scheme (see Sect. 9.4). Different assumptions about the ioniziation and excitation state can be adopted but for the Tardis simulations presented below, a modified nebular approximation (see Mazzali and Lucy 1993) was used together with a dilute-Boltzmann excitation treatment. Finally, Tardis relies on a discrete representation of the ejecta state in terms of density and velocity on a spherical grid. For each grid cell, the mass density, the velocity at the cell interfaces and the chemical composition have to be specified. Internally, perfect homology is assumed, for example when progressing through the Sobolev line interaction scheme (see Sect. 8.2).

Spectral synthesis with MCRT
As an illustration, we use Tardis to calculate a synthetic spectrum for the SN Ia SN 2005bl. 50 We consider the epoch three days before maximum light in the Bband which corresponds to 14 d after explosion. We adopt the stratified chemical composition derived by Hachinger et al. (2009) and use a density profile similar to the famous W7 explosion model by Nomoto et al. (1984). This setup, which is shown in Fig. 20, has been previously used by Barbosa (2016) to establish the suitability of Tardis for abundance tomography studies. All necessary configuration and data files to repeat the Tardis calculations are included in the repository published as part of this review (see Appendix B). Figure 21 shows the main product of a Tardis calculation, namely the synthetic spectrum for the model setup. Since the optical depth of the constructed SN atmosphere is rather high, many MC packets injected at the lower boundary are back-scattered onto the photosphere and lost for the spectral synthesis process. Only a small fraction of the launched packets reach the ejecta surface and contribute to the emergent spectrum, leading to a substantial amount of MC noise. This situation can be significantly improved by using the implemented virtual packet scheme. Whenever a MC packet is launched or interacts, a pre-defined number of virtual packets (ten in the current Tardis simulation) are spawned and propagated towards the ejecta surface along rays that are cast in directions drawn from the emission profile of the corresponding process. The optical depth to the surface is calculated along these rays and the energy of the virtual packet decreased by a corresponding attenuation factor (see Sect. 9.4 or Kerzendorf and Sim 2014 for more details). Figure 21 also includes the synthetic . This setup has been adapted by Barbosa (2016) from Hachinger et al. (2009) spectrum generated from the virtual packets which has a much lower noise level than the spectrum that is based on the real packet population.
One advantage of MCRT lies in the diagnostic possibilities this approach offers. Details about the interactions packets experienced can be easily recorded and used to examine the radiation-matter coupling or to investigate the origin of particular features in the SED of the emergent radiation field. In the following, we highlight only some possible applications of these capabilities. For simplicity, we will only focus on the last interaction MC packets performed before escaping through the outer boundary. 51 Figure 22 illustrates the importance of non-resonant line interactions when calculating synthetic spectra for SNe Ia. All emergent packets have been binned according to their incident and emergent wavelengths in their last line interaction. In the Tardis simulations shown here, the macro atom scheme is used to treat nonresonant interactions within the indivisible energy packet paradigm. While Fig. 22 shows that many packets have interacted resonantly (cf. diagonal where λ in = λ out ), the fluorescence and inverse-fluorescence regions above and below the diagonal are also densely populated.
In analogy to extracting information about the wavelength redistribution, details about the interaction process can be recorded just as easily. Figure 23 shows which ions predominantly contribute to the last line interactions MC packets experience in the Tardis simulation of SN 2005bl. It clearly illustrates, that singly-and doublyionized iron-group elements are the dominant interaction partners, followed by the intermediate-mass elements in the same ionization state.

Fig. 21
Tardis synthetic spectra for the SN model constructed for SN 2005bl at 3 d before B-band maximum. The spectrum constructed from the escaping MC packets is shown in blue and exhibits high MC noise. In addition, the synthetic spectrum which is generated from the virtual packets and which suffers from much less MC noise, is shown in orange Fig. 22 Histogram of the incident and emergent wavelength of all escaping MC packets in the last line interaction from the Tardis simulation of SN 2005bl. Resonance interactions can be found on the diagonal, fluorescence above it and inverse-fluorescence process below it. The macro atom scheme of Lucy (2002Lucy ( , 2003 was used in this simulation (see also Sect. 7) Finally, we combine the information about the interaction partner and the wavelength change into a visualization proposed by M. Kromer (see e.g. Kromer and Sim 2009). This provides detailed information about the spectrum formation process. The contribution of each escaping packet to the emergent spectrum is colour-coded according to the atomic number of the last interaction partner and plotted at the location of the emergent wavelength. This procedure can be performed on the level of individual elements, or as we chose to do here for simplicity, by elemental groups. Figure 24 shows the synthetic spectrum calculated with Tardis and how the different elemental groups, fuel (C, N, O, Ne), intermediate-mass elements (Na through Sc) and iron-peak elements, contribute.

Summary and conclusions
In this work, we provide an overview of some of the MCRT techniques used in astrophysics. We have presented a variety of evidence that this approach has evolved into a competitive and very successful method to solve radiative transfer problems. With its probabilistic approach, MCRT offers a number of compelling advantages that make this technique ideal for a variety of astrophysical applications. Whenever irregular multidimensional geometries are encountered or complex interaction processes, particularly scatterings, have to be accounted for, MCRT methods are typically a good choice for addressing radiative transfer problems. For this reason, the MCRT framework finds wide-spread application in astrophysics, from modelling mass-outflows from stars and accretion discs, to simulating radiative transfer through dusty environments or studying ionization on cosmological scales. Recently, MCRT schemes have even been included in fully dynamic radiation hydrodynamics calculations.

Fig. 24
Illustration of the contributions of the various elemental groups to the final emergent spectrum in the Tardis simulation of SN 2005bl. In particular, the contribution of each escaping (virtual) MC packet to the final spectrum is colour-coded according to the last interaction partner. In addition to the contributions of the different elemental groups, packets that escaped without interacting straight from the inner boundary are shown as well ("photosphere"), together with packets that performed electron scatterings as their last interactions Relying on MCRT approaches, however, always comes at the cost of introducing statistical fluctuations into the solution process. Nevertheless, a variety of variance reduction techniques have been developed over the years to keep this noise component under control-many of these methods have been reviewed in this work. Also, conventional MCRT approaches are ill-suited for the application to optically thick environments and to problems with short cooling time scales. Extensions and modifications, particularly MC diffusion schemes and the IMC approach, have been developed to alleviate these deficiencies and have already found their application in astrophysical MCRT calculations.
Finally, we want to emphasize an important aspect of MCRT methods, the value of which should not be under-rated: the MCRT approach of performing a simulation of radiative transfer by following the propagation of packets is very intuitive since it closely resembles the microphysical processes realised in nature. Furthermore, the fundamental MCRT concepts are quite simple and basic computer programs can be developed quickly with only a handful of instructions. The directness of the physics and simplicity of the algorithms also mean that it is typically fairly easy to develop codes by gradually upgrading the physics: incorporating new physical processes rarely requires any fundamental overhaul. All this, together with the fact that many stateof-the-art MCRT simulation codes for astrophysical applications are open source and freely available, makes the entrance barrier quite low for the adoption of MCRT. As the continuous increase in the availability of computational resources seems to hold and since MC calculations can easily be distributed over multiple computation units, it seems more than likely that the success MCRT will continue. contributions, had a profound influence on the development of Monte Carlo radiative transfer techniques in astrophysics. One of us (SAS) had the privilege of working with LBL during his time at Imperial College London and wishes to acknowledge the large number of insightful discussions with LBL. We both (UMN/SAS) wish to thank Knox S. Long who has, for over a decade, been a collaborator and sounding board for many projects. UMN first came into contact with Monte Carlo radiative transfer when working with KSL, an experience that instilled a fascination for the subject that ultimately lead to the development of this work. Nick Higginbottom, Wolfgang Kerzendorf, Christian Knigge, James Matthews, Jorick Vink and Christian Vogl are thanked for many fruitful discussions on topics included in this review. We are also very grateful to Rémi Kazeroni and Jérôme Guilet for their help with the original publications in French that were used for the historical sketch of the Monte Carlo approach. Finally, we would like to express our sincere thanks to Markus Kromer and Wolfgang Hillebrandt for their thoughtful comments and suggestions in the prepration of this review, and for many interesting and productive collaborations over the years.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Test problems
For this review, we use a number of simple test problems to illustrate various techniques relevant for MCRT calculations. In the following, we introduce these test problems.

Homogeneous sphere
A frequently used test setup to verify and validate numerical approaches to RT is that of radiation emerging from a homogeneous sphere (see e.g. Smit et al. 1997;Abdikamalov et al. 2012). In particular, we consider a homogeneous sphere of radius R composed of a material with constant opacity χ and emissivity η (and thus with a constant source function S). The sphere is assumed to be surrounded by vacuum. The structure of the steady-state radiation field inside and outside of the sphere can be obtained from the formal solution to the transfer equation (cf. Smit et al. 1997 After performing the appropriate angle averaging (cf. Sect. 2), the moments of the specific intensity are obtained. Inside the sphere (r < R), they follow (cf. Smit et al. 1997) J (r ) = S 1 − For the test calculations presented in this work, specifically in Sect. 9, we adopt the parameters suggested by Abdikamalov et al. (2012). In particular, a homogeneous sphere with radius R = 10 km and the constant absorption opacity χ = 2.5 × 10 −4 cm −1 and source function S = 10 erg.cm −2 s −1 on the inside is considered. In the MCRT test simulations the sphere is embedded in a computational domain that extends out to r = 50 km and is divided into 100 equidistant shells. Figure 25 shows the analytic solution for this homogeneous sphere test problem in terms of J , H , and K according to Eqs. (A.5)-(A.10).
The homogeneous sphere problem is often discussed from a slightly different viewpoint, namely when the probability of photons escaping from such a sphere is of interest. This question has been discussed in detail by Osterbrock (1974, Appendix 2) and Osterbrock and Ferland (2006, Section 4.5) in the context of gaseous nebulae and an analytic expression is derived and presented there. By considering the emergent flux at the surface of the sphere [i.e. Eq. (A.9), evaluated at r = R] and relating it to the expected flux in the absence of absorption (i.e. χ = 0), the escape probability as a function of optical depth (τ ) is obtained (see also Fig. 26). Throughout Sects. 6 and 9, we determine the escape probability from the homogeneous sphere with MCRT simulations and compare the results to the predictions according to Eq. (A.11).

Line profile for a sphere in homologous expansion
To test the line formation process implemented in Sobolev-based MCRT schemes in Sect. 8.2 we use a simple setup, which aims at predicting the H Lyman α line profile emanating from a homologous flow composed of neutral hydrogen. Specifically, we consider a spherical domain with an inner and outer boundary at R inner and R outer .
The material is assumed to be in perfect homologous expansion, i.e. v = r /t, and we set the extent of the domain by choosing the time t = 13.5 d and the minimum and maximum material velocities v min = 10 −4 c and v max = 10 −2 c. We neglect timedependence and follow the propagation of N = 10 5 packets that are emitted from the lower boundary at R inner . Their initial frequency is uniformly drawn from the interval corresponding to the wavelength range λ min = 1185 Å to λ max = 1255 Å. We assume that line interactions proceed resonantly at the natural wavelength λ line = 1215 Å and that their strength throughout the flow is given by a constant Sobolev optical depth τ s = 1. All packets are followed until they either escape through the outer domain edge at R outer and contribute to the line profile or are back-scattered onto the inner boundary and are discarded. For this illustration, we only include the Doppler effect and further simplify the transformations by working in the weakly-relativistic limit, thus setting γ = 1. The MCRT simulation starts by launching the packets at the inner boundary. These packets are initialized with r ini = R inner , a frequency sampled from ν = ν min + ξ(ν max − ν min ), (A.12) a distance to the next interaction, τ , given by Eq. (45) and an initial propagation direction drawn from Eq. (43). At the beginning of the packet propagation, the distance to the outer domain edge is calculated Here, μ i is the incident propagation direction determined by Eq. (A.16) and μ e denotes the direction into which the packet emerges after scattering. In homologous flows, the Sobolev escape probability is (approximately) direction independent. Thus, a new propagation direction can be drawn isotropically and the packet propagation then continues. In this simple illustration, the ensuing propagation process is trivial. Since the packet only redshifts in a homologous flow, it cannot come into resonance and interact with the line transition again. What remains is to determine whether the packet has been backscattered or if it propagates towards the outer domain boundary. If the packet intersects the inner boundary and is discarded. Otherwise, it escapes and contributes with ν esc = ν to the line profile. The final line profile is determined after all packets have been processed by binning the frequencies of the escaping ones.

Model supernova
Lucy (2005) presented a simple but powerful test problem to verify the performance of RT schemes for the calculation of SN Ia light curves. In this spherically symmetric setup, radiative energy is non-uniformly generated throughout the calculation. This simulates the energy liberated in the decay of non-uniformly distributed 56 Ni. In particular, spherically symmetric ejecta in homologous expansion with a total mass of M tot = 1.39 M and uniform density are considered. The maximum material velocity is assumed to be v max = 10 4 km s −1 and the composition is chosen as follows: the inner ejecta regions up to M r = 0.5 M are entirely made up from 56 Ni. Its abundance then drops linearly until it reaches zero at M r = 0.75 M . The remaining mass is composed of carbon and oxygen in equal parts. The energy released in the decay of radioactive material is first distributed into MC packets representing γ -rays, which are propagated through the model SN. For these γ -packets, a purely absorptive specific cross section, κ = 0.03 cm 2 g −1 is assumed. When γ -packets are absorbed, they are instantly converted into MC packets representing the ultravioletoptical-infrared radiation field. 52 Interactions for these packets are treated as isotropic resonant scatterings and their strength is given by a uniform specific cross section σ = 0.1 cm 2 g −1 . A time-dependent MCRT simulation is performed by following the packets which represent the energy release from the decay until they leave through the ejecta surface. We note that we do not start the MCRT simulation promptly after explosion but at t = 3 d. This is advised since the ejecta are initially very optically thick and virtually opaque, rendering an application of a conventional MCRT approach (without the techniques outlined in Sect. 11) to determine the emergent radiation field inefficient and unnecessary. We simply expand the ejecta in accordance with the homologous velocity law to the start time of the simulation and keep track of the radioactive energy release up to this point. After accounting for adiabatic cooling, this energy constitutes the seed radiation field at the start of the MCRT simulations. We refer the reader to Lucy (2005) and Noebauer et al. (2012) for more details on the physical and numerical setup of this test problem.

Radiative shocks
Determining the structure of radiation-dominated shocks has become a standard test problem to verify and validate the performance of numerical schemes for RH. The theoretical foundations for an understanding and description of these phenomena have been laid out by Raizer (1957a) and Zel'dovich (1957a). 53 Their findings are summarized in the text book by Zel'dovich and Raizer (1967). In contrast to their hydrodynamical counterparts, radiation flows from the hot shocked domain into the cold unshocked region and pre-heats and pre-compresses the flow there. As a consequence, the sharp shock front becomes washed out. Depending on the amount of pre-heating one distinguishes between sub-and supercritical radiative shocks. In the latter, the material right in front of the hydrodynamic shock is heated to the temperature of the shocked material beyond the relaxation region. For more details, we refer to the standard literature on this subject, for example Zel'dovich and Raizer (1967), Sincell et al. (1999), and Lowrie and Edwards (2008). Non-steady radiative shocks: Ensman (1994) examined a number of different test problems to verify and validate numerical approaches to RT and RH. In particular, it was proposed to solve the time-dependent structure of non-steady radiative shocks. Over the years, this original suggestion has become a standard test to validate new numerical RH approaches (e.g. Turner and Stone 2001;Hayes and Norman 2003;Hayes et al. 2006;Commerçon et al. 2011;Noebauer et al. 2012;Kolb et al. 2013;Roth and Kasen 2015;Sijoy and Chaturvedi 2015). The non-steady radiative shock calculations presented in this work follow closely the suggestion by Ensman (1994). Specific details about the setup are given by Noebauer et al. (2012). The shock calculations are carried out in a plane-parallel computational domain of size L = 7 × 10 10 cm. The shock is generated by directing the flow, which has a uniform grey absorption opacity of χ = 3.1 × 10 −10 cm −1 and initial uniform density of ρ = 7.78 × 10 −8 g cm −3 , towards the left, reflecting boundary. Typically, two realisations of the test problem are considered that differ in the bulk velocity of the flow. With v = −6 × 10 5 cm s −1 , a subcritical shock emerges, while a super-critical shock is generated with v = −2 × 10 6 cm s −1 . Finally, an initial temperature structure with a slight gradient is chosen 54 : T (x) = 10 K + L − x L 75 K. (A.20) 53 The original publications in Russian are Raizer (1957b) and Zel'dovich (1957b). 54 Ensman (1994) found it necessary to include this slight gradient to avoid numerical problems in their calculations.
Following the suggestion by Hayes and Norman (2003), the shock structure calculated is displayed as a function of a pseudo-Lagrangian coordinate: This allows us to easily compare with the results of Ensman (1994) who used a Lagrangian code. Steady radiative shocks: While the calculation of the evolving structure of non-steady radiative shocks has become a standard test problem for radiation hydrodynamics, its value is somewhat limited by the lack of an analytic solution to the test problem. Thanks to the developments by Lowrie and Rauenzahn (2007) and Lowrie and Edwards (2008), this is not the case for steady radiative shocks: an analytic solution technique, first based on the equilibrium diffusion description of radiative transfer (Lowrie and Rauenzahn 2007) and later generalized by relying on non-equilibrium diffusion (Lowrie and Edwards 2008), has been presented. Following the original work, solving the structure of steady radiative shocks and comparing it to the predictions obtained with the solution technique of Lowrie and Edwards (2008) has quickly become an integral part of the standard test suite for RH approaches (e.g. Sekora and Stone 2010;van der Holst et al. 2011;Zhang et al. 2011;Davis et al. 2012;Jiang et al. 2012;Ramsey and Dullemond 2015;González et al. 2015;Roth and Kasen 2015). Lowrie and Rauenzahn (2007) and Lowrie and Edwards (2008) based their solution strategy on the non-dimensional form of the radiation hydrodynamical equations. In particular, reference length, mass density, material temperature and material sound speeds are used to convert the relevant physical quantities into their non-dimensional counterparts, which we denote with the tilde symbol: where reference values for each quantity are denoted by a hat symbol. While not required for the applicability of the solution approach, the steady radiation shock test problem is typically set up assuming a uniform absorption cross section. In this case, a particular shock realisation can be specified by choosing values for the dimensionless quantityP = a RT 4 ρâ 2 (A.29) the radiation diffusivityκ, an adiabatic index γ , an absorption cross sectionσ , a value for the shock Mach number M and a reference length. By convention, the pre-shock state is given byθ = 1,T = 1,ρ = 1 andṽ = M. In Sect. 11, we present the results of a M = 5 test calculation, withP = 10 −4 , κ = 1,σ = 10 6 andx = 1, obtained with the MC-based RH approach Mcrh (Noebauer et al. 2012). For the numerical setup in Mcrh, which relies on the dimensional form of physical quantities, the reference valuesx = 1 cm,â = 1.7310 × 10 7 cm s −1 , ρ = 0.3449 g cm −3 andT = 1.0810 × 10 6 K were used. A plane-parallel computational domain was chosen in which a central discontinuity separates two regions with constant fluid states. The jump fulfils the jump conditions of the radiation-plusmatter system as given by Lowrie and Rauenzahn (2007). Simple outflow boundary conditions are used for the hydrodynamic sub-system and MC packets reaching the boundaries freely escape. To counteract this outflow, inflow boundary conditions are set up for the radiative subsystem. Specifically, inflowing radiation is imposed with a rate corresponding to that of a thermal radiation field at the temperature given by the jump conditions and integrated over one hemisphere. The system is then followed until a steady-state has established which is then compared to the analytic predictions according to Lowrie and Edwards (2008). 55

Software collection
As part of this review, we freely distribute a number of MCRT Python programs used in the test calculations together with the configuration files for the Tardis simulation presented in Sect. 12. The interested reader can find all these in the GitHub repository at https://github.com/unoebauer/mcrtreview-tools.git. It contains the following tools and data files: -Python tool mcrt_hom_sphere.py to perform simple, time-independent MCRT simulations for the homogeneous sphere problem (cf. Appendix A.1) -Python script mcrt_escape_prop.py with which a simple MCRT simulation to determine the escape probability from a homogeneous sphere can be performed (cf. Appendix A.1 and Sect. 6.4) -Python script mcrt_pcyngi.py to determine the P-Cygni line profile formed in a homologously expanding spherical flow (cf. Appendix A.2 and Sect. 8.2) -Tardis setup files for the spectral synthesis calculations presented in Sect. 12. The setup consists of -the configuration file tardis_sn2005bl_m3_config.yml -the density structure file tardis_sn2005bl_m3_density.dat -the chemical composition file tardis_sn2005bl_m3_abundances.dat -text file containing information about the Tardis simulation facilitating reproduction; info.rst.