1 Introduction

The neutron transport equation (NTE) describes the flux of neutrons across a directional planar cross-section in an inhomogeneous fissile medium (typically measured is number of neutrons per cm\(^2\) per second). As such, flux is described as a function of time, t, Euclidian location, \(r\in \mathbb {R}^3\), direction of travel, \(\Omega \in \mathbb {S}_2\), speed \(c>0\) (and hence velocity \(\upsilon = c\Omega \)), and neutron energy, \(E\in \mathbb {R}\). It is not uncommon in the physics literature, as indeed we shall do here, to assume that energy is a function of velocity (\(E = m|\upsilon |^2/2\)), thereby reducing the number of variables by one. This allows us to describe the dependency of flux more simply in terms of time and, what we call, the configuration variables\( (r, \upsilon ) \in D \times V\) where \(D\subseteq \mathbb {R}^3\) is a smooth, open, connected and bounded domain of concern such that \(\partial D\) has zero Lebesgue measure and V is the velocity space, which can now be taken to be \(V = \{v\in \mathbb {R}^3: \upsilon _{\texttt {min}}<|v|<\upsilon _{\texttt {max}}\}\), where \(0<\upsilon _{\texttt {min}}<\upsilon _{\texttt {max}}<\infty \).

Before stating the NTE, let us remind the reader of some elementary nuclear physics, which is required to describe the evolution of neutron flux. In the most basic of flux models, there are essentially only four processes at the level of the atomic nuclei which contribute to the evolution of neutron flux.

The first is spontaneous neutron emission from unstable nuclei. This comes from radioactive isotopes whose nuclei are excited. They cause what is known as non-transmutation emissions, in which a neutron is ejected with an escape velocity (neutron emission), or, conversely, what are called transmutation emissions in which the nucleus instantaneously fragments into two or more nuclei (spontaneous fission) with a range of possible masses, emitting one or more neutrons with escape velocities in the process.

The second process pertains to neutron scattering. This is where a neutron travelling with a given velocity passes in close proximity to an atomic nucleus, which, in our model, results in an instantaneous change of velocity.

The third process is neutron-induced fission. This is the classical setting in which a neutron travelling with a given velocity strikes an atomic nucleus sending it into an excited state, from which it instantaneously fragments into two or more nuclei, simultaneously releasing one or more neutrons.

The fourth and final process is neutron capture. In this setting, a neutron travelling with a given velocity strikes an atomic nucleus, but instead of causing nuclear fission, it is absorbed into the nucleus. It can also be the case that neutrons decay into other subatomic particles, and thus disappear from the system. To all intents and purposes, we can treat this as neutron capture.

When modelling the transmission of neutrons in a fissile material, those neutrons which have been released from nuclei are known as prompt neutrons.

With more advanced modelling, one can also take account of the fact that some of the processes described above can also involve other types of nuclear emissions, often in addition to neutrons. These include alpha and beta particles and gamma radiation. Whilst the former two are not sufficiently energetic to cause fission, sufficiently energetic gamma rays are able to induce fission.

Spontaneous fission and neutron-induced fission can also produce what are known as delayed neutrons. These are neutrons released from a fission product (isotope) some time after fission has occurred. In terms of modelling, they are spontaneous neutron emissions which occur at the site of neutron-induced fission but at a moment later in time. Delayed neutrons are only in a delayed state until they are released after which they are considered as prompt neutrons.

We refer to models which take account of the full range of flux profiles as multi-species models.

2 Neutron Transport Equation

Let us now write down the basic neutron transport equation (prompt neutrons only), which has been widely considered in a variety of physics and engineering literature (cf. [8, 28], to name but two classical references), and somewhat more sporadically studied in the mathematical literature. See [6, 17, 25] for the three most authoritative mathematical texts in more recent times, as well as e.g. [3, 4, 13, 22, 26] for some of the rarer examples of the probabilistic treatment of the NTE.

Neutron flux at time \(t\ge 0\) is henceforth identified as \(\Psi _t: {D}\times V\rightarrow [0,\infty )\), and the classical presentation of its evolution in time is given by the integro-differential equation, also known as the forward neutron transport equation,Footnote 1

$$\begin{aligned} \frac{\partial }{\partial t}\Psi _t(r, \upsilon )&=-\, \upsilon \cdot \nabla \Psi _t(r, \upsilon ) -\sigma (r, \upsilon )\Psi _t(r, \upsilon ) +Q(r,\upsilon , t)\nonumber \\&\quad +\,\int _{V}\Psi _t(r, \upsilon ') \sigma _{\texttt {s}}(r, \upsilon ') \pi _{\texttt {s}}(r, \upsilon ', \upsilon ){ d }\upsilon '\nonumber \\&\quad +\,\int _{V}\Psi _t(r, \upsilon ')\sigma _{\texttt {f}}(r, \upsilon ') \pi _{\texttt {f}}(r, \upsilon ', \upsilon ){ d }\upsilon ', \end{aligned}$$
(2.1)

where the different components (or cross-sections as they are known in the physics literature) are all uniformly bounded and measurable with the following interpretation:

$$\begin{aligned} \sigma _{\texttt {s}}(r, \upsilon ')&: \text { the rate at which scattering occurs from incoming velocity }\upsilon ',\\ \sigma _{\texttt {f}}(r, \upsilon ')&: \text { the rate at which fission occurs from incoming velocity }\upsilon ',\\ \sigma (r, \upsilon )&: \text { the sum of the rates } \sigma _{\texttt {f}}+ \sigma _{\texttt {s}}, \text { also known as the } { total} \text { cross section} \\ \pi _{\texttt {s}}(r, \upsilon ', \upsilon ){ d }\upsilon '&: \text { the scattering yield at velocity}\, \upsilon \text { from incoming velocity } \upsilon ', \\&\quad \text { satisfying }\textstyle {\int _{V}}\pi _{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon '=1,\\ \pi _{\texttt {f}}(r, \upsilon ', \upsilon ){ d }\upsilon '&: \text { the neutron yield at velocity } \upsilon \text { from fission with incoming velocity } \upsilon ',\\&\quad \text { satisfying }\textstyle {\int _{V}}\pi _{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon '<\infty , \text { and }\\ Q(r,\upsilon , t)&: \text { non-negative source term. } \end{aligned}$$

It is normal to assume that all quantities are uniformly bounded away from infinity. It is also usual to assume the additional boundary conditions

$$\begin{aligned} \left\{ \begin{array}{ll} \Psi _0(r, \upsilon ) = g(r, \upsilon ) &{}\text { for }r\in D, \upsilon \in {V},\\ &{}\\ \Psi _t(r, \upsilon )= 0&{} \text { for } t\ge 0 \text { and } r\in \partial D \text { if }\upsilon \cdot \mathbf{n}_r<0, \end{array} \right. \end{aligned}$$
(2.2)

where \(\mathbf{n}_r\) is the outward facing normal of D at \(r\in \partial D\) and \(g: D\times {V}\rightarrow [0,\infty )\) is a bounded, measurable function which we will later assume has some additional properties. Roughly speaking, as the forward equation describes where particles could have evolved from in order to contribute to the current configuration, this boundary condition means that particles from outside the domain with incoming velocity are not taken into account. The second of the above two boundary condition is sometimes written \(\Psi _t|_{\partial D^-} =0\), where \(\partial D^- = \{(r,\upsilon )\in \partial D\times V:\upsilon \cdot \mathbf{n}_r<0 \}\). It is also usual to set \(Q =0\) when considering a rector with a multiplying medium, as the resulting fission will overwhelm the radioactive source term.

The notion of a solution of the form (2.1) turns out to be too strong to expect to make mathematical sense of it. This is predominantly due to the non-diffusive nature of the equation, in particular the non-local nature of the scattering and fission operators as well as regularity issues on the domain \(D\times V\) in relation to continuity properties of e.g. the operator \(\upsilon \cdot \nabla \). It is much more natural to look for solutions that belong to e.g. an appropriate \(L_2\) space. This is, moreover, helpful when looking to understand (2.1) as a backwards equation, rather than a forwards equation.

With some rearrangements, the components of (2.1) separate into transport, scattering and fission. Specifically,

$$\begin{aligned} \left\{ \begin{array}{rll} {\overset{_\rightarrow }{\texttt {T}}}{g}(r, \upsilon ) &{}:= - \upsilon \cdot \nabla {g}(r, \upsilon ) - \sigma (r,\upsilon ){g}(r, \upsilon ) &{}\text { (forwards transport) }\\ &{}\\ {\overset{_\rightarrow }{\texttt {S}}}{g}(r, \upsilon ) &{}:= \int _{V}{g}(r, \upsilon ') \sigma _{\texttt {s}}(r, \upsilon )\pi _{\texttt {s}}(r, \upsilon ', \upsilon ){ d }\upsilon ' &{}\text { (forwards scattering) }\\ &{}\\ {\overset{_\rightarrow }{\texttt {F}}}{g}(r, \upsilon ) &{}: = \int _{V}{g}(r, \upsilon ') \sigma _{\texttt {f}}(r, \upsilon ) \pi _{\texttt {f}}(r, \upsilon ', \upsilon ){ d }\upsilon '&{}\text { (forwards fission)} \end{array} \right. \end{aligned}$$
(2.3)

such that all operators are defined on \(D\times V\) and their action is zero otherwise. Let us momentarily consider the operator on the right-hand side of (2.1) as acting on \(L_2(D\times V)\), the space of square integrable functions on \(D\times V\), and write

$$\begin{aligned} \langle f, g\rangle = \int _{ D\times V}f(r,\upsilon )g(r,\upsilon ){ d }r{ d }\upsilon \end{aligned}$$

for the associated inner product. Note that, for \(f,g\in L_2(D\times V)\) such that both \(\upsilon \cdot \nabla f\) and \(\upsilon \cdot \nabla g\) are well defined as distributional derivatives, which are also in the space \( L_2(D\times V)\), with g respecting the second of the boundary conditions in (2.2), we can verify with a simple integration by parts that, for \(\upsilon \in V\),

$$\begin{aligned} \langle f, \upsilon \cdot \nabla g \rangle&= \int _{\partial D\times V} (\upsilon \cdot \upsilon ') f(r,\upsilon ')g(r,\upsilon ') { d }r{ d }\upsilon ' -\langle \upsilon \cdot \nabla f,g \rangle = -\langle \upsilon \cdot \nabla f,g \rangle \end{aligned}$$
(2.4)

providing we insist that f respects the boundary \(f (r, \upsilon )= 0\) for \(r\in \partial D\) if \(\upsilon \cdot \mathbf{n}_r>0\). Moreover, Fubini’s theorem also tells us that, for example, with \(f, g\in L_2(D\times V)\),

$$\begin{aligned} \left\langle f, \int _{V} g(\cdot , \upsilon ')\sigma _{\texttt {s}}(\cdot ,\upsilon ')\pi _{\texttt {s}}(\cdot , \upsilon ', \cdot ){ d }\upsilon ' \right\rangle&=\int _{{D}\times V\times V} f(r,\upsilon )\sigma _{\texttt {s}}(r,\upsilon ')g(r, \upsilon ')\pi _{\texttt {s}}(r, \upsilon ', \upsilon ){ d }\upsilon ' { d }r { d }\upsilon \\&=\int _{{D}\times V}\sigma _{\texttt {s}}(r,\upsilon ') \int _V f(r,\upsilon )\pi _{\texttt {s}}(r, \upsilon ', \upsilon ) { d }\upsilon \, g(r, \upsilon '){ d }r{ d }\upsilon ' \\&=\langle \sigma _{\texttt {s}}(\cdot , \cdot ) \int _V f(\cdot ,\upsilon )\pi _{\texttt {s}}(\cdot , \cdot , \upsilon ) { d }\upsilon , g \rangle . \end{aligned}$$

These computations tell us that, for \(f, g \in L_2({D}\times V)\) such \(\upsilon \cdot \nabla g\) and \(\upsilon \cdot \nabla f\) are well defined in the distributional sense and, moreover, that \(g(r, \upsilon )= 0\) for \(r\in \partial D\) if \(\upsilon \cdot \mathbf{n}_r<0\), and for \(f \in L_2({D}\times V)\) such that \(f (r, \upsilon )= 0\) for \(r\in \partial D\) if \(\upsilon \cdot \mathbf{n}_r>0\),

$$\begin{aligned} \langle f, (\overset{_\rightarrow }{\texttt {T}}+ \overset{_\rightarrow }{\texttt {S}}+ \overset{_\rightarrow }{\texttt {F}})g\rangle = \langle (\overset{_\leftarrow }{\texttt {T}}+ \overset{_\leftarrow }{\texttt {S}}+ \overset{_\leftarrow }{\texttt {F}}) f, g\rangle , \end{aligned}$$

where now we identify the transport, scattering and fission operators as

$$\begin{aligned} \left\{ \begin{array}{rll} {\overset{_\leftarrow }{\texttt {T}}}{f}(r, \upsilon ) &{}:= \upsilon \cdot \nabla {f}(r, \upsilon ) &{}\text { (backwards transport) }\\ &{}\\ {\overset{_\leftarrow }{\texttt {S}}}{f}(r, \upsilon ) &{}:= \sigma _{\texttt {s}}(r, \upsilon )\int _{V}{f}(r, \upsilon ') \pi _{\texttt {s}}(r, \upsilon , \upsilon ') { d }\upsilon ' - \sigma _{\texttt {s}}(r, \upsilon ){f}(r, \upsilon ) &{}\text { (backwards scattering) }\\ &{}\\ {\overset{_\leftarrow }{\texttt {F}}}{f}(r, \upsilon ) &{}: = \sigma _{\texttt {f}}(r, \upsilon ) \int _{V}{f}(r, \upsilon ') \pi _{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ' -\sigma _{\texttt {f}}(r, \upsilon )f (r,\upsilon )&{}\text { (backwards fission)} \end{array} \right. \end{aligned}$$
(2.5)

such that all operators are defined on \(D\times V\) with zero action otherwise. The reader will immediately note that, although the terms in the sum \({\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}} + {\overset{_\leftarrow }{\texttt {F}}}\) are identifiable as the adjoint of the terms in the sum \({\overset{_\rightarrow }{\texttt {T}}} + {\overset{_\rightarrow }{\texttt {S}}} + {\overset{_\rightarrow }{\texttt {F}}}\), the same can not be said for the individual ‘T’, ‘S’ and ‘F’ operators. That is to say, the way we have grouped the terms does not allow us to say that e.g. \({\overset{_\leftarrow }{\texttt {T}}}\) is the adjoint operator to \({\overset{_\rightarrow }{\texttt {T}}}\) and so on.

The reason for this difference in grouping of terms lies with how one reads the operators in terms of infinitesimal generators as a probabilist. Although this will not make any difference in the analysis of this paper, we keep to this notation for the sake of consistency with further related articles which offer a probabilistic perspective on the backwards NTE; see [5, 12, 14].

Roughly speaking, \({\overset{_\leftarrow }{\texttt {T}}}\), with an appropriately defined domain, is the generator of the rather simple Markov process consisting of a deterministic motion with velocity \(\upsilon \), i.e. transport due to pure advection, with killing on exiting the domain D. Similarly, with an appropriately defined domain, the operator \({\overset{_\leftarrow }{\texttt {S}}}\) is the generator corresponding to scattering, in which a particle travelling with velocity \(\upsilon \) at position r is removed at rate \(\sigma _{\texttt {s}}\) and replaced by a new particle at r with velocity \(\upsilon '\) chosen with probability \(\pi _{\texttt {s}}(r,\upsilon ,\upsilon '){ d }\upsilon '\). Taking advantage of the fact that \(\textstyle { \int _V\pi _{\texttt {s}}(r,\upsilon , { d }\upsilon '){ d }\upsilon '=1}\) we can also write

$$\begin{aligned}&\sigma _{\texttt {s}}(r, \upsilon )\int _{V}{f}(r, \upsilon ')\pi _{\texttt {s}}(r, \upsilon , \upsilon ') { d }\upsilon ' -\sigma _{\texttt {s}}(r, \upsilon ) {f}(r, \upsilon )\\&\quad =\sigma _{\texttt {s}}(r, \upsilon )\int _{V}[{f}(r, \upsilon ') - {f}(r, \upsilon ) ]\pi _{\texttt {s}}(r, \upsilon , \upsilon ') { d }\upsilon ' \end{aligned}$$

and also note that it takes the classical form of a difference operator. Finally \({\overset{_\leftarrow }{\texttt {F}}}\) is the generator action of a fission even in which a particle travelling with velocity \(\upsilon \) at position r is removed at rate \(\sigma _{\texttt {f}}\) and replaced by an average number of particles \(\pi _{\texttt {f}}(r,\upsilon ,\upsilon '){ d }\upsilon '\) moving onwards from r with velocity \(\upsilon '\).

This leads us to the so called backwards neutron transport equation (which is also known as the adjoint neutron transport equation) given by

$$\begin{aligned} \frac{\partial }{\partial t}\psi _t(r, \upsilon )&= \upsilon \cdot \nabla \psi _t(r, \upsilon ) -\sigma (r, \upsilon )\psi _t(r, \upsilon )\nonumber \\&\quad +\, \sigma _{\texttt {s}}(r, \upsilon )\int _{V}\psi _t(r, \upsilon ') \pi _{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon ' \nonumber \\&\quad +\,\sigma _{\texttt {f}}(r, \upsilon ) \int _{V}\psi _t(r, \upsilon ') \pi _{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ', \end{aligned}$$
(2.6)

with additional boundary conditions

$$\begin{aligned} \left\{ \begin{array}{ll} \psi _0(r, \upsilon ) = g(r, \upsilon ) &{}\text { for }r\in D, \upsilon \in {V}, \\ &{} \\ \psi _t(r, \upsilon ) = 0&{} \text { for } t \ge 0 \text { and } r\in \partial D \text { if }\upsilon \cdot \mathbf{n}_r>0. \end{array} \right. \end{aligned}$$
(2.7)

Similarly to previously, the second of these two conditions is often written \(\psi _t|_{\partial D^+} = 0\), where \(\partial D^+ : = \{(r,\upsilon )\in \partial D\times V : \upsilon \cdot \mathbf{n}_r>0 \}\).

The NTE has played a prominent role in real-world modelling and, for many years, has found a home in commercial software which is used in the nuclear safety industry. In particular, this is most prominent in the modelling and design of environments which are exposed to radioactive material, from nuclear reactor cores and hospital equipment, through to equipment used to irradiate produce that is sold in supermarkets, thereby prolonging its shelf-life. More recently, with the notion of human interplanetary space exploration becoming less of a sci-fi fantasy and more of a fast approaching reality, an understanding of how long-lasting and compact nuclear power sources, for e.g. Moon or Mars bases has become increasingly important.

Figure 1 illustrates a typical geometrical model of a reactor core rod, cladding and outer shielding.Footnote 2 The structural design of such a reactor can easily be stored as virtual environment (i.e. storing the coordinates of the different geometrical domains and the material properties in each domain) with around 150 MB of data, on to which extensive data libraries of numerical values for the respective quantities \( \sigma _{\texttt {s}}, \sigma _{\texttt {f}}, \pi _{\texttt {s}}, \pi _{\texttt {f}}\) can be mapped. (It is an otherwise little known fact that countries which are heavily invested in nuclear power, such as the UK, USA, France, China, etc., are all in possession of such numerical libraries of cross sections, which have been carefully built up over decades.)

Fig. 1
figure 1

A virtual model of a nuclear reactor core with colour indicating the respective fissile properties of the virtual materials used. Uranium rods are arranged into hexagonal cells which are arranged within a larger containment casing (Color figure online)

One of the principal ways in which neutron flux is understood is to look for the leading eigenvalue and associated ground state eigenfunction. Roughly speaking, this means looking for an associated triple of eigenvalue \(\lambda \in \mathbb {R}\), non-negative right eigenfunction \(\varphi : {D}\times V\rightarrow [0,\infty )\) in \(L_2(D\times V)\) satisfying \(\varphi |_{\partial D^+} =0\) and a non-negative left eigenfunction \({\tilde{\varphi }}\) on \({D}\times V\) in \(L_2(D\times V)\) satisfying \({\tilde{\varphi }}|_{\partial D^-} = 0\) such that

$$\begin{aligned} \lambda \langle \varphi , f \rangle =\langle ( {\overset{_\leftarrow }{\texttt {T}}}+ {\overset{_\leftarrow }{\texttt {S}}} + {\overset{_\leftarrow }{\texttt {F}}})\varphi , f \rangle \quad \text { and } \quad \lambda \langle f, \tilde{\varphi } \rangle =\langle ( {\overset{_\leftarrow }{\texttt {T}}}+ {\overset{_\leftarrow }{\texttt {S}}} + {\overset{_\leftarrow }{\texttt {F}}})f, \tilde{\varphi } \rangle . \end{aligned}$$

As such, this introduces the notion of fissile stability, in particular in the case that \(\lambda = 0\). This is naturally the desired scenarioFootnote 3 for a nuclear reactor.

In the physics literature, it is thus often understood that, to leading order, the NTE (2.6) is solved in the approximate sense

$$\begin{aligned} \psi _t(r,\upsilon ) = \mathrm{e}^{\lambda t}\langle g, \tilde{\varphi }\rangle \varphi (r, \upsilon ) + o(\mathrm{e}^{\lambda t}), \quad t\ge 0. \end{aligned}$$
(2.8)

Note that the scenario that \(\lambda >0\) is obviously to be avoided in practice as this would correspond to a set-up that could result in exponential growth in fission.

The approximation (2.8) can be seen as a functional version of the Perron–Frobenius Theorem and has given rise to a number of different numerical methods for estimating the value of the eigenvalue \(\lambda \) as well as the eigenfunctions \(\varphi \) and \(\tilde{\varphi }\). One approach pertains to the discretisation of (2.1) followed by the use numerical analytic methods; see [31]. Another pertains to the previously alluded to identification of the solution to the NTE as the linear semigroup of a Markov branching process, which in turn implies Monte Carlo methods involving the simulation of the aforesaid branching process. Such methods are computationally expensive, as branching processes, being tree-like structures, are complex to simulate, e.g. from the point of view of parallelisation. In related papers to this one, we will discuss a new Monte Carlo approach to the NTE based on some of the stochastic analysis we deal with in this article as well as in related work undertaken by the authors of this paper; see [5, 12, 14].

The aim of this paper is manifold. First and foremost, we aim to reposition the theory of the NTE into a contemporary probabilistic setting. We will do this by explaining a precise relationship between the NTE and a two different families of Markov processes via Feynman–Kac type formulae. Indeed, this article is one of a cluster of forthcoming pieces of work, which take a new and predominantly probabilistic point of view of the NTE; cf. [5, 12, 14]. Next we want to introduce the notion of the (multi-species) NTE into the literature, which generalises (2.1) by simultaneously modeling the flux of all species of particles and radiation involved in the process of nuclear fission. In doing so we will show that, just as in the classical setting, one may develop the notion of a lead eigenvalue and eigenfunction, which is an important part of describing fissile stability. As such, the current article is part review of existing theory and part presentation of new research results based on generalisation of existing results.

Together with the accompanying papers [5, 12, 14], we believe that the probabilistic perspective presented here, i.e. coupling the solutions to the NTE with averaging procedures of certain Markov processes, opens up the possibility of many questions that can be considered at depth in the arena of stochastic analysis and Monte Carlo algorithms, which are currently missing from the literature. Indeed, returning to the kind of environments seen in Fig. 1, there are many questions concerning how to analyse and numerically generate the leading eigenfunctions and eigenvalue to a reasonable degree of precision. Such questions might include: What is the connection of the eigendecomposition discussed in this paper and e.g. R-theory or the theory of general Harris recurrence for stochastic processes (cf. [9, 23, 24])? How do different stochastic representations lead to different Monte Carlo simulations?Based on stochastic representation how does one measure convergence of Monte Carlo algorithms? How strong can they be predicted to be? What kind of variance reduction techniques does stochastic representation suggest?Does the inclusion of multi-species models make estimation of the leading eigenvalue more accurate?

3 Organisation of the Paper

In the next section, we give a brief overview of the key mathematical literature for the NTE. (Note we do not stray beyond mathematical literature, as the physics and engineering literature is significantly more expansive.) Thereafter in Sect. 5, we introduce the multi-species NTE (MNTE) and its rigorous formulation, existence, uniqueness and asymptotics in the setting of an abstract Cauchy problem. In particular, we show how the unique solution is identified as a \(c_0\)-semigroup in the appropriate \(L_2\) space. In Sect. 6, we introduce a spatial branching process that is constructed using the cross sections that appear in the NTE to describe its stochastic evolution. Here we introduce its expectation semigroup. In Sect. 7, we provide a second stochastic representation to the expectation semigroup introduced in the previous section via a classical method of the many-to-one formula.

Ideally, we would like to claim that the expectation semigroup discussed in Sects. 6 and 7 agree with the \(c_0\)-semigroup introduced in Sect. 5 (its formal definition appearing just above Theorem 5.2). This is particularly desired as it forms the foundations of how Monte Carlo simulation of the physical process can be used to develop a numerical solution to the MNTE. In Sect. 8, we consolidate the two notions of semigroup and show that there is partial agreement in an appropriate sense. As far as we are aware, this is a point which is currently not clearly discussed in the literature. Finally we end the paper with a proof of one of the main theorems in Sect. 6 which provides the asymptotic behaviour of the solution to the MNTE in terms of the lead eigenfunction. This is a new result in the multi-species setting in the sense that we have allowed for multiple types of prompt emissions (both particles and radioactive emissions) rather than the case of only one type of prompt emission dealt with in [25]; we also allow for multiple types of delayed emissions (that is, emissions that are pre-emptively held in an unstable radioactive isotope product from an earlier fission event). Our proof nonetheless takes inspiration from the classical approach of [6, 25], and remains loyal to the techniques there.

4 Historical Remarks on the Mathematical Treatment of the NTE

Classical texts such as Davison and Sykes [8] were once hailed as a bible of mathematical knowledge during the 1950s post Manhattan Project era when rapid technological advances lead to the construction of the very first nuclear reactors driving commercial power stations. Around this time, there was an understanding of how to treat the NTE in special geometries and also by imposing an isotropic scattering and fission, see for example Lehner [18] and Lehner and Wing [19, 20]. It was also understood quite early on that the natural way to cite the NTE is via the linear differential transport equation associated to a suitably defined operator on a Banach space. Moreover, it was understood that in this formulation, a spectral decomposition should play a key role in representing solutions, see e.g. Jörgens [15], Pazy and Rabinowitz [29]. This notion was promoted by the work of R. Dautray and collaborators, who showed how \(c_0\)-semigroups form a natural framework within which one may analyse the existence and uniqueness of solutions to the NTE; see [6, 7]. Moreover, a similar approach has also been pioneered by Mokhtar-Kharroubi [25].

The probabilistic interpretation of the NTE was appreciated from the very first treatments of the NTE (see e.g. [8] and references therein, as well as Bell [2]). Indeed, the physical description of nuclear fission, when governed by basic principles, allowing for additional randomness, is nothing more than a branching Markov process. Numerous derivations of the NTE from this perspective can be found in the literature to various degrees of rigour; see e.g. Bell [2], Mori et al. [26], Pazy and Rabinowitz [30], Lewins [21] and Pázsit and Pál. [28].

A more modern treatment of the probabilistic representation through Feynman–Kac expectation semigroups and the connection to the theory of Markov diffusions is found in Dautray et al. [7]. A purely probabilistic can be found in Lapeyre et al. [17]. See also the accompanying papers to this one [5, 12, 14].

We finish this section by noting that there is a body of literature that pertains to the numerical analysis of the NTE. Recent work in this field, including the notion of uncertainty quantification, can be found in e.g. [22, 27, 31]. See also references therein.

5 Multi-species (Backwards) Neutron Transport Equation

In the following discussion, rather than talk about typed particles, we prefer to say typed ‘emissions’ as the different types correspond to particles, electromagnetic rays (e.g. gamma rays) and isotopes (which are considered to be carriers for delayed emissions).

Let us now introduce an advanced version of the NTE, which takes account of both non-transmutation emissions as well as transmutation emissions, in particular, allowing for the inclusion of all types of emissions, prompt neutrons, delayed neutrons, alpha, beta and gamma emissions etc. An important feature (and arguably a restriction) of our model is that only prompt neutrons can produce delayed emissions.

In order to keep track of the various emission types, we define the type space \(I :=\{1, \dots , m\}\) for some \(m \in \mathbb {N}\), ordered such that

$$\begin{aligned} \text {type } 1\text { emissions:}&\text { prompt neutrons (neutrons released immediately after fission)}\\ \text {types }2, \dots ,\ell \text { emissions:}&\text { other prompt emissions (e.g. alpha, beta, gamma emissions)}\\ \text {types }\ell +1, \dots , m\text { emissions:}&\text { isotopes (holding types/precursors) that hold delayed emissions.} \end{aligned}$$

Finally, the set of admissible velocities for each of the types i can be embedded within a common space \(V = \{\upsilon \in \mathbb {R}^3: \upsilon _{\texttt {min}}\le |\upsilon |\le \upsilon _{\texttt {max}}\}\), with \(0< \upsilon _{\texttt {min}} \le \upsilon _{\texttt {max}}< \infty \)). We now consider the flux, \(\psi _t(i,r, \upsilon ) \) of type i emissions through a given region \(r\in D\) with velocity \(\upsilon \in V\) at time \(t\ge 0\). We are interested in the so called multi-species neutron transport equation (MNTE) which takes the form

$$\begin{aligned} \frac{\partial }{\partial t}\psi _t(i,r, \upsilon )&= \upsilon \cdot \nabla \psi _t(i,r, \upsilon ) -\sigma ^i(r, \upsilon )\psi _t(i,r, \upsilon ) \nonumber \\&\quad +\, \sigma ^i_{\texttt {s}}(r, \upsilon ) \int _{V}\psi _t(i,r, \upsilon )\pi ^i_{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon '\nonumber \\&\quad +\, \sigma ^i_{\texttt {f}}(r, \upsilon )\sum _{j = 1}^\ell \int _{V}\psi _t(j,r, \upsilon ) \pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ' \nonumber \\&\quad +\,\mathbf {1}_{(i=1)}\sigma ^1_{\texttt {f}}(r, \upsilon )\sum _{j=\ell +1}^mm^j(r, \upsilon )\psi _t(j,r, \upsilon ), \end{aligned}$$
(5.1)

for prompt emissions \(i = 1,\ldots , \ell \), whereas, in the case of delayed emissions, \(i =\ell +1, \ldots , m\) satisfies

$$\begin{aligned} \frac{\partial }{\partial t}\psi _t(i,r, \upsilon )&= -\lambda _i\psi _t(i,r, \upsilon ) + \lambda _i\sum _{j=1}^\ell \int _{V}\psi _t(j,r, \upsilon )\pi _{\texttt {f}}^{i,j}(r, \upsilon , \upsilon '){ d }\upsilon ' , \end{aligned}$$
(5.2)

which is of a simple form because it describes only how these emissions are held in a suspended state (no advection) before being converted back to prompt emissions. Similarly to before, have the following interpretation:

$$\begin{aligned} \sigma ^i_{\texttt {s}}(r, \upsilon )&: \text { the rate at which scattering occurs for a type }i \text { emission with incoming}\\&\quad \text {velocity }\upsilon ,\\ \sigma ^i_{\texttt {f}}(r, \upsilon )&: \text { the rate at which fission occurs for a type } i\text { emission with incoming}\\&\quad \text {velocity } \upsilon ,\\ \sigma ^i(r, \upsilon )&: \text { the sum of the rates } \sigma ^i_{\texttt {f}}+ \sigma ^i_{\texttt {s}} \text { and is known as the total cross section for a }\\&\quad \text {type }i\text { emission,}\\ \pi ^i_{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon '&: \hbox { the scattering yield at velocity } \upsilon ' \hbox {from incoming velocity}\, \upsilon \text { for a type }i \\&\quad \text {emission, satisfying }\textstyle {\int _V}\pi ^i_{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon '=1,\\ \pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon '&: \text { the average type }j\text { yield at velocity }\upsilon '\text { from fission with incoming velocity}\\&\quad \upsilon \text{ for } \text{ a } \text{ type } i \text{ emission } \text{ satisfying } \sum _{j = 1}^\ell \textstyle {\int _V}\pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ' <\infty ,\\ m^j(r, \upsilon )&: \text { the average type }j\text { (unstable) isotope yield from a fission event due to a} \\&\quad \hbox { type 1 particle with incoming velocity}\ \upsilon ,\\ \lambda _i&: \text { the decay rate for a type } i\text { isotope.} \end{aligned}$$

There are a number of assumptions about the many cross sections that appear in the above equations that will remain in force throughout the remainder of this text.

Assumption 5.1

All cross sections are non-negative, measurable and uniformly bounded from above. Moreover, all prompt emissions scatter and hence, without loss of generality, we also assume that for for each \(i=1,\ldots , \ell \), the terms \(\sigma ^i_{\texttt {s}}\pi ^i_{\texttt {s}}\) are uniformly bounded away from the origin on \(D\times V\). We need not assume that the cross sections \(\sigma ^i_{\texttt {f}}\pi ^{i,j}_{\texttt {f}}\) are uniformly bounded away from the origin for \(1\le i,j\le \ell \), with the exception of \(i = 1\), for which it only makes sense that \(\sigma ^1_{\texttt {f}}m^j\) is uniformly bounded away from 0 for each \(j = \ell +1, \ldots , m.\) Without loss of generality, we can assume that \(0<\lambda _{\ell + 1}<\cdots <\lambda _m\).

We also assume similar boundary conditions to the single-type case in the sense that emissions exiting the physical domain D are killed. That is to say

$$\begin{aligned} \left\{ \begin{array}{ll} \psi _0(i, r, \upsilon ) = g(i, r, \upsilon ) &{}\text { for } 1\le i\le m, r\in D, \upsilon \in {V}, \\ &{} \\ \psi _t(i,r, \upsilon ) = 0&{} \text { for }1\le i\le \ell , r\in \partial D \text { if }\upsilon \cdot \mathbf{n}_r>0. \end{array} \right. \end{aligned}$$
(5.3)

For the second condition, we will write \(\psi _t|_{\partial D^+} = 0\), where \(\partial D^+ = \{(i, r,\upsilon ) \in \{1,\ldots , \ell \}\times \partial D\times V:\upsilon \cdot \mathbf{n}_r>0\}\)

Classical literature suggests that one can integrate delayed neutrons into the setting of the NTE by adding an inhomogeneity corresponding to the integral of incoming delayed neutrons from time \(-\infty \) to the present; see e.g. [8]. A vectorial representation such as the one above can be found, however, in the work of [25]. There, only one category of prompt emissions are considered with multiple species of delayed neutrons.

As before, let us define the multi-species backward transport, scattering and fission operators as they appear in MNTE (5.1) and (5.2), acting on \(f\in \prod _{i=1}^m L_2(D\times V)\), so that, for \(i = 1,\ldots m\),

$$\begin{aligned} \left\{ \begin{array}{rl} {\overset{_\leftarrow }{\texttt {T}}}_i{f}(\cdot , r, \upsilon ) &{}:= \mathbf {1}_{(1\le i \le \ell )} \upsilon \cdot \nabla f(i, r, \upsilon ) \\ &{}\\ {\overset{_\leftarrow }{\texttt {S}}}_i{f}(\cdot , r, \upsilon ) &{}:=\mathbf {1}_{(1\le i \le \ell )}\int _{V} {[}f(i, r, \upsilon ') - f(i, r, \upsilon )]\sigma ^i_{\texttt {s}}(r, \upsilon ) \pi ^i_{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon ' \\ &{}\\ {\overset{_\leftarrow }{\texttt {F}}}_i{f}(\cdot , r, \upsilon ) &{}: = \mathbf {1}_{(1\le i\le \ell )}\left( \sum \nolimits _{j = 1}^\ell \int _{V}f(j, r, \upsilon ')\sigma ^i_{\texttt {f}}(r, \upsilon ) \pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ' - \sigma ^i_{\texttt {f}}(r, \upsilon )f(i, r, \upsilon ')\right) \\ &{}\quad \, +\,\mathbf {1}_{(i=1)}\sum \nolimits _{j=\ell +1}^m\sigma ^i_{\texttt {f}}(r, \upsilon )m^j(r, \upsilon ) f(j, r, \upsilon )\\ &{} \quad \, +\,\mathbf {1}_{(\ell +1\le i\le m)}\left( \lambda _i\sum \nolimits _{j=1}^\ell \int _{V} f(j, r, \upsilon ')\pi _{\texttt {f}}^{i,j}(r, \upsilon , \upsilon '){ d }\upsilon ' -\lambda _if(i, r, \upsilon ) \right) , \end{array} \right. \end{aligned}$$

with zero action otherwise.

It is not often that MNTE is stated as above in (5.1) and (5.2) in existing literature; see e.g. [25] for presentation of the NTE in a similar vectorial format, which allows for only one category of prompt neutrons.

The requirement that all cross sections are uniformly bounded is by far not the weakest assumption we can make (see e.g. Chapter XXI of [6]).

The precise mathematical sense in which we must understand solutions to the coupled system (5.1) and (5.2) needs some discussion before we can proceed. To this end, we shall first introduce some notational conventions.

As alluded to above, we are interested in an vector space of functions, written as the column vector \(g(\cdot ) = (g(1,\cdot ), \dots , g(m,\cdot ))^{\texttt {T}}\), whose entries \(g(i,\cdot ): D\times V\rightarrow [0,\infty )\), for each \(i =1,\ldots , m\). More precisely we are interested in functions \(f\in \prod _{j= 1}^m L_2({D}\times V)\), which is easily verified to be itself an \(L_2\) space with inner product given by

$$\begin{aligned} \langle f, g \rangle = \sum _{i = 1}^m (f, g)_i, \quad \text {where}\ \ (f,g)_i = \int _{D\times V} f(i, r, \upsilon ) g(i, r, \upsilon ) { d }r{ d }\upsilon . \end{aligned}$$
(5.4)

Generally speaking, for a scalar quantity which is indexed by i, say a(i), when written without the index, we will understand it to be a column vector. Sometimes we will want to put \(f\in \prod _{j= 1}^m L_2({D}\times V)\) on the diagonal of an \(m\times m\) matrix, in which case we will write \(\texttt {diag}(f)\). For our transport, scattering and fission operators, we will understand \(\overset{_\leftarrow }{\texttt {T}}= \texttt {diag}( \overset{_\leftarrow }{\texttt {T}}_1, \ldots , \overset{_\leftarrow }{\texttt {T}}_m)\), however, we will understand \(\overset{_\leftarrow }{\texttt {F}}\) to be the matrix acting on vectors \(f\in \prod _{j= 1}^m L_2({D}\times V)\), with ij-th entry given by

$$\begin{aligned}&{\overset{_\leftarrow }{\texttt {F}}}_{i,j}{f}(j, r, \upsilon )\\&\quad : = \mathbf {1}_{(1\le i,j\le \ell )}\left( \displaystyle \int _{V}f(j, r, \upsilon ')\sigma ^i_{\texttt {f}}(r, \upsilon ) \pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ' - \mathbf {1}_{(i = j)} \sigma ^i_{\texttt {f}}(r, \upsilon )f(i, r, \upsilon ')\right) \\&\qquad +\mathbf {1}_{(i=1, \ell + 1\le j\le m)} \sigma ^i_{\texttt {f}}(r, \upsilon )m^j(r, \upsilon ) f(j, r, \upsilon )\\&\qquad +\mathbf {1}_{(\ell +1\le i\le m, 1\le j\le \ell )}\left( \lambda _i\displaystyle \int _{V} f(j, r, \upsilon ')\pi _{\texttt {f}}^{i,j}(r, \upsilon , \upsilon '){ d }\upsilon '-\mathbf {1}_{(i = j)} \lambda _if(i, r, \upsilon ) \right) . \end{aligned}$$

The operator \(\overset{_\leftarrow }{\texttt {S}}\) can be handled similarly.

We are fundamentally interested in a classical solution to the so-called (initial-value) abstract Cauchy problem (ACP)

$$\begin{aligned} \left\{ \begin{array}{ll} \dfrac{\partial }{\partial t}u_t &{}= ({\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})u_t \\ u_0&{} = g \end{array} \right. \end{aligned}$$
(5.5)

where \(u_t\) is treated as a column vector belonging to the space \(\prod _{j= 1}^m L_2({D}\times V)\), for \(t\ge 0\). Specifically this means that, \((u_t, t\ge 0)\) is continuously differentiable in this space. In other words, there exists a \(\dot{\psi }_t\in \prod _{j= 1}^m L_2({D}\times V)\), which is time-continuous in \(\prod _{j= 1}^m L_2({D}\times V)\) with respect to \(||\cdot ||_2\), such that \(\lim _{h\rightarrow 0}h^{-1}|| u_{t+h} - u_t ||_2= \dot{\psi }_t\) for all \(t\ge 0\).

The theory of \(c_0\)-semigroups gives us a straightforward approach to describing the unique solution to (5.5). Recall that a \(c_0\)-semigroup also goes by the name of a strongly continuous semigroup and, in the present context, this means a family of time-indexed operators, \((\texttt {V}_t, t\ge 0)\), on \(\prod _{j= 1}^m L_2({D}\times V)\) with the properties that

  1. (i)

    \(\texttt {V}_0 = \mathrm{Id}\),

  2. (ii)

    \(\texttt {V}_{t+s}[g] = \texttt {V}_t[\texttt {V}_s[g]]\), for all \(s, t\ge 0\), \(g\in \prod _{j= 1}^m L_2({D}\times V)\) and

  3. (iii)

    for all \(g\in \prod _{j= 1}^m L_2({D}\times V)\), \(\lim _{h\rightarrow 0}||\texttt {V}_h[g] - g ||=0\).

To see how \(c_0\)-semigroups relate to (5.5), let us define \({\overset{_\leftarrow }{\texttt {A}}}: = {\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}\) and define \((\texttt {V}_t[g], t\ge 0)\) the semigroup generated by \({\overset{_\leftarrow }{\texttt {A}}}\) via

$$\begin{aligned} \texttt {V}_t[g] := \exp (t {\overset{_\leftarrow }{\texttt {A}}})g, \quad g\in \prod _{j= 1}^m L_2({D}\times V). \end{aligned}$$
(5.6)

Note that

$$\begin{aligned} \mathrm{Dom}({\overset{_\leftarrow }{\texttt {A}}}): = \left\{ g\in \prod _{j= 1}^m L_2({D}\times V) : \lim _{h\rightarrow 0}h^{-1}||\texttt {V}_h[g] - g ||_2 \text { exists} \right\} \end{aligned}$$

is the domain of \(\overset{_\leftarrow }{\texttt {A}}\) and standard theory (cf. [11]) tells us that \(\texttt {V}_t[g]\in \mathrm{Dom}({\overset{_\leftarrow }{\texttt {A}}})\) for all \(t\ge 0\), with g as above. Proposition II.6.2 of [11] now gives us the relevance to (5.5).

Theorem 5.2

Let \(({\overset{_\leftarrow }{\texttt {A} }}, \mathrm{Dom}({\overset{_\leftarrow }{\texttt {A} }}))\) be the generator of a \(c_0\)-semigroup \((\texttt {V} _t, t\ge 0)\). If \(g\in \mathrm{Dom}({\overset{_\leftarrow }{\texttt {A} }})\), then \(u_t: = \texttt {V} _t[g]\) is a representation of the unique classical solution of (5.5).

The reader may well have wondered where the second boundary condition in (5.3) has gone in the above formulation. This is a matter of interpretation of \(({\overset{_\leftarrow }{\texttt {T}}}, \mathrm{Dom}({\overset{_\leftarrow }{\texttt {T}}}))\), and hence the generator \(({\overset{_\leftarrow }{\texttt {A}}}, \mathrm{Dom}({\overset{_\leftarrow }{\texttt {A}}}))\), as we now discuss.

We are interested in the advection semigroup with exponential killing and killing on the boundary of D,

$$\begin{aligned} \texttt {U}_t[g](i, r,\upsilon ) = g(i, r+\upsilon t, \upsilon )\mathbf {1}_{(t<\kappa ^D_{r,\upsilon })}, \quad i = 1,\ldots , \ell \quad \text {and}\quad t\ge 0. \end{aligned}$$
(5.7)

where

$$\begin{aligned} \kappa _{r,\upsilon }^{D} := \inf \{t>0 : r+\upsilon t\not \in D\}. \end{aligned}$$
(5.8)

In essence, \((\texttt {U}_s,s\ge 0)\) is the semigroup of the process which moves from a point of issue r in a straight line with velocity \(\upsilon \) and which is killed on hitting \(\partial D\). To see why \(\texttt {U}: = (\texttt {U}_s,s\ge 0)\) has the semigroup property, note that

$$\begin{aligned} \kappa _{r+\upsilon {s}, \upsilon } = \inf \{t>0 : r+\upsilon (t+{s}) \not \in D\} = (\kappa _{r, \upsilon }-{s})\vee 0, \end{aligned}$$

so that \(t<\kappa _{r+\upsilon {s}, \upsilon } \) if and only if \(t+{s} < \kappa ^D_{r, \upsilon }\). Hence for any \(g\in \prod _{i= 1}^m L_2({D}\times V)\) satisfying the boundary conditions (5.3), we have from the definition (5.7), for \(i =1,\ldots , \ell \), \(r\in D\), \(\upsilon \in V\),

$$\begin{aligned} {\texttt {U}}_{s} [{\texttt {U}}_t[g] ](i,r, \upsilon )&= {\texttt {U}}_t[g](i, r+\upsilon {s}, \upsilon )\mathbf {1}_{({s}< \kappa ^D_{r, \upsilon } )}\nonumber \\&=g(i, r+\upsilon (t+{s}), \upsilon )\mathbf {1}_{(t<\kappa ^D_{r+\upsilon {s}, \upsilon })} \mathbf {1}_{({s}< \kappa ^D_{r, \upsilon } )}\nonumber \\&=\texttt {U}_{t+{s}}[g](i,r, \upsilon ) \end{aligned}$$

It is a straightforward exercise, see e.g. Theorem 2 in Chapter XXI of [6], to show that \(\texttt {U}\) is a \(c_0\)-semigroup with generator \( {\overset{_\leftarrow }{\texttt {T}}}. \) Its domain satisfies

$$\begin{aligned} \mathrm{Dom}(\overset{_\leftarrow }{\texttt {T}})&= \prod _{i = 1}^\ell \mathrm{Dom}(\overset{_\leftarrow }{\texttt {T}}_i)\times \prod _{i =\ell +1}^m L_2(D\times V), \text { where}\nonumber \\ \mathrm{Dom}({\overset{_\leftarrow }{\texttt {T}}}_i)&=\Bigg \{g\in L_2({D}\times V) : \upsilon \cdot \nabla g \in L_2({D}\times V)\text { and }g|_{\partial D^+} =0 \Bigg \}. \end{aligned}$$
(5.9)

Here, by \(\upsilon \cdot \nabla g\in L_2({D}\times V)\) we mean that \(\upsilon \cdot \nabla g\) exists in the distributional sense and is integrable in the space \( L_2({D}\times V)\).

The domain of \({\overset{_\leftarrow }{\texttt {A}}}\) can be no larger than Dom\(({\overset{_\leftarrow }{\texttt {T}}})\). It turns out however that Dom\(({\overset{_\leftarrow }{\texttt {A}}})=\)Dom\(({\overset{_\leftarrow }{\texttt {T}}})\). To see why, we need only consider that the linear operators of the form

$$\begin{aligned} \texttt {K}_i f(i,r,\upsilon ) :=\alpha ^i(r,\upsilon )\sum _{j=1}^m\int _{V} f(j, r, \upsilon ') \pi ^{i,j}(r,\upsilon ,\upsilon '){ d }\upsilon , \end{aligned}$$

are continuous mappings from \(\prod _{i= 1}^m L_2({D}\times V)\) into itself, where \(\alpha \) and \(\pi ^{i,j}\) are non-negative, measurable and uniformly bounded. The proof is a straightforward exercise which uses the Cauchy-Schwarz inequality; see for example Lemma XXI.1 of [6]. It follows that Dom\(({\overset{_\leftarrow }{\texttt {S}}})\) and Dom\(({\overset{_\leftarrow }{\texttt {F}}})\) are both equal to \(\prod _{i= 1}^m L_2({D}\times V)\) and, hence, Dom\(({\overset{_\leftarrow }{\texttt {A}}})\) and Dom\(({\overset{_\leftarrow }{\texttt {T}}})\) agree.

Note there is no particular necessity to put solutions in an \(L_2\) space, one might equally work with the space \(\prod _{i = 1}^m L_p (D\times V)\), for \(p\in (1,\infty )\). As the reader might suspect, solutions of the backwards equation in an \(L_p\) space comes hand in hand with a similarly formulated solution to the forward equation in the conjugate space \(\prod _{i = 1}^m L_q(D\times V)\), where \(q^{-1}+ p^{-1} =1\). See for example Chapter XXI of [6] or [25]. The reader will note the exclusion of the \(L_1\) and \(L_\infty \) conjugacy. The reason for the exclusion boils down to the cumbersome nature of the advection operator \({\overset{_\leftarrow }{\texttt {T}}} = \upsilon \cdot \nabla \). Quite simply it is not possible to verify the strong continuity property of the advection semigroup

$$\begin{aligned} \texttt {U}_t[g](i, r,\upsilon ) = g(i, r+\upsilon t, \upsilon )\mathbf {1}_{(t<\kappa ^D_{r,\upsilon })}, \quad t\ge 0. \end{aligned}$$
(5.10)

where \( \kappa _{r,\upsilon }^{D} := \inf \{t>0 : r+\upsilon t\not \in D\}. \) Hence we cannot give a meaning to \( \upsilon \cdot \nabla \) as a \(c_0\)-semigroup on \(L_\infty (D\times V)\). This is unfortunate as the latter is the more natural setting for probabilistic interpretation of solutions to the ACP. Having said that, the backwards scattering and fission operators, respectively \({\overset{_\leftarrow }{\texttt {S}}}\) and \({\overset{_\leftarrow }{\texttt {F}}}\), are well defined on all \(\prod _{i = 1}^mL_p(D\times V)\) spaces for \(p\in [1,\infty ]\).

One of our main results will be to establish the asymptotic (2.8) but now in the current setting. Recall that we have assumed that \(D\subseteq \mathbb {R}^3\) is a smooth open pathwise connected bounded domain of concern such that \(\partial D\) has zero Lebesgue measure.

Theorem 5.3

Let D be convex. We assume the following irreducibility conditions. For each \(i,j \in \{1, \dots , \ell \}\) assume that each of the cross sections \(\sigma _{\texttt {f} }^{i}(r, \upsilon )\pi _{\texttt {f} }^{i,j}(r, \upsilon , \upsilon ')\), \(\sigma ^i_{\texttt {f} }(r,\upsilon )m^j(r,\upsilon )\) and \(\sigma _{\texttt {s} }^{i}(r, \upsilon )\pi _{\texttt {s} }^{i}(r, \upsilon , \upsilon ')\) are piece-wise continuousFootnote 4 on \(\bar{D}\times V\times V\) and there exists \(k = k_{i,j}\in \{1, \dots , \ell \}\) such that

$$\begin{aligned} \sigma _{\texttt {f} }^i(r,\upsilon )\pi _{\texttt {f} }^{i,k}(r, \upsilon , \upsilon ') >0 \text { on } D \times V \times V \end{aligned}$$
(5.11)

and

$$\begin{aligned} \sigma _{\texttt {f} }^k(r,\upsilon )\pi _{\texttt {f} }^{k,j}(r, \upsilon , \upsilon ')>0 \text { on } D \times V \times V. \end{aligned}$$
(5.12)

Then,

  1. (i)

    the neutron transport operator \(\overset{_\leftarrow }{\texttt {A} }\) has a simple and isolated eigenvalue \(\lambda _c > -\lambda _{\ell +1}\), which is leading in the sense that \(\lambda _c = \sup \{\mathrm{Re}(\lambda ): \lambda \text { is an eigenvalue of }\overset{_\leftarrow }{\texttt {A} }\}\) and which has corresponding non-negative right and left eigenfunctions in \( \prod _{i=1}^mL_2(D\times V)\), \(\varphi \) and \({\tilde{\varphi }}\) respectively and

  2. (ii)

    there exists an \(\varepsilon >0\) such that, as \(t\rightarrow \infty \),

    $$\begin{aligned} || \mathrm{e}^{-\lambda _c t}{\texttt {V} }_t[f] -\langle f, \tilde{\varphi } \rangle \varphi ||_2 = O(\mathrm{e}^{-\varepsilon t}), \end{aligned}$$
    (5.13)

    for all \(f\in \prod _{i=1}^mL_2(D\times V)\), where \(({\texttt {V} }_t, t\ge 0)\) is defined in (5.6). To give a precise value for \(\varepsilon \), suppose we enumerate the eigenvalues of \(\overset{_\leftarrow }{\texttt {A} }\) in decreasing order by the set \(\{\lambda ^{(1)}, \ldots , \lambda ^{(n)}\}\) (noting from earlier that we have at least \(\lambda ^{(1)}= \lambda _c\)). Then \(\lambda ^{(n)}> -\lambda _{\ell + 1}\) and we can take any \(\varepsilon \) such that \(\varepsilon <\lambda _c- (\lambda ^{(2)}\vee (-\lambda _{\ell +1} ))\) where we understand \(\lambda ^{(2)} = -\infty \) if \(n = 1\).

Remark 5.1

It could be argued that the assumptions in the above theorem rule out the possibility that we may, for example, include alpha or beta emissions emissions in the model for that particular conclusion. Whilst alpha and beta emissions may scatter, they are not energetic enough to cause fission. The irreducibility conditions (5.11) and (5.12) would thus fail. On the other hand, it is also known that when such particles are energetic enough, they can draw gamma radiation or positrons out of nuclei when passing in close proximity. If the latter are sufficiently energetic, then they can induce fission.

6 Multi-species Neutron Branching Process

Heuristically speaking, (5.5) can be thought of as being closely related to the expectation semigroup of a Markov branching process, or Multi-species nuclear branching process (MNBP) as we shall call it, whose infinitesimal generator is \({\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}\). Consider the system of typed emissions whose configurations in \(D\times V\) at time \(t\ge 0\) are given by \(\{r_{i,j}(t), \upsilon _{i,j}(t): i = 1, \dots , N_t^j\}\), where, for each \(j = 1, \dots , m\), \(N_t^j\) is the number of type j emissions alive at time t. In order to describe the system as Markovian, we will represent it by the atomic measures

$$\begin{aligned} X_t(j, A) = \sum _{i=1}^{N_t^j}\delta _{(r_{i,j}(t), \upsilon _{i,j}(t))}(A), \quad j = 1, \dots , m, \end{aligned}$$

where A is a Borel subset of \(D \times V\) and \(\delta \) is the Dirac measure defined on the same space. Then the system can be described via the m-tuple \(X_t(\cdot ) = (X_t(1,\cdot ),\dots , X_t(m,\cdot ))\), \(t\ge 0\), which evolves as follows.

\(\triangleright \) A emission of type \(i \in \{1, \dots , \ell \}\) with configuration \((r, \upsilon )\) moves in a straight line with velocity \(\upsilon \) from the point r until one of the following events occur:

  • The emission leaves the domain, at which point it is killed.

  • Independently of all other emissions, a scattering event occurs when a emission comes in close proximity to an atomic nucleus and, accordingly, makes an instantaneous change of velocity. For an emission in the system of type \(i \in \{1, \dots , \ell \}\) with initial position and velocity \((r,\upsilon )\), if we write \(T^i_{\texttt {s}}\) for the random time until the next scattering occurs, then, independently of any other physical event that may affect the emission,

    $$\begin{aligned} \Pr (T^i_{\texttt {s}}>t) = \exp \left\{ -\int _0^t \sigma ^i_{\texttt {s}}(r+\upsilon s, \upsilon )\mathrm{d}s \right\} . \end{aligned}$$
    (6.1)
  • When scattering of an emission of type \(i \in \{1, \dots , \ell \}\) occurs at space-velocity \((r,\upsilon )\), the new velocity is selected independently with probability \(\pi ^i_{\texttt {s}}(r, \upsilon , \upsilon '){ d }\upsilon '\).

  • Independently of all other emissions, a fission event occurs when an emission smashes into an atomic nucleus. For an emission in the system with initial position and velocity \((r,\upsilon )\), we will write \(T^i_{\texttt {f}}\) for the random time that the next fission occurs. Then independently of any other physical event that may affect the emission,

    $$\begin{aligned} \Pr (T^i_{\texttt {f}}>t) = \exp \left\{ -\int _0^t \sigma ^i_{\texttt {f}}(r+\upsilon s, \upsilon )\mathrm{d}s \right\} . \end{aligned}$$
    (6.2)
  • When fission occurs, the smashing of the atomic nucleus releases a random number of other prompt emissions of type \(i =1,\ldots , \ell \), say \(N^{i,j}\ge 0\), which are ejected from the point of impact with randomly distributed, and possibly corollated, velocities, say \(\{\upsilon ^{i,j}_k: k = 1, \ldots , N^{i,j}\}\). When fission occurs at location \(r\in D\) from a emission with incoming velocity \(\upsilon \in {V}\), the quantity \(\pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon '\) describes the average number of type j prompt emissions released from nuclear fission with outgoing velocity in the infinitesimal neighbourhood of \(\upsilon '\). In particular

    $$\begin{aligned} \int _A\pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon '){ d }\upsilon ' = \mathrm{E}\left[ \sum _{k =1}^{N^{i,j}}\mathbf {1}_{(\upsilon ^{i,j}_k\in A)} \right] , \quad A\in \mathcal {B}(V). \end{aligned}$$
  • Note, the possibility that \(\Pr (N^{i,j} = 0)>0\) is possible. If \(i = j = 1\) then this is tantamount to neutron capture or further decomposition into subatomic particles which are not counted.

  • Further, if the initial emission is a (type 1) neutron, a fission event (occurring at rate \(\sigma ^1_{\texttt {f}}\)) may result in the production of unstable isotopes (which later release delayed emissions). On this event, an average number, \(m^j(r, \upsilon )\), of type \(j \in \{\ell +1, \dots , m\}\) isotopes will be produced from a collision at position r from a neutron with incoming velocity \(\upsilon \). The isotopes will inherit the configuration of the incoming nucleus at the time of collision.

\(\triangleright \) An isotope of type \(i \in \{\ell +1, \dots , m\}\) with inherited physical configuration \((r, \upsilon )\) stays in the same place for an exponentially distributed amount of time with rate \(\lambda _i\). At this point, it produces a random number of type \(j \in \{1, \dots , \ell \}\) prompt emissions, the average number of which, along with their corresponding velocities, are chosen according to \(\pi ^{i,j}_{\texttt {f}}(r, \upsilon , \upsilon ')\), in a similar way to previously described. We note that although unstable isotopes stay in the same spatial position, we will still assign them a velocity as a ‘mark’.

In all cases, it is a natural make the following physical assumption which will remain in force throughout.

Assumption 6.1

Random emissions of any type are bounded in number by the non-random constant \(n_{\texttt {max}}\ge 1\). In particular this means that

$$\begin{aligned} \sup _{1\le i\le m, 1\le j\le \ell , r\in D, \upsilon \in V}\pi ^{i,j}_{\texttt {f}}(r,\upsilon ,V)\le n_\texttt {max}\quad \text { and }\quad \sup _{r,\in D, \upsilon \in V_1, 1\le j\le \ell }m^j(r,\upsilon )\le {n}_{\texttt {max}}. \end{aligned}$$

For non-negative and uniformly bounded \(g: \prod _{i = 1}^m (D\times V)\mapsto [0,\infty )\), that is \(g\in \prod _{i=1}^mL^+_{\infty }(D\times V)\), define the expectation semigroup

$$\begin{aligned} \psi _t[g](i, r, v) := \mathbb {E}_{\delta _{(i,r, v)}}[\langle g, X_t \rangle ], \end{aligned}$$
(6.3)

where \(\mathbb {P}_{\delta _{(i,r, v)}}\) is law of the process started from a single type i emission with configuration \((r,\upsilon )\) with corresponding expectation operator \(\mathbb {E}_{\delta _{(i,r, v)}}\).

As we have assumed that all cross sections are uniformly bounded, ignoring spatial trajectories of neutrons (in particular those that are killed by leaving the domain D), it is straightforward to compare the growth of \((\psi _t[g], t\ge 0)\) against that of a continuous-time Galton-Watson process with growth rate \( \eta \{(m\times n_{\texttt {max}})-1\} \), where \(\eta = \sup _{1\le i\le \ell , r\in D, \upsilon \in V}\sigma ^i_{\texttt {f}}(r,\upsilon ) + \max _{\ell +1\le i\le m}\lambda _i\).

The rate of growth \( \eta \{(m\times n_{\texttt {max}})-1\} \) simply assumes that each emission of type i gives rise to at most \(n_{\texttt {max}}\) emissions of any other type and at a rate which is uniformly bounded by a uniform upper bound of all possible rates at which fission events occur. Note this rate takes account of the emission count introduced into the system at a fission event and the single emission removed from the system which caused the fission event.

It is also straightforward to stochastically upper bound the process \(\langle 1, X_t\rangle \), \(t\ge 0\), by the aforesaid continuous-time Galton Watson process on the same probability space. The latter process branches whenever X does, topping up the number of offspring always to \(n_{\texttt {max}}\), but also it has additional independent branching events at rate \((\eta -\mathbf {1}_{(1\le i\le \ell )}\sigma ^i_{\texttt {f}}(r,\upsilon ) - \mathbf {1}_{(\ell +1\le i\le m)}\lambda _i)\) always producing precisely \(n_{\texttt {max}}\) offspring of each of the m possible emissions.

If we denote this Galton-Waton process by \((Z_t, t\ge 0)\), then we have both the stochastic bound \(\langle 1, X_t \rangle \le Z_t\le Z_{t+s}\), for all \(s,t\ge 0\) and the upper estimate

$$\begin{aligned} \sup _{1\le i\le m, r\in D, \upsilon \in V}\psi _t[g](i, r, \upsilon ) \le ||g||_\infty \exp (\eta ((n_{\texttt {max}}\times m)-1) t),\quad t\ge 0. \end{aligned}$$
(6.4)

If we put g in the smaller space \(\prod _{i = 1}^m C^+(D\times V)\), the space of non-negative, continuous and uniformly bounded vector functions on \((D\times V)\), then we also have by a dominated convergence argument, \( \lim _{t\rightarrow 0}\psi _t[g] = g\) in the pointwise sense. Otherwise the latter convergence is not necessarily clear.

The name ‘expectation semigroup’ is earned thanks to the behaviour of \((\psi _t, t\ge 0)\) under an application of the Markov branching property. Indeed, associated to the MNBP are the probabilities \(\mathbb {P}_{\mu }\) for atomic measures of the form

$$\begin{aligned} \mu =\left( \sum _{i =1}^{n_1} \delta _{(1, r_{i, 1}, \upsilon _{i, 1})}, \ldots ,\sum _{i =1}^{n_m} \delta _{(m, r_{i, m}, \upsilon _{i, m})}. \right) =: (\mu _1,\ldots ,\mu _m). \end{aligned}$$
(6.5)

The Markov branching property dictates that, for \(g\in \prod _{i = 1}^m L_2 (D\times V)\) as before and \(t\ge 0\),

$$\begin{aligned} \mathbb {E}_{\mu }[\langle g, X_t\rangle ] = \sum _{j =1}^{m} \sum _{i =1}^{n_j} \mathbb {E}_{\delta _{(j, r_{i, j}, \upsilon _{i, j})}}[\langle g, X_t\rangle ] = \langle \mathbb {E}_{\delta _{(\cdot , \cdot , \cdot )}}[\langle g, X_t\rangle ] , \mu \rangle \end{aligned}$$

Here we are abusing our earlier notation in (5.4) and writing for finite atomic measures \(\mu \) of the form (6.5),

$$\begin{aligned} \langle g,\mu \rangle = \sum _{i = 1}^m ( g, \mu )_i, \quad \text {where}\quad (g,\mu )_i = \int _{D\times V} g(i, r, \upsilon ) \mu _i({ d }r, { d }\upsilon ) . \end{aligned}$$
(6.6)

Hence, by conditioning on the configuration of the system at time \(t\ge 0\), we have, for \(s\ge 0\),

$$\begin{aligned} \psi _{t+s}[g](i, r, v) := \mathbb {E}_{\delta _{(i,r, v)}} \left[ \mathbb {E}_{X_t} [ \langle f, X_s \rangle ]\right] =\mathbb {E}_{\delta _{(i,r, v)}} \left[ \langle \psi _s [g],X_t \rangle \right] = \psi _{t}[\psi _s[g]](i, r, v). \nonumber \\ \end{aligned}$$
(6.7)

The expectation semigroup property of \((\psi _t, t\ge 0)\) does not imply that it is necessarily a \(c_0\)-semigroup on \(\prod _{i= 1}^m L_2({D}\times V)\). Recalling our earlier discussion, if we were able to work with (5.5) in the setting of a \(c_0\)-semigroup on \(\prod _{i=1}^m L_\infty (D\times V))\), then we would be much closer to being able to match the expectation semigroup \((\psi _t, t\ge 0)\) to the solution \((u_t, t\ge 0)\). But even then, problems would occur with verifying strong continuity at the origin.

Nonetheless, classical literature supports the view that it is the physical processes, i.e. in this setting the MNBP, that provides a stochastic representation of the solution to the backward MNTE. The authors are not aware of a formal proof of this fact. We will nonetheless try to address this point shortly in Sect. 8. In the mean time, let us present an alternative ‘mild’ form of the MNTE (also called a Duhamel solution in the PDE literature) which the semigroup \((\psi _t,t\ge 0)\) more comfortably solves.

Lemma 6.1

The expectation semigroup \((\psi _t[g], t\ge 0)\) is the unique solution in \(\prod _{i=1}^m L^+_\infty (D\times V)\) to the mild MNTE

$$\begin{aligned} u_t(i,r,\upsilon ) =\texttt {U} _t[g](i,r,\upsilon ) + \int _0^t \texttt {U} _s[({\overset{_\leftarrow }{\texttt {S} }+\overset{_\leftarrow }{\texttt {F} }})u_{t-s}](i,r,\upsilon ){ d }s, \end{aligned}$$
(6.8)

for \(t\ge 0\), \(1\le i\le m\), \(r\in D, \upsilon \in V\) and \(g\in \prod _{i = 1}^m L^+_\infty (D\times V)\).

Before proceeding to the proof, let us remark that, in the statement of the theorem, we are not working with \((\texttt {U}_t, t\ge 0)\) as a \(c_0\)-semigroup on \(\prod _{i = 1}^m L_\infty (D\times V)\), but a pointwise shift operator. The reader will recall from the discussion preceding (5.10) that \((\texttt {U}_t, t\ge 0)\) cannot be defined as such for \(\prod _{i = 1}^m L_\infty (D\times V)\).

Proof of Lemma 6.1

First suppose we start with an emission of type i. By splitting the expectation in the definition of \(\psi _t[g]\) at the first scattering or fission event, and remembering that the time \(\kappa _{r,\upsilon }^{D}\) defined in (5.8) is deterministic, we have for \(r\in D\) and \(\upsilon \in V\),

$$\begin{aligned}&\psi _t[g](i,r,\upsilon ) \\&\quad =\mathrm{e}^{-\int _0^t\sigma ^i(r+\upsilon s,\upsilon ){ d }s}g(i,r+\upsilon t,\upsilon )\mathbf {1}_{(t<\kappa _{r,\upsilon }^{D} )}\\&\qquad +\,\int _0^{t\wedge \kappa _{r,\upsilon }^{D} } \sigma ^i(r+\upsilon s,\upsilon )\mathrm{e}^{-\int _0^s\sigma ^i(r+\upsilon u,\upsilon ){ d }u}\\&\qquad \times \, \Bigg \{ \frac{\sigma ^i_{\texttt {s}}(r+\upsilon s,\upsilon )}{\sigma ^i(r+\upsilon s,\upsilon )}\int _{V}\psi _{t-s}[g](i, r+\upsilon s, \upsilon ') \pi _{\texttt {s}}^i(r+\upsilon s, \upsilon , \upsilon '){ d }\upsilon '\\&\qquad +\,\frac{\sigma ^i_{\texttt {f}}(r+\upsilon s,\upsilon )}{\sigma ^i(r+\upsilon s,\upsilon )} \Bigg (\sum _{j =1}^m\int _{V}\psi _{t-s}[g](j, r+\upsilon s, \upsilon ') \pi ^{i,j}_{\texttt {f}}(r+\upsilon s, \upsilon , \upsilon '){ d }\upsilon ' \\&\qquad +\,\mathbf {1}_{(i=1)}\sum _{j = \ell +1}^m m^j(r+\upsilon s, \upsilon )\psi [g](j, r+\upsilon s, \upsilon , t-s)\Bigg )\Bigg \}{ d }s\\&\quad =\mathrm{e}^{-\int _0^t\sigma ^i(r+\upsilon s,\upsilon ){ d }s}g(i, r+\upsilon t,\upsilon )\mathbf {1}_{(t<\kappa _{r,\upsilon }^{D} )}\\&\qquad +\,\int _0^{t\wedge \kappa _{r,\upsilon }^{D} } \mathrm{e}^{-\int _0^s\sigma ^i(r+\upsilon u,\upsilon ){ d }u} ({\overset{_\leftarrow }{\texttt {S}}}_i+{\overset{_\leftarrow }{\texttt {F}}}_i+\sigma ^i)\psi _{t-s}[g](i, r+\upsilon s, \upsilon ){ d }s, \quad t\ge 0. \end{aligned}$$

Now appealing to an analogue of Lemma 1.2, Chapter 4 in [10] (see also the Appendix of [16]), we can transfer the exponential integrals in each of the terms on the right-hand side above to a potential term in the integral so that we end with

$$\begin{aligned} \psi _t[g](r,\upsilon )= & {} g(i, r+\upsilon t,\upsilon )\mathbf {1}_{(t<\kappa _{r,\upsilon }^{D} )}\nonumber \\&+\,\int _0^{t\wedge \kappa _{r,\upsilon }^{D} } ({\overset{_\leftarrow }{\texttt {S}}}_i+{\overset{_\leftarrow }{\texttt {F}}}_i)\psi _{t-s}[g](i, r+\upsilon s, \upsilon '){ d }s, \quad t\ge 0, \end{aligned}$$
(6.9)

which agrees with (6.8), for \(1\le i\le \ell .\)

Following a similar approach, for \(\ell +1\le i\le m\), \(r\in D\), \(\upsilon ,\in V\), we also get

$$\begin{aligned} \psi _t[g](i,r,\upsilon )&=g(i, r+\upsilon t,\upsilon )\mathbf {1}_{(t<\kappa _{r,\upsilon }^{D} )} - \lambda _i \int _0^{t\wedge \kappa _{r,\upsilon }^{D} }\psi _{t-s}[g](i, r+\upsilon s, \upsilon )\mathrm{d}s\nonumber \\&\quad +\int _0^{t\wedge \kappa _{r,\upsilon }^{D} }\lambda _i\left\{ \sum _{j=1}^\ell \int _{V}\psi _{t-s}[g](1, r+\upsilon s, \upsilon ')\pi _{\texttt {f}}^{i,j}(r+\upsilon s, \upsilon , \upsilon ')\mathrm{d}\upsilon ' \right\} \mathrm{d}s\nonumber \\&=g(i, r+\upsilon t,\upsilon )\mathbf {1}_{(t<\kappa _{r,\upsilon }^{D} )} +\int _0^{t\wedge \kappa _{r,\upsilon }^{D} } ({\overset{_\leftarrow }{\texttt {S}}}_i+{\overset{_\leftarrow }{\texttt {F}}}_i)\psi _{t-s}[g](i, r+\upsilon s, \upsilon '){ d }s,\quad t\ge 0, \end{aligned}$$
(6.10)

noting in particular that, for \(\ell +1\le i\le m\), \({\overset{_\leftarrow }{\texttt {S}}}_i \equiv 0\). Now putting (6.9) and (6.10) together we obtain (6.8).

For uniqueness, suppose that \((\psi ^{(i)}_t, t\ge 0)\), \(i= 1,2\) are two bounded solutions to (6.8). Define \(\chi _t[g]: =|\psi ^{(1)}_t[g] - \psi ^{(2)}_t[g]|\) and note that, for \(i=1,\ldots ,m\),

$$\begin{aligned} \chi _t[g](i, r,\upsilon )&\le \int _0^{t\wedge \kappa _{r,\upsilon }^{D} } |({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}) \psi ^{(1)}_{t-s}[g](i,r+\upsilon s, \upsilon ) - ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}) \psi ^{(2)}_{t-s}[g](i, r+\upsilon s, \upsilon )| { d }s \nonumber \\&\le \int _0^{t\wedge \kappa _{r,\upsilon }^{D} } ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})|\psi ^{(1)}_{t-s}[g](i,r+\upsilon s, \upsilon ) - \psi ^{(2)}_{t-s}[g](i, r+\upsilon s, \upsilon )| { d }s \nonumber \\&\le \int _0^{t\wedge \kappa _{r,\upsilon }^{D} } ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})\chi _{t-s}[g](i,r+\upsilon s, \upsilon ) { d }s \nonumber \\&\le C_1 \int _0^{t\wedge \kappa _{r,\upsilon }^{D} } \sum _{j=1}^m\int _{V} \chi _{t-s}[g](j, r+\upsilon s, \upsilon '){ d }\upsilon ' { d }s \nonumber \\&\quad +\,C_2\int _0^{t\wedge \kappa _{r,\upsilon }^{D} } \chi _{t-s}[g](i, r+\upsilon s, \upsilon ){ d }s \end{aligned}$$
(6.11)

for some constants \(C_1, C_2\in (0,\infty )\), where the final inequality follows on account of all cross sections being uniformly bounded. Now define \( {\bar{\chi }}_t[g]=\sup _{1\le i\le m, r\in {D}, \upsilon \in {V}}\chi _t[g](i,r, \upsilon )\), \(t\ge 0\). From (6.11) we have that

$$\begin{aligned} {\bar{\chi }}_t[g]&\le \left( C_1\sum _{j = 1}^m \texttt {Vol}(V) +C_2\right) \int _0^{t }{\bar{\chi }}_{t-s}[g] { d }s. \end{aligned}$$
(6.12)

Reversing the order of integration on the right-hand side above and then applying Grönwall’s Lemma allows us to conclude that \(\chi _t[g]\equiv 0 \), which shows uniqueness. \(\square \)

7 Multi-species Neutron Random Walk and the Many-to-one Lemma

A second probabilistic perspective for analysing the MNTE is possible, seems rarely to have been discussed in existing literature, if at all. This consists of collapsing the sum of the operators \({\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}\) to take the form \({\overset{_\leftarrow }{\texttt {L}}}+\texttt {diag}({\beta })\) for an appropriate choice of \(\beta \), where \(\overset{_\leftarrow }{\texttt {L}}\) is the operator which is similar in structure to \({\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}}\). In essence, this transformation, which we will describe more rigorously in a moment, heuristically postulates that the operator \({\overset{_\leftarrow }{\texttt {T}}} + {\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}\) can be reinterpreted via a Feynman–Kac formula as the infinitesimal generator of a single emission which undergoes linear transport and scattering and which also accumulates potential \(\beta \).

To describe this more precisely, we need to introduce the notion of a multi-species neutron random walk (MNRW). In the current setting this means a continuous-time typed random walk by \( (J_t, R_t, \Upsilon _t)\), \(t\ge 0\), on \(\{1,\ldots ,m\}\times (D\times V)\) with additional cemetery state \(\{\dagger \}\) when it exits the physical domain D or an emission otherwise disappears from the system. The MNRW is described by two fundamental quantities (which are functions of the current particle type, spatial position and velocity). First, a scattering rate \(\alpha ^i(r,\upsilon )\), \(i\in \{1,\ldots ,m\}, r\in D, \upsilon ,\upsilon '\in V\), such that \(\alpha ^i(r,\upsilon ) = \lambda _i\), for \(i \in \{\ell +1,\ldots ,m\}\). Second, a scattering probability kernel \(\pi ^{i,j}(r,\upsilon , \upsilon ')\), \(i,j\in \{1,\ldots , m\}, r\in D, \upsilon ,\upsilon '\in V\). In the spirit of the description of the MNBP, the MNRW is described as follows.

\(\triangleright \) When the MNRW is of type \(i \in \{1, \dots , \ell \}\) with configuration \((r, \upsilon )\), it moves in a straight line with velocity \(\upsilon \) from the point r until one of the following events occur:

  • When the MNRW position moves out of D or e.g. it decomposes into an emission type that is not counted, or is captured in a nucleus, it is instantaneously killed.

  • A scattering event occurs and, accordingly, the MNRW keeps the same emission type but makes an instantaneous change of velocity. If we write \(T^i_{\texttt {s}}\) for the random time until the next scattering occurs, then,

    $$\begin{aligned} \Pr (T^i_{\texttt {s}}>t) = \exp \left\{ -\int _0^t \alpha ^i(r+\upsilon s, \upsilon )\mathrm{d}s \right\} . \end{aligned}$$
    (7.1)
  • When scattering of an emission of type \(i \in \{1, \dots , \ell \}\) occurs at space-velocity \((r,\upsilon )\), the new velocity is selected independently with probability \(\pi ^i(r, \upsilon , \upsilon '){ d }\upsilon '\).

\(\triangleright \) Otherwise, if \(\ell +1\le i\le m\), then the emission remains motionless, i.e. the random walk is dormant, holding its initial position r, but retaining the velocity \(\upsilon \) as a mark. After an independent and exponentially distributed random time with rate \(\lambda _i\), the particle transfers it type \(j\in \{1,\ldots ,\ell \}\) and acquires a new velocity \(\upsilon '\) with probability density \(\pi ^i(r,\upsilon ,\upsilon ')\).

We can associate to the MNRW the infinitesimal generator

$$\begin{aligned} {\overset{_\leftarrow }{\texttt {L}}}_if(r,v)&: = \mathbf {1}_{(i\le \ell )}\upsilon \cdot \nabla f(i, r,\upsilon )\mathbf {1}_{(r\in D)}\nonumber \\&\quad +\, \alpha ^i(r, \upsilon )\sum _{j=1}^m\int _{V}[f(j, r, \upsilon ') - f(i, r, \upsilon )]\pi ^{i,j}(r, \upsilon , \upsilon ')\mathrm{d}\upsilon '. \end{aligned}$$
(7.2)

for \(f\in \text {Dom}({\overset{_\leftarrow }{\texttt {L}}}) = \text {Dom}({\overset{_\leftarrow }{\texttt {T}}})\). We thus refer to the process as an \(\overset{_\leftarrow }{\texttt {L}}\)-MNRW.

With the notion of the MNRW in hand, let us consider the following algebraic manipulations. For \(i\in \{1, \dots , \ell \}\), \(j\in \{1, \dots , m\}\), \((r,\upsilon ) \in D\times V\) and \(\upsilon ' \in V\), define

$$\begin{aligned} \alpha ^i(r,\upsilon )&= \mathbf {1}_{(1\le i\le \ell )}\sigma _{\texttt {s}}^i(r,\upsilon ) \nonumber \\&\quad +\,\mathbf {1}_{(1\le i\le \ell )} \sigma _{\texttt {f}}^i(r,\upsilon )\left( \sum _{j=1}^\ell \int _V\pi _{\texttt {f}}^{i,j}(r, \upsilon ,\upsilon '){ d }\upsilon '+\mathbf {1}_{(i=1)}\sum _{j =\ell +1}^m m^j(r,\upsilon ) \right) \nonumber \\&\quad +\,\mathbf {1}_{(\ell +1\le i\le m)}\lambda _i\sum _{j=1}^\ell \int _V\pi _{\texttt {f}}^{i,j}(r, \upsilon , \upsilon '){ d }\upsilon ', \end{aligned}$$
(7.3)
$$\begin{aligned} \pi ^{i,j}(r, \upsilon , \upsilon ')&= (\alpha ^i(r,\upsilon ))^{-1}\Bigg [\sigma _{\texttt {s}}^i(r,\upsilon )\pi _{\texttt {s}}^i(r, \upsilon , \upsilon ')\mathbf {1}_{(1\le i=j\le \ell )} \nonumber \\&\quad +\,\sigma _{\texttt {f}}^i(r,\upsilon )\left( \pi _{\texttt {f}}^{i,j}(r, \upsilon , \upsilon ')\mathbf {1}_{(1\le i, j\le \ell )} + m^j(r,\upsilon )\mathbf {1}_{(i=1, j>\ell ))} \right) \nonumber \\&\quad +\,\lambda _i\pi _{\texttt {f}}^{i,j}(r, \upsilon , \upsilon ')\mathbf {1}_{(\ell +1 \le i \le m, \, j\le \ell )}\bigg ], \end{aligned}$$
(7.4)
$$\begin{aligned} \beta ^i(r,\upsilon )&= \alpha ^i(r,\upsilon ) - \mathbf {1}_{(1\le i\le \ell )} \sigma ^i_{\texttt {s}}(r,\upsilon ) - \mathbf {1}_{(\ell +1\le i\le m)}\lambda _i- \mathbf {1}_{(1\le i\le \ell )} \sigma ^i_{\texttt {f}}(r,\upsilon ) . \end{aligned}$$
(7.5)

Note, in particular, that for each fixed \(1\le i\le m\), \(r\in D\) and \(\upsilon \in V\), \(\pi ^{i,j}(r,\upsilon ,\upsilon ')\) is a probability distribution on \(\{1,\ldots , m\}\times V\) in the sense that \(\sum _{j=1}^m \int _{V} \pi ^{i,j}(r,\upsilon ,\upsilon '){ d }\upsilon ' = 1\). Note also that the assumption \(\sum _{j= 1}^\ell \int _V\pi _{\texttt {f}}^{i,j}(r, \upsilon ,\upsilon '){ d }\upsilon '\ge 0\) ensures that \(\beta ^i\ge 0\), for \(1\le i\le m\).

With simple algebra, we may now identify

$$\begin{aligned} ({\overset{_\leftarrow }{\texttt {T}}}+ {\overset{_\leftarrow }{\texttt {S}}}+ {\overset{_\leftarrow }{\texttt {F}}})f ( r, \upsilon ) = {\overset{_\leftarrow }{\texttt {L}}} f ( r, \upsilon )+ \texttt {diag}(\beta ) f ( r, \upsilon ) \end{aligned}$$
(7.6)

where, for \(f\in \) Dom\(({\overset{_\leftarrow }{\texttt {A}}}) \) (for which it was remarked earlier that it is equal to \(\text {Dom}({\overset{_\leftarrow }{\texttt {T}}})\)), and \({\overset{_\leftarrow }{\texttt {L}}}\) is given by (7.2).

Heuristically speaking, we have algebraically gathered all of the operators into the infinitesimal generator of an \(\overset{_\leftarrow }{\texttt {L}}\)-MNRW and local potential \(\beta \). This has the attraction of leading us the aforementioned single emission representation of the solution to the MNTE using a single-emission Feynman–Kac representation. Said another way, this means that one would expect that, in the appropriate sense, the solution to the NTE to be represented in the form

$$\begin{aligned} \phi _t[g](i,r,\upsilon ) = \mathbf {E}_{(i, r,\upsilon )}\left[ \mathrm{e}^{\int _0^t\beta ^{J_s}(R_s, \Upsilon _s)\mathrm{d}s}g(J_t, R_t, \Upsilon _t) \mathbf {1}_{(t < \tau _D)}\right] , \end{aligned}$$
(7.7)

for \(t\ge 0\), \(1\le i\le m\), \(r\in D, \upsilon \in V\). Here \(\mathbf{P}_{(i,r, v)}\) for the law of the \(\overset{_\leftarrow }{\texttt {L}}\)-MNRW starting from a single emission with configuration \((i, r, \upsilon )\), and \(\mathbf {E}_{(i,r, v)}\) for the corresponding expectation operator.

Appealing to the Markov property for \((J, R, \Upsilon )\), it is not difficult to show that a semigroup property similar to (6.7) holds. That is to say, for \(s,t\ge 0\), \(1\le i\le m\), \(r\in D, \upsilon \in V\)

$$\begin{aligned} \phi _{s+t}[g](i,r,\upsilon ) = \phi _{s}[\phi _t[g]](i,r,\upsilon ). \end{aligned}$$

Similarly to the case of \((\psi _t[g], t\ge 0)\), if we put g in the smaller space \(\prod _{i = 1}^m C^+(D\times V))\) then we also have \(\lim _{t\rightarrow 0}\phi _t[g] = g\) in the pointwise sense, but otherwise strong continuity at \(t = 0\) is unclear. Note also that, since all cross sections are uniformly bounded, then so is \(\beta \) (in all of its variables) by a constant, say \({\bar{\beta }}\). Hence, for \(g\in \prod _{i = 1}^m L_\infty (D\times V)\), the \(\phi _t[g]\le ||g ||_\infty \exp ({\bar{\beta }} t)\), \(t\ge 0\). As with the case of \((\psi _t[g], t\ge 0)\), the notion that \((\phi _t[g], t\ge 0)\), solves (5.5) is not a straightforward claim. Nonetheless, as one might expect, these two expectation semigroups are equal and, we can see this by relating back to (6.8).

Indeed, by conditioning the expectation in the definition of \(\phi _t[g]\) on the first scattering event, and then appealing to the Lemma 1.2, Chapter 4 in [10] in a similar manner to what was done in the proof of Lemma 6.1, one easily deduces the below result. In the the spatial branching process literature, this would be called a ‘many-to-one’ lemma.

Lemma 7.1

For \(g\in \prod _{i = 1}^m L^+_\infty (D\times V)\), the two expectation semigroups \((\phi _t[g], t\ge 0)\) and \((\psi _{t}[g], t\ge 0)\) agree.

8 Consolidating the ACP with the Expectation Semigroup

We want to understand how the \(\prod _{i= 1}^m L_2({D}\times V)\) semigroup \((\texttt {V}_t, t\ge 0)\) that represents the unique solution to the Abstract Cauchy Problem (5.5) relates to the expectation semigroups \((\psi _t, t\ge 0)\) and \((\phi _t, t\ge 0)\) that offer two different stochastic representations to the mild Eq. (6.8).

We start by noting that if \(g\in \prod _{i = 1}^m L^+_\infty (D\times V)\), then, on account of the fact that \(\mathrm{Vol}(\prod _{i =1}^m (D\times V)) =( \int _{D\times V}{ d }r{ d }\upsilon )^m<\infty \), we also have \(g\in \prod _{i= 1}^m L_2({D}\times V)\). Since it is unclear whether \((\psi _t[g], t\ge 0)\) is well defined for all \(g\in \prod _{i= 1}^m L_2({D}\times V)\), it makes makes sense to consider the comparison with \((\texttt {V}_t[g], t\ge 0)\) (defined in (5.6)) for the more restrictive choice \(g\in \prod _{i = 1}^m L_\infty (D\times V)\). The natural setting in which to make the comparison is in the space \(\prod _{i= 1}^m L_2({D}\times V)\) as, by (6.4), \(||\psi _t[g] ||_\infty <\infty \) and the latter implies \(||\psi _t[g] ||_2<\infty \), again thanks to the fact that \(\mathrm{Vol}(\prod _{i =1}^m (D\times V))<\infty .\)

Theorem 8.1

If \(g\in \prod _{i = 1}^m L^+_\infty (D\times V)\) then, for \(t\ge 0\), \(\texttt {V} _t[g] = \psi _t[g]\) on \(\prod _{i= 1}^m L_2({D}\times V)\), i.e. \(||\texttt {V} _t[g] - \psi _t[g] ||_2 = 0\).

Before moving to its proof, the reader should take care to note that this does not imply that \((\texttt {V}_t, t\ge 0)\) and \((\psi _t, t\ge 0)\) agree as \(c_0\)-semigroups on \(\prod _{i= 1}^m L_2({D}\times V)\). In particular, the comparison between the two semigroup operators is only made on \(\prod _{i = 1}^m L_2(D\times V)\), and \((\psi _t, t\ge 0)\) was not (and in fact cannot be) shown to demonstrate the strong continuity property on \(\prod _{i= 1}^m L_2({D}\times V)\).

Remark 8.1

If we consider Theorem 8.1 in light of Theorem 5.3, noting that \((\psi _t[g],t\ge 0)\) is a uniformly bounded sequence, it is tempting to want to say that the leading eigenfunction \(\varphi \) belongs to \(\prod _{i = 1}^m L_\infty (D\times V)\). This is not the case necessarily and remains to be proved. In the setting of a single type of emission, this will be demonstrated in the forthcoming paper [14].

Proof of Theorem 8.1

Consider the adjusted ACP with inhomogeneity given by

$$\begin{aligned} \left\{ \begin{array}{ll} \dfrac{\partial u_t}{\partial t} &{}= {\overset{_\leftarrow }{\texttt {T}}} u_t+({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})\texttt {V}_{t}[g] \\ u_0&{} = g \end{array} \right. \end{aligned}$$
(8.1)

By taking the difference of two solutions and invoking the uniqueness of the ACP in \(\prod _{i= 1}^m L_2({D}\times V)\) with initial data \(g = 0\), we note that the solution to (8.1) is unique in \(\prod _{i= 1}^m L_2({D}\times V)\). However, on the one hand, it is straightforward to verify that

$$\begin{aligned} u_t= \mathrm{e}^{t{\overset{_\leftarrow }{\texttt {T}}}}g + \int _0^t\mathrm{e}^{(t-s){\overset{_\leftarrow }{\texttt {T}}}} ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})\texttt {V}_{s}[g] { d }s, \quad t\ge 0, \end{aligned}$$

solves (8.1). On the other hand, taking account of the fact that \(({\texttt {V}}_t[g],t\ge 0)\) solves (5.5), it is also the case that

$$\begin{aligned} u_t = {\texttt {V}}_t[g], \quad t\ge 0, \end{aligned}$$

solves (8.1). Uniqueness thus tells us that on \(\prod _{i= 1}^m L_2({D}\times V)\),

$$\begin{aligned} {\texttt {V}}_t[g] = {\texttt {U}}_t[g]+ \int _0^t\mathrm{e}^{(t-s){\overset{_\leftarrow }{\texttt {T}}}} ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}){\texttt {V}}_{s}[g] { d }s = {\texttt {U}}_t[g]+ \int _0^t\texttt {U}_s[ ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}){\texttt {V}}_{t-s}[g]] { d }s, \quad t\ge 0, \end{aligned}$$

where in the second equality we have reversed the direction of integration. In conclusion, where as \((\psi _t[g], t\ge 0)\) solves (6.8) in the pointwise sense, \(({\texttt {V}}_t[g], t\ge 0)\) solves it in the \(\prod _{i= 1}^m L_2({D}\times V)\) sense.

On the other hand, we know that \((\psi _t[g], t\ge 0)\) is valued in \(\prod _{i= 1}^m L_2({D}\times V)\), hence we can consider,

$$\begin{aligned} ||\psi _t[g] - {\texttt {V}}_t[g] ||_2 = \left\| \int {\int _0^t \texttt {U}_s[ ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})\{\psi _{t-s}[g]-\texttt {V}_{t-s}[g]\}]{ d }s}\right\| _2, \quad t\ge 0. \end{aligned}$$

To this end, let us note that, for \(T>0\), and \(w_t\in \prod _{i= 1}^m L_2({D}\times V)\), \(t\le T\), we have

$$\begin{aligned} ||\int _0^t w_s{ d }s ||_2^2&= \int _{D\times V} \left( t\int _0^tw_s (r,\upsilon )\frac{{ d }s}{t}\right) ^2{ d }r{ d }\upsilon \nonumber \\&\le \int _{D\times V} t^2\left( \int _0^tw_s (r,\upsilon )^2\frac{{ d }s}{t}\right) { d }r{ d }\upsilon \nonumber \\&\le T\int _0^t ||w_s ||^2_2 { d }s, \quad t\le T, \end{aligned}$$
(8.2)

where in the first inequality we have used Jensen’s inequality and Cauchy–Schwarz in the second. Moreover, for \(f\in \prod _{i= 1}^m L_2({D}\times V)\),

$$\begin{aligned} ||\texttt {U}_s[f] ||^2_2&= \sum _{i =1}^m\int _{D\times V} \mathbf {1}_{(s<\kappa _{r,\upsilon }^D)}f(i, r+\upsilon s, \upsilon )^2{ d }r{ d }\upsilon \nonumber \\&\le \sum _{i =1}^m\int _{D\times V} f(i, r', \upsilon )^2{ d }r '{ d }\upsilon \nonumber \\&=||f ||_2^2 \end{aligned}$$
(8.3)

where the inequality follows as a consequence that, for each \(\upsilon \), the integral \(\int _D \mathbf {1}_{(s<\kappa _{r,\upsilon }^D)}v(i, r+\upsilon s, \upsilon )^2{ d }r\) integrates over a subdomain of D. Also, we have for the operator \({\overset{_\leftarrow }{\texttt {S}}}\) (and similarly for \({\overset{_\leftarrow }{\texttt {F}}}\)), for \(f\in \prod _{i= 1}^m L_2({D}\times V)\),

$$\begin{aligned} ||({\overset{_\leftarrow }{\texttt {S}}}+\texttt {diag}(\sigma _{\texttt {s}})) f ||_2&= \left( \sum _{i = 1}^m\int _{D\times V} \left( \int _{V} f(i,r,\upsilon ')\sigma _{\texttt {s}}(r,\upsilon )\pi ^i_{\texttt {s}}(r,\upsilon ,\upsilon '){ d }\upsilon '\right) ^2 { d }r{ d }\upsilon \right) ^{1/2}\nonumber \\&\le C\left( \sum _{i = 1}^m\int _{D\times V} \left( \int _{V} f(i,r,\upsilon ')\times 1\,{ d }\upsilon '\right) ^2 { d }r{ d }\upsilon \right) ^{1/2}\nonumber \\&\le C \left( \sum _{i = 1}^m \mathrm{Vol}(V)\int _{D\times V} \int _{V}f(i, r, \upsilon ')^2{ d }\upsilon '{ d }r\right) ^{1/2}\nonumber \\&\le C \max _{1\le i\le m}\mathrm{Vol}(V) ||f ||_2, \end{aligned}$$
(8.4)

where the constant C appears by upper estimating the uniformly bounded cross sections and in the second inequality we have used Cauchy-Schwarz.

It thus follows from (8.2), (8.3) and (8.4) that, for \(t\le T\), writing \(\omega _t = \psi _t[g]-\texttt {V}_{t}[g]\), \(t\ge 0\),

$$\begin{aligned} ||\omega _t ||_2^2&= \left\| \int _0^t \texttt {U}_s[ ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})\omega _{t-s}]{ d }s\right\| ^2_2\nonumber \\&\le T\int _0^t ||\texttt {U}_s[ ({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}})\omega _{t-s}] ||_2^2{ d }s\nonumber \\&\le T\int _0^t||({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}} )\omega _{t-s} ||_2^2{ d }s\nonumber \\&= T\int _0^t||({\overset{_\leftarrow }{\texttt {S}}}+{\overset{_\leftarrow }{\texttt {F}}}+\texttt {diag}(\sigma )-\texttt {diag}(\sigma ) )\omega _{s} ||_2^2{ d }s\nonumber \\&\le T\int _0^t \left( ||({\overset{_\leftarrow }{\texttt {S}}}+ \texttt {diag}(\sigma _{\texttt {s}}))\omega _s ||_2 +||({\overset{_\leftarrow }{\texttt {F}}}+ \texttt {diag}(\sigma _{\texttt {f}}))\omega _s ||_2 +||\texttt {diag}(\sigma )\omega _{s} ||_2\right) ^2 { d }s\nonumber \\&\le C' \int _0^t||\omega _s ||_2^2{ d }s,\quad t\le T, \end{aligned}$$
(8.5)

where the constant \(C'\) comes from the fact that \(\sigma \) is uniformly bounded. The final inequality in (8.5) together with Grönwall’s Lemma now tells us that \(||\omega _t ||_2 = 0\), for all \(t\le T\). Since T is chosen arbitrarily, it follows that \((\psi _t[g],t\ge 0)\) and \((\texttt {V}_t[g], t\ge 0)\) are indistinguishable in \(\prod _{i= 1}^m L_2({D}\times V)\). \(\square \)

The conclusion of this section is that it is not unreasonable to now understand the expectation semigroups \((\psi _t[g],t\ge 0)\) and \((\phi _t[g],t\ge 0)\) for non-negative, bounded and measurable g on \(D\times V\) as the ‘solution’ to the MNTE in place of \((\texttt {V}_t[g], t\ge 0)\) for the same class of g. Indeed, the two agree both in \(\prod _{i= 1}^m L_2({D}\times V)\) and hence \(({ d }r\times { d }\upsilon )\)-Lebesgue almost everywhere.

The reader will also note that from the perspective of Monte Carlo simulation, the expectation semigroup \( (\phi _t[g],t\ge 0)\) carries the potential to be exploited in a way that \( (\psi _t[g],t\ge 0)\) cannot. More precisely, where branching trees are difficult to simulate and are not convenient for Monte Carlo computational parallelisation, random walks are. This simple idea is explored in greater detail in the accompanying paper to this one [5].

9 Asymptotic Behaviour of the MNTE: Proof of Theorem 5.3

In this section we return to the fundamental notion that the solution to the MNTE in the form (5.5) is described by its leading asymptotics for large times. That is to say, we give the proof of Theorem 5.3. Our proof follows closely ideas found in Chapters 4 and 5 of [25].

Recall that the quantities \(\alpha ^i\), \(\pi ^{i,j}\), \(\beta ^i\), \(i,j =1,\ldots ,m\) were defined in (7.3), (7.4) and (7.5) respectively. They were arranged into the operator \(\overset{_\leftarrow }{\texttt {A}}= \overset{_\leftarrow }{\texttt {T}}+\overset{_\leftarrow }{\texttt {S}}+\overset{_\leftarrow }{\texttt {F}}\), such that Dom\((\overset{_\leftarrow }{\texttt {A}})=\) Dom\(({\overset{_\leftarrow }{\texttt {T}}})\), described in (5.9).

For \(j = 1, \dots ,m\), let us introduce the operators \(\texttt {K}_{i,j}\) on \(L_2(D \times V) \) by

$$\begin{aligned} \texttt {K}_{i,j}f(r,\upsilon ) = \alpha ^i(r,\upsilon )\int _{V}f(r, \upsilon ')\pi ^{i,j}(r, \upsilon , \upsilon ')\mathrm{d}\upsilon ' . \end{aligned}$$

These are integral operators, which take the form

$$\begin{aligned} \texttt {K}_{i,j}f(r,\upsilon ) = \int _{V} f(r,\upsilon ') \texttt {k}_{i,j}(r,\upsilon ,\upsilon '){ d }\upsilon ' \end{aligned}$$

on \(D\times V\times V\), where

$$\begin{aligned} \texttt {k}_{i,j}(r,\upsilon ,\upsilon ') = \sigma ^i_{\texttt {s}}\pi ^i_{\texttt {s}}(r,\upsilon ,\upsilon ')+ \sigma ^i_{\texttt {f}}\pi ^{i,j}_{\texttt {f}}(r,\upsilon ,\upsilon '). \end{aligned}$$
(9.1)

A similar computation to (8.4) also shows that \(\texttt {K}_{i,j}g \in L_2(D \times V)\) when \(g\in L_2(D \times V)\). Then from (5.1) and (7.4), taking care to note the use of the indicators for the inclusion of terms for different indices, we can write, for \(1\le i\le \ell \), for \(g\in \mathrm{Dom}(\overset{_\leftarrow }{\texttt {A}})\),

$$\begin{aligned} \overset{_\leftarrow }{\texttt {A}}_i g(i,r,\upsilon )&= \overset{_\leftarrow }{\texttt {T}}_ig(i,r,\upsilon ) -\sigma ^i (r,\upsilon )g(i,r,\upsilon )\nonumber \\&\quad +\sum _{j = 1}^\ell \texttt {K}_{i,j}g(j, r,\upsilon ) + \mathbf {1}_{(i=1)}\sigma ^1(r, \upsilon )\sum _{j =\ell +1}^m m^j(r,\upsilon )g(j,r,\upsilon ) \end{aligned}$$
(9.2)

Moreover, for \(\ell +1\le i\le m\),

$$\begin{aligned} \overset{_\leftarrow }{\texttt {A}}_i g(i,r,\upsilon )&= -\lambda _ig(i, r,\upsilon ) + \sum _{j=1}^\ell \texttt {K}_{i,j}g(j, r,\upsilon ) \end{aligned}$$
(9.3)

With this notation, write

$$\begin{aligned} \texttt {T}&=\texttt {diag}({\overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1,\ldots ,\overset{_\leftarrow }{\texttt {T}}_\ell }-\sigma ^\ell ) ,\\ \Lambda&= \texttt {diag}(\lambda _{\ell +1}, \dots , \lambda _m),\\ \texttt {K}^\circ&= (\texttt {K}_{i,j}), \quad \text { for } i,j = 1, \dots , \ell ,\\ \texttt {M}&= (\texttt {M}_{i,j}), \quad \text { where } \texttt {M}_{i,j} = \sigma ^1(r,\upsilon )m^j(r,\upsilon )\mathbf {1}_{(i=1)}, \text { for } i = 1,\dots , \ell , j = \ell +1,\dots , m,\\ \texttt {K}_\circ&= (\texttt {K}_{i,j}), \quad \text { for } i = \ell +1, \dots , m, j = 1, \dots , \ell . \end{aligned}$$

Then the abstract Cauchy problem (5.5) on \(\prod _{j= 1}^m L_2({D}\times V)\) may now be written in matrix form

$$\begin{aligned} \frac{\partial }{\partial t}u_t = \varvec{A}u_t, \quad t\ge 0. \end{aligned}$$

where \(\varvec{A} = \varvec{T} + \varvec{K}\) and

$$\begin{aligned} \varvec{T} = \begin{bmatrix} \texttt {T}&\mathbf 0 \\ \mathbf 0&-\Lambda \end{bmatrix}\quad \text { and } \quad \varvec{K} = \begin{bmatrix} \texttt {K}^\circ&\texttt {M}\\ \texttt {K}_\circ&\mathbf 0 \end{bmatrix}. \end{aligned}$$

The matrix \(\varvec{T}\) is an operator on \(\prod _{i=1}^m L_2(D\times V))\) with domain

$$\begin{aligned} \mathrm{Dom}(\varvec{T}) = \prod _{i = 1}^\ell \mathrm{Dom}(\overset{_\leftarrow }{\texttt {T}}_i) \times \prod _{i = \ell +1}^m L_2(D \times V) \end{aligned}$$

which generates the strongly continuous semigroup \(({\texttt {U}}^{\varvec{T}}_t, t\ge 0)\) given by

$$\begin{aligned} {\texttt {U}}^{\varvec{T}}_t[g] = \left\{ \begin{array}{ll} \mathrm{e}^{-\int _0^t \sigma ^i(r+\upsilon s, \upsilon )\mathrm{d}s}\texttt {U}_t[g]&{} \quad 1\le i\le \ell \\ \mathrm{e}^{-\lambda _i t}&{}\quad \ell + 1\le i\le m, \end{array} \right. \end{aligned}$$
(9.4)

for \(g\in \prod _{i=1}^m L_2(D\times V))\).

In order to prove Theorem 5.3, we consider a different operator that is related to A as follows. Consider the eigenvalue problem

$$\begin{aligned} \varvec{A}\varphi = \lambda \varphi , \quad \lambda > -\lambda _{\ell +1}, \end{aligned}$$
(9.5)

for \(\varphi \in \prod _{i =1}^m L_2(D\times V)\). Write

$$\begin{aligned} \varphi ^\circ (\cdot ) =(\varphi (1,\cdot ),\ldots , \varphi (\ell , \cdot ) )\text { and } \varphi _\circ (\cdot ) =(\varphi (\ell +1,\cdot ),\ldots , \varphi (m, \cdot ) ) \end{aligned}$$

so that \(\varphi \) is the concatenation \((\varphi ^\circ , \varphi _\circ )\). Separating this into prompt and delayed initial emissions, it can be written as

$$\begin{aligned}&\texttt {T}\varphi ^\circ + \texttt {K}^\circ \varphi ^\circ + \texttt {M}\varphi _\circ = \lambda \varphi ^\circ \\&\lambda {\texttt {I}}_{m-\ell }\varphi _\circ = -\Lambda \varphi _\circ + \texttt {K}_\circ \varphi ^\circ , \end{aligned}$$

where \(\texttt {I}_{m-\ell }\) is the \((m-\ell )\times (m-\ell )\) identity matrix. Substituting the second equation into the first, we get

$$\begin{aligned} \varphi _\circ = (\lambda \texttt {I}_{m-\ell }+\Lambda )^{-1}\texttt {K}_\circ \varphi ^\circ \end{aligned}$$
(9.6)

and

$$\begin{aligned} (\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\varphi ^\circ =\varphi ^\circ \text { where }\texttt {K}^\circ (\lambda ) = \texttt {K}^\circ + \texttt {M}(\lambda \texttt {I}_{m-\ell }+\Lambda )^{-1}\texttt {K}_\circ . \end{aligned}$$
(9.7)

Our strategy is to show that there exists a \(\lambda _c\) such that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) has a leading eigenvalue 1, and that this is equivalent to \(\lambda _c\) being an eigenvalue of \(\varvec{A}\). The tool we shall use to do this is the Krein–Rutman theorem, which we recall here for convenience in a format that is appropriate for our use; c.f. [6, p. 286].

Theorem 9.1

(Krein–Rutman theorem) Let X be a Banach space and suppose it contains a convex cone \(\mathcal {C}\) such that \(\mathcal {C} - \mathcal {C}: = \{h = f-g: f, g\in \mathcal {C}\}\) is dense in X. Suppose \(\mathcal {L}\) is a positive compact linear operator on X such that \({r}(\mathcal {L}) :=\sup \{|\lambda | : \lambda \in \Sigma (\mathcal {L})\} > 0\), where \(\Sigma (\mathcal {L})\) is the spectrum of the operator \(\mathcal {L}\). Then \({r}(\mathcal {L})\) is an eigenvalue of \(\mathcal {L}\) with a corresponding positive eigenfunction.

Our proof of Theorem 5.3 requires the following intermediary result below. Before stating it, the reader is reminded that the eigenvalues \(\lambda _{\ell +1},\ldots , \lambda _m\) are arranged so that \(\lambda _{\ell +1}\) is the smallest. Thus, the condition \(\lambda >-\lambda _{\ell +1}\) ensures that \(\texttt {K}^\circ (\lambda )\) is well defined. In particular, \((\lambda \texttt {I}_{m-\ell }+\Lambda )\) is invertible. We will use the obvious meaning for \({\texttt {I}}_\ell \).

Proposition 9.1

Under the assumptions of Theorem 5.3, for each \(\lambda > -\lambda _{\ell +1}\), \({r}\big ((\lambda \texttt {I} _\ell - T )^{-1}{} \texttt {K} ^\circ (\lambda )\big )\) is the leading eigenvalue of \((\lambda \texttt {I} _\ell - \texttt {T} )^{-1}{} \texttt {K} ^\circ (\lambda )\) with a corresponding positive eigenfunction \(\varphi ^\circ _{\lambda }\).

Proof

In relation to the Krein–Rutman theorem stated above, our Banach space is \(X = \prod _{i = 1}^m L_2(D \times V)\) and the corresponding cone is \(\mathcal {C} = \prod _{i = 1}^m L^+_2(D \times V)\). It is clear that this cone is convex, and since every \(L_2\) function can be written as the difference of its positive and negative parts, \(\mathcal {C}\) satisfies the assumptions of the theorem. We now break the rest of the proof into a number of steps which are stated with a proof immediately afterwards.

Step 1 First we claim that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) is a compact operator.

Fix \(1\le i,j\le m\). By Fubini’s Theorem we have that \(r\mapsto \texttt {K}_{i,j}f(r,\upsilon )\) is measurable for \(g\in L_2(D\times V)\). The operators \( \texttt {K}_{i,j}\) are also integral operators and therefore are continuous on \(L_2(V)\) and compact. The assumed piecewise continuity of the cross sections \(\sigma _{\texttt {s}}^i\pi ^i_{\texttt {s}}\) and \(\sigma _{\texttt {f}}^i\pi ^{i,j}_{\texttt {f}}\) and the boundedness of the domain V is sufficient to ensure that \(r\mapsto \texttt {K}_{i,j}\cdot (r,\cdot )\) is continuous under the operator norm on \(L_2(V)\) and hence \(\{\texttt {K}_{i,j}\cdot (r,\cdot ): r\in D\}\) forms a relatively compact set in the space of linear operators on \(L_2(V)\). With these properties, the mapping \(r\mapsto \texttt {K}_{i,j}\cdot (r,\cdot )\), for \(r\in D\), is said to be regular. One similarly (but more easily) shows that \(r\mapsto \texttt {M}_{i,j}\cdot (r,\cdot )\) is regular for \(r\in D\) as operators on \(L_2(V)\). By linearity, this implies that, for \(1\le i,j\le \ell \), the mapping \(r\mapsto K^\circ (\lambda )_{i,j}\) is also regular. Hence, by [25, Theorem 4.1], \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) is a compact operator.

Remark 9.1

It is precisely at the application of [25, Theorem 4.1] that we need the convexity of the domain D, as this is required within the aforesaid result.

Step 2 Next we show that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) is a positive irreducible operator.

Positivity is a straightforward consequence of the assumptions on the operators \(\texttt {K}_{i,j}\) and the form of the semigroup defined in  (9.4). For irreducibility, it is enough to show that there exists an integer \(n \ge 1\) such that \([(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )]^nf > 0\) for each \(f\in \prod _{i=1}^\ell L^+_2(D\times V)\). To this end, note that the entries of \(\texttt {K}^\circ (\lambda )(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) satisfy

$$\begin{aligned} {[}\texttt {K}^\circ (\lambda )(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )]_{i,j}&\ge [\texttt {K}^\circ (\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ ]_{i,j} = \sum _{k=1}^\ell \texttt {K}_{i,k}(\lambda \texttt {I}_\ell - \texttt {T})^{-1}_{k,k}\texttt {K}_{k,j}, \end{aligned}$$

and that \(\texttt {K}_{i,k}(\lambda \texttt {I}_\ell - \texttt {T})^{-1}_{k,k}\texttt {K}_{k,j}\) is an integral operator \(L_2(D\times V) \rightarrow L_2(D \times V)\), \(1\le i,j\le \ell \), whose kernel is greater than or equal to

$$\begin{aligned} \int _0^{\infty }\mathrm{e}^{-\lambda +\underline{\sigma }^k(\frac{r-r'}{t}))t}{} \texttt {k}_{i,k}\left( r',\frac{r-r'}{t}, v'' \right) \texttt {k}_{k,j}\left( r,v, \frac{r-r'}{t} \right) \frac{\mathrm{d}t}{t^n}, \end{aligned}$$
(9.8)

where \(\underline{\sigma }^k(v) = \inf _{r\in D}\{\sigma ^k(r,\upsilon )\}\). Note, in order to produce this estimate, the reader will note that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}_{k,k}\) is the resolvent of \((\texttt {U}^{\varvec{T}}_t, t\ge 0)\) in (9.4). If we choose the index k as in the assumptions (5.11) and (5.12) then the lower bound (9.8) ensures that \([\texttt {K}^\circ (\lambda )(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )]_{i,j}\) is positivity improving. It follows that \([(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )]^2\) is also positivity improving and therefore \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) is irreducible.

Step 3 We claim that there exists a non-negative eigenfunction \( 0\ne \varphi _\lambda \in \prod _{i = 1}^\ell L_2(D\times V) \) for the operator \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) with eigenvalue that agrees with \({r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big )\).

We use de Pagter’s Theorem, cf. [25, Theorem 5.7], which says that the spectral radius of an irreducible operator is strictly positive; that is to say \({r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big )>0\). In turn the Krein–Rutman theorem 9.1 states that \({r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big )\) is thus an eigenvalue for the operator \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) with a corresponding non-negative eigenfunction \(\varphi ^\circ _{\lambda }\). \(\square \)

Proof of Theorem 5.3

(i) In looking for a non-negative eigenfunction of \(\overset{_\leftarrow }{\texttt {A}}\) with real eigenvalue, our earlier discussion tells us we must equivalently look for a solution to (9.5) and hence (9.7). This is equivalent to finding a real value \(\lambda _c\) such that \(r\big ((\lambda _c \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda _c)\big ) = 1\). We again achieve this goal in steps.

Step 1 We want to show that

$$\begin{aligned} \lim _{\lambda \downarrow -\lambda _{\ell +1}}{r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big ) = \infty . \end{aligned}$$
(9.9)

Recall that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) is compact and irreducible so by [25, Theorem 5.13] we have the comparison of the spectral radii,

$$\begin{aligned} {r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big ) \ge {r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\Delta [{\texttt {K}}^\circ (\lambda )]\big ), \end{aligned}$$
(9.10)

where \(\Delta [{\texttt {K}}^\circ (\lambda )]\) is the matrix whose entries are given by \(\Delta [{\texttt {K}}^\circ (\lambda )] =\texttt {diag}( \texttt {K}^\circ (\lambda )_{1,1}, \ldots , \texttt {K}^\circ (\lambda )_{\ell ,\ell } )\).

Suppose \(\Delta \) is an \(\ell \times \ell \) whose diagonal entries are given by operators \(\Delta _i\) on \(L_2(D\times V)\), for \(i = 1,\ldots , \ell \). If \(\mu \in \sigma (\Delta _1)\), the spectrum of \(\Delta _1\), then \((\mu \texttt {I}_\ell - \Delta )_{1,1}\) is not invertible, and so \(\mu \texttt {I}_\ell - \Delta \) is also not invertible. Hence \(\mu \in \sigma (\Delta )\), the spectrum of \(\Delta \), and so \(\sigma (\Delta _1) \subset \sigma (\Delta )\). Applying this argument to the diagonal matrix \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\Delta {\texttt {K}}^\circ (\lambda )\), we have that

$$\begin{aligned} \sigma ([(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\Delta {\texttt {K}}^\circ (\lambda )]_{1,1}) \subset \sigma ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\Delta {\texttt {K}}^\circ (\lambda )) \end{aligned}$$
(9.11)

and so

$$\begin{aligned} {r}\big ((\lambda \texttt {I}_\ell -\texttt {T})^{-1}\Delta {\texttt {K}}^\circ (\lambda )\big )\ge & {} {r}\big ([(\lambda \texttt {I}_\ell - \texttt {T})^{-1}\Delta {\texttt {K}}^\circ (\lambda )]_{1,1}\big ) \nonumber \\\ge & {} {r}\big ((\lambda - \overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1)^{-1}\Delta [{\texttt {K}}^\circ (\lambda )]_{1,1}\big ). \end{aligned}$$
(9.12)

where, in the final inequality, we have used (9.11).

Next recall that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\varphi ^\circ =\varphi ^\circ \text { where }\texttt {K}^\circ (\lambda ) = \texttt {K}^\circ + \texttt {M}(\lambda \texttt {I}_{m-\ell }+\Lambda )^{-1}\texttt {K}_\circ \). Similar reasoning to the proofs of previous steps shows us that \( (\lambda - \overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1)^{-1}\Delta [{\texttt {K}}^\circ (\lambda )]_{1,1}\) and \((\lambda - \overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1)^{-1}\sigma ^1_{\texttt {f}}m^{\ell +1}(\texttt {K}_\circ )_{1,\ell +1} \) are both compact and irreducible operators, so that

$$\begin{aligned} {r}\big ((\lambda - \overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1)^{-1}\Delta [{\texttt {K}}^\circ (\lambda )]_{1,1}\big ) \ge \frac{{r}\big ((\lambda - \overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1)^{-1}\sigma ^1_{\texttt {f}}m^{\ell +1}(\texttt {K}_\circ )_{1,\ell +1} \big )}{\lambda + \lambda _{\ell +1}} > 0, \end{aligned}$$
(9.13)

where the first inequality follows from [25, Theorem 5.13] and the second follows from [25, Theorem 5.7]. Combining (9.10), (9.12) and (9.13), we have

$$\begin{aligned} {r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big ) \ge \frac{{r}\big ((\lambda - \overset{_\leftarrow }{\texttt {T}}_1-\sigma ^1)^{-1}\sigma ^1_{\texttt {f}}m^{\ell +1}(\texttt {K}_\circ )_{1,\ell +1} \big )}{\lambda + \lambda _{\ell +1}} > 0, \end{aligned}$$

with the latter term tending to \(\infty \) as \(\lambda \rightarrow -\lambda _{\ell +1}\).

Step 2 Next we need to show that

$$\begin{aligned} \lim _{\lambda \rightarrow \infty }r\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big ) < 1 \end{aligned}$$

The spectral radius \({r}\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big )\) as is \(\texttt {K}^\circ (\lambda )\). Using the standard operator norm \(||\cdot ||_2\) on \(\prod _{i=1}^\ell L_2(D\times V)\),

$$\begin{aligned} ||\texttt {K}^\circ (\lambda ) g ||_2=||\texttt {M}\left( \texttt {diag}\big ((\lambda +\lambda _{\ell +1})^{-1},\ldots , (\lambda +\lambda _{m})^{-1}\big ) \right) \texttt {K}_\circ g ||_2 \end{aligned}$$

and, hence, by inspection, \(\texttt {K}^\circ (\lambda )\) is decreasing with \(\lambda \) and tends to \(\texttt {K}^\circ \) as \(\lambda \rightarrow \infty \). Note, moreover, that for all \(g\in \prod _{i=1}^\ell L_2(D\times V)\),

$$\begin{aligned} (\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda ) g= \int _0^\infty \mathrm{e}^{-\lambda t}\langle f, \texttt {U}^{\varvec{T}}_t[\texttt {K}^\circ (\lambda ) g]\rangle { d }t, \end{aligned}$$

showing similarly that \((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\) is decreasing in \(\lambda \). Due to [25, Lemma 8.1] (note that it is not difficult to see from the proof of that lemma that that the order of the operators there can be reversed), we have

$$\begin{aligned} \lim _{\lambda \rightarrow \infty }r\big ((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda )\big ) < 1. \end{aligned}$$

Step 3 In this penultimate step, we show that we have found a non-negative function of \(\overset{_\leftarrow }{\texttt {A}}\), with eigenvalue \(\lambda _c\).

We have the existence of a \(\lambda _c > -\lambda _{\ell +1}\) such that \({r}((\lambda \texttt {I}_\ell - \texttt {T})^{-1}\texttt {K}^\circ (\lambda ))= 1\). That is to say, thanks to Proposition 9.1, we have found \(\varphi ^\circ = \varphi ^\circ _{\lambda _c}\) which solves (9.7), which in turn, thanks to (9.6) gives us that \(\varphi _\circ = (\lambda \texttt {I}_{m-\ell }+\Lambda )^{-1}\texttt {K}_\circ \varphi ^\circ _{\lambda _c}\) so that with the concatenation

$$\begin{aligned} \varphi = (\varphi ^\circ _{\lambda _c}, (\lambda _c\texttt {I}_{m-\ell }+\Lambda )^{-1}\texttt {K}_\circ \varphi ^\circ _{\lambda _c})\ge 0 \end{aligned}$$

we have the eigensolution

$$\begin{aligned} \varvec{A}\varphi = \lambda _c \varphi . \end{aligned}$$

which is equivalent to \(\overset{_\leftarrow }{\texttt {A}}\varphi = \lambda _c\varphi \).

Step 4 For the final step we need to show that \(\lambda _c\) is the leading real eigenvalue of \(\overset{_\leftarrow }{\texttt {A}}\), i.e.

$$\begin{aligned} \lambda _c = s(\varvec{A}) :=\sup \{\mathrm{Re}(\lambda ) : \lambda \in \sigma (\varvec{A})\}, \end{aligned}$$

where \(\sigma (\varvec{A})\) is the spectrum of the operator \(\varvec{A}\) or equivalently of \(\overset{_\leftarrow }{\texttt {A}}\). Moreover we need to show that it is simple and isolated.

We first note that since we have shown that \(\lambda _c \in \sigma (\varvec{A})\), in particular that the spectrum is non-empty, it follows from [25, Theorem 5.2] that \(s(\varvec{A}) \in \sigma (\varvec{A})\). Now suppose that \(\lambda _c \ne s(\varvec{A})\) so that, in particular, \(\lambda _c < s(\varvec{A})\). Then, thanks again to [25, Lemma 8.1], \(r\big ((s(\varvec{A}) \texttt {I}_{\ell }- \texttt {T})^{-1}\texttt {K}^\circ (s(\varvec{A}))\big ) < 1\) and so 1 is not an eigenvalue of \((s(\varvec{A}) \texttt {I}_{\ell }- \texttt {T})^{-1}\texttt {K}(s(\varvec{A}))\). Said another way, this means that \(s(\varvec{A})\) is not an eigenvalue of \(\varvec{A}\) (and hence of \(\overset{_\leftarrow }{\texttt {A}}\)), leading to a contradiction. Algebraic and geometric simplicity of \(\lambda _c\) follows from [6, Remark 12] and [6, Theorem 7(iii)], respectively. \(\square \)

Before turning to the proof of Theorem 5.3 (ii), we must state another intermediary result which is translated from a general setting of Banach operators to our current situation; cf. [25, Theorem 4.1] and [1, p. 359, Theorem 22].

Proposition 9.2

Under the assumptions of Theorem 5.3

$$\begin{aligned} \sigma (\varvec{A}) \cap \{\mathrm{Re}(\lambda ) : \lambda >s(\varvec{T}) \} \end{aligned}$$

consists of isolated eigenvalues with finite multiplicities, where \(s(\varvec{T}) :=\sup \{\mathrm{Re}(\lambda ) : \lambda \in \sigma (\varvec{T})\}\).

Note the Theorem from which the above proposition is derived in [1, p. 359, Theorem 22] requires as a sufficient condition that \((\lambda {{\varvec{I}}} - \varvec{T})^{-1}\varvec{K}\) is compact, where \(\varvec{I}\) is an \(m\times m\) identity matrix. This fact easily follows from the conclusion in Step 1 of the proof of Proposition 9.1.

Finally we can complete the proof of Theorem 5.3.

Proof of Theorem 5.3

(ii) It is also easy from the structure of \(\varvec{T}\) that \(-\lambda _{\ell +1},\ldots , -\lambda _{m}\), belong to its spectrum. Moreover, for all \(i = 1,\ldots , \ell \), \(s(\overset{_\leftarrow }{\texttt {T}}_i - \sigma ^i) = -\infty \). Since \(-\lambda _{\ell + 1}\) is the largest of these eigenvalues, and \(\lambda _c>-\lambda _{\ell + 1}\) (from part (i) of Theorem 5.3), Proposition 9.2 tells us that \(\sigma (\varvec{A}) \cap \{\lambda : {\mathrm{Re}}(\lambda )>-\lambda _{\ell +1}\}\) contains at least one isolated eigenvalue with finite (algebraic) multiplicity (i.e. the lead eigenvalue \(\lambda _c\)).

Suppose we enumerate the eigenvalues in \(\sigma (\varvec{A}) \cap \{\lambda : {\mathrm{Re}}(\lambda )>-\lambda _{\ell +1}\}\) in decreasing order by the set \(\{\lambda ^{(1)}, \ldots , \lambda ^{(n)}\}\) (noting from earlier that we have at least \(\lambda ^{(1)}= \lambda _c\) and \(\lambda ^{(n)}> -\lambda _{\ell + 1}\)). Then, from [6, p. 265], for \(g \in \mathrm{Dom}(\overset{_\leftarrow }{\texttt {A}})\), we have

$$\begin{aligned} \texttt {V}_t[g] = \sum _{k=1}^n \mathrm{e}^{\lambda ^{(k)} t }\left( \sum _{m = 0}^{\mathrm{order}(\lambda ^{(k)} ) - 1}t^m\Pi _k^m [g]\right) + O(\mathrm{e}^{-\lambda _{\ell +1} t}), \end{aligned}$$

as \(t\rightarrow \infty \), where \(\Pi _k\) are projectors in \(\mathrm{Dom}(\overset{_\leftarrow }{\texttt {A}})\).

We are really only interested in the projection onto the eigenfunction that we know exists in the real part of the spectrum. The projector \(\Pi _1\) can be written in the form

$$\begin{aligned} \Pi _1[g] = \langle g, \tilde{\varphi }\rangle \varphi , \quad g\in \prod _{i= 1}^m L_2({D}\times V), \end{aligned}$$

where \({\tilde{\varphi }}\) is the left-eigenfunction with eigenvalue \(\lambda _c\), which is guaranteed to exist by examining the preceding arguments for \(\overset{_\leftarrow }{\texttt {A}}\) and re-applying them for \(\overset{_\rightarrow }{\texttt {A}}:= \overset{_\rightarrow }{\texttt {T}}+\overset{_\rightarrow }{\texttt {S}}+\overset{_\rightarrow }{\texttt {F}}\), the adjoint operator of \(\overset{_\leftarrow }{\texttt {A}}\). Hence, we have the following leading order expansion,

$$\begin{aligned} V_t[f]&= \mathrm{e}^{\lambda _ct}(f, \tilde{\varphi })\varphi +O(\mathrm{e}^{[\lambda ^{(2)}\vee (-\lambda _{\ell +1} )]t}). \end{aligned}$$

Note that since, according to Proposition 9.2, \(\lambda _c\) is isolated, there exists a \(\varepsilon >0\) such that \(\lambda ^{(2)}\vee (-\lambda _{\ell +1} )<\lambda _c- \varepsilon \), where we understand \(\lambda ^{(2)} = -\infty \) if \(n = 1\). The statement of part (ii) of Theorem 5.3 now follows. \(\square \)