1 Introduction

The linear Boltzmann transport problem describes the flow of particles through a scattering and absorbing medium, and is a widely used model in areas as diverse as medical imaging, radiotherapy treatment planning, and the design of nuclear reactors, for example. Here, we consider the numerical approximation of the stationary form of the problem, seeking a solution which is a function of up to six independent variables: d, \(d=2,3\), spatial variables varying over a domain in \(\mathbb {R}^d\), \((d-1)\) angular variables on the surface of the d-dimensional unit sphere \(\mathbb {S}\), and an energy variable on the non-negative real line \(\mathbb {R}_{\ge 0}\). The high dimensionality of this problem means that it is imperative to develop efficient numerical approximation methods. Over the years numerous methods have been proposed for this problem, which we shall briefly review below.

Given the structure of the underlying problem, the space, angle and energy components of the solution are typically discretised separately using a variety of techniques. Historically, there has largely been a predominant standard approach to energy discretisation known as the multigroup approximation; see [36, Chapter 2] and the references cited therein. Essentially, this approach approximates the energy by a piecewise constant function with respect to a finite number of non-overlapping energy groups. A key appeal of this approach is that the numerical solution is computed by sequentially solving a single monoenergetic Boltzmann transport problem (i.e., only depending on the spatial and angular variables) for each energy group. This is possible because the scattering process is typically structured in such a way that particles only lose energy in each collision with the medium, either by producing secondary particles or depositing energy locally, and hence the solution in a given energy group only depends on the solution in groups at higher energies, cf., also [21].

On the other hand, discretisations of the angular component of the solution have a rich history and numerous numerical schemes have been proposed. A few classes of such schemes have received particular attention within the literature due to their numerical properties. Spherical harmonic approximations are a widely used form of spectral discretisation in angle, constructed utilising a basis of typically high-order smooth spherical harmonic functions defined globally on the sphere; see [14, 19, 36]. The emphasis of such schemes is to simplify the implementation of the scattering operator, typically at the expense of a more expensive implementation of the streaming operator. Such schemes offer a natural variational setting for their analysis, but the global nature of the basis functions makes local adaptivity a challenging task and Gibbs’-type oscillations may be expected around sharp variations in the solution.

An alternative strain of methods are collectively known as discrete ordinates methods, in which the angular component of the problem is discretised via collocation at a discrete set of angular quadrature points. The advantage of this approach is that, when combined with an appropriate linear solver, the Boltzmann transport problem may be solved in parallel as a set of independent linear transport problems in the spatial domain with fixed wind directions. There appear to be two predominant flavours of discrete ordinates-type methods in the literature, which may be coarsely classified as global high-order methods and local low-order methods. Schemes in the former category typically fall within the family of spectral collocation methods, based on sets of interpolatory or quadrature points for high-order spherical harmonic functions on the sphere, designed according to the principles laid out by Sobolev and Vaskevich in [42]. Such schemes include those based on the widely used level symmetric quadrature formulae in [13, 28, 31], Lebedev quadrature schemes in [34, 35], general double cyclic triangle quadratures in [28, 29], or sets of points arranged on spherical t-designs in [3], to name but a few. The appeal of such methods is that they formally approximate the solution using high-order spherical harmonics, although generating efficient point sets can become difficult for very high-orders, limiting the theoretical accuracy of such schemes. Moreover, it is typically challenging to produce such point sets adaptively, i.e., to focus quadrature points in zones of the angular domain where higher resolution is required, for instance, around beams or other localised structures present in the underlying solution.

Complementing these are methods based on quadrature sets constructed locally using an angular mesh. Typically, the quadrature schemes used are exact for constant functions on each element, such as so-called \(T_N\) schemes, cf. [46], or sometimes linear or quadratic functions; see [25, 26, 32, 33, 48]. In a similar category, we include methods based on interpolation using continuous finite element basis functions in angle, such as those of [20], and schemes incorporating piecewise spherical harmonic approximations on an angular mesh in [30] and wavelet-based approaches in [2, 8]. Although such schemes formally approximate the solution using lower-order polynomials, the ability to generate a higher fidelity approximation by refining the mesh, either locally or globally, has contributed to their significant popularity. Recent work has generalised these schemes to use higher order polynomials in angle in various different ways; see, for example, [21, 30, 48]. While such schemes offer the possibility of high-order convergence and mesh adaptivity, underpinned by a variational framework, they can be more challenging to implement efficiently because the high-order nature of the basis functions on each angular element means that the problem may not immediately facilitate a discrete ordinates-like decoupling into independent spatial transport problems.

In this article, we propose a state of the art hp-version discontinuous Galerkin finite element method (DGFEM) for the discretisation of the linear Boltzmann transport problem, in which the space, angle, and energy components of the solution are approximated in a unified manner. In many applications, particularly those arising in medical physics, the spatial domain may be highly complicated; to deal with such strong complexity of the physical geometry, in an efficient manner, we admit the use of general polytopic meshes; see, for example, [10,11,12] and the references cited therein. The key properties and advantages of the proposed methodology include: the exploitation of a unified DGFEM discretisation of the linear Boltzmann problem over the entire computational domain ensures that the resulting scheme is naturally high-order; note that, in particular, the use of the aforementioned multigroup approximation limits the accuracy of the resulting numerical method to first-order. Taking advantage of the intrinsic variational formulation of the scheme means that the convergence and stability analysis of the underlying DGFEM can be developed, which is the key objective of this article. Furthermore, the proposed framework naturally lends itself to the exploitation of hp-adaptivity techniques coupled with rigorous a posteriori error estimation to ensure that the spatial, angular, and energy meshes can be focused around solution features of interest. Moreover, as already highlighted above, complex geometries can be efficiently meshed and easily handled. Finally, and perhaps most importantly from a practical viewpoint, the proposed method enables arbitrary order mesh-based approximations to be built independently in each of the space, angle and energy domains, while still being implemented in the same way as conventional multigroup discrete ordinates schemes. This highly efficient and naturally parallelisable implementation is made possible by exploiting a novel set of basis functions for the polynomial function spaces which satisfy a Lagrangian property at the nodes of a (tensor product) Gaussian quadrature scheme. We remark that employing a unified DGFEM approximation in all dimensions has been utilised in other application areas, such as, for example, coupled wave/circulation problems, cf. [37]. Previous work using higher order local basis functions in angle has also aimed to build quadrature sets which decouple the angular variables, although the focus was on schemes where the angular quadrature is used in a collocation-type manner such as [25, 32, 33] for evaluating the angular integrals. For this, the quadrature points and basis functions are chosen together such that the quadrature is exact for the basis functions, and the basis functions satisfy the Lagrangian property at the quadrature points. Despite the apparent similarity to our approach here, these existing approaches are based around piecewise spherical harmonic functions (on triangular or quadrilateral elements) and the quadratures are therefore unable to offer the Gaussian-type exactness property which is required to integrate products of basis functions and therefore produce a family of variational schemes which are stable (Theorem 6) and offer arbitrary order convergence rates (Theorem 7). As far as we are aware, analogous quadrature sets are not available for general order piecewise spherical harmonic basis functions.

To the best of the authors knowledge, the mathematical convergence results presented in this article represent the first hp-version error analysis of the DGFEM approximation applied to the polyenergetic linear Boltzmann transport problem, which also admits the use of general polytopic meshes in the spatial domain. To this end, a novel inf-sup bound is established based on extending the analysis presented in [10] to the high–dimensional setting, with the inclusion of the scattering operator. Here, great care is taken to not only track the discretization parameters for each finite element space employed in the space, angle, and energy domains, but also the material coefficients present in the underlying integro-differential equation; this latter issue is particularly important when studying the convergence rates of iterative solvers for the resulting system of linear equations, cf. [23], for example. For related error bounds, we also refer to the early paper by Johnson and Pitkäranta in [27], who derived the first a priori error estimates for a discrete ordinates h-version DGFEM approximation of the mono-energetic Boltzmann transport problem, assuming isotropic scattering, albeit under very low regularity assumptions on the analytical solution.

We remark that a very popular alternative computational framework for simulating the linear Boltzmann transport problem are Monte Carlo methods, which are widely used in practice. Such methods naturally incorporate the stochastic nature of the underlying physical processes and are highly efficient to implement, as the trajectories of individual incoming particles are simulated independently, yet the mean observed behaviour may only be expected to converge with the square root of the number of samples used. For this reason, Börgers in [9] identified that finite element-based methods could expect to perform more efficiently than Monte Carlo-based methods if high-order finite element methods could be utilised in a suitably efficient way. Our work therefore provides a stepping stone to answering the open question of how to achieve this objective in practice, as we are able to compute high-order numerical approximations with minimal additional computational overhead compared to conventional multigroup discrete ordinates methods.

The outline for this paper is as follows. In Sect. 2 we introduce the linear Boltzmann transport problem. Then in Sect. 3, we formulate the unified hp-version DGFEM discretisation. Section 4 introduces the necessary inverse and hp-approximation results; on the basis of these bounds, the stability and convergence analysis of the underlying DGFEM is undertaken in Sect. 5. In Sect. 6 we outline how the proposed DGFEM may be implemented in a highly efficient and parallelisable manner based on employing a careful selection of the quadrature and local polynomial bases in the angular and energy domains. The practical performance of the method is assessed in Sect. 7 through a series of numerical examples. Finally, in Sect. 8 we summarise the work presented in this paper and draw some conclusions.

1.1 Notation

For a bounded open set \(\omega \subset \mathbb {R}^{d}\), \(d\ge 1\), we write \(H^k(\omega )\) to denote the usual Hilbertian Sobolev space of index \(k\ge 0\) of real-valued functions defined on \(\omega \), endowed with the seminorm \(|\cdot |_{H^k(\omega )}\) and norm \(\Vert \cdot \Vert _{H^k(\omega )}\), as detailed in [1], for example. Furthermore, we let \(L_p(\omega )\), \(p \in [1,\infty ]\), be the standard Lebesgue space on \(\omega \), equipped with the norm \(\Vert \cdot \Vert _{L_p(\omega )}\). Similarly, for a bounded \((d-1)\)-dimensional surface S embedded in \(\mathbb {R}^{d}\), the spaces \(H^k(S)\) are defined in an analogous manner, cf. [18], for example.

2 Model Problem

Given an open bounded polyhedral spatial domain \(\varOmega \subset \mathbb {R}^{d}\) for \(d= 2\) or 3, let \(\mathcal {D}= \varOmega \times \mathbb {S}\times \mathbb {E}\), where \(\mathbb {S}= \{ \varvec{\mu }\in \mathbb {R}^{d}: {|\varvec{\mu }|}_2 = 1\}\) denotes the surface of the \(d\)-dimensional unit sphere and \(\mathbb {E}= \{ E\in \mathbb {R}: E\ge 0 \}\) is the real half line.

The linear Boltzmann transport problem reads: find \(u: \mathcal {D}\rightarrow \mathbb {R}\) such that

$$\begin{aligned} \varvec{\mu }\cdot \nabla _{\textbf{x}} u(\textbf{x}, \varvec{\mu }, E) + (\alpha (\textbf{x}, \varvec{\mu }, E) + \beta (\textbf{x}, \varvec{\mu }, E)) u(\textbf{x}, \varvec{\mu }, E)&= \mathcal {S}[u](\textbf{x}, \varvec{\mu }, E) \nonumber \\&\qquad + f(\textbf{x}, \varvec{\mu }, E) \text { in } \mathcal {D}, \nonumber \\ u(\textbf{x}, \varvec{\mu }, E)&= g_\textrm{D}(\textbf{x}, \varvec{\mu }, E) \text { on } \varGamma _{{\text {in}}}, \end{aligned}$$
(1)

where \(f,g_\textrm{D},\alpha , \beta : \mathcal {D}\rightarrow \mathbb {R}\) are given data terms (discussed further below), \(\nabla _{\textbf{x}}\) is the spatial gradient operator, and \(\varGamma _{{\text {in}}} = \{ (\textbf{x}, \varvec{\mu }, E) \in {\bar{\mathcal {D}}}: \textbf{x}\in \partial \varOmega \text { and } \varvec{\mu }\cdot \varvec{n}< 0\}\) denotes the inflow boundary of \(\mathcal {D}\), where \(\varvec{n}\) denotes the unit outward normal vector on the boundary \(\partial \varOmega \) of \(\varOmega \). The action of the scattering operator applied to the solution u is denoted by

$$\begin{aligned} \mathcal {S}[u](\textbf{x}, \varvec{\mu }, E) = \int _{\mathbb {E}} \int _{\mathbb {S}} \theta (\textbf{x}, \varvec{\eta }\rightarrow \varvec{\mu }, E^{\prime }\rightarrow E) u(\textbf{x}, \varvec{\eta }, E^{\prime }) \,d\varvec{\eta }\,dE^{\prime }, \end{aligned}$$

where \(\theta \) is a specified scattering kernel, and \(\beta (\textbf{x}, \varvec{\mu }, E) = \int _{\mathbb {E}} \int _{\mathbb {S}} \theta (\textbf{x}, \varvec{\mu }\rightarrow \varvec{\eta }, E\rightarrow E^{\prime }) \,d\varvec{\eta }\,dE^{\prime }\).

Physically, the model (1) describes the transport of particles through a scattering medium, and is linear due to the key physical assumption that particles are only scattered by interactions with the medium and do not interact with one another. The solution \(u(\textbf{x}, \varvec{\mu }, E)\) represents the fluence of particles with a particular energy \(E\in \mathbb {E}\), travelling in direction \(\varvec{\mu }\in \mathbb {S}\), passing through the point \(\textbf{x}\in \varOmega \). The scattering kernel \(\theta (\textbf{x}, \varvec{\eta }\rightarrow \varvec{\mu }, E^{\prime }\rightarrow E)\) represents the proportion of particles at position \(\textbf{x}\) with energy \(E^{\prime }\) travelling in direction \(\varvec{\eta }\) which transition to direction \(\varvec{\mu }\) and energy \(E\) as a result of an instantaneous collision with the medium. Conversely, the reaction coefficient \(\alpha (\textbf{x}, \varvec{\mu }, E) + \beta (\textbf{x}, \varvec{\mu }, E)\), commonly referred to as the macroscopic the total cross section, models loss of particles from the fluence in direction \(\varvec{\mu }\) with energy \(E\) as they are absorbed by the medium (\(\alpha \)) or scattered into other directions and energies (\(\beta \)).

We simplify the model slightly by assuming that the medium is angularly isotropic in the sense that \(\alpha (\textbf{x}, \varvec{\mu }, E) \equiv \alpha (\textbf{x}, E)\) and the scattering kernel depends only on the cosine of the angle between the initial and final directions; i.e., \(\theta (\textbf{x}, \varvec{\eta }\rightarrow \varvec{\mu }, E^{\prime }\rightarrow E) \equiv \theta (\textbf{x}, \varvec{\eta }\cdot \varvec{\mu }, E^{\prime }\rightarrow E)\). This has the implication that \(\beta (\textbf{x}, \varvec{\mu }, E) \equiv \beta (\textbf{x}, E)\) by symmetry. Furthermore, we make the (physically reasonable) assumption that \(\theta (\textbf{x}, \varvec{\eta }\cdot \varvec{\mu }, E^{\prime }\rightarrow E) = 0\) for \(E^{\prime }< E\), which states that particles cannot gain energy by scattering off the medium.

Remark 1

We remark that this assumption on the scattering kernel is not essential for the definition of the proposed DGFEM discretization or the proceeding a priori error analysis presented in Sect. 5; indeed, it is only relevant in Sect. 6, where we present a source iteration linear solver. For many applications involving photon/electron transport, the assumption that particles cannot gain energy by scattering off the medium is typically fulfilled, though of course up-scattering is relevant in some applications, e.g., thermal neutrons [36, 43]. In the case where up-scattering occurs, then the iterative solver presented in Sect. 6 would need to be suitably modified, whereby, for example, one sweeps both up and down the energy groups.

Finally, we assume that f and g are compactly supported functions of energy, and that there exists a constant \(c_0\) such that

$$\begin{aligned} c(\textbf{x}, \varvec{\mu }, E):= \alpha (\textbf{x}, \varvec{\mu }, E) + \frac{1}{2}(\beta (\textbf{x}, \varvec{\mu }, E) -\gamma (\textbf{x}, \varvec{\mu }, E)) \ge c_0> 0, \end{aligned}$$
(2)

where \(\gamma (\textbf{x}, \varvec{\mu }, E) = \int _{\mathbb {E}} \int _{\mathbb {S}} \theta (\textbf{x}, \varvec{\eta }\rightarrow \varvec{\mu }, E^{\prime }\rightarrow E) \,d\varvec{\eta }\,dE^{\prime }\). For notational simplicity, henceforth we will suppress the dependence of the data terms \(\alpha , \beta , f\) and \(g_\textrm{D}\) on the independent variables.

Remark 2

In practice the absorption cross section \(\alpha \) may be equal to zero; hence, in this setting, condition (2) reduces to the requirement that \(\beta - \gamma \ge c_0^{\prime }>0\), \(c_0^{\prime } = 2 c_0\), or more precisely that the macroscopic scattering cross section related to outgoing directions and energies (\(\beta \)) is greater than the corresponding quantity related to the incoming directions and energies (\(\gamma \)). An important scattering model employed in practice for photons is the Klein-Nishina scattering model, discussed in Sect. 7; one can show that this model does indeed satisfy (2) within a physical range of energies; see [39] for details.

Remark 3

In neutron transport applications, cross-section data may not be calculated from first principles, and instead must be determined through empirical means, cf. [36, Chapter 2.2]. By contrast, photon and electron applications may directly use analytic expressions of cross-section data [22].

3 Discrete Scheme

We discretise the Boltzmann transport problem (1) using a DGFEM approach, seeking an approximate solution which is a product of discontinuous piecewise polynomial functions with respect to meshes defined in the spatial, angular, and energy domains separately. For this, we introduce the following notation.

3.1 Spatial Discretisation

Let \(\mathcal {T}_{\varOmega }\) be a subdivision of the spatial domain \(\varOmega \) into non-overlapping open polytopic elements \(\kappa _{\tiny \varOmega }\) with diameter \(h_{\kappa _{\tiny \varOmega }}\) such that \(\overline{\varOmega } = \cup \,\overline{ \kappa _{\tiny \varOmega }}\). The set of faces in \(\mathcal {T}_{\varOmega }\) will be denoted by \(\mathcal {F}^{}_{\varOmega }\), which are defined as the \((d-1)\)-dimensional planar facets of the elements \(\kappa _{\tiny \varOmega }\) present in the mesh \(\mathcal {T}_{\varOmega }\). For \(d=3\), we assume that each planar face of an element \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\) can be subdivided into a set of co-planar \((d-1)\)-dimensional simplices and we refer to this set as the set of faces, as in [11].

Given \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), we denote by \(p_{\kappa _{\tiny \varOmega }} \ge 0\) the polynomial degree on \(\kappa _{\tiny \varOmega }\), and define the vector \(\textbf{p}:= (p_{\kappa _{\tiny \varOmega }}: \kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega })\). The spatial finite element space is then defined by

$$\begin{aligned} \mathbb {V}^\textbf{p}_{\varOmega }&= \{ v \in L_2(\varOmega ) : v|_{\kappa _{\tiny \varOmega }} \in {\mathbb {P}}_{p_{\kappa _{\tiny \varOmega }}}(\kappa _{\tiny \varOmega }) \text { for all } \kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\}, \end{aligned}$$

where \({\mathbb {P}}_{k}(\kappa _{\tiny \varOmega })\) denotes the space of polynomials of total degree k on \(\kappa _{\tiny \varOmega }\). We denote by \(\partial \kappa _{\tiny \varOmega }\) the union of \((d-1)\)-dimensional open faces of the element \(\kappa _{\tiny \varOmega }\). Then, for a given direction \(\varvec{\mu }\in \mathbb {S}\) the inflow and outflow parts of \(\partial \kappa _{\tiny \varOmega }\) are defined as

$$\begin{aligned} \partial _{-}\kappa _{\tiny \varOmega }&=\{ \textbf{x}\in \partial \kappa _{\tiny \varOmega }: ~ \varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}(\textbf{x})<0\} , \\ \partial _{+}\kappa _{\tiny \varOmega }&=\{\textbf{x}\in \partial \kappa _{\tiny \varOmega }: ~ \varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}(\textbf{x}) \ge 0\}, \end{aligned}$$

respectively, where \(\varvec{n}_{\kappa _{\tiny \varOmega }}(\textbf{x})\) denotes the unit outward normal vector to \(\partial \kappa _{\tiny \varOmega }\) at \(\textbf{x}\in \partial \kappa _{\tiny \varOmega }\).

Given \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), the trace of a (sufficiently smooth) function v on \(\partial _{-}\kappa _{\tiny \varOmega }\) from \(\kappa _{\tiny \varOmega }\) is denoted by \(v^+_{\kappa _{\tiny \varOmega }}\). Further, if \(\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega \) is nonempty, then for \(\textbf{x}\in \partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega \) there exists a unique \(\kappa _{\tiny \varOmega }^\prime \in \mathcal {T}_{\varOmega }\) such that \(\textbf{x}\in \partial _{+}\kappa _{\tiny \varOmega }^\prime \); with this notation, we denote by \(v^-_{\kappa _{\tiny \varOmega }}\) the trace of \(v |_{\kappa _{\tiny \varOmega }^\prime }\) on \(\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega \). Hence the upwind jump of the function v across a face \(F \subset \partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega \) is denoted by

$$\begin{aligned} \lfloor v\rfloor :=v^+_{\kappa _{\tiny \varOmega }}-v^-_{\kappa _{\tiny \varOmega }}. \end{aligned}$$

In the remainder of the article we suppress the subscript \(\kappa _{\tiny \varOmega }\), since it will always be clear which element \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\) the quantities \(v_{\kappa _{\tiny \varOmega }}^\pm \) correspond to.

3.2 Angular Discretisation

A general framework developed for solving partial differential equations on surfaces has been developed in [4, 17, 18] and the references cited therein. Given that our particular setting is greatly simplified, we proceed in a slightly different manner. Let \(\mathbb {S}_h\) denote a polyhedral surface in \(\mathbb {R}^d\) composed of (closed) planar faces \(\tilde{\kappa }_{\tiny \mathbb {S}}\) which are assumed to be either simplices (intervals if \(d=2\); triangles if \(d=3\)) or (affine) quadrilaterals (when \(d=3\)). We write \(\tilde{\mathcal {T}_{\mathbb {S}}}= \{\tilde{\kappa }_{\tiny \mathbb {S}}\}\) to denote the associated regular conforming triangulation of \(\mathbb {S}_h\), i.e., \(\mathbb {S}_h = \cup _{\tilde{\kappa }_{\tiny \mathbb {S}}\in \tilde{\mathcal {T}_{\mathbb {S}}}} \tilde{\kappa }_{\tiny \mathbb {S}}\). We now introduce a smooth invertible mapping \(\phi _\mathbb {S}: \mathbb {S}_h\rightarrow \mathbb {S}\); for example, assuming the surface is star-shaped with respect to the origin, we may simply define \(\phi _\mathbb {S}({\tilde{\varvec{\mu }}}) = {|{\tilde{\varvec{\mu }}}|}_{2}^{-1} {\tilde{\varvec{\mu }}}\), where \(|\cdot |_2\) denotes the \(l_2\)-norm. With this notation, we define a mesh of curved surface elements defined on \(\mathbb {S}\) by

$$\begin{aligned} \mathcal {T}_{\mathbb {S}}= \left\{ \kappa _{\tiny \mathbb {S}}: \kappa _{\tiny \mathbb {S}}= \phi _{\mathbb {S}}(\tilde{\kappa }_{\tiny \mathbb {S}}) ~ \forall \tilde{\kappa }_{\tiny \mathbb {S}}\in \tilde{\mathcal {T}_{\mathbb {S}}}\right\} . \end{aligned}$$

Crucially, we assume that elements \(\tilde{\kappa }_{\tiny \mathbb {S}}\in \tilde{\mathcal {T}_{\mathbb {S}}}\) are mapped to \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), without any significant rescaling. More precisely, we assume that the determinant of the inverse of the first fundamental form of the mapping \(\phi _\mathbb {S}: \mathbb {S}_h\rightarrow \mathbb {S}\) is uniformly bounded from above and below, cf. [18]. Following [4, 17, 18], in the case when \(\tilde{\mathcal {T}_{\mathbb {S}}}\) is composed of simplices, then \(\mathbb {S}_h\) may, for example, be chosen to be a piecewise linear approximation of \(\mathbb {S}\), whereby the elements forming \(\mathbb {S}_h\) may be constructed so that their vertices lie on \(\mathbb {S}\). In the case when quadrilateral elements are employed, then an initial polyhedral domain \(\mathbb {S}_h^\prime \) may be constructed in a similar fashion, though in general the resulting element domains will not be affine. In this setting, we assume there exists \(\mathbb {S}_h\) consisting of affine quadrilateral elements, in such a manner that the corresponding quadrilateral facets of \(\mathbb {S}_h^\prime \) and \(\mathbb {S}_h\) may be mapped to one another without any significant rescaling. We stress that, irrespective of the specific choice of \(\mathbb {S}_h\), the assumption on scaling of the Jacobian of the mapping \(\phi _\mathbb {S}: \mathbb {S}_h\rightarrow \mathbb {S}\) is crucial to ensure that Lemma 5 holds, see Sect. 4 below.

Since the surface we are interested in discretising is simply the unit sphere in \(\mathbb {R}^d\), a practical choice for \(\mathbb {S}_h\) is the surface of the cube \([-1,1]^d\). This leads to the widely used cube-sphere discretisation of the sphere, and enables a particularly simplified implementation of the method.

Let \(\hat{\kappa }_{\tiny \mathbb {S}}\subset \mathbb {R}^{d-1}\) denote the reference element (either a simplex or quadrilateral), \(\phi _{\kappa _{\tiny \mathbb {S}}}: \hat{\kappa }_{\tiny \mathbb {S}}\rightarrow \tilde{\kappa }_{\tiny \mathbb {S}}\), which is assumed to be affine, and define \(F_{\kappa _{\tiny \mathbb {S}}}: \hat{\kappa }_{\tiny \mathbb {S}}\rightarrow \kappa _{\tiny \mathbb {S}}\) by \(F_{\kappa _{\tiny \mathbb {S}}} = \phi _{\mathbb {S}} \circ \phi _{\kappa _{\tiny \mathbb {S}}}\). For each \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), let \(q_{\kappa _{\tiny \mathbb {S}}}\ge 0\) denote the polynomial degree used on \(\kappa _{\tiny \mathbb {S}}\), and introduce \(\textbf{q}:=(q_{\kappa _{\tiny \mathbb {S}}}: \kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}})\). The finite element space defined on the surface of the sphere \(\mathbb {S}\) is then given by

$$\begin{aligned} \mathbb {V}^\textbf{q}_{\mathbb {S}}= \{v \in L_2(\mathbb {S}) : v|_{\kappa _{\tiny \mathbb {S}}} = {\hat{v}} \circ F_{\kappa _{\tiny \mathbb {S}}}^{-1}, ~{\hat{v}} \in {{\mathcal {R}}}_{q_{\kappa _{\tiny \mathbb {S}}}}(\hat{\kappa }_{\tiny \mathbb {S}}) \text { for all } \kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\}, \end{aligned}$$

where \({{\mathcal {R}}}_k(\hat{\kappa }_{\tiny \mathbb {S}}) = {\mathbb {P}}_{k}(\hat{\kappa }_{\tiny \mathbb {S}})\) if \(\hat{\kappa }_{\tiny \mathbb {S}}\) is a simplex and \({{\mathcal {R}}}_k(\hat{\kappa }_{\tiny \mathbb {S}}) = {\mathbb {Q}}_{k}(\hat{\kappa }_{\tiny \mathbb {S}})\) if \(\hat{\kappa }_{\tiny \mathbb {S}}\) is a square; here \({\mathbb {Q}}_{k}(\hat{\kappa }_{\tiny \mathbb {S}})\) denotes the space of tensor product polynomials on \(\hat{\kappa }_{\tiny \mathbb {S}}\) of degree k in each coordinate direction.

3.3 Energy Discretisation

We first restrict the energy domain to be a finite interval by selecting minimum and maximum energy cutoffs \(E_{\min }\) and \(E_{\max }\), respectively. Due to the assumption that the problem data is compactly supported in energy and the assumption on the structure of the scattering kernel, these limits may be chosen so that the analytical solution is compactly supported in energy; with a slight abuse of notation we refer to \(\mathbb {E}\) to be this restricted domain \((E_{\min },E_{\max })\).

Then, for \(N_{\mathbb {E}} \ge 1\), let \(E_{\max } = E_{0}> E_{1}> \ldots> E_{N_{\mathbb {E}}-1} > E_{N_{\mathbb {E}}} = E_{\min }\) define a partition of the energy domain of the problem into \(N_{\mathbb {E}}\) energy groups. We will refer to the interval \(\kappa _g= (E_{g}, E_{g-1})\) as energy group g, \(g = 1, \dots , N_{\mathbb {E}}\), and define \(\mathcal {T}_{\mathbb {E}}= \{ \kappa _g\}_{g=1}^{N_{\mathbb {E}}}\). To each energy group \(\kappa _g\), \(g = 1, \dots , N_{\mathbb {E}}\), we associate a polynomial degree \(r_g \ge 0\). Defining \(\textbf{r} = (r_{\kappa _g})_{g = 1}^{N_{\mathbb {E}}}\), we introduce the energy finite element space

$$\begin{aligned} \mathbb {V}^\textbf{r}_{\mathbb {E}}= \{ v \in L_2(E_{\min }, E_{\max }) : v|_{\kappa _g} \in {\mathbb {P}}_{r_{\kappa _g}}(\kappa _g) \text { for all } \kappa _g\in \mathcal {T}_{\mathbb {E}}\}. \end{aligned}$$

3.4 Discontinuous Galerkin Scheme

Employing the definitions introduced in the previous sections, we define the full space-angle-energy mesh by

$$\begin{aligned} \mathcal {T}_{}= \mathcal {T}_{\varOmega }\times \mathcal {T}_{\mathbb {S}}\times \mathcal {T}_{\mathbb {E}}=\{\kappa : \kappa = \kappa _{\tiny \varOmega }\times \kappa _{\tiny \mathbb {S}}\times \kappa _g, ~\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }, ~\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}, ~\kappa _g\in \mathcal {T}_{\mathbb {E}}\}. \end{aligned}$$

Over the mesh \(\mathcal {T}_{}\), we combine the separate function spaces defined above to obtain the discretisation space

$$\begin{aligned} \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}= \mathbb {V}^\textbf{p}_{\varOmega }\otimes \mathbb {V}^\textbf{q}_{\mathbb {S}}\otimes \mathbb {V}^\textbf{r}_{\mathbb {E}}, \end{aligned}$$

and, for any \(\varvec{\mu }\in \mathbb {S}\), let \(\mathcal {G}_{\varvec{\mu }, h} = \{ v \in L_2(\varOmega ): \varvec{\mu }\cdot \nabla _{\textbf{x}} v|_{\kappa _{\tiny \varOmega }} \in L_2(\kappa _{\tiny \varOmega }) \text { for all } \kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\}\) denote the broken spatial graph space.

We define the upwind transport bilinear form \(a_{\varvec{\mu }}^{E}: \mathcal {G}_{\varvec{\mu }, h} \times \mathcal {G}_{\varvec{\mu }, h} \rightarrow \mathbb {R}\) as

$$\begin{aligned} a_{\varvec{\mu }}^{E}(w, v) =&\sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \int _{\kappa _{\tiny \varOmega }} (\varvec{\mu }\cdot \nabla _\textbf{x}w v + (\alpha + \beta ) w v) \,d\textbf{x}\\&- \!\!\!\sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \int _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega } \!\! (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}) \lfloor w\rfloor v^+ \,ds \\&- \!\!\! \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \int _{\partial _{-}\kappa _{\tiny \varOmega }\cap \partial \varOmega } (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}) w^+ v^+ \,ds, \end{aligned}$$

and further define the scattering bilinear form \(s_{\varvec{\mu }}^{E}: L_2(\mathcal {D}) \times L_2(\varOmega ) \rightarrow \mathbb {R}\) and load linear form \(\ell _{\varvec{\mu }}^{E}: \mathcal {G}_{\varvec{\mu }, h} \rightarrow \mathbb {R}\), respectively, by

$$\begin{aligned} s_{\varvec{\mu }}^{E}(w, v) = \int _{\varOmega } \mathcal {S}[w](\textbf{x}, \varvec{\mu }, E) v \,d\textbf{x}, \end{aligned}$$

and

$$\begin{aligned} \ell _{\varvec{\mu }}^{E}(v) = \int _{\varOmega } f w \,d\textbf{x}- \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \int _{\partial _{-}\kappa _{\tiny \varOmega }\cap \partial \varOmega } (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}) g_\textrm{D}w \,ds. \end{aligned}$$

Finally, we introduce the DGFEM: find \(u_h \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\) such that

$$\begin{aligned} b(u_h, v_h) \equiv a(u_h, v_h) - s(u_h, v_h) = \ell (v_h) \end{aligned}$$
(3)

for all \(v_h \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\), where \(a, s: \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\times \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\rightarrow \mathbb {R}\) and \(\ell : \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\rightarrow \mathbb {R}\) are given, respectively, by

$$\begin{aligned} a(w_h, v_h) = \int _{\mathbb {E}} \int _{\mathbb {S}} a_{\varvec{\mu }}^{E}(w_h, v_h)&\,d\varvec{\mu }\,dE, \qquad s(w_h, v_h) = \int _{\mathbb {E}} \int _{\mathbb {S}} s_{\varvec{\mu }}^{E}(w_h, v_h) \,d\varvec{\mu }\,dE, \\ {}&\ell (v_h) = \int _{\mathbb {E}} \int _{\mathbb {S}} \ell _{\varvec{\mu }}^{E}(v_h) \,d\varvec{\mu }\,dE. \end{aligned}$$

We note that this scheme is consistent in the sense that if the analytical solution u to (1) is sufficiently smooth then

$$\begin{aligned} b(u, v) = \ell (v) \end{aligned}$$

for all \(v \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\).

Remark 4

We note that when \(\textbf{q} = \textbf{0}\) and \(\textbf{r} = \textbf{0}\), i.e., when piecewise constant polynomials are employed over the angular and energy meshes \(\mathcal {T}_{\mathbb {S}}\) and \(\mathcal {T}_{\mathbb {E}}\), respectively, then the proposed DGFEM scheme (3) is essentially equivalent to employing a discrete ordinates approximation over the angular domain \(\mathbb {S}\), with the multigroup approximation in the energy domain \(\mathbb {E}\), subject to a slightly different treatment of the scattering operator, cf. [21], for example.

4 Inverse Inequalities and Approximation Theory

In this section, we briefly outline the key technical results required to analyse the DGFEM defined in (3); for further details, we refer to [10,11,12]. We first introduce some assumptions on the polytopic spatial mesh \(\mathcal {T}_{\varOmega }\).

Assumption 1

The subdivision \(\mathcal {T}_{\varOmega }\) is shape regular in the sense that there exists a positive constant \(C_{\textrm{shape}}\), independent of the mesh parameters, such that:

$$\begin{aligned} \forall \kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }, \quad \frac{h_{\kappa _{\tiny \varOmega }}}{\rho _{\kappa _{\tiny \varOmega }}} \le C_{\textrm{shape}}, \end{aligned}$$

with \(\rho _{\kappa _{\tiny \varOmega }}\) denoting the diameter of the largest ball contained in \(\kappa _{\tiny \varOmega }\).

Assumption 2

There exists a positive constant \(C_F\), independent of the mesh parameters, such that

$$\begin{aligned} \max _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \left( \text{ card } \left\{ F \in \mathcal {F}^{}_{\varOmega }: F \subset \partial \kappa _{\tiny \varOmega }\right\} \right) \le C_F. \end{aligned}$$

In order to state the following hp-version inverse estimates, proved in [10, 12], which are sharp with respect to \((d-k)\)-dimensional, \(k = 1,\ldots , d-1\), element facet degeneration, we first recall the following definition.

Definition 1

Let \({\tilde{\mathcal {T}_{\varOmega }}}\) denote the subset of elements \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\) which can each be covered by at most \(m_{\mathcal {T}_{\varOmega }}\) shape-regular simplices \(K_i\), \(i=1,\dots , m_{\mathcal {T}_{\varOmega }}\), and

$$\begin{aligned} {{\,\textrm{dist}\,}}(\kappa _{\tiny \varOmega }, \partial K_i)>C_{as}{\text {diam}}(K_i)/p_{\kappa _{\tiny \varOmega }}^{2}, \text { with } |K_i|\ge c_{as} |\kappa _{\tiny \varOmega }| \end{aligned}$$

for all \(i=1,\dots , m_{\mathcal {T}_{\varOmega }}\), for some \(m_{\mathcal {T}_{\varOmega }}\in {\mathbb {N}}\) and \(C_{as}, c_{as}>0\), independent of \(\kappa _{\tiny \varOmega }\) and \(\mathcal {T}_{\varOmega }\), where \(p_{\kappa _{\tiny \varOmega }}\) denotes the polynomial degree associated with element \(\kappa _{\tiny \varOmega }\), \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\).

Next we recall the following definition from [10].

Definition 2

For each element \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), let \({\mathcal {F}}_{\flat }^{\kappa _{\tiny \varOmega }}\) denote the family of all possible d-dimensional simplices contained in \(\kappa _{\tiny \varOmega }\) and having at least one face in common with \(\kappa _{\tiny \varOmega }\). The notation \(\kappa _{\flat }^F\) will be used to indicate a simplex belonging to \({\mathcal {F}}_{\flat }^{\kappa _{\tiny \varOmega }}\) and sharing the face F with \(\kappa _{\tiny \varOmega }\).

With this definition, we introduce the mesh parameter \(h_{\kappa _{\tiny \varOmega }}^\bot \) defined by

$$\begin{aligned} h_{\kappa _{\tiny \varOmega }}^\bot : = \min _{F\subset \partial \kappa _{\tiny \varOmega }} \frac{\sup _{\kappa _{\flat }^F\subset \kappa _{\tiny \varOmega }} |\kappa _{\flat }^F|}{|F|} d \qquad \forall \kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }, ~~ d=2,3, \end{aligned}$$
(4)

and note that \(h_{\kappa _{\tiny \varOmega }}^\bot \le h_{\kappa _{\tiny \varOmega }}\). This enables us to recall the following inverse inequality, cf. [11] (equation (5.23)).

Lemma 1

Let \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), \(F\subset \partial \kappa _{\tiny \varOmega }\) denote one of its faces. Then, for each \(v\in {\mathbb {P}}_p(\kappa _{\tiny \varOmega })\), we have the inverse estimate

$$\begin{aligned} \Vert v\Vert _{L_2(F)}^2 \le C_\textrm{inv}^{F}\frac{p^2}{h_{\kappa _{\tiny \varOmega }}^\bot }\Vert v\Vert _{L_2(\kappa _{\tiny \varOmega })}^2, \end{aligned}$$
(5)

where \(C_\textrm{inv}^{F}\) is a positive constant, which depends on the shape regularity of the covering of \(\kappa _{\tiny \varOmega }\), if \(\kappa _{\tiny \varOmega }\in {\tilde{\mathcal {T}_{\varOmega }}}\), but is independent of the discretisation parameters.

To state the \(H^1-L_2\) trace inequality, we need the following further assumption.

Assumption 3

We assume that every polytopic element \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\backslash {\tilde{\mathcal {T}_{\varOmega }}}\), admits a sub-triangulation into at most \(n_{\mathcal {T}_{\varOmega }}\) shape-regular simplices \({\mathfrak {s}}_i\), \(i=1,2,\dots , n_{\mathcal {T}_{\varOmega }}\), such that \({\bar{\kappa _{\tiny \varOmega }}} = \cup _{i=1}^{n_{\mathcal {T}_{\varOmega }}} \bar{{\mathfrak {s}}}_i\) and

$$\begin{aligned} |{\mathfrak {s}}_i| \ge {\hat{c}} |\kappa _{\tiny \varOmega }| \end{aligned}$$

for all \(i=1,\dots , n_{\mathcal {T}_{\varOmega }}\), for some \(n_{\mathcal {T}_{\varOmega }}\in {\mathbb {N}}\) and \({\hat{c}}>0\), independent of \(\kappa _{\tiny \varOmega }\) and \(\mathcal {T}_{\varOmega }\).

Lemma 2

([11] (Lemma 14)) Given Assumptions 1 and 3 are satisfied, for each \(v\in {\mathbb {P}}_p(\kappa _{\tiny \varOmega })\), \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), the inverse estimate

$$\begin{aligned} \Vert \nabla _{\textbf{x}} v\Vert _{L_2(\kappa _{\tiny \varOmega })}^2 \le C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\frac{p^4}{h_{\kappa _{\tiny \varOmega }}^2}\Vert v\Vert _{L_2(\kappa _{\tiny \varOmega })}^2, \end{aligned}$$
(6)

holds, with constant \(C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\) independent of the element diameter \(h_{\kappa _{\tiny \varOmega }}\), the polynomial order p, and the function v, but dependent on the shape regularity of the covering of \(\kappa _{\tiny \varOmega }\), if \(\kappa _{\tiny \varOmega }\in {\tilde{\mathcal {T}_{\varOmega }}}\), or the sub-triangulation of \(\kappa _{\tiny \varOmega }\), if \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\backslash {\tilde{\mathcal {T}_{\varOmega }}}\).

Furthermore we recall the following multiplicative trace inequality, see [12], but written in a slightly different form, see [10].

Lemma 3

For \(v\in H^1(\kappa _{\tiny \varOmega })\), \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), given \(F\subset \partial \kappa _{\tiny \varOmega }\), the following bound holds

$$\begin{aligned} {\Vert v\Vert }^2_{L_2(F)} \le \frac{C_T}{h_{\kappa _{\tiny \varOmega }}^\bot } \left( {\Vert v\Vert }^2_{L_2(\kappa _{\tiny \varOmega })} + h_{\kappa _{\tiny \varOmega }} {\Vert v\Vert }_{L_2(\kappa _{\tiny \varOmega })} {\Vert \nabla _\textbf{x}v\Vert }_{L_2(\kappa _{\tiny \varOmega })} \right) , \end{aligned}$$

where \(C_T\) is a positive constant which is independent of the element diameter \(h_{\kappa _{\tiny \varOmega }}\).

We now turn our attention to deriving suitable hp-version approximation results on each of the finite element spaces \(\mathbb {V}^\textbf{p}_{\varOmega }\), \(\mathbb {V}^\textbf{q}_{\mathbb {S}}\) and \(\mathbb {V}^\textbf{r}_{\mathbb {E}}\). Starting with the spatial finite element space \(\mathbb {V}^\textbf{p}_{\varOmega }\), we first introduce the following covering of the mesh \(\mathcal {T}_{\varOmega }\), see [12].

Definition 3

A (typically overlapping) covering \({\mathcal {T}}_{\sharp } = \{ {\mathcal {K}} \}\) related to the polytopic mesh \(\mathcal {T}_{\varOmega }\) is a set of shape-regular d-simplices \({\mathcal {K}}\), such that for each \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), there exists a \({\mathcal {K}}\in {\mathcal {T}}_{\sharp }\), with \(\kappa _{\tiny \varOmega }\subset {\mathcal {K}}\). Moreover, we assume there exists a covering such that \( {\text {diam}}({\mathcal {K}})\le C_{{\text {diam}}} h_{\kappa }, \) for each pair \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), \({\mathcal {K}}\in {\mathcal {T}}_{\sharp }\), with \(\kappa _{\tiny \varOmega }\subset {\mathcal {K}}\), for a constant \(C_{{\text {diam}}}>0\), uniformly with respect to the meshsize.

Furthermore, we introduce the following extension operator from [44] (Theorem 5) and [40] (Theorem 3).

Theorem 4

Let D be a domain with minimally smooth boundary. Then, there exists a linear extension operator \({\mathfrak {E}}:H^s(D) \rightarrow H^s({{\mathbb {R}}}^d)\), \(s \in {{\mathbb {N}}}_0\equiv \left\{ 0,1,2,\ldots \right\} \), such that \({\mathfrak {E}}v|_{D}=v\) and

$$\begin{aligned} \Vert {\mathfrak {E}} v \Vert _{H^s({{\mathbb {R}}}^d)} \le C_{{\mathfrak {E}}} \Vert v \Vert _{H^s(D)}, \end{aligned}$$

where \(C_{{\mathfrak {E}}}\) is a positive constant depending only on s and parameters which characterize the boundary \(\partial D\).

With this notation we recall the approximation result from [12] (Theorem 4.2).

Lemma 4

Let \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\) and \({\mathcal {K}}\in {\mathcal {T}}_{\sharp }\) denote the corresponding simplex such that \(\kappa _{\tiny \varOmega }\subset {\mathcal {K}}\), cf. Definition 3. Suppose that \(v\in L_2(\varOmega )\) is such that \({\mathfrak {E}} v|_{{\mathcal {K}}}\in H^{l_{\kappa _{\tiny \varOmega }}}({\mathcal {K}})\), for some \(l_{\kappa _{\tiny \varOmega }}\ge 0\). Then, there exists \(\varPi _{\varOmega } v\), such that \(\varPi _{\varOmega } v|_{\kappa _{\tiny \varOmega }} \in {\mathbb {P}}_{p_{\kappa _{\tiny \varOmega }}}(\kappa _{\tiny \varOmega })\), and the following bound holds

$$\begin{aligned} {\Vert v - \varPi _{\varOmega } v\Vert }_{H^m(\kappa _{\tiny \varOmega })} \le C \frac{h_{\kappa _{\tiny \varOmega }}^{s_{\kappa _{\tiny \varOmega }}-m}}{p_{\kappa _{\tiny \varOmega }}^{l_{\kappa _{\tiny \varOmega }}-m}}{\Vert {\mathfrak {E}}v\Vert }_{H^{l_{\kappa _{\tiny \varOmega }}}({\mathcal {K}} )},\quad l_{\kappa _{\tiny \varOmega }}\ge 0, \end{aligned}$$
(7)

for \(0\le m\le l_{\kappa _{\tiny \varOmega }}\). Here, \(s_{\kappa _{\tiny \varOmega }}=\min \{p_{\kappa _{\tiny \varOmega }}+1, l_{\kappa _{\tiny \varOmega }}\}\) and C is a positive constant, that depends on the shape-regularity of \({{\mathcal {K}}}\), but is independent of v, \(h_{\kappa _{\tiny \varOmega }}\), and \(p_{\kappa _{\tiny \varOmega }}\).

A careful inspection of the proof of Theorem 4 reveals that the constant \(C_{{\mathfrak {E}}}\) is independent of the measure of the underlying domain D, cf. [5]. Hence, employing Theorem 4, the bound (7) given in Lemma 4 may be stated in the following simplified form:

$$\begin{aligned} {\Vert v - \varPi _{\varOmega } v\Vert }_{H^m(\kappa _{\tiny \varOmega })} \le C \frac{h_{\kappa _{\tiny \varOmega }}^{s_{\kappa _{\tiny \varOmega }}-m}}{p_{\kappa _{\tiny \varOmega }}^{l_{\kappa _{\tiny \varOmega }}-m}}{\Vert v\Vert }_{H^{l_{\kappa _{\tiny \varOmega }}}(\kappa _{\tiny \varOmega })},\quad l_{\kappa _{\tiny \varOmega }}\ge 0, \end{aligned}$$
(8)

for \(0\le m\le l_{\kappa _{\tiny \varOmega }}\), and therefore the condition placed on the amount of overlap of the simplices \({\mathcal {K}}\) in [10,11,12] is not required.

To construct a projection operator onto the angular finite element space \(\mathbb {V}^\textbf{q}_{\mathbb {S}}\), some care is required to account for the curvature of \(\mathbb {S}\); for completeness we recall the key steps. Under our assumptions on the mapping \(\phi _\mathbb {S}: \mathbb {S}_h\rightarrow \mathbb {S}\), we first recall the following result from [4, 17].

Lemma 5

Let \(v \in H^j (\kappa _{\tiny \mathbb {S}})\), \(j\ge 0\); then writing \({\tilde{v}} = v \circ \phi _{\mathbb {S}}\), we have that

$$\begin{aligned} \frac{1}{C} \Vert v \Vert _{L_2(\kappa _{\tiny \mathbb {S}})} \le&\Vert {\tilde{v}} \Vert _{L_2(\tilde{\kappa }_{\tiny \mathbb {S}})} \le C \Vert v \Vert _{L_2(\kappa _{\tiny \mathbb {S}})}, \quad \text { and } \quad |{\tilde{v}}|_{H^j(\tilde{\kappa }_{\tiny \mathbb {S}})} \le C \Vert v\Vert _{H^j(\kappa _{\tiny \mathbb {S}})}, \end{aligned}$$

where C is a positive constant, which is independent of the meshsize \(h_{\kappa _{\tiny \mathbb {S}}}\).

Employing hp-approximation results for standard shaped elements, we recall the following result from [7, 41].

Lemma 6

Suppose that \({\mathfrak {K}}\) is a d-simplex or d-parallelepiped of diameter \(h_{{\mathfrak {K}}}\). Suppose further that \(v|_{{\mathfrak {K}}} \in H^{l_{{\mathfrak {K}}}}({\mathfrak {K}})\), \(l_{{\mathfrak {K}}} \ge 0\). Then, there exists \({\hat{\varPi }}_{p_{{\mathfrak {K}}}}v\) in \({\mathcal {R}}_{p_{{\mathfrak {K}}}}({\mathfrak {K}})\), \(p_{{\mathfrak {K}}} = 1,2, \dots ,\) such that for \(0 \le m \le l_{{\mathfrak {K}}}\),

$$\begin{aligned} \Vert v - {\hat{\varPi }}_{p_{{\mathfrak {K}}}} v\Vert _{H^m({\mathfrak {K}})} \le C \frac{h_{{\mathfrak {K}}}^{s_{{\mathfrak {K}}}-m}}{p_{{\mathfrak {K}}}^{l_{{\mathfrak {K}}}-m}} \Vert v\Vert _{H^{l_{{\mathfrak {K}}}}({\mathfrak {K}})}, \end{aligned}$$

where \(s_{{\mathfrak {K}}} = \min \{p_{{\mathfrak {K}}}+1, l_{{\mathfrak {K}}}\}\) and C is a positive constant, independent of v and the discretisation parameters.

Equipped with Lemma 6, we introduce the projection operator \(\varPi _{\mathbb {S}}\) by

$$\begin{aligned} \varPi _{\mathbb {S}} v|_{\kappa _{\tiny \mathbb {S}}} = ({\hat{\varPi }}_{q_{\kappa _{\tiny \mathbb {S}}}} v|_{\kappa _{\tiny \mathbb {S}}} \circ \phi _{\mathbb {S}}) \circ \phi _{\mathbb {S}}^{-1} \end{aligned}$$

for all \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\). Hence, employing Lemmas 56, together with the definition of \(\varPi _{\mathbb {S}}\) we deduce the following result.

Lemma 7

Let \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), then given \(v|_{\kappa _{\tiny \mathbb {S}}}\in H^{l_{\kappa _{\tiny \mathbb {S}}}}(\kappa _{\tiny \mathbb {S}})\), for some \(l_{\kappa _{\tiny \mathbb {S}}}\ge 0\), the following bound holds

$$\begin{aligned} {\Vert v - \varPi _{\mathbb {S}} v\Vert }_{L_2(\kappa _{\tiny \mathbb {S}})} \le C \frac{h_{\kappa _{\tiny \mathbb {S}}}^{s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{l_{\kappa _{\tiny \mathbb {S}}}}}{\Vert v\Vert }_{H^{l_{\kappa _{\tiny \mathbb {S}}}}(\kappa _{\tiny \mathbb {S}})},\quad l_{\kappa _{\tiny \mathbb {S}}}\ge 0, \end{aligned}$$

where \(s_{\kappa _{\tiny \mathbb {S}}}=\min \{q_{\kappa _{\tiny \mathbb {S}}}+1, l_{\kappa _{\tiny \mathbb {S}}}\}\) and C is a positive constant, that depends on the shape-regularity of \(\kappa _{\tiny \mathbb {S}}\), but is independent of v, \(h_{\kappa _{\tiny \mathbb {S}}}\), and \(q_{\kappa _{\tiny \mathbb {S}}}\).

For approximation with respect to energy, we simply define the projection operator \(\varPi _{\mathbb {E}}\) by \(\varPi _{\mathbb {E}}v |_{\kappa _g} = {\hat{\varPi }}_{r_{\kappa _g}} v|_{\kappa _g}\), for \(g=1,\ldots ,N_{\mathbb {E}}\). Collecting these three projection operators, we define \(\varPi :L_2(\mathcal {D})\rightarrow \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\) by \(\varPi = \varPi _{\varOmega }\varPi _{\mathbb {S}}\varPi _{\mathbb {E}}\). With this notation we state the following approximation result for the projection operator \(\varPi \).

Lemma 8

Let \(\kappa \in \mathcal {T}_{}\) such that \(\kappa =\kappa _{\tiny \varOmega }\times \kappa _{\tiny \mathbb {S}}\times \kappa _g\), \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), \(\kappa _g\in \mathcal {T}_{\mathbb {E}}\), then given \(v|_{\kappa }\in H^{l_\kappa }(\kappa )\), \(l_\kappa \ge 0\), the following bound holds

$$\begin{aligned} {\Vert v-\varPi v\Vert }^2_{L_2(\kappa )} \le C \left( \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }}} +\frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) {\Vert v\Vert }_{H^{l_\kappa }(\kappa )}^2. \end{aligned}$$
(9)

Furthermore, assuming \(v|_{\kappa } \in H^{l_\kappa }(\kappa )\cup H^1(\kappa _{\tiny \varOmega };H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g))\), \(l_\kappa \ge 1\), we have that

$$\begin{aligned} {\Vert \nabla _\textbf{x}(v-\varPi v)\Vert }^2_{L_2(\kappa )} \le&~C \left( \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}-2}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }-2}}{\Vert v\Vert }_{H^{l_\kappa }(\kappa )}^2 \right. \nonumber \\&\left. +\left( \frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) {\Vert \nabla _\textbf{x}v\Vert }_{L_2(\kappa _{\tiny \varOmega };H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g) )}^2 \right) , \end{aligned}$$
(10)

and

$$\begin{aligned}&\int _{\kappa _g}\int _{\kappa _{\tiny \mathbb {S}}} {\Vert v-\varPi v\Vert }^2_{L_2(\partial \kappa _{\tiny \varOmega })} \,d\varvec{\mu }\,dE\nonumber \\&~~~~~~ \le ~C \left( \frac{1}{h_{\kappa _{\tiny \varOmega }}^\perp }\left( \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }-2}} +\frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) {\Vert v\Vert }_{H^{l_\kappa }(\kappa )}^2 \right. \nonumber \\&~~~~~~~~~~ \left. +\frac{h_{\kappa _{\tiny \varOmega }}^2}{h_{\kappa _{\tiny \varOmega }}^\perp } \left( \frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) {\Vert \nabla _\textbf{x}v\Vert }_{L_2(\kappa _{\tiny \varOmega };H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g) )}^2 \right) . \end{aligned}$$
(11)

Here, \(s_{\kappa _{\tiny \varOmega }} = \min (p_{\kappa _{\tiny \varOmega }}+1,l_\kappa )\), \(s_{\kappa _{\tiny \mathbb {S}}} = \min (q_{\kappa _{\tiny \mathbb {S}}}+1,l_\kappa )\), \(s_{\kappa _g} = \min (r_{\kappa _g}+1,l_\kappa )\), and C is a positive constant that depends on the shape regularity of the element \(\kappa \), but is independent of the mesh parameters.

Proof

We start by first writing the projection error in the form

$$\begin{aligned} v-\varPi v = v -\varPi _{\mathbb {E}} v + \varPi _{\mathbb {E}} (v - \varPi _{\mathbb {S}} v) + \varPi _{\mathbb {E}} \varPi _{\mathbb {S}} (v - \varPi _{\varOmega } v). \end{aligned}$$

Then (9) follows immediately upon application of the triangle inequality, employing the \(L_2\)-stability of \(\varPi _{\mathbb {E}}\) and \(\varPi _{\mathbb {S}}\), and the approximation results stated in Lemma 4, cf. (8), Lemmas 6 and 7. The proof of (10) follows in an analogous fashion. To derive (11), we first employ the trace inequality stated in Lemma 3, together with (9) and (10). \(\square \)

5 Stability and Convergence of the Discrete Scheme

In this section we study the stability and convergence of the DGFEM (3). To this end, we introduce the DGFEM-energy norm

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}^2 =&\Vert \sqrt{c} \, v\Vert _{L_2(\mathcal {D})}^2 \nonumber \\&+ \frac{1}{2} \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \Big ( \Vert v^+ - v^-\Vert _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega }^2 + \Vert v^+\Vert _{\partial \kappa _{\tiny \varOmega }\cap \partial \varOmega }^2 \Big ) \,d\varvec{\mu }\,dE, \end{aligned}$$
(12)

where \(\Vert \cdot \Vert _{\omega }\), \(\omega \subset \partial \kappa _{\tiny \varOmega }\), denotes the (semi)norm associated with the (semi)inner product \((v,w)_\omega = \int _\omega |\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }} |vw \,ds\), and define streamline norm

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2 = {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}^2 + \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }} {\Vert \varvec{\mu }\cdot \nabla _\textbf{x}v\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 \,d\varvec{\mu }\,dE. \end{aligned}$$

Furthermore, for \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), we define

$$\begin{aligned} \tau _{\kappa _{\tiny \varOmega }} = \frac{h_{\kappa _{\tiny \varOmega }}^\bot }{p_{\kappa _{\tiny \varOmega }}^2}.\ \end{aligned}$$

Firstly, we state the following coercivity bound.

Theorem 5

(Coercivity) The DGFEM (3) is coercive with respect to the DGFEM-energy norm \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{\cdot }|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}\) in the sense that the following bound holds:

$$\begin{aligned} b(v, v) \ge {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}^2 \end{aligned}$$

for all \(v \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\).

Proof

Integrating by parts and rearranging the face terms, the transport bilinear form satisfies

$$\begin{aligned} a_{\varvec{\mu }}^{E}(v, v) = {\Vert (\alpha + \beta )^{1/2} v\Vert }_{L_2(\varOmega )}^2 + \frac{1}{2} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \Big ( \Vert v^+ - v^-\Vert _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega }^2 + \Vert v^+\Vert _{\partial \kappa _{\tiny \varOmega }\cap \partial \varOmega }^2 \Big ), \end{aligned}$$

as shown in [24]. Recalling that \(\beta (\textbf{x}, \varvec{\mu },E) = \int _{\mathbb {E}} \int _{\mathbb {S}} \theta (\textbf{x}, \varvec{\mu }\cdot \varvec{\eta }, E\rightarrow E^{\prime }) \,d\varvec{\eta }\,dE^{\prime }\) and \(\gamma (\textbf{x}, \varvec{\mu },E) = \int _{\mathbb {E}} \int _{\mathbb {S}} \theta (\textbf{x}, \varvec{\mu }\cdot \varvec{\eta }, E^{\prime }\rightarrow E) \,d\varvec{\eta }\,dE^{\prime }\), employing the Cauchy–Schwarz inequality implies that the scattering term may be bounded by

$$\begin{aligned} s(v, v)&= \int _{\mathbb {E}} \int _{\mathbb {S}} \int _{\mathbb {E}} \int _{\mathbb {S}} \int _{\varOmega } \theta (\textbf{x}, \varvec{\mu }\cdot \varvec{\eta }, E^{\prime }\rightarrow E) v(\textbf{x}, \varvec{\eta }, E^{\prime }) v(\textbf{x}, \varvec{\mu }, E) \,d\textbf{x}\,d\varvec{\eta }\,dE^{\prime }\,d\varvec{\mu }\,dE\\ {}&\le \Vert \beta ^{\nicefrac {1}{2}} v \Vert _{L_2(\mathcal {D})} \Vert \gamma ^{\nicefrac {1}{2}} v \Vert _{L_2(\mathcal {D})} \le \frac{1}{2} \Vert \beta ^{\nicefrac {1}{2}} v \Vert _{L_2(\mathcal {D})}^2 + \frac{1}{2} \Vert \gamma ^{\nicefrac {1}{2}} v \Vert _{L_2(\mathcal {D})}^2. \end{aligned}$$

The result then follows by combining these bounds with the definition of \(c\) in (2). \(\square \)

We now derive an inf-sup stability result in the streamline norm \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{\cdot }|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\).

Theorem 6

(Inf-sup stability) Given that Assumptions 12, and 3 hold, then the DGFEM (3) is inf-sup stable in the streamline norm, i.e., there exists a constant \(\varLambda > 0\), independent of discretisation parameters, such that

$$\begin{aligned} \inf _{v \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\setminus \{0\}} \sup _{w \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\setminus \{0\}} \frac{b(v, w)}{{|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}} \ge \varLambda . \end{aligned}$$

Proof

The proof follows a standard form for inf-sup results, and is similar to the argument presented in [10] for a scalar advection problem, adapted to the Boltzmann setting. To this end, we construct a function \(w \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\) for each \(v \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\) such that \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \le \varLambda _1 {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\) and \(b(v, w) \ge \varLambda _2 {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2\). The result then follows with \(\varLambda = \nicefrac {\varLambda _2}{\varLambda _1}\).

Let \(w(\textbf{x}, \varvec{\mu },E) = v(\textbf{x}, \varvec{\mu },E) + \delta v_s(\textbf{x}, \varvec{\mu },E)\) where \(\delta > 0\) is a constant which will be determined, depending only on the problem data, and \(v_s(\textbf{x}, \varvec{\mu },E)|_{\kappa _{\tiny \varOmega }} = \tau _{\kappa _{\tiny \varOmega }} \varvec{\mu }\cdot \nabla _{\textbf{x}} v(\textbf{x}, \varvec{\mu },E)\) on each spatial element \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\). To prove that there exists \(C>0\) such that \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \le C {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\), we apply the triangle inequality to find

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \le {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} + \delta {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v_s}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}, \end{aligned}$$

and bound each term of \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v_s}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\) by \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\) individually. Observing that \({|\varvec{\mu }|} = 1\), upon application of the inverse inequality stated in Lemma 2, recalling the definition of \(\tau _{\kappa _{\tiny \varOmega }}\) and noting that \(h_{\kappa _{\tiny \varOmega }}^\bot \le h_{\kappa _{\tiny \varOmega }}\), we deduce that

$$\begin{aligned} \Vert \sqrt{c}v_s\Vert _{L_2(\mathcal {D})}^2&= \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }}^2 {\Vert \sqrt{c} \varvec{\mu }\cdot \nabla _{\textbf{x}} v\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 \,d\varvec{\mu }\,dE\\&\le \frac{C_\textrm{inv}^{\kappa _{\tiny \varOmega }}{\Vert c\Vert }_{L_{\infty }\!(\mathcal {D})}}{c_0} {\Vert \sqrt{c} v\Vert }_{L_2(\mathcal {D})}^2. \end{aligned}$$

Similarly, we have

$$\begin{aligned}&\int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }} {\Vert \varvec{\mu }\cdot \nabla _{\textbf{x}} v_s\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 \,d\varvec{\mu }\,dE\\&\qquad \le C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }} {\Vert \varvec{\mu }\cdot \nabla _{\textbf{x}} v\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 \,d\varvec{\mu }\,dE. \end{aligned}$$

We now consider the face terms arising in the definition of the streamline norm \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{\cdot }|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\). Noting that \({|\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}|} \le 1\), applying the inverse inequality stated in Lemma 1 gives

$$\begin{aligned}&\frac{1}{2} \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \Big ( \Vert v_s^+ - v_s^-\Vert _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega }^2 + \Vert v_s^+\Vert _{\partial \kappa _{\tiny \varOmega }\cap \partial \varOmega }^2 \Big ) \,d\varvec{\mu }\,dE\\&\qquad \le \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \sum _{F\subset \partial \kappa _{\tiny \varOmega }} \Vert v_s^+\Vert _{L_2(F)}^2 \,d\varvec{\mu }\,dE\le C_\textrm{inv}^{F}C_F {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2. \end{aligned}$$

Since the terms resulting from these bounds are components of \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{\cdot }|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}\), it follows that

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \le \varLambda _1 {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \quad \text { with }\quad \varLambda _1 = 1 + \delta \Big ( C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\Big (1+\frac{{\Vert c\Vert }_{L_{\infty }(\mathcal {D})}}{c_0} \Big ) + C_\textrm{inv}^{F}C_F \Big )^{1/2}. \end{aligned}$$

We now show that \(b(v, w) \ge \varLambda _2 {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2\). By linearity and the coercivity bound stated in Theorem 5, we deduce that

$$\begin{aligned} b(v, w) = b(v, v) + \delta b(v, v_s) \ge {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}^2 + \delta (a(v, v_s) - s(v, v_s)), \end{aligned}$$
(13)

and expanding the second term on the right-hand side of (13) gives

$$\begin{aligned} a(v, v_s)&= \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }} \Big ( {\Vert \varvec{\mu }\cdot \nabla _\textbf{x}v\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 + \int _{\kappa _{\tiny \varOmega }} (\alpha + \beta ) (\varvec{\mu }\cdot \nabla _{\textbf{x}} v) v \,d\textbf{x}\\&\quad -\int _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega } (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }} ) \lfloor v\rfloor \varvec{\mu }\cdot \nabla _{\textbf{x}} v^+ \,ds \\&\quad -\int _{\partial _{-}\kappa _{\tiny \varOmega }\cap \partial \varOmega } (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}) v^+ \varvec{\mu }\cdot \nabla _{\textbf{x}} v^+ \,ds \Big ) \,d\varvec{\mu }\,dE\\&\equiv \textrm{I} + \textrm{II} + \textrm{III} + \textrm{IV}. \end{aligned}$$

Term \(\textrm{I}\) is already in the required form; employing Lemma 2, Term \(\textrm{II}\) may be bounded as follows:

$$\begin{aligned} |\textrm{II}|&\le {\Vert \alpha + \beta \Vert }_{L_{\infty }(\mathcal {D})} \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }} \Vert \varvec{\mu }\cdot \nabla _\textbf{x}v\Vert _{L_2(\kappa _{\tiny \varOmega })} \Vert v\Vert _{L_2(\kappa _{\tiny \varOmega })} \,d\varvec{\mu }\,dE\\&\le \left( C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\right) ^{\nicefrac {1}{2}}\frac{{\Vert \alpha + \beta \Vert }_{L_{\infty }(\mathcal {D})}}{c_0} \Vert \sqrt{c} v \Vert ^2_{L_2(\mathcal {D})} . \end{aligned}$$

We now consider the face terms present in terms \(\textrm{III}\) and \(\textrm{IV}\); employing the inverse inequality in Lemma 1 together with Young’s inequality, we deduce that

$$\begin{aligned} |\textrm{III}+\textrm{IV}|&\le \int _{\mathbb {E}} \int _{\mathbb {S}} \! \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }}\!\!\! \big ( C_F^2 C_\textrm{inv}^{F}\big (\Vert v^+ \!\!- \! v^-\Vert _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega }^2 + \Vert v^+\Vert _{\partial \kappa _{\tiny \varOmega }\cap \partial \varOmega }^2 \big ) \\&\qquad + \frac{\tau _{\kappa _{\tiny \varOmega }}}{4} \Vert \varvec{\mu }\cdot \nabla _{\textbf{x}} v\Vert _{L_2(\varOmega )}^2 \big ) \,d\varvec{\mu }\,dE. \end{aligned}$$

Finally, we bound the scattering term; recalling the definition of \(\beta \) and \(\gamma \), employing the Cauchy–Schwarz inequality and Lemma 2 gives

$$\begin{aligned} s(v, v_s)&= \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }} \\&\quad \times \int _{\kappa _{\tiny \varOmega }} \int _{\mathbb {E}} \int _{\mathbb {S}} \int _{\mathbb {E}} \int _{\mathbb {S}} \theta (\textbf{x}, \varvec{\mu }\cdot \varvec{\eta }, E^{\prime }\rightarrow E) v(\textbf{x}, \varvec{\eta },E^{\prime }) \varvec{\mu }\cdot \nabla _\textbf{x}v (\textbf{x}, \varvec{\mu },E) \,d\varvec{\eta }\,dE^{\prime }\,d\varvec{\mu }\,dE\,d\textbf{x}\\ {}&\le \Big ( \int _{\varOmega } \int _{\mathbb {E}} \int _{\mathbb {S}} \beta v^2 \,d\varvec{\mu }\,dE\,d\textbf{x}\Big )^{1/2} \Big ( \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \tau _{\kappa _{\tiny \varOmega }}^2 \int _{\kappa _{\tiny \varOmega }} \int _{\mathbb {E}} \int _{\mathbb {S}} \gamma (\varvec{\mu }\cdot \nabla _\textbf{x}v)^2 \,d\varvec{\mu }\,dE\,d\textbf{x}\Big )^{1/2} \\ {}&\le (C_\textrm{inv}^{\kappa _{\tiny \varOmega }})^{\nicefrac {1}{2}} \frac{{\Vert \beta \Vert }^{\nicefrac {1}{2}}_{L_\infty (\mathcal {D})} {\Vert \gamma \Vert }^{\nicefrac {1}{2}}_{L_\infty (\mathcal {D})}}{c_0} {\Vert \sqrt{c} v\Vert }_{L_2(\mathcal {D})}^2. \end{aligned}$$

Combining the individual estimates above, we deduce that

$$\begin{aligned} b(v, w)&\ge \int _{\mathbb {E}} \int _{\mathbb {S}} \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \Big ( C_1 \Vert \sqrt{c} v \Vert ^2_{L_2(\kappa _{\tiny \varOmega })} + C_2 \Big ( \Vert v^+ - v^-\Vert _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega }^2 + \Vert v^+\Vert _{\partial \kappa _{\tiny \varOmega }\cap \partial \varOmega }^2 \Big ) \\&\quad + \frac{3\delta }{4} \tau _{\kappa _{\tiny \varOmega }} {\Vert \varvec{\mu }\cdot \nabla _\textbf{x}v\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 \Big ) \,d\varvec{\mu }\,dE. \end{aligned}$$

where

$$\begin{aligned} C_1 = 1 - \delta \left( C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\right) ^{\nicefrac {1}{2}}\frac{{\Vert \alpha + \beta \Vert }_{L_{\infty }(\mathcal {D})}}{c_0} - \delta \left( C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\right) ^{\nicefrac {1}{2}} \frac{{\Vert \beta \Vert }^{\nicefrac {1}{2}}_{L_\infty (\mathcal {D})} {\Vert \gamma \Vert }^{\nicefrac {1}{2}}_{L_\infty (\mathcal {D})}}{c_0}, \end{aligned}$$

and \( C_2 = \frac{1}{2} - \delta C_F^2 C_\textrm{inv}^{F}\). Setting \( \varLambda _2 = \min \Big \{ \frac{3\delta }{4}, C_1, C_2 \Big \} \) which is positive for

$$\begin{aligned} 0< \delta < \min \left\{ \frac{c_0}{ \left( C_\textrm{inv}^{\kappa _{\tiny \varOmega }}\right) ^{\nicefrac {1}{2}} \big ( {\Vert \alpha + \beta \Vert }_{L_\infty (\mathcal {D})} + {\Vert \beta \Vert }^{\nicefrac {1}{2}}_{L_\infty (\mathcal {D})} {\Vert \gamma \Vert }^{\nicefrac {1}{2}}_{L_\infty (\mathcal {D})} \big ) } , \frac{1}{2 C_F^2 C_\textrm{inv}^{F}} \right\} , \end{aligned}$$

we conclude that \( b(v, w) \ge \varLambda _2 {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{v}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2 \) and the result follows. \(\square \)

Finally, we state the main result of this paper in the following theorem.

Theorem 7

(Convergence in the streamline norm) Given the mesh \(\mathcal {T}_{}\) defined over the space-angle-energy domain \(\mathcal {D}\), we assume that the spatial polytopic mesh \(\mathcal {T}_{\varOmega }\) satisfies Assumptions 1, 2, and 3. Let \(u_h \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\) denote the DGFEM approximation satisfying (3), let \(u\in H^1(\mathcal {D})\) denote the solution of the problem (1) and suppose that \(u|_{\kappa } \in H^{l_\kappa }(\kappa )\cup H^1(\kappa _{\tiny \varOmega };H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g))\), \(l_\kappa >1\). Then it follows that

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u-u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2&\le ~C \sum _{\kappa \in \mathcal {T}_{}} \left( \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }}} \left( {{\mathcal {L}}}_\kappa (\alpha ,\beta ,\gamma ) + \frac{1}{h_{\kappa _{\tiny \varOmega }}^\bot }(1+p_{\kappa _{\tiny \varOmega }}^2) + \frac{h_{\kappa _{\tiny \varOmega }}^\bot }{h^2_{\kappa _{\tiny \varOmega }}} \right) {\Vert u\Vert }_{H^{l_\kappa }(\kappa )}^2 \right. \\&\left. \quad +\left( \frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) \left( \left( {{\mathcal {L}}}_\kappa (\alpha ,\beta ,\gamma ) + \frac{1}{h_{\kappa _{\tiny \varOmega }}^\bot } \right) {\Vert u\Vert }_{H^{l_\kappa }(\kappa )}^2 \right. \right. \\&\left. \left. \quad +\left( \frac{h_{\kappa _{\tiny \varOmega }}^2}{h_{\kappa _{\tiny \varOmega }}^\bot } + \frac{h_{\kappa _{\tiny \varOmega }}^\bot }{p_{\kappa _{\tiny \varOmega }}^2} \right) {\Vert u\Vert }_{H^1(\kappa _{\tiny \varOmega }; H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g))}^2 \right) \right) , \end{aligned}$$

where

$$\begin{aligned} {{\mathcal {L}}}_\kappa (\alpha ,\beta ,\gamma ) = {\Vert c\Vert }_{L_\infty (\kappa )} + ({\Vert \alpha +\beta \Vert }_{L_\infty (\kappa )}^2 +{\Vert \beta \Vert }_{L_\infty (\kappa )}{\Vert \gamma \Vert }_{L_\infty (\kappa )})c_0^{-1}, \end{aligned}$$

\(s_{\kappa _{\tiny \varOmega }} = \min (p_{\kappa _{\tiny \varOmega }}+1,l_{\kappa })\), \(s_{\kappa _{\tiny \mathbb {S}}} = \min (q_{\kappa _{\tiny \mathbb {S}}}+1,l_{\kappa })\), \(s_{\kappa _g} = \min (r_{\kappa _g}+1,l_{\kappa })\) and C is a positive constant which is independent of the discretization parameters.

Proof

The triangle inequality implies that

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u - u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \le {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u - \varPi u}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} + {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{\varPi u - u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}, \end{aligned}$$
(14)

where \(\varPi \) denotes the projection operator defined in Sect. 4. Exploiting the approximation results derived in Lemma 6, the first term on the right-hand side of (14) can be bounded as follows:

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u - \varPi u}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}^2 \le&~ C \sum _{\kappa \in \mathcal {T}_{}} M_{\kappa } {\Vert u\Vert }_{H^{l_\kappa }(\kappa )}^2 +T_{\kappa } \left( \frac{h_{\kappa _{\tiny \varOmega }}^2}{h_{\kappa _{\tiny \varOmega }}^\bot } + \frac{h_{\kappa _{\tiny \varOmega }}^\bot }{p_{\kappa _{\tiny \varOmega }}^2} \right) {\Vert u\Vert }_{H^1(\kappa _{\tiny \varOmega }; H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g))}^2 , \end{aligned}$$
(15)

where

$$\begin{aligned} T_{\kappa } = \frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}}, \end{aligned}$$

and

$$\begin{aligned} M_{\kappa } = \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }}} \left( {\Vert c\Vert }_{L_\infty (\kappa )} + \frac{1}{h_{\kappa _{\tiny \varOmega }}^\bot }(1+p_{\kappa _{\tiny \varOmega }}^2) + \frac{h_{\kappa _{\tiny \varOmega }}^\bot }{h_{\kappa _{\tiny \varOmega }}^2} \right) + T_{\kappa } \left( {\Vert c\Vert }_{L_\infty (\kappa )} + \frac{1}{h_{\kappa _{\tiny \varOmega }}^\bot } \right) . \end{aligned}$$

Recalling the inf-sup bound derived in Theorem 6 and employing Galerkin orthogonality, the second term on the right-hand side of (14) can be bounded by

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{\varPi u - u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \le \frac{1}{\varLambda } \sup _{w \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\setminus \{0\}} \frac{b(u -\varPi u, w)}{{|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}}. \end{aligned}$$
(16)

We proceed by estimating the individual terms arising in \(b(u - \varPi u, w)\). Writing \(u_\varPi = u -\varPi u\) and integrating by parts elementwise gives

$$\begin{aligned}&a_{\varvec{\mu }}^{E}(u_\varPi , w) \\&\quad = \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \left( \int _{\kappa _{\tiny \varOmega }} ((\alpha + \beta ) u_\varPi w - u_\varPi \varvec{\mu }\cdot \nabla _\textbf{x}w) \,d\textbf{x}\right. \\&\qquad \left. +\int _{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega } (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}) \lfloor w\rfloor u_\varPi ^- \,ds -\int _{\partial _{+}\kappa _{\tiny \varOmega }\cap \partial \varOmega } (\varvec{\mu }\cdot \varvec{n}_{\kappa _{\tiny \varOmega }}) u_\varPi ^+ w^+ \,ds \right) . \end{aligned}$$

The Cauchy–Schwarz inequality therefore implies that

$$\begin{aligned}&a_{\varvec{\mu }}^{E}(u_\varPi , w) \\&\quad \le \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \bigg ( \frac{{\Vert \alpha + \beta \Vert }_{L_\infty (\kappa _{\tiny \varOmega })}}{\sqrt{c_0}} {\Vert u_\varPi \Vert }_{L_2(\kappa _{\tiny \varOmega })} {\Vert \sqrt{c} w\Vert }_{L_2(\kappa _{\tiny \varOmega })} + {\Vert u_\varPi ^+\Vert }_{\partial _{+}\kappa _{\tiny \varOmega }\cap \partial \varOmega } {\Vert w^+\Vert }_{\partial _{+}\kappa _{\tiny \varOmega }\cap \partial \varOmega } \\&\qquad +{\Vert \tau _{\kappa _{\tiny \varOmega }}^{-\nicefrac {1}{2}} u_\varPi \Vert }_{L_2(\kappa _{\tiny \varOmega })} {\Vert \tau _{\kappa _{\tiny \varOmega }}^{\nicefrac {1}{2}} \varvec{\mu }\cdot \nabla _\textbf{x}w\Vert }_{L_2(\kappa _{\tiny \varOmega })} + {\Vert u_\varPi ^-\Vert }_{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega } {\Vert w^+-w^-\Vert }_{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega } \bigg ) \end{aligned}$$

and, applying the Cauchy–Schwarz inequality once again gives

$$\begin{aligned} a_{\varvec{\mu }}^{E}(u_\varPi , w)&\le \bigg ( \! \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \!\!\! \bigg ( \!\! \bigg ( \frac{{\Vert \alpha + \beta \Vert }_{L_\infty (\kappa _{\tiny \varOmega })}^2}{c_0} + \frac{1}{\tau _{\kappa _{\tiny \varOmega }}} \bigg ) {\Vert u_\varPi \Vert }^2_{L_2(\kappa _{\tiny \varOmega })} \\&\quad \! + \! 2 {\Vert u_\varPi ^-\Vert }^2_{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega } \! + \! 2 {\Vert u_\varPi ^+\Vert }^2_{\partial _{+}\kappa _{\tiny \varOmega }\cap \partial \varOmega } \bigg ) \!\! \bigg )^{\frac{1}{2}} \\&\quad \times \bigg ( \sum _{\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }} \Big ( {\Vert \sqrt{c} w\Vert }_{L_2(\kappa _{\tiny \varOmega })}^2 + \tau _{\kappa _{\tiny \varOmega }}{\Vert \varvec{\mu }\cdot \nabla _\textbf{x}w\Vert }^2_{L_2(\kappa _{\tiny \varOmega })} \\&\quad + \frac{1}{2}{\Vert w^+-w^-\Vert }_{\partial _{-}\kappa _{\tiny \varOmega }\backslash \partial \varOmega }^2 + \frac{1}{2}{\Vert w^+\Vert }_{\partial _{+}\kappa _{\tiny \varOmega }\cap \partial \varOmega }^2 \Big ) \!\! \bigg )^{\frac{1}{2}}. \end{aligned}$$

Hence, integrating over energy and angle, and applying the Cauchy–Schwarz inequality and Lemma 6, we deduce that

$$\begin{aligned} a(u_\varPi , w)&\le ~C \left( \sum _{\kappa \in \mathcal {T}_{}} \left( \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }}} \left( \frac{{\Vert \alpha + \beta \Vert }_{L_\infty (\kappa )}^2}{c_0} + \frac{1}{h_{\kappa _{\tiny \varOmega }}^\bot } (1+p_{\kappa _{\tiny \varOmega }}^2) \right) {\Vert u\Vert }_{H^{l_\kappa }(\kappa )}^2 \right. \right. \\&\quad \left. +\left( \frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) \left( \left( \frac{{\Vert \alpha + \beta \Vert }_{L_\infty (\kappa )}^2}{c_0} + \frac{1}{h_{\kappa _{\tiny \varOmega }}^\bot } \right) {\Vert u\Vert }_{H^{l_\kappa }(\kappa )}^2 \right. \right. \\&\left. \left. \left. \quad + \frac{h_{\kappa _{\tiny \varOmega }}^2}{h_{\kappa _{\tiny \varOmega }}^\bot } {\Vert u\Vert }_{H^1(\kappa _{\tiny \varOmega }; H^{l_\kappa }(\kappa _{\tiny \mathbb {S}}\times \kappa _g))}^2 \right) \right) \right) ^{\nicefrac {1}{2}} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{w}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s}. \end{aligned}$$

Finally, we consider the scattering term; applying the Cauchy–Schwarz inequality, recalling the definition of \(\beta \) and \(\gamma \), and using Lemma 6 gives

$$\begin{aligned}&s(u_\varPi , w)\\&\quad \le C \left( \sum _{\kappa \in \mathcal {T}_{}} \frac{{\Vert \beta \Vert }_{L_\infty (\kappa )} {\Vert \gamma \Vert }_{L_\infty (\kappa )}}{c_0} \left( \frac{h_{\kappa _{\tiny \varOmega }}^{2s_{\kappa _{\tiny \varOmega }}}}{p_{\kappa _{\tiny \varOmega }}^{2l_{\kappa }}} +\frac{h_{\kappa _{\tiny \mathbb {S}}}^{2s_{\kappa _{\tiny \mathbb {S}}}}}{q_{\kappa _{\tiny \mathbb {S}}}^{2l_{\kappa }}} +\frac{h_{\kappa _g}^{2s_{\kappa _g}}}{r_{\kappa _g}^{2l_{\kappa }}} \right) {\Vert u\Vert }_{H^{l_\kappa }(\kappa )}^2 \right) ^{\nicefrac {1}{2}} {\Vert \sqrt{c} w\Vert }_{L_2(\mathcal {D})}. \end{aligned}$$

The result then follows by inserting the above bounds into (16) and using (15). \(\square \)

Remark 5

(p-suboptimality of Theorem 7) Let \(h_{\kappa } = {\text {diam}}(\kappa _{\tiny \varOmega })\), \(\kappa \in \mathcal {T}_{}\), and \(h = \max _{\kappa \in \mathcal {T}_{}}h_\kappa \), and suppose we have a uniform polynomial degree for all elements, so \(p_{\kappa _{\tiny \varOmega }} = p\) for all \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), \(q_{\kappa _{\tiny \mathbb {S}}} = p\) for all \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), \(r_{\kappa _g} = p\) for all \(\kappa _g\in \mathcal {T}_{\mathbb {E}}\). Assume that we also have a uniform smoothness degree \(s_{\kappa }=s\) for all \(\kappa \in \mathcal {T}_{}\), \(s=\min (p+1,l)\), \(l\ge 1\), and that the diameter of the spatial faces of each element \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\) is of comparable size to the diameter of the corresponding element, i.e., so that \(h_{\kappa _{\tiny \varOmega }}^\bot \sim h_{\kappa _{\tiny \varOmega }}\). Then, the a priori bound stated in Theorem 7 yields

$$\begin{aligned} {|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u-u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_{s} \sim {{{\mathcal {O}}}} \left( \frac{h^{s-\nicefrac {1}{2}}}{p^{l-1}}\right) , \end{aligned}$$

as \(h \rightarrow 0\) and \(p\rightarrow \infty \). This bound is optimal with respect to the meshsize h, but suboptimal in the polynomial degree p by half an order, cf. the corresponding result derived in [10] for the DGFEM approximation of the linear transport problem on (spatial) polytopic meshes.

6 Efficient Implementation as a Multigroup Discrete Ordinates Scheme

The numerical method (3) introduced above can be implemented in the framework of a multigroup discrete ordinates scheme. Although at first sight it appears that the method fully couples the space, angle and energy unknowns, we show that, through a judicious choice of basis functions and element quadrature schemes, it is possible to evaluate the DGFEM solution by simply computing a sequence of linear transport problems in the d spatial variables. More precisely, we select (tensor-product) Gauss-Legendre quadrature points for angle and energy. This allows, on the reference element, exact integration of the bilinear form \(a(\cdot ,\cdot )\) generated by the (tensor product) polynomial basis elements which satisfy the Lagrangian property at the quadrature points. The Lagrangian property of the basis functions allows us to eliminate coupling between quadrature points, while retaining the same order of convergence of the underlying scheme. To this end, we first consider the multigroup approximation in energy before outlining the angular implementation. For further details, we refer to [47].

6.1 Multigroup Implementation in Energy

We first show how the energy dependence of the problem may be decoupled. If we had perfect knowledge of the function

$$\begin{aligned} u^+(\textbf{x}, \varvec{\mu }, E) = {\left\{ \begin{array}{ll} u(\textbf{x}, \varvec{\mu }, E) \text { for } E> {\hat{E}}, \\ 0 \text { otherwise,} \end{array}\right. } \end{aligned}$$

for some \({\hat{E}} > 0\), then the assumption that the scattering kernel satisfies \(\theta (\textbf{x}, \varvec{\eta }\cdot \varvec{\mu }, E^{\prime }\rightarrow E) = 0\) for \(E^{\prime }< E\), would imply that \({\hat{u}}(\textbf{x}, \varvec{\mu }) \equiv u(\textbf{x}, \varvec{\mu }, {\hat{E}})\) satisfies the monoenergetic radiation transport problem: find \({\hat{u}}: \varOmega \times \mathbb {S}\rightarrow \mathbb {R}\) such that

$$\begin{aligned} \begin{aligned} \varvec{\mu }\cdot \nabla _{{\textbf {x}}} {\hat{u}}({\textbf {x}}, \varvec{\mu }) + (\alpha ({\textbf {x}}, \varvec{\mu }, {\hat{E}}) + \beta ({\textbf {x}}, \varvec{\mu }, {\hat{E}})) {\hat{u}}({\textbf {x}}, \varvec{\mu })&= \mathcal {S}[u^+]({\textbf {x}}, \varvec{\mu }, {\hat{E}}) \\ {}&\quad + f({\textbf {x}}, \varvec{\mu }, {\hat{E}}) \text{ in } \mathcal {D}, \nonumber \\ {\hat{u}}({\textbf {x}}, \varvec{\mu })&= g({\textbf {x}}, \varvec{\mu }, {\hat{E}}) \text{ on } \varGamma _{{\text{ in }}}. \end{aligned} \end{aligned}$$

This is the observation underpinning the standard multigroup discretisation: in the discrete setting, we first solve for the fluence in the highest energy group (corresponding to \(g = 1\)) and then subsequently for each lower energy group in turn. Recalling that \(\kappa _g=(E_g,E_{g-1})\) denotes the gth energy group, \(1\le g\le N_\mathbb {E}\), we therefore introduce the following family of energy cutoff functions:

$$\begin{aligned} u_g^+(\textbf{x}, \varvec{\mu }, E) = {\left\{ \begin{array}{ll} u_h(\textbf{x}, \varvec{\mu }, E) \text { for } E\ge E_{g-1}, \\ 0 \text { otherwise,} \end{array}\right. } \end{aligned}$$

which represents the component of the discrete fluence which may be considered as pre-computed ‘data’ when solving for the fluence in group \(\kappa _g\), and focus on solving the problem in a single energy group \(\kappa _g\), \(1\le g\le N_\mathbb {E}\).

We expand \(u_h\) in group \(\kappa _g\) in terms of energy basis functions as

$$\begin{aligned} u_h(\textbf{x}, \varvec{\mu }, E)|_{\kappa _g} \equiv u_g(\textbf{x}, \varvec{\mu }, E) = \sum _{j=1}^{r_{\kappa _g} + 1} u_g^j(\textbf{x}, \varvec{\mu }) \varphi _g^j(E), \end{aligned}$$

where \(u_g^j \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}= \mathbb {V}^\textbf{p}_{\varOmega }\otimes \mathbb {V}^\textbf{q}_{\mathbb {S}}\), \(j=1,2,\ldots ,r_{\kappa _g} + 1\), and \(\{\varphi _g^j\}_{j=1}^{r_{\kappa _g}+1}\) forms a basis of \({\mathbb {P}}_{r_{\kappa _g}}(\kappa _g)\) (which is only supported on \(\kappa _g\)). Selecting \(v_h=v_g \varphi _g^i \in \mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\), with \(v_g\in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\), \(i = 1,2, \dots , r_{\kappa _g}+1\), the fluence in group \(\kappa _g\) may then be computed by solving: find \(\left\{ u_g^i \right\} _{i=1}^{r_{\kappa _g}+1} \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) such that

$$\begin{aligned} \sum _{j=1}^{r_{\kappa _g}+1} \left( \int _{\kappa _g} \int _{\mathbb {S}} a_{\varvec{\mu }}^{E}(u_g^j, v_g) \varphi _g^j \varphi _g^i \,d\varvec{\mu }\,dE- s(u_g^j \varphi _g^j, v_g \varphi _g^i) \right) = s(u_g^+, v_g \varphi _g^i) + \ell (v_g \varphi _g^i) \end{aligned}$$
(17)

for all \(v_g \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) and \(i = 1,2, \dots , r_{\kappa _g}+1\).

Currently, this takes the form of a fully coupled system of monoenergetic Boltzmann transport problems for the \(r_{\kappa _g}+1\) unknowns within the energy group \(\kappa _g\). To simplify this structure, let \(\{E_g^q\}_{q=1}^{r_{\kappa _g}+1}\subset \kappa _g\) denote the \(r_{\kappa _g}+1\) Gauss-Legendre quadrature points on \(\kappa _g\) with associated weights \(\{\omega _g^q\}_{q=1}^{r_{\kappa _g}+1}\subset {\mathbb {R}}_{\ge 0}\). We then select the basis functions \(\{\varphi _g^i\}_{i=1}^{r_{\kappa _g}+1}\) to be the unique set of polynomials which satisfy the Lagrangian property \(\varphi _g^i(E_g^j) = \delta _{ij}\), \(i,j=1,2, \ldots , r_{\kappa _g}+1\), where \(\delta _{ij}\) denotes the Kronecker delta. This quadrature is exact for polynomials of degree \(2 r_{\kappa _g}+1\), and so we use it to evaluate the (energy) integrals present in the bilinear form \(a_{\varvec{\mu }}^{E}(\cdot ,\cdot )\), meaning we replace (17) with: find \(\left\{ u_g^j \right\} _{j=1}^{r_{\kappa _g}+1} \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) such that

$$\begin{aligned} \omega _g^i \int _{\mathbb {S}} a_{\varvec{\mu }}^{E_g^i}(u_g^i, v_g) \,d\varvec{\mu }- \sum _{j=1}^{r_{\kappa _g}+1} s(u_g^j \varphi _g^j, v_g \varphi _g^i) = s(u_g^+, v_g \varphi _g^i) + \ell (v_g \varphi _g^i) \end{aligned}$$
(18)

for all \(v_g \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) and \(i = 1,2, \dots , r_{\kappa _g}+1\). Here, \(a_{\varvec{\mu }}^{E_g^i}(\cdot ,\cdot )\) is defined analogously to \(a_{\varvec{\mu }}^{E}(\cdot ,\cdot )\) with the coefficient data \(\alpha \) and \(\beta \) evaluated at the energy quadrature point \(E_g^i\), \(i = 1,2, \dots , r_{\kappa _g}+1\). Furthermore, with a slight abuse of notation we have written \(\left\{ u_g^i \right\} _{i=1}^{r_{\kappa _g}+1}\) to also denote the solution of (18), though we stress that (18) is an approximation of (17).Footnote 1

We have not applied the above quadrature scheme in energy to the forcing and scattering terms, since in applications it is usually preferable to treat these terms separately. Instead, we express the scattering term in an alternative form. For \(w, v\in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\), we define

$$\begin{aligned} s_{g^{\prime },g}^{j,i}(w,v) = \int _{\mathbb {S}} \int _{\varOmega } \int _{\mathbb {S}} \varTheta _{g^{\prime }, g}^{j,i}(\textbf{x}, \varvec{\eta }\cdot \varvec{\mu }) w(\textbf{x},\varvec{\eta }) v(\textbf{x},\varvec{\mu }) \,d\varvec{\eta }\,d\textbf{x}\,d\varvec{\mu }, \end{aligned}$$

where

$$\begin{aligned} \varTheta _{g^{\prime },g}^{j,i}(\textbf{x}, \varvec{\eta }\cdot \varvec{\mu }) = \int _{\kappa _g} \int _{\kappa _{g^{\prime }}} \theta (\textbf{x}, \varvec{\eta }\cdot \varvec{\mu }, E^{\prime }\rightarrow E) \varphi _{g}^i(E) \varphi _{g^{\prime }}^j(E^{\prime }) \,dE^{\prime }\,dE, \end{aligned}$$

for \(g, g^{\prime } = 1,2,\ldots ,N_E\), \(i=1,2,\ldots ,r_{\kappa _g}+1\), and \(j=1,2,\ldots ,r_{\kappa _{g^\prime }}+1\). With this notation (18) may be rewritten in the following equivalent form: find \(\left\{ u_g^j \right\} _{j=1}^{r_{\kappa _g}+1} \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) satisfying the discrete monoenergetic radiation transport problem

$$\begin{aligned} \omega _g^i \int _{\mathbb {S}} a_{\varvec{\mu }}^{E_g^i}(u_g^j, v_g) \,d\varvec{\mu }- \sum _{j=1}^{r_{\kappa _g}+1} s_{g,g}^{j,i}(u_g^j,v_g) = \sum _{g^\prime =1}^{g-1} \sum _{j=1}^{r_{\kappa _{g^\prime }}+1} s_{g^{\prime },g}^{j,i}(u_{g^\prime }^j,v_g) + \ell (v_g \varphi _g^i) \end{aligned}$$
(19)

for all \(v_g \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) and \(i = 1,2, \dots , r_{\kappa _g}+1\). This yields a system of \(r_{\kappa _g}+1\) monoenergetic radiation transport problems to solve within each energy group, which are only coupled through the scattering operator. Moreover, the assumed structure of the scattering kernel implies that the problems within a given energy group depend only on the solutions within the same group and from higher energy groups.

6.2 Discrete Ordinates Implementation in Angle

We now focus on solving the monoenergetic radiation transport problem (19) for a single energy group g, \(g = 1,2,\ldots ,N_E\), and energy basis function \(\varphi _g^i\), \(i=1,2,\ldots ,r_{\kappa _g}+1\). To simplify the presentation in this section, we will use \(u_h\) to denote \(u_g^i\) for an arbitrary g and i, and write (19) in the following simplified form: find \(u_h \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) such that

$$\begin{aligned} \int _{\mathbb {S}} a_{\varvec{\mu }}(u_h, v) \,d\varvec{\mu }- {\tilde{s}}(u_h, v) = {\tilde{\ell }}(v) \end{aligned}$$
(20)

for all \(v \in \mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\), where

$$\begin{aligned} a_{\varvec{\mu }}(v,w)&= \omega _g^i a_{\varvec{\mu }}^{E_g^i}(v, w), \qquad {\tilde{s}}(v, w) = \sum _{j=1}^{r_{\kappa _g}+1} s_{g,g}^{j,i}(v,w), \\ {\tilde{\ell }}(v)&= \sum _{g^\prime =1}^{g-1} \sum _{j=1}^{r_{\kappa _{g^\prime }}+1} s_{g^{\prime },g}^{j,i}(u_{g^\prime }^j,v) + \ell (v \varphi _g^i) \end{aligned}$$

for some (fixed) g, \(g = 1,2,\ldots ,N_E\), and some (fixed) i, \(i=1,2,\ldots ,r_{\kappa _g}+1\).

For simplicity, we discuss the scheme in the context of the widely-used framework of source iteration, although similar simplifications may be incorporated into other linear solvers; indeed, source iteration may be effectively used as a preconditioner within a GMRES solver, for example, see [39].

We may express the problem (20) in the following equivalent matrix form: find the vector \(U \in \mathbb {R}^{N}\) of coefficients with respect to a basis of \(\mathbb {V}^{\textbf{p},\textbf{q}}_{\varOmega ,\mathbb {S}}\) such that

$$\begin{aligned} A U - S U = F \end{aligned}$$
(21)

where \(A, S \in \mathbb {R}^{N \times N}\) and \(F \in \mathbb {R}^{N}\) denote the matrix representation of the streaming and scattering operators and load term, respectively. Source iteration simply refers to the technique of solving this linear system using the Richardson iteration: given \(U^0 \in \mathbb {R}^{N}\), find \(U^r \in \mathbb {R}^{N}\) such that

$$\begin{aligned} A U^r = S U^{r-1} + F, \end{aligned}$$
(22)

for \(r =1,2, \ldots \). It may be shown that this iteration converges to the solution of (21) under certain assumptions on the problem data. The advantage of this approach is that it avoids inverting the scattering matrix, which is typically dense and highly coupled in angle.

To investigate the structure of the matrix A, we introduce the following notation: for an angular element \(\kappa _{\tiny \mathbb {S}}\), \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), we define the local element basis by \(\{ \varphi _{\kappa _{\tiny \mathbb {S}}}^{i}\}_{i=1}^{|q_{\kappa _{\tiny \mathbb {S}}}|}\), where \(|q_{\kappa _{\tiny \mathbb {S}}}|\) denotes the dimension of the polynomial space defined on \(\kappa _{\tiny \mathbb {S}}\). Furthermore, write \(\mathbb {V}^\textbf{p}_{\varOmega }= \text{ span } \{\varphi _{\varOmega }^{i}\}_{i=1}^{N_{\varOmega }}\), \(N_\varOmega = \dim (\mathbb {V}^\textbf{p}_{\varOmega })\). Then, noting that the underlying DGFEM does not contain any communication terms between different angular elements, the matrix A has the natural nested block structure

$$\begin{aligned} A = \left[ \begin{array}{ccccc} D^1 &{} 0 \\ 0 &{} D^2 &{} 0 \\ &{} 0 &{} \ddots \\ &{}&{} &{} \ddots &{} 0 \\ &{}&{}&{} 0 &{} D^{|\mathcal {T}_{\mathbb {S}}|} \end{array} \right] , \quad \text { with }\quad D^n = \left[ \begin{array}{cccc} D^n_{1,1} &{} \dots &{} D^n_{1, |q_{\kappa _{\tiny \mathbb {S}}}|} \\ \vdots &{} \ddots &{} \vdots \\ D^n_{|q_{\kappa _{\tiny \mathbb {S}}}|,1} &{} \dots &{} D^n_{|q_{\kappa _{\tiny \mathbb {S}}}|,|q_{\kappa _{\tiny \mathbb {S}}}|} \end{array} \right] , \end{aligned}$$

where \(|\mathcal {T}_{\mathbb {S}}| = \mathop {\textrm{card}}\limits (\mathcal {T}_{\mathbb {S}})\) and, for \(n=1,2,\ldots ,|\mathcal {T}_{\mathbb {S}}|\), \( D^n_{i,j} = \int _{\kappa _{\tiny \mathbb {S}}} \varphi _{\kappa _{\tiny \mathbb {S}}}^i (\varvec{\mu }) \varphi _{\kappa _{\tiny \mathbb {S}}}^j (\varvec{\mu }) A_{\varvec{\mu }} \,d\varvec{\mu }, \) \(i,j = 1,2,\ldots ,|q_{\kappa _{\tiny \mathbb {S}}}|\), where \(A_{\varvec{\mu }} \in \mathbb {R}^{N_{\varOmega } \times N_{\varOmega }}\), with \((A_{\varvec{\mu }})_{i,j} = a_{\varvec{\mu }}(\phi _\varOmega ^j, \phi _\varOmega ^i )\), \(i,j = 1,2, \ldots , N_{\varOmega }\). Solving (22) therefore requires inverting each diagonal block \(D^n\), \(n=1,2,\ldots ,|\mathcal {T}_{\mathbb {S}}|\), which corresponds to solving a coupled system of spatial transport problems on each angular element.

By working once again as in Sect. 6.1, this algorithm can be made significantly more efficient. To enable this, we restrict the angular mesh to only consist of tensor-product elements, with local element spaces consisting of tensor-product polynomials. We can therefore define a basis on each angular element \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\) which satisfies the Lagrangian property with respect to a tensor-product Gauss-Legendre quadrature scheme, simply by using the tensor product of the 1D bases constructed above for the energy discretisation. Given the reference element \(\hat{\kappa }_{\tiny \mathbb {S}}\), let \(\{({\hat{\varvec{\mu }}}_q, {\hat{\omega }}_q)\}_{q=1}^{|q_{\kappa _{\tiny \mathbb {S}}}|}\) (where \(|q_{\kappa _{\tiny \mathbb {S}}}| = (q_{\kappa _{\tiny \mathbb {S}}}+1)^{d-1}\)) denote the tensor-product Gauss-Legendre quadrature scheme with \(q_{\kappa _{\tiny \mathbb {S}}}+1\) points in each direction. As in the 1D case, this scheme exactly integrates polynomials in the space \({\mathbb {Q}}_{2q_{\kappa _{\tiny \mathbb {S}}} + 1}(\hat{\kappa }_{\tiny \mathbb {S}})\).

On the reference element \(\hat{\kappa }_{\tiny \mathbb {S}}\), let \(\{{\hat{\varphi }}_i\}_{i=1}^{|q_{\kappa _{\tiny \mathbb {S}}}|}\) denote the Lagrangian basis for \({\mathbb {Q}}_{q_{\kappa _{\tiny \mathbb {S}}}}(\hat{\kappa }_{\tiny \mathbb {S}})\) constructed with respect to the Gauss-Legendre quadrature points \({\hat{\varvec{\mu }}}_q\), \(q=1,2,\ldots ,|q_{\kappa _{\tiny \mathbb {S}}}|\), which uniquely satisfies \({\hat{\varphi }}_i({\hat{\varvec{\mu }}}_j) = \delta _{ij}\), \(i,j = 1,2,\ldots ,|q_{\kappa _{\tiny \mathbb {S}}}|\). On each angular element \(\kappa _{\tiny \mathbb {S}}\), \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), we map the local basis defined on the reference element to \(\kappa _{\tiny \mathbb {S}}\) based on employing the mapping \(F_{\kappa _{\tiny \mathbb {S}}}\); more precisely, this yields the local basis \(\{\varphi ^i_{\kappa _{\tiny \mathbb {S}}} = {\hat{\varphi }}_i \circ F_{\kappa _{\tiny \mathbb {S}}}^{-1} \}_{i=1}^{|q_{\kappa _{\tiny \mathbb {S}}}|}\) on \(\kappa _{\tiny \mathbb {S}}\). Furthermore, the quadrature scheme on \(\kappa _{\tiny \mathbb {S}}\), \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), is given by \((\varvec{\mu }_q,\omega _q)_{q=1}^{|q_{\kappa _{\tiny \mathbb {S}}}|}\), where \(\varvec{\mu }_q = F_{\kappa _{\tiny \mathbb {S}}}({\hat{\varvec{\mu }}}_q)\), \(\omega _q = {\hat{\omega }}_q \mathcal {J}({\hat{\varvec{\mu }}}_q)\), \(q=1,2,\ldots ,|q_{\kappa _{\tiny \mathbb {S}}}|\), and \(\mathcal {J}\) denotes the square root of the determinant of the first fundamental form of the mapping \(F_{\kappa _{\tiny \mathbb {S}}}\). Hence, the mapped basis retains the Lagrangian property of the reference basis.

Using this quadrature to approximate the angular integrals in the first term on the left-hand side of (20), corresponding to the streaming operator, we deduce that

$$\begin{aligned} D^n \approx \left[ \begin{array}{ccccc} \omega _1 A_{\varvec{\mu }_1} &{} 0 &{} \\ 0 &{} \omega _2 A_{\varvec{\mu }_2} &{} \ddots \\ &{}\ddots &{} \ddots &{}0 \\ &{}&{}0 &{} \omega _{|q_{\kappa _{\tiny \mathbb {S}}}|} A_{\varvec{\mu }_{N_{|q_{\kappa _{\tiny \mathbb {S}}}|}}} \end{array}. \right] \end{aligned}$$

Consequently, with this approximation A becomes a block diagonal matrix formed from block diagonal matrices where the individual blocks correspond to a single spatial transport problem. Solving the source iteration system (22) therefore only requires the numerical solution of a set of independent spatial transport problems, one for each angular quadrature point, which may be performed in parallel.

Remark 6

We point out that similar ideas have also been exploited in [25, 32, 33], for example, to develop high-order discrete ordinates schemes, though there are a number of key differences in terms of the general methodology adopted here. Most notably, in this article we start from a variational framework, rather than directly considering a collocation method. Moreover, our basis functions are built from local mappings from a \((d-1)\)-dimensional reference element rather than piecewise spherical harmonics on the sphere. Although the philosophy of constructing interlinked basis functions and quadrature sets is similar, the Gaussian rules used here allow the method to be seamlessly interpreted as being both collocation-type (enabling an efficient discrete ordinates implementation) and variational (facilitating the stability analysis of Theorem 6 and the error estimates of Theorem 7). To the best of our knowledge, analogous quadrature sets are not available for general order piecewise spherical harmonic basis functions.

6.3 Full Algorithm

Combining the multigroup energy discretisation and the discrete ordinates angle discretisation described above, we arrive at the efficient algorithm for solving the problem presented in Algorithm 1. Here, we require a function GaussLegendre(\(\omega \),\(k+1\)) which provides the set of points within the one- or two-dimensional element \(\omega \) consisting of \(k+1\) points in each dimension, or the mapped analogue for an element on the spherical surface. The function weight is then used to obtain the quadrature weight associated with a given quadrature point. The notation parfor indicates a for loop where the individual iterations are independent of one another and may therefore be performed simultaneously and in parallel.

Algorithm 1
figure a

High order multigroup discrete ordinates implementation of the DGFEM scheme

We associate a solution vector \(U_{\varvec{\mu },E}\), containing degrees of freedom with respect to the basis \(\{\phi ^i_\varOmega \}_{i=1}^{N_{\varOmega }}\) of \(\mathbb {V}^\textbf{p}_{\varOmega }\), with each pair of angular quadrature points \(\varvec{\mu }\) and energy quadrature points \(E\) in the natural manner described above. The DGFEM solution \(u_h\) is therefore obtained by summing these solution vectors weighted by the space, angle and energy basis functions.

The general structure of the algorithm is to iterate through energy groups in order of decreasing energy, and apply the discrete ordinates algorithm within each group. We note that the solutions associated with all of the energy basis functions in a given energy group are necessarily coupled together through the scattering operator. This coupling is quite weak, however, and source iteration reduces this to alternating between two algorithmic steps. First, the scattering operator is evaluated (using the current solution within the energy group and the previously obtained solution from higher energy groups), which may be performed in parallel. Second, we solve the spatial transport problem associated with each angle and energy quadrature point. Again, these are independent problems which may be performed in parallel.

We note that this algorithm could be made more efficient by splitting up the evaluation of the scattering operator into intragroup and intergroup components as in (19), although we do not pursue this here to keep the presentation of the algorithm as simple as possible.

Remark 7

The convergence analysis of the source iteration linear solver outlined here is presented in our recent article [23]. Furthermore, in [23], we also study a modified source iteration solver for mono-energetic solvers, as well as a preconditioned GMRES method, where the preconditioner is selected based on employing the source iteration proposed here. This latter iterative solver is shown to be particularly effective in the low-energy photon scattering regime where traditional solvers may stagnate.

7 Numerical Results

In this section we present the results from a series of computational experiments designed to numerically investigate the asymptotic convergence behaviour of the proposed method for both polyenergetic and monoenergetic problems. The deal.II finite element library in [6] was used for the implementation of the method in these numerical examples.

7.1 Example 1: Polyenergetic Problem in 2D

In this example we consider the numerical approximation of the polyenergetic problem (1) posed in a two-dimensional spatial domain, i.e., \(d=2\), with a one-dimensional angular domain and a one-dimensional energy domain. To this end, the spatial domain is defined as \(\varOmega = (0,1)^2\) (in units of m) and the energy domain is \(\mathbb {E}= (500~\textrm{keV}, 1000~\textrm{keV})\). Furthermore, the macroscopic total absorption cross-section \(\alpha \) and the differential scattering cross-section \(\theta \) are chosen to mimic Compton scattering of photons travelling through water, see [16], albeit in a two-dimensional setting. This is achieved by setting \(\alpha = 0\) and

$$\begin{aligned} \theta ({\textbf{x}},\varvec{\mu }'\rightarrow \varvec{\mu },E'\rightarrow E) = \rho ({\textbf{x}}) \sigma _{KN}(E',E,\varvec{\mu }\cdot \varvec{\mu }') \delta (F(E',E,\varvec{\mu }\cdot \varvec{\mu }')), \end{aligned}$$

where \(\rho ({\textbf{x}}) \approx 3.34281\times 10^{29}\)e/m\(^{3}\) is the electron density of water, and \(\sigma _{KN}\) is the Klein-Nishina differential scattering cross-section, see [16], defined by

$$\begin{aligned} \sigma _{KN}(E,E',\cos \phi ) = \frac{1}{2} r_e^2 \left( \frac{E'}{E}\right) ^2 \left( \frac{E'}{E} + \frac{E}{E'} - \sin ^2\phi \right) , \end{aligned}$$

with \(r_e \approx 2.81794\times 10^{-15}\)m. Further, \(\delta \) denotes the Dirac delta distribution and

$$\begin{aligned} F(E,E',\cos \phi ) = E' - \frac{E}{1+\frac{E}{511}(1-\cos \phi )}, \end{aligned}$$

is used to enforce the conservation of particle momentum. Finally, f and \(g_\textrm{D}\) are selected so that the analytical solution to (1) is given by

$$\begin{aligned} u({\textbf{x}},\varvec{\mu },E) = \textrm{e}^{ -\left( \nicefrac {E \varvec{\mu }\cdot {\textbf{x}}}{E_{max}}\right) ^2} \ \textrm{e}^{ -(1-(\nicefrac {E}{E_{max}})^2)^{-1}}, \end{aligned}$$

where \(E_{max} = 1000\) keV.

Fig. 1
figure 1

Example 1: Convergence of the DGFEM under h-refinement for \(p=0,1,2\). Here, the DGFEM-norm is defined in (12)

We investigate the asymptotic behaviour of the proposed DGFEM on a sequence of successively finer meshes for different values of the polynomial degrees. To this end, the spatial meshes are (non-nested) polygonal grids generated using the Polymesher software package [45]. As noted in Sect. 3.2 the angular meshes are formed by mapping uniform interval elements, defined on the boundary of the square \((-1,1)^2\) to the unit circle \(\mathbb {S}\). We set polynomial degrees \(p_{\kappa _{\tiny \varOmega }} = p\) for all \(\kappa _{\tiny \varOmega }\in \mathcal {T}_{\varOmega }\), \(q_{\kappa _{\tiny \mathbb {S}}} = p\) for all \(\kappa _{\tiny \mathbb {S}}\in \mathcal {T}_{\mathbb {S}}\), and \(r_{\kappa _g} = p\) for all \(\kappa _g\in \mathcal {T}_{\mathbb {E}}\). Figure 1 shows the error, measured in terms of both the \(L_2(\mathcal {D})\) and DGFEM-norm, against the number of degrees of freedom (denoted by N) in the underlying finite element space \(\mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\). Writing \(d_{\mathcal {D}}\) to denote the dimension of the domain \(\mathcal {D}=\varOmega \times \mathbb {S}\times \mathbb {E}\) (here, \(d_{\mathcal {D}}=4\)), we clearly observe that \(\Vert u-u_h \Vert _{L_2(\mathcal {D})} \sim {{\mathcal {O}}}(N^{\nicefrac {(p+1)}{d_\mathcal {D}}})\) as the space-angle-energy mesh \(\mathcal {T}_{}\) is uniformly refined for each fixed p. Equivalently, since \(h \sim N^{-1/d_\mathcal {D}}\), where h denotes the meshsize of \(\mathcal {T}_{}\), we note that \(\Vert u-u_h \Vert _{L_2(\mathcal {D})} \sim {{\mathcal {O}}}(h^{p+1})\) as h tends to zero for each fixed p. This is the expected optimal rate of convergence with respect to the \(L_2(\mathcal {D})\)-norm, though this rate of convergence for the DGFEM approximation of first-order hyperbolic PDEs is not guaranteed on general meshes, for further details see [38] and the remarks in [10]. Secondly, from Fig. 1 we also observe that for fixed p, \(p=0,1\), that the DGFEM-norm of the error behaves like \({{\mathcal {O}}}(N^{\nicefrac {(p+1/2)}{d_\mathcal {D}}})\), or equivalently \({{\mathcal {O}}}(h^{p+1/2})\), as the meshsize h tends to zero. This is in full agreement with Theorem 7 (see also Remark 5). In the case when \(p=2\), we observe that \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u-u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}\) converges at a slightly faster rate as h tends to zero; despite the large number of degrees of freedom in \(\mathbb {V}^{\textbf{p},\textbf{q},\textbf{r}}_{h}\), the meshes are relatively coarse and hence we expect that we are still in the pre-asymptotic regime.

7.2 Example 2: Monoenergetic Problem in 3D

Fig. 2
figure 2

Example 2: Convergence of the method under h-refinement for \(p=0,1,2\). Here, the DGFEM-norm is defined in (12)

We now consider the numerical approximation of a simplified monoenergetic variant of the problem (1), where the energy is assumed to remain constant, posed in a three-dimensional spatial domain with a two-dimensional angular domain. To this end, we let \(\varOmega = (0,1)^3\), \(\alpha =1\), \(\theta ({\textbf{x}},\varvec{\mu }'\rightarrow \varvec{\mu }) = \nicefrac {1}{|\mathbb {S}^{2}|} = \nicefrac {1}{4\pi }\), \(\beta ({\textbf{x}}) = \int _{\mathbb {S}} \theta ({\textbf{x}},\varvec{\mu }\rightarrow \varvec{\mu }') \ d \varvec{\mu }' = 1\), and select f and \(g_\textrm{D}\) so that the analytical solution of the underlying problem is given by

$$\begin{aligned} u({\textbf{x}},\varvec{\mu }) = \cos (4\phi )\left( x\cos y+y\sin x \right) , \end{aligned}$$

where \(\phi = \arccos \varvec{\mu }_3\) denotes the polar angle of \(\varvec{\mu }\).

Figure 2 shows the convergence of the DGFEM using meshes comprising of uniform cubes in the spatial domain \(\varOmega \) and mapped quadrilateral elements in the angular domain \(\mathbb {S}\). As before, we plot the error measured in both the \(L_2(\mathcal {D})\)-norm and the DGFEM-norm. As in the previous example we observe that \(\Vert u-u_h \Vert _{L_2(\mathcal {D})} \sim {{\mathcal {O}}}(N^{\nicefrac {(p+1)}{d_\mathcal {D}}})\), \(d_\mathcal {D}=5\), or equivalently \(\Vert u-u_h \Vert _{L_2(\mathcal {D})} \sim {{\mathcal {O}}}(h^{p+1})\) as h tends to zero for each fixed value of the polynomial degree p, \(p=0,1,2\). Moreover, we observe that \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u-u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG} \sim {{\mathcal {O}}}(N^{\nicefrac {(p+1/2)}{d_\mathcal {D}}})\) (\(\sim {{\mathcal {O}}}(h^{p+1/2})\)) for \(p=0,1\), as h tends to zero. As in the previous example, we again observe a slighter faster rate of convergence of \({|\hspace{-0.85358pt}|\hspace{-0.85358pt}|{u-u_h}|\hspace{-0.85358pt}|\hspace{-0.85358pt}|}_\textrm{DG}\) for \(p=2\), which we attribute to being in the pre-asymptotic regime.

8 Conclusions

We have introduced a unified hp-version DGFEM for the numerical approximation of the linear Boltzmann transport problem. We have proven stability and convergence results for the method, through an inf-sup condition in an appropriate norm, and shown how it may be efficiently implemented as a high-order version of the widely used multigroup discrete ordinates method. The unified DGFEM formulation in the space, angle and energy domains therefore provides a simple and flexible way of computing arbitrary order approximations of solutions to the Boltzmann transport problem for the first time. General classes of polytopic elements are admitted for the design of the spatial computational mesh, which facilitates the accurate and efficient representation of complex geometries. Numerical experiments have been presented which confirm the theoretical results derived in this paper. Further work will include generalizing the scheme to include more general boundary conditions, such as reflective conditions, for example, utilizing automatic hp-refinement mesh adaptation, and investigating problems arising in medical physics applications. In addition, the extension of the proposed linear solver to the case when particles gain energy after scattering off the medium, which can occur in applications such as thermal neutrons, for example, cf. [36, 43], will also be considered