## Abstract

This is the third paper in a series in which we develop machine learning (ML) moment closure models for the radiative transfer equation. In our previous work (Huang et al. in J Comput Phys 453:110941, 2022), we proposed an approach to learn the gradient of the unclosed high order moment, which performs much better than learning the moment itself and the conventional \(P_N\) closure. However, while the ML moment closure has better accuracy, it is not able to guarantee hyperbolicity and has issues with long time stability. In our second paper (Huang et al., in: Machine learning moment closure models for the radiative transfer equation II: enforcing global hyperbolicity in gradient based closures, 2021. arXiv:2105.14410), we identified a symmetrizer which leads to conditions that enforce that the gradient based ML closure is symmetrizable hyperbolic and stable over long time. The limitation of this approach is that in practice the highest moment can only be related to four, or fewer, lower moments. In this paper, we propose a new method to enforce the hyperbolicity of the ML closure model. Motivated by the observation that the coefficient matrix of the closure system is a lower Hessenberg matrix, we relate its eigenvalues to the roots of an associated polynomial. We design two new neural network architectures based on this relation. The ML closure model resulting from the first neural network is weakly hyperbolic and guarantees the physical characteristic speeds, i.e., the eigenvalues are bounded by the speed of light. The second model is strictly hyperbolic and does not guarantee the boundedness of the eigenvalues. Several benchmark tests including the Gaussian source problem and the two-material problem show the good accuracy, stability and generalizability of our hyperbolic ML closure model.

### Similar content being viewed by others

## Data Availability

Enquiries about data availability should be directed to the authors.

## References

Alldredge, G.W., Hauck, C.D., OLeary, D.P., Tits, A.L.: Adaptive change of basis in entropy-based moment closures for linear kinetic equations. J. Comput. Phys.

**258**, 489–508 (2014)Alldredge, G.W., Hauck, C.D., Tits, A.L.: High-order entropy-based closures for linear transport in slab geometry ii: a computational study of the optimization problem. SIAM J. Sci. Comput.

**34**(4), B361–B391 (2012)Alldredge, G.W., Li, R., Li, W.: Approximating the \( {M}_2 \) method by the extended quadrature method of moments for radiative transfer in slab geometry. Kinetic Relat. Models

**9**(2), 237 (2016)Amos, B., Xu, L., Kolter, J.Z.: Input convex neural networks. In: International Conference on Machine Learning, pp. 146–155. PMLR (2017)

Bois, L., Franck, E., Navoret, L., Vigon, V.: A neural network closure for the Euler-Poisson system based on kinetic simulations. Preprint arXiv:2011.06242 (2020)

Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci.

**113**(15), 3932–3937 (2016)Cai, Z., Fan, Y., Li, R.: Globally hyperbolic regularization of Grad’s moment system in one-dimensional space. Commun. Math. Sci.

**11**(2), 547–571 (2013)Cai, Z., Fan, Y., Li, R.: Globally hyperbolic regularization of Grad’s moment system. Commun. Pure Appl. Math.

**67**(3), 464–518 (2014)Cai, Z., Fan, Y., Li, R.: On hyperbolicity of 13-moment system. Kinetic Relat. Models

**7**(3), 415 (2014)Chandrasekhar, S.: On the radiative equilibrium of a stellar atmosphere. Astrophys. J.

**99**, 180 (1944)Crockatt, M.M., Christlieb, A.J., Garrett, C.K., Hauck, C.D.: An arbitrary-order, fully implicit, hybrid kinetic solver for linear radiative transport using integral deferred correction. J. Comput. Phys.

**346**, 212–241 (2017)Crockatt, M.M., Christlieb, A.J., Garrett, C.K., Hauck, C.D.: Hybrid methods for radiation transport using diagonally implicit runge-kutta and space-time discontinuous galerkin time integration. J. Comput. Phys.

**376**, 455–477 (2019)Elouafi, M., Hadj, A.D.A.: A recursion formula for the characteristic polynomial of hessenberg matrices. Appl. Math. Comput.

**208**(1), 177–179 (2009)Fan, Y., Li, R., Zheng, L.: A nonlinear hyperbolic model for radiative transfer equation in slab geometry. SIAM J. Appl. Math.

**80**(6), 2388–2419 (2020)Fan, Y., Li, R., Zheng, L.: A nonlinear moment model for radiative transfer equation in slab geometry. J. Comput. Phys.

**404**, 109128 (2020)Frank, M., Hauck, C.D., Olbrant, E.: Perturbed, entropy-based closure for radiative transfer. Preprint arXiv:1208.0772 (2012)

Grad, H.: On the kinetic theory of rarefied gases. Commun. Pure Appl. Math.

**2**(4), 331–407 (1949)Han, J., Jentzen, A., Weinan, E.: Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci.

**115**(34), 8505–8510 (2018)Han, J., Ma, C., Ma, Z., Weinan, E.: Uniformly accurate machine learning-based hydrodynamic models for kinetic equations. Proc. Natl. Acad. Sci.

**116**(44), 21983–21991 (2019)Hauck, C., McClarren, R.: Positive \({P_N}\) closures. SIAM J. Sci. Comput.

**32**(5), 2603–2626 (2010)Hauck, C.D.: High-order entropy-based closures for linear transport in slab geometry. Commun. Math. Sci.

**9**(1), 187–205 (2011)He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

Huang, J., Cheng, Y., Christlieb, A.J., Roberts, L.F.: Machine learning moment closure models for the radiative transfer equation I: directly learning a gradient based closure. J. Comput. Phys.

**453**, 110941 (2022)Huang, J., Cheng, Y., Christlieb, A.J., Roberts, L.F., Yong, W.-A.: Machine learning moment closure models for the radiative transfer equation II: enforcing global hyperbolicity in gradient based closures. Preprint arXiv:2105.14410 (2021)

Huang, J., Ma, Z., Zhou, Y., Yong, W.-A.: Learning thermodynamically stable and Galilean invariant partial differential equations for non-equilibrium flows. J. Non-Equilibrium Thermodyn.

**46**, 355–70 (2021)Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

Jiang, G.-S., Shu, C.-W.: Efficient implementation of weighted ENO schemes. J. Comput. Phys.

**126**(1), 202–228 (1996)Klose, A.D., Netz, U., Beuthan, J., Hielscher, A.H.: Optical tomography using the time-independent equation of radiative transfer-part 1: forward model. J. Quant. Spectrosc. Radiat. Transf.

**72**(5), 691–713 (2002)Koch, R., Becker, R.: Evaluation of quadrature schemes for the discrete ordinates method. J. Quant. Spectrosc. Radiat. Transf.

**84**(4), 423–435 (2004)Laboure, V.M., McClarren, R.G., Hauck, C.D.: Implicit filtered \({P}_{N}\) for high-energy density thermal radiation transport using discontinuous galerkin finite elements. J. Comput. Phys.

**321**, 624–643 (2016)Larsen, E., Morel, J.: Asymptotic solutions of numerical transport problems in optically thick, diffusive regimes ii. J. Comput. Phys.

**83**(1) (1989)Levermore, C.: Relating eddington factors to flux limiters. J. Quant. Spectrosc. Radiat. Transf.

**31**(2), 149–160 (1984)Levermore, C.D.: Moment closure hierarchies for kinetic theories. J. Stat. Phys.

**83**(5), 1021–1065 (1996)Li, R., Li, W., Zheng, L.: Direct flux gradient approximation to close moment model for kinetic equations. Preprint arXiv:2102.07641 (2021)

Ma, C., Zhu, B., Xu, X.-Q., Wang, W.: Machine learning surrogate models for Landau fluid closure. Phys. Plasmas

**27**(4), 042502 (2020)Maulik, R., Garland, N.A., Burby, J.W., Tang, X.-Z., Balaprakash, P.: Neural network representability of fully ionized plasma fluid model closures. Phys. Plasmas

**27**(7), 072106 (2020)McClarren, R.G., Hauck, C.D.: Robust and accurate filtered spherical harmonics expansions for radiative transfer. J. Comput. Phys.

**229**(16), 5597–5614 (2010)Murchikova, E., Abdikamalov, E., Urbatsch, T.: Analytic closures for M1 neutrino transport. Mon. Not. R. Astron. Soc.

**469**(2), 1725–1737 (2017)Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. Preprint arXiv:1912.01703 (2019)

Pomraning, G.C.: The Equations of Radiation Hydrodynamics. Pergamon Press, Oxford (1973)

Porteous, W.A., Laiu, M.P., Hauck, C.D.: Data-driven, structure-preserving approximations to entropy-based moment closures for kinetic equations. Preprint arXiv:2106.08973 (2021)

Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.

**378**, 686–707 (2019)Schotthöfer, S., Xiao, T., Frank, M., Hauck, C.D.: A structure-preserving surrogate model for the closure of the moment system of the Boltzmann equation using convex deep neural networks. Preprint arXiv:2106.09445 (2021)

Scoggins, J.B., Han, J., Massot, M.: Machine learning moment closures for accurate and efficient simulation of polydisperse evaporating sprays. In: AIAA Scitech 2021 Forum, p. 1786 (2021)

Serre, D.: Systems of Conservation Laws 1: Hyperbolicity, Entropies, Shock Waves. Cambridge University Press (1999)

Shu, C.-W., Osher, S.: Efficient implementation of essentially non-oscillatory shock-capturing schemes. J. Comput. Phys.

**77**(2), 439–471 (1988)Szeg, G.: Orthogonal Polynomials, vol. 23. American Mathematical Soc. (1939)

Wang, L., Xu, X., Zhu, B., Ma, C., Lei, Y.-A.: Deep learning surrogate model for kinetic Landau-fluid closure with collision. AIP Adv.

**10**(7), 075108 (2020)Yong, W.-A.: Basic aspects of hyperbolic relaxation systems. In: Advances in the Theory of Shock Waves, pp. 259–305. Springer (2001)

Zhu, Y., Hong, L., Yang, Z., Yong, W.-A.: Conservation-dissipation formalism of irreversible thermodynamics. J. Non-Equilib. Thermodyn.

**40**(2), 67–74 (2015)

## Acknowledgements

We thank Michael M. Crockatt in Sandia National Laboratories for providing numerical solver for the radiative transfer equation. We acknowledge the High Performance Computing Center (HPCC) at Michigan State University for providing computational resources that have contributed to the research results reported within this paper. JH would like to thank Professor Wen-An Yong in Tsinghua University for many fruitful discussions. This work has been assigned a document release number LA-UR-21-28626.

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

### Competing interests

The authors have not disclosed any competing interests.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yingda Cheng: Research is supported by NSF Grants DMS-2011838 and AST-2008004.

Andrew J. Christlieb: Research is supported by AFOSR Grants FA9550-19-1-0281 and FA9550-17-1-0394; NSF Grants DMS-1912183 and AST-2008004; and DoE Grant DE-SC0017955.

Luke F. Roberts: Research is supported by NSF Grant AST-2008004; and DoE Grant DE-SC0017955.

## Appendix A: Collections of Proofs

### Appendix A: Collections of Proofs

In this appendix, we collect some lemma and proofs. We start with a lemma which characterize the eigenspace of unreduced lower Hessenberg matrix.

### Lemma A.1

For an unreduced lower Hessenberg matrix \(H=(h_{ij})_{n\times n}\), the geometric multiplicity of any eigenvalue \(\lambda \) is 1 and the corresponding eigenvector is \((q_0(\lambda ), q_1(\lambda ), \ldots , q_{n-1}(\lambda ))^T\). Here \(\{q_i\}_{0\le i\le n-1}\) is the associated polynomial sequence defined in (2.1).

### Proof

By Definition 2.1, we have that \(h_{ij}=0\) for \(j>i+1\) and \(h_{i,i+1}\ne 0\) for \(i=1,\ldots ,n-1\). Let \(r=(r_1,r_2,\ldots ,r_n)\) be an eigenvector associated with \(\lambda \). We write \(Ar = \lambda r\) as an equivalent component-wise formulation:

and

Here we used the fact that \(h_{ij}=0\) for \(j>i+1\). Since \(h_{i,i+1}\ne 0\) for \(i=1,\ldots ,n-1\), (A.1) is equivalent to

From (A.3), we deduce that \(r_1\ne 0\), otherwise \(r_2=\cdots =r_n=0\). Moreover, \(r_i\) for \(i = 2,\ldots ,n\) are uniquely determined by \(r_1\). Therefore, the geometric multiplicity of \(\lambda \) is 1. Moreover, without loss of generality, we take \(r_1=1\). In this case, *r* is exactly the same with \((q_0(\lambda ),q_1(\lambda ),\ldots ,q_{n-1}(\lambda ))^T\). Here \(\{q_i\}_{0\le i\le n-1}\) is the associated polynomial sequence defined in (2.1). \(\square \)

### Lemma A.2

Let \(H = (h_{ij})_{n\times n}\) be an unreduced lower Hessenberg matrix and \(\{q_i\}_{0\le i\le n}\) is the associated polynomial sequence with *H*. If \(\lambda \) is an eigenvalue of *H*, then \(\lambda \) is a root of \(q_n\).

### Proof

From Lemma A.1, we have the geometric multiplicity of \(\lambda \) is 1 and the corresponding eigenvector \(\varvec{q}_{n-1}(\lambda ) = (q_0(\lambda ),q_1(\lambda ),\ldots ,q_{n-1}(\lambda ))^T\), i.e. \(H \varvec{q}_{n-1}(\lambda ) = \lambda \varvec{q}_{n-1}(\lambda )\). Plugging \(\lambda \) into (2.2), we immediately have \(q_n(\lambda )=0\), i.e., \(\lambda \) is a root of \(q_n\). \(\square \)

### 1.1 A.1: Proof of Theorem 2.4

### Proof

We start by proving that condition 1 and condition 2 are equivalent. First, it is easy to see that condition 2 implies condition 1. We only need to prove that condition 1 implies condition 2. Since *A* is real diagonalizable, all the eigenvalues of *A* are real. Moreover, for any eigenvalue of *A*, the geometric multiplicity is equal to its algebraic multiplicity. By Lemma A.1, the geometric multiplicity of any eigenvalue of an unreduced lower Hessenberg matrix is 1. Therefore, any eigenvalue of *A* has algebraic multiplicity of 1, i.e. all the eigenvalues of *A* are distinct.

Next, we prove that the equivalence of condition 2 and condition 3. It is easy to see that, condition 3 implies condition 2 from Theorem 2.3, and condition 2 implies condition 3 from Lemma A.2. This completes the proof. \(\square \)

### 1.2 A.2: Proof of Lemma 3.3

### Proof

We start from the definition of Legendre polynomials by the generating function:

Introduce the variable *s* such that

which is equivalent to

Therefore, we have

Define

By comparing the coefficients of \(t^n\) on both sides of (A.7), we find that \(a_{m,n} = 0\) if \(n>m\) or *m*, *n* has different parity. For \(n = m - 2k\) for some integer \(k\ge 0\), we have

By introducing the variable \(\tau = s^2\) or equivalently \(s = \tau ^{\frac{1}{2}}\), we have

where in the fourth equality we used the relation between the gamma function and the beta function:

and in the last equality we used the properties of the gamma function: for any integer \(n\ge 0\)

Lastly, using the orthogonality relation \(\int _{-1}^1 P_m(x)P_n(x) = \frac{2}{2m+1}\delta _{m,n}\), we have for any integer \(m\ge 0\),

This completes the proof. \(\square \)

## Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

## About this article

### Cite this article

Huang, J., Cheng, Y., Christlieb, A.J. *et al.* Machine Learning Moment Closure Models for the Radiative Transfer Equation III: Enforcing Hyperbolicity and Physical Characteristic Speeds.
*J Sci Comput* **94**, 7 (2023). https://doi.org/10.1007/s10915-022-02056-7

Received:

Revised:

Accepted:

Published:

DOI: https://doi.org/10.1007/s10915-022-02056-7