Introduction

Multiscale phonon transport from the ballistic limit to the diffusive extreme is ubiquitous in technologically important materials and applications, such as thermoelectrics1,2 and miniaturized electronic systems like central processing units and flat-panel display3,4. A better understanding of thermal transport mechanisms in a large span of length scales is critical to engineering such materials and devices to achieve better properties1 and performances5. Phonon Boltzmann transport equation (BTE)6,7 is a method capable of modeling phonon transport from ballistic to diffusive regimes, and a large number of research efforts have been made to devise numerical solvers for it8. However, due to the limited computational efficacy of existing methods, phonon BTE faces difficulties when applied to complicated problems involving mode-resolved phonon properties, high spatial dimensions, and large temperature non-equilibrium. It is thus desirable to develop an accurate and efficient mode-resolved BTE solver for predicting heat conduction particularly under various temperature ranges and length scales, where phonons can follow very different transport mechanisms9,10.

Due to the challenges related to computational cost, apart from the commonly adopted assumptions like single-mode relaxation time approximation and isotropic phonon dispersion6, the temperature difference is usually assumed to be sufficiently small (relative to a reference temperature) to simplify the computations11,12,13,14. With small temperature differences, the phonon equilibrium deviational distribution can be linearly approximated by temperature15,16, and the relaxation time is often treated as temperature independent and thus spatially invariant. Although success has been achieved in investigating nanoscale size effects for phonon transport under these assumptions17,18,19, the results and conclusions cannot be simply generalized to the cases with large temperature differences, since the phonon relaxation time or mean free path does depend on the temperature9,10. Even in the fully diffusive limit, it is well known that as the local temperature changes drastically, the thermal conductivity can vary across the system domain. Moreover, under large temperature gradients, phonon transport can be in different regimes (diffusive or ballistic) for a given phonon frequency and polarization at different spatial locations20. In many applications, hotspots with temperatures much higher than the average system temperature can emerge, such as those in laser material processing21 and power electronics22,23, especially at cryogenic temperatures24. As a result, it is necessary to develop the capability to model the effects of large temperature differences in phonon BTE.

Despite the necessity of developing a robust solver for phonon BTE under arbitrary temperature differences, it is a challenging task as the BTE can be multiscale in both the frequency and spatial domains. The temperature-dependent relaxation time further adds to the difficulty of solving phonon BTE given the high dimensionality of this partial differential equation (PDE). Numerical methods have been proposed for this problem, such as the Monte Carlo (MC) method25,26,27 and deterministic discretization-based methods28,29,30. However, traditional MC methods suffer from statistical errors and become inefficient at small Knudsen numbers (Kn) due to its restrictions on time step and grid size26,31. While variance reduction techniques have been employed to enable fast MC simulations, they are only suitable for problems with small deviations from the equilibrium, and thus the computational speedup can only be achieved under near-equilibrium conditions (i.e., small temperature differences)11,32. As a widely used deterministic solver, discrete ordinate method (DOM) discretizes the angular space into small solid angles to capture the non-equilibrium phonon distribution. However, DOM and its variants usually converge slowly in the diffusive regime and require large memory33. Recently, a finite-volume discrete unified gas kinetic scheme (DUGKS)34 has been developed for arbitrary temperature difference, but the explicit scheme is known to be restricted by the Courant-Friedrichs-Lewy condition and not efficient for real three-dimensional steady-state problems. In general, few of these methods are both accurate and efficient in predicting phonon heat conduction under arbitrary temperature gradients.

Machine learning-based techniques started to play a role in studying and predicting physical properties in the past decade35,36,37,38,39,40. Deep learning has shown great potential in solving high-dimensional PDEs to describe the unknown or unrepresented physics41,42,43,44,45. Recently, we have developed a deep neural network (DNN) framework for solving stationary mode-resolved phonon BTE46. The model can be trained by minimizing the BTE residuals to obey physical laws governed by the BTE without the need of any labeled data. Such a physics-informed neural network (PINN) can return accurate results in the domain of interest very efficiently. Different from other numerical methods, PINNs are trained to approximate a high-dimensional solution function of the phonon BTE by leveraging its capability as a universal function approximator47. The evaluation of such trained models can be very fast as the feedforward algorithm only involves a few matrix multiplications. Moreover, parametric learning has also been enabled by treating system parameters (e.g., system size) as additional inputs besides mode-resolved phonon properties, providing a significant advantage of investigating effects of parameters like characteristic length scale, which is a key to determining the ballistic and diffusive nature of the phonon transport process. However, as the first demonstration of PINN for phonon BTE, small temperature difference was still assumed46. To make this efficient tool more generally applicable, it is necessary to extend the PINN model so that it can handle problems with large temperature gradients.

In this work, we develop a data-free PINN scheme for solving stationary mode-resolved phonon BTE with arbitrary temperature difference. This scheme uses the temperature-dependent relaxation times and learns the solutions by minimizing the residuals of the governing equations and boundary conditions. Numerical experiments are conducted to validate the model with up to three spatial dimensions. We show that both the length scale and boundary temperature difference can be used as input variables to learn BTE solutions in parameterized spaces, so that a single training can enable the model to be used for evaluating thermal transport at any length scale or temperature difference. We also confirm the effects of large temperature variations, which are difficult to capture in conventional numerical methods. The proposed method performs well in accuracy and efficiency, providing a powerful tool for simulating device-level phonon heat conduction.

Results

Phonon Boltzmann transport equation

Under the single-mode relaxation time approximation, the mode-resolved phonon BTE at the steady state can be written as6,

$${\bf{v}}\cdot \nabla f=\frac{{f}^{{\rm{eq}}}(T)-f}{\tau (T)},$$
(1)

where \(f=f({\bf{x}},{\bf{s}},k,p)\) (or \(f({\bf{x}},{\bf{s}},\omega ,p)\)) is the phonon distribution function dependent on the spatial vector x, directional unit vector \({\bf{s}}=({\rm{cos}}\theta ,\,{\rm{sin}}\theta {\rm{cos}}\varphi ,\,{\rm{sin}}\theta {\rm{sin}}\varphi )\) (θ is the polar angle and φ is the azimuthal angle), wave number k (or angular frequency \(\omega =\omega (k,\,p)\)) and polarization p, and v is the phonon group velocity. \({f}^{{\rm{eq}}}\) represents the phonon equilibrium distribution following the Bose-Einstein distribution,

$${f}^{{\rm{eq}}}(\omega ,p,T)=1/({e}^{\frac{\hslash \omega }{{k}_{{\rm{B}}}T}}-1),$$
(2)

where \(\hslash\) is the reduced Planck’s constant, and kB is the Boltzmann constant. It is noted that in Eq. (1) the relaxation time \(\tau =\tau (\omega ,p,T)\) also depends on the local temperature T, meaning that τ changes spatially across the system for a given phonon frequency and polarization.

In the case of a system without any internal heat source, we have a physical constraint that the divergence of the heat flux q must be zero, which can be obtained by integrating the energy-based form of Eq. (1) over the solid angle space (Ω) and the frequency space (ω, p)

$$\nabla \cdot {\bf{q}}=\sum\limits_{p}{\int_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}}{\int _{4{\rm{\pi }}}}\hslash \omega D\frac{{f}^{{\rm{eq}}}(T)-f}{\tau (T)}d\Omega d\omega =0,$$
(3)

where the heat flux is

$${\bf{q}}=\sum\limits_{p}{\int_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}}{\int_{4{\rm{\pi }}}}{\bf{v}}\hslash \omega Dfd\Omega d\omega ,$$
(4)

with \(D=D(\omega ,p)\) and \({\omega}_{{\mathrm{max}},{\mathrm{p}}}\) being the phonon density of states and maximum frequency, respectively.

PINNs for stationary phonon BTE with arbitrary temperature difference

For a multiscale thermal transport problem at the steady state, the physical constraints can be expressed as

$${\mathcal{R}}(f({\bf{x}},{\bf{s}},k,p,{\boldsymbol{\mu}}),T):=\left\{\begin{array}{c}{\bf{v}}\cdot \nabla f-\frac{{f}^{{\rm{eq}}}(T)-f}{\tau (T)}=0\,\\ \sum\limits_{p}{\int_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}}{\int_{4{\rm{\pi }}}}\hslash \omega D\frac{{f}^{{\rm{eq}}}(T)-f}{\tau (T)}d\Omega\, d\omega =0,\,\end{array}{\bf{x}},{\bf{s}},k,p\in {\Gamma } ,\, {\boldsymbol{\mu}} \in {{\mathbb{R}}}^{d}.\right.$$
(5)

Here, the phonon distribution f is a function of variables in domain Γ and additional parameters μ. The solution of f can be uniquely determined under certain boundary conditions,

$${ {\mathcal{B}} }_{i}({\bf{x}},\,{\bf{s}},\,k,\,p,\,f,{\boldsymbol{\mu }})=0,\,{\bf{x}},{\bf{s}},k,p\in {\Gamma }_{{\rm{b}}},\,{\boldsymbol{\mu}} \in {{{\mathbb{R}}}}^{d},$$
(6)

where \({ {\mathcal B} }_{i}\) are the boundary condition operators, and Γb denotes the boundary region. In the “Methods” section, we show three typical boundary conditions encountered in phonon BTE, including the isothermal boundary, the diffusely reflecting boundary, and the periodic boundary.

For fast predictions of steady-state multiscale thermal transport with arbitrary temperature difference, a PINN model is developed as depicted in Fig. 1. The input layer is composed of x, s, k, p, and parameters of interest μ. Parameter μ in this work is set to be either length scale L or boundary temperature difference ∆T. We use two fully connected DNNs to approximate the equilibrium part \({f}^{{\rm{eq}}}(T)\) and the non-equilibrium part \({f}^{{\rm{neq}}}=f-{f}^{{\rm{eq}}}(T)\) of the phonon distribution. Each sub-network maps the inputs to a target output, through several layers of neurons comprising affine linear transformations and scalar nonlinear activation functions. Specifically, the output from one DNN is the equilibrium temperature T, which determines \({f}^{{\rm{eq}}}(T)\) and \(\tau (T)\) accordingly. The other output is \({f}^{{\rm{neq}}}\), and we combine it with \({f}^{{\rm{eq}}}(T)\) to obtain the total phonon distribution function f. While the loss function can be explicitly defined with the residuals of Eqs. (5) and (6), it is difficult to directly minimize the second integral in Eq. (5) as proper nondimensionalization must be performed to evaluate it relative to some appropriate unit. Inspired by the way of linearly approximating \({f}^{{\rm{eq}}}\) under small temperature difference, an additional shallow neural network (NN) with only one hidden layer is pretrained to generate a scaling factor \(\beta (T)\) such that we have

$${f}^{{\rm{eq}}}(T)\approx {f}^{{\rm{eq}}}({T}_{{\rm{ref}}})+\beta (T)(T-{T}_{{\rm{ref}}}),$$
(7)

where Tref is the reference temperature. Same as \({f}^{{\rm{eq}}}(T)\), the scaling factor \(\beta (T)\) is also implicitly dependent on (ω, p). It is noted that this factor is reduced to \(\frac{\partial {f}^{{\rm{eq}}}}{\partial T}\) under the assumption of small temperature differences. Although Eq. (7) is still a nonlinear approximation of \({f}^{{\rm{eq}}}\) by T, substituting it into Eq. (3) allows us to close the system of equations. If we denote T in the linear part of Eq. (7) as T*and plug Eq. (7) into Eq. (3), we have

$${T}^{\ast }={T}_{{\rm{ref}}}+\frac{1}{4{\rm{\pi }}}\left(\sum\limits_{p}{\int_{0}^{{\omega }_{{\mathrm{max}},{\mathrm{p}}}}}{\int_{4\pi}}\hslash \omega D\frac{f-{f}^{{\rm{eq}}}({T}_{{\rm{ref}}})}{\tau (T)}d\Omega \, d\omega \right)\times {\left(\sum\limits_{p}{\int_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}}\frac{\hslash \omega D\beta (T)}{\tau (T)}d\omega \right)}^{-1}.$$
(8)
Fig. 1: A schematic of the PINN framework for solving stationary phonon BTE with arbitrary temperature differences.
figure 1

Two DNNs are employed to approximate the temperature (T) and non-equilibrium (\({f}^{{\rm{neq}}}\)) parts of the phonon distribution function, respectively. Inputs include spatial vector x, directional unit vector \({\bf{s}}=({\rm{cos}}\theta ,\,{\rm{sin}}\theta {\rm{cos}}\varphi ,\,{\rm{sin}}\theta {\rm{sin}}\varphi )\) (θ is the polar angle and φ is the azimuthal angle), wave number k and polarization p. μ represents additional parameters, which is either characteristic length L or boundary temperature difference ∆T in this study. σ represents the activation function, which is set to be the Swish activation function. The pretrained shallow NN provides a scaling factor β for approximating the equilibrium phonon distribution \({f}^{{\rm{eq}}}\). The loss function contains residuals of the PDEs and boundary conditions on sampled collocation points in the simulation domain. The parameters in the two DNNs are learned by minimizing the total loss.

Apparently, T and T* should be identical. Then Eq. (5) becomes

$${\mathcal{R}}(f({\bf{x}},{\bf{s}},k,p,{\boldsymbol{\mu }}),\,T):=\left\{\begin{array}{c}{\bf{v}}\cdot \nabla f-\frac{{f}^{{\rm{eq}}}(T)-f}{\tau (T)}=0\\ {T}^{\ast }-T=0,\end{array}{\bf{x}},{\bf{s}},k,p\in {\Gamma },\,{\boldsymbol{\mu}} \in {{\mathbb{R}}}^{d},\right.$$
(9)

where T* is calculated based on Eq. (8). The difference between these two temperatures can be easily nondimensionalized by the temperature difference across the simulation domain. Different from conventional solvers that determine the local equilibrium temperature by dichotomy or Newton’s method, the introduction of β enables efficient computation of T* without any iterative process. It is also noted that pretraining must be conducted to learn β with a shallow NN, which is sufficient to capture the nonlinear relationship between β and (ω, p, T). Therefore, instead of directly using Eq. (2), in the present scheme we compute \({f}^{{\rm{eq}}}(T)\) based on Eq. (7) with the outputs β and T from the two networks (see Fig. 1).

As shown in Fig. 1, given Tref and the temperature range \(T\in [{T}_{{\rm{ref}}}-\Delta T,\,{T}_{{\rm{ref}}}+\Delta T]\), a shallow NN is first trained to provide the scaling factor \(\beta =\beta (\omega ,p,T)\). Then two DNNs are trained by minimizing the sum of residuals in Eqs. (6) and (9) as follows:

$${\mathcal{L}}({\bf{W}},{\bf{b}})={\left\Vert {\bf{v}}\cdot \nabla f-\frac{{f}^{{\rm{eq}}}-f}{\tau }\right\Vert }^{2}+{\Vert {T}^{\ast }-{T}\Vert}^{2} +\sum\limits_{i}{ {\Vert{\mathcal B}}_{i}\Vert}^{2},$$
(10)

where W and b refer to the weights and biases of the entire network, and \(\Vert \cdot \Vert\) is L2 norm. An optimal set of network parameters can be obtained by minimizing this composite loss function,

$${{\bf{W}}}^{\ast },\,{{\bf{b}}}^{\ast }=\text{arg}{\min }_{{\bf{W}},{\bf{b}}} {\mathcal L} ({\bf{W}},{\bf{b}}).$$
(11)

The specific DNN architectures and training details are included in the “Methods” section.

Model systems

Numerical tests are carried out to evaluate the performance of this PINN scheme. We investigate several steady-state thermal transport problems with arbitrary temperature differences at different length scales, including 1D cross-plane, 2D in-plane, 2D rectangle and 3D cuboid. Single crystalline silicon is used as a model material, but the proposed model is applicable to other materials. We assume that the phonon dispersion relation of silicon is isotropic and the (100) direction information9 is used in all tests. The derivations of phonon frequencies and relaxation times are based on refs. 10,48 (see the “Methods” section). Only one longitudinal acoustic (LA, set as \(p=1\)) and two degenerate transverse acoustic (TA, set as \(p=0\)) phonon branches are considered because the optical branches contribute little to the thermal transport31. It should be noted that including optical branches is possible and only involves expanding the sample space of the discrete input variable p. Since we assume that the silicon system is not heavily doped, electron–phonon interaction is not considered important in this work, but its effect can be easily included by adding its influence in the phonon relaxation times49. For each phonon branch, we discretize the wave vector space \(k\in [0,\,2\pi /a]\) equally into Nk frequency bands by the midpoint rule, where \(a=5.431\,\AA\) is the lattice constant for silicon. We set \({N}_{k}=10\) in all cases as it gives a bulk thermal conductivity around 145.6 W m−1 K−1 at 300 K, which is in agreement with the literature value50. To obtain the average Knudsen number \(\overline{{\rm{Kn}}}=\bar{\lambda }/L\) at different temperatures, the average mean free path \(\bar{\lambda }\) is introduced as

$$\bar{\lambda }(T)=\left(\sum\limits_{p}{\int_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}}\hslash \omega D\frac{\partial {f}^{{\rm{eq}}}}{\partial T}|{\bf{v}}|\tau d\omega \right){\left(\sum\limits_{p}{\int_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}}\hslash \omega D\frac{\partial {f}^{{\rm{eq}}}}{\partial T}d\omega\right)}^{-1}.$$
(12)

The average mean free path as a function of temperature is shown in Fig. 2a.

Fig. 2: Results of 1D cross-plane phonon transport with small temperature differences.
figure 2

a Temperature-dependent average phonon mean free path. b A schematic of the quasi-1D cross-plane phonon transport and the boundary temperatures. c Dimensionless temperature profiles of silicon thin films with different thicknesses (L = 10 nm, 100 nm, 1 μm, 100 μm), where \({T}^{\ast }=(T-{T}_{{\rm{R}}})/({T}_{{\rm{L}}}-{T}_{{\rm{R}}})\) and \(X=x/L\). The black solid lines represent analytical solutions to the quasi-1D phonon BTE. d Effective thermal conductivity normalized by the bulk thermal conductivity as a function of the thickness L. The filled circles represent the parameter points used in training, while the open circles are predicted points not included in training.

The training and testing details about numerical experiments are summarized in Table 1. Nx is the number of interior points in the spatial domain (quasi-random Sobol sequences in training and uniform grids in testing), and Ns is the number of solid angles by the Gauss-Legendre quadrature. Nμ represents the number of parameter values (length scale L or boundary temperature difference ∆T) sampled in a range given in Table 1. Then the total number of collocation points is Nx × Ns × Nk × Np × Nμ, where \({N}_{p}=2\) is the number of phonon branches. The computation times and losses of training and testing processes are shown in Table 2, where the training loss is defined in Eq. (10) after nondimensionalization, and the validation loss is the total loss evaluated in testing with the settings shown in Table 1.

Table 1 Training and testing information of the numerical experiments.
Table 2 Computation times and losses of numerical experiments.

1D cross-plane phonon transport

We first evaluate the quasi-1D cross-plane thermal transport in silicon films (Fig. 2b). The phenomena are described by an 1D phonon BTE with two isothermal boundary conditions (see the “Methods” section). The thickness of the film is L, and a temperature difference ∆T is induced in the x direction. The temperature of the left boundary is set as \({T}_{{\rm{L}}}={T}_{{\rm{ref}}}+\Delta T/2\), while that of the right boundary is set as \({T}_{{\rm{R}}}={T}_{{\rm{ref}}}-\Delta T/2\). For the training of our PINN model, the spatial domain is equally discretized with 40 training points, and 16-point Gauss-Legendre quadrature is used for the phonon transport direction \({{\bf{s}}}_{x}={\rm{cos}}\theta\) (Table 1). Here we have conducted several tests at different reference temperatures Tref, and the effects of L and ∆T are separately studied by parametric learning.

To validate the proposed scheme, we perform numerical tests under the small temperature difference limit but without explicitly linearizing the equilibrium deviational distribution with respect to temperature. This is the limit where analytical solutions exist for such quasi-1D problems. Here, Tref is set to be 300 K and ∆T is set to 2 K. In this case, the pretrained scaling factor β is almost equivalent to \(\frac{\partial {f}^{{\rm{eq}}}}{\partial T}\) since ∆T is sufficiently small. Parametric training is employed with thickness L as an input to the DNNs. The model is trained with 9 samples of L in the range between 10 nm and 100 μm. After training, the temperature and heat flux can be evaluated at any new L given the interpolation ability of DNNs. Figure 2c shows the dimensionless temperature profiles \({T}^{\ast }=(T-{T}_{{\rm{R}}})/({T}_{{\rm{L}}}-{T}_{{\rm{R}}})\) at different L. The analytical solutions by the method of degenerate kernels12 are included for comparison. Temperature profiles predicted by the PINN model are found to agree almost exactly with analytical solutions, with the discrepancy < 1%. We also calculate the dimensionless thermal conductivities (\({k}_{{\rm{eff}}}/{k}_{{\rm{bulk}}}\)), where \({k}_{{\rm{eff}}}=qL/\Delta T\) is the effective thermal conductivity defined by Fourier’s law, and \({k}_{{\rm{bulk}}}=\frac{1}{3}\sum\nolimits_{p}{\int }_{0}^{{\omega }_{{\rm{max }},{\rm{p}}}}C{|{\bf{v}}|}^{2}\tau d\omega\) is the bulk thermal conductivity in the diffusive limit. As shown in Fig. 2d, we again observe good agreement for all testing points with the analytical solution (error < 0.7%). Although only trained with discrete thickness values, our model provides accurate predictions of thermal conductivity at unseen input thicknesses.

We then apply this framework to problems with larger temperature differences. In the ballistic limit, the phonon transport is governed by Stefan-Boltzmann law51, where the phonon scattering is rare and the temperature across the whole system approximately follows \({T}^{4}=({{T}_{{\rm{L}}}}^{4}-{{T}_{{\rm{R}}}}^{4})/2\). We consider a silicon film with \({T}_{{\rm{L}}}=50\) K, \({T}_{{\rm{R}}}=40\) K and \(L=10\) nm such that the average Knudsen number \(\overline{{\rm{Kn}}}=\bar{\lambda }/L\gg 10\). The shallow NN for β has been trained for the relevant temperature range. We note that training this shallow NN is very fast, taking < 20 s. Figure 3a shows the predicted temperature profile, which is very consistent with the analytical solution by Stefan-Boltzmann law. The small difference stems from the fact that there are still a small number of phonons with mean free path smaller than the thickness \(L=10\) nm, so they are close to diffusive.

Fig. 3: Results of 1D cross-plane phonon transport in the ballistic and diffusive limits with arbitrary temperature differences.
figure 3

a Temperature profile of a silicon film at \(L=10\) nm, with boundary condition \({T}_{{\rm{L}}}=50\) K, \({T}_{{\rm{R}}}=40\) K, and \(X=x/L\). The black solid line is the analytical solution by Stefan-Boltzmann law in the ballistic limit. b Temperature profiles with various boundary temperature differences at \(L=100\) μm, and the black lines are derived based on Fourier’s law in the diffusive limit.

As the system size increases, the phonon transport becomes more diffusive. Here, we continue to study the thermal transport in the diffusive regime at \({T}_{{\rm{ref}}}=300\) K, but with a much larger ∆T. For fast predictions under different ∆T, this time we incorporate ∆T as an input variable and learn the solutions in a parametric setting. Similar to the previous case, a shallow NN is trained to provide β in the temperature range between 200 and 400 K. Then, the PINN model is trained with 5 sampled ∆T values (20, 60, 100, 150, and 200 K) at a fixed thickness \(L=100\) μm, where all phonons are expected to be diffusive. Figure 3b shows the predicted temperature profiles with different ∆T. Compared to the analytical solutions based on Fourier’s law, which is valid in the diffusive limit, our model accurately reproduces the analytical solutions (mean absolute error < 0.4 K) and captures the nonlinear effect due to the temperature-dependent thermal conductivity. Specifically, different from the linear profile in the diffusive regime under small ∆T (Fig. 2c), the temperature profile is predicted to be convex with larger ∆T. Since at higher temperatures lattice thermal conductivity decreases due to the stronger intrinsic phonon scattering, the local temperature gradient increases with increasing temperature given the same heat flux. Thus, the convex temperature profile correctly indicates that the thermal conductivity decreases with the increasing temperature between 200 and 400 K.

The same training procedure is adopted for cases at \({T}_{{\rm{ref}}}=100\) K, where the average mean free path changes more drastically (Fig. 2a). Considering the larger \(\bar{\lambda }\) at lower temperatures, we set \(L=5\) mm to ensure that the phonon transport is close to the diffusive limit. As shown in Fig. 4a, good agreement with Fourier solution in predicted temperature is confirmed for ∆T ranging from 10 to 50 K. We also successfully reproduce the nonlinear temperature curves, which is infeasible under the assumption of small ∆T. To further investigate the effects of ∆T, we consider the cases near the ballistic regime with \(L=100\) nm. Tref is fixed at 100 K, while ∆T varies through a ratio \(R=\varDelta T/{T}_{{\rm{ref}}}\) and is added as an input in parametric training. The dimensionless temperature profiles T* with different R values are plotted in Fig. 4b. Since the phonon transport in this case is dominated by the phonon boundary scattering, we observe temperature slips near the boundaries. As ∆T increases, the temperatures at the two boundaries increase, which can be attributed to the higher \(\overline{{\rm{Kn}}}\) at a lower temperature. The phonon boundary scattering has a larger impact at the cold boundary, leading to a larger temperature deviation from the predefined boundary temperature, and vice versa for the hot boundary.

Fig. 4: Results of 1D cross-plane phonon transport at Tref = 100 K.
figure 4

a Temperature profiles with different boundary temperature differences at \(L=5\) mm, and the black lines are derived based on Fourier’s law. b Dimensionless temperature profiles with different boundary temperature differences at \(L=100\) nm, where \({T}^{\ast }=(T-{T}_{{\rm{R}}})/\Delta T\), \(R=\Delta T/{T}_{{\rm{ref}}}\), \({T}_{{\rm{ref}}}=100\) K, and \(X=x/L\).

All these testing cases confirm that the present scheme can not only describe the phonon transport correctly under small temperature ranges, but also provide accurate predictions when the temperature difference is large. As for the computational cost, the training time is estimated to be consistently less than one hour on a GPU in the form of parametric training, while all testing processes take less than one second (Table 2). It is important to note that since parameters like L and \(\Delta T\) can be added to the parametric space (μ) when training the model, a single training will allow the use of the model for different conditions (see Figs. 2c, d, 3b and 4)—a main advantage of PINN over traditional numerical solvers, which need new simulations from scratch when any of the parameters are changed.

2D in-plane phonon transport

For 2D in-plane thermal transport, we focus on the square silicon film (inset in Fig. 5) with a small temperature gradient (\(\Delta T=2\) K) along the x-direction, where analytical solutions can be derived from the Fuchs-Sondheimer theory for comparison18. Although approaching the small temperature gradient limit, we do not explicitly linearize the BTE but use the shallow NN for the scaling factor β. Diffusely reflecting boundary condition (see the “Methods” section) is applied to the top and bottom walls, and the other boundaries are periodic boundaries. The settings of the computational domain can be found in Table 1 and the “Methods” section. Here, Tref is fixed at 300 K, and the length scale L is used as an input parameter. Two PINN models are trained to predict the phonon transport at L within the range [10 nm, 1 μm] and [1 μm, 100 μm], respectively, to minimize the training loss for each range as phonon transport transitions from highly ballistic to diffusive.

Fig. 5: Results of 2D in-plane phonon transport (\(\Delta T=2\) K).
figure 5

a Dimensionless x-directional heat flux results along the y-axis (see inset in panel (b)) in silicon films with different length scales, where \({q}_{x}^{\ast }={q}_{x}(Y)/{q}_{{\rm{bulk}}}\) and \(Y=y/L\). From bottom to top, the PINN predictions are shown as circles for L = 100 nm, 300 nm, 1 μm, 10 μm. The black solid lines represent analytical solutions by the Fuchs-Sondheimer theory. b Effective thermal conductivity normalized by the bulk thermal conductivity at different length scales. The filled circles represent the parameter points used in training, while the open circles are predicted points not included in training. The inset shows the schematic of the simulation domain and boundary conditions.

Figure 5a shows the dimensionless x-directional heat flux \({q}_{x}^{\ast }={q}_{x}(Y)/{q}_{{\rm{bulk}}}\) at different L, where \({q}_{{\rm{bulk}}}=-{k}_{{\rm{bulk}}}\cdot \Delta T/L\). The differences between the PINN predictions and the analytical solutions are < 2.9%. We also observe a good agreement (error < 1.9%) in effective thermal conductivity \({k}_{{\rm{eff}}}=-{(dT\!/\!dx)}^{-1}{\int }_{0}^{1}{q}_{x}(Y)dY\) as shown in Fig. 5b, and again the present method reproduces the varying effective thermal conductivity due to the change of length scale. Like our previous model devised for small temperature differences46, the present scheme shows high accuracy in 2D in-plane thermal transport. The evaluation is also very fast and takes < 8 s on a domain with much more collocation points than the training domain (Tables 1 and 2).

2D rectangle phonon transport

Next, we apply our method to the phonon transport in a 2D rectangle domain (Fig. 6a) with large temperature differences. The length and width of the geometry are L and 0.5L, respectively. To mimic the condition of Joule self-heating, we apply a Gaussian temperature distribution Th to the top boundary, while other boundaries are held at a lower temperature \({T}_{{\rm{c}}}=300\) K. The Gaussian temperature distribution is set with the full width at half maximum (FWHM) to be 0.4L, and the difference between the peak temperature at the center of the top wall Tmax and Tc is 100 K.

Fig. 6: Results of 2D rectangle phonon transport.
figure 6

a The computational domain is of size L × 0.5L. Gaussian temperature distribution Th is applied to the top boundary (with \({T}_{{\rm{\max }}}=400\,K\)), while all the other boundaries are maintained at a lower temperature (\({T}_{{\rm{c}}}=300\) K). b PINN-predicted temperature distributions at the top wall at different length scales. ce Predicted temperature contours at length scale L = 100 nm, 1 μm, 100 μm. f Solution of the 2D heat equation based on the Fourier’s law, which is obtained by a well-trained PINN. X and Y are normalized spatial coordinates.

Parametric training is conducted on 4 geometries with L ranging from 100 nm to 100 μm. It is observed that PINN successfully reproduces the Gaussian temperature distribution at the top boundary (Fig. 6b). Figure 6c–e shows the predicted 2D temperature contours at different L. Although there is no available analytical solution for direct comparison, we derive a Fourier solution (i.e., solution in the diffusive limit) using a simple PINN model for a 2D steady-state heat equation (Fig. 6f). This result can be treated as the benchmark solution in the diffusive limit as the final training loss is as low as 4 × 10−4.

We can find that the result of the 100 μm case (Fig. 6e), which is close to the diffusive limit, is nearly identical to the Fourier benchmark (Fig. 6f), with the mean absolute error < 0.3 K. Obvious temperature slips near the top boundary are also observed in cases with smaller L, and as L increases the slip decreases. This test confirms the capability of the present scheme in solving 2D phonon transport under large temperature gradients. Another feature about this scheme is that although the training time increases due to more training points used and the higher input dimension (Table 2), the evaluation cost can be less than one second if we only need to predict the temperature profile, which is the output of a sub-network in our model (see Fig. 1).

3D cuboid phonon transport

To validate the present model for problems in a more realistic setting, we consider the phonon transport in a 3D cuboid geometry as an extension of the last 2D case. The test geometry is a silicon block of size L × L × 0.5L, with a circular hot spot following a Gaussian temperature distribution (FWHM = 0.4L) on the top surface, as depicted in Fig. 7a. To demonstrate that the present model is applicable to 3D problems where the phonon mean free path features a wide span of orders of magnitudes (e.g., below 200 K), we select a different temperature range than that in the 2D case in this test. The peak temperature \({T}_{{\rm{\max }}}\) is set to 200 K at the center of the top surface, and the other surfaces are maintained at \({T}_{{\rm{c}}}=100\) K. A PINN model is first trained at L = 1 mm without additional input parameters. Similar to the previous test, we compare the results to the Fourier solution for the 3D heat equation from a PINN model (training loss < 7 × 10−4).

Fig. 7: Results of 3D cuboid phonon transport.
figure 7

a Schematic of the 3D thermal transport in a cuboid geometry of size L × L × 0.5L. Gaussian temperature distribution is applied to the top surface (\({T}_{{\rm{\max }}}=200\) K), while all the other surfaces are maintained at a lower temperature (\({T}_{{\rm{c}}}=100\) K). b Predicted steady-state temperature contour for a 3D system of size 1 mm × 1 mm × 0.5 mm. c Predicted temperature contour in the central plane (y = 0.5L) at L = 1 mm. d Solution of 3D heat equation based on Fourier’s law under the same boundary conditions, which is obtained by a well-trained PINN. ef Predicted temperature contours at two length scales through parametric training, and L = 500 nm is not included in the training. X and Z are normalized spatial coordinates.

Figure 7b shows the temperature contours in two central planes predicted by our PINN model without parametric learning, which are evaluated on a computational domain with much more points than the training domain. Comparing our prediction with the Fourier benchmark in the plane at y = 0.5L (Fig. 7c, d), we find that the difference is small as the 1 mm case is close to the diffusive limit, while the mean absolute temperature difference is < 0.2 K across the whole system. We have also performed a parametric training with variable L sampled in the range between 300 nm and 3 μm (Table 1) and achieved the ability to predict the temperature profile for various sample lengths within a few minutes. Figure 7e, f shows the predicted temperature profiles at two length scales, while L = 500 nm is a predicted point not included in the training. Based on the good performance in 1D and 2D problems, we expect high computational accuracies for 3D geometries of different sizes as well, but there are no benchmark results to compare with for the 3D non-diffusive cases. It is noted that such fast and accurate prediction in 3D geometries has not been achieved by any other methods under large temperature gradients.

Discussion

In summary, a deep learning-based PINN model is developed for solving mode-resolved phonon BTE with arbitrary temperature differences. Numerical tests show that the present scheme can accurately predict steady-state phonon transport from 1D to 3D under arbitrary temperature differences, which is computationally challenging, if not impossible, for conventional numerical methods. Under large temperature gradients, the phonon transport is found to be very different from that under small temperature differences due to the temperature-dependent phonon relaxation times. When the temperature difference is large, the phonon relaxation time can vary significantly over space for a given frequency and polarization depending on the local temperature, and the present scheme successfully handles such situations through the introduction of a scaling factor described by a pretrained shallow NN. Parametric learning is also enabled by including the length scale or boundary temperature difference as additional inputs to the model, allowing for efficient investigation of effects due to varying temperature differences and Knudsen numbers. As for the computational cost, we note that conventional solvers usually require long computation time (tens of hours) and large memory (hundreds of gigabytes) for 3D solutions under large temperature difference even using large-scale parallel computing29,52. However, the present model can provide solutions at any point in the computational domain within at most several minutes, and it is very computationally efficient even considering the training time due to the implementation of parametric learning.

The parametric learning feature, together with the low evaluation cost, may allow for efficient search in parameterized spaces for design purposes. To apply this scheme to the simulations of realistic device-level thermal transport, we need the phonon dispersion relations for the materials constituting the target system. When the component materials are heavily doped, the effects of electron–phonon interaction should be included by adjusting the phonon relaxation time accordingly. Besides, an efficient sampling strategy must be adopted for improved learning performance when the system structure is more complex. Since the current scheme approximates the solution function by minimizing the BTE residuals on the sampled collocation points, it is theoretically applicable to inhomogeneous systems such as heterojunctions and porous materials given the intrinsic phonon properties and interfacial phonon scattering. This PINN scheme is also easy to implement and does not need any labeled data for training. It can be a powerful tool in studying multiscale thermal transport for applications like thermoelectrics and electronics thermal management.

While being accurate and efficient in predicting multiscale thermal transport, the current scheme still has limitations, which warrants further research. In particular, our framework is designed for steady-state problems, and modifications are required in order to capture the transient thermal transport. For example, Long-Short Term Memory (LSTM) recurrent neural network architecture53 could potentially be used to deal with dynamic systems. Furthermore, for realistic electronic device-level simulations, it is desirable to solve the phonon BTE and electron BTE simultaneously. Since most existing methods for electro-thermal simulations have employed simplifications in physics equations54 or separate solvers for electrons and phonons55,56, a unified PINN model would be ideal for reliable investigation of self-heating effects by solving the coupled BTEs.

Methods

PINN architecture and training

The proposed PINN model consists of two DNNs for training and one pretrained ANN, where two DNNs have the local temperature (T) and the non-equilibrium (\({f}^{{\rm{neq}}}\)) part of the phonon distribution function as output, respectively (Fig. 1). With 30 neurons per layer, the DNN for \({f}^{{\rm{neq}}}\) has a structure of 8 hidden layers, while the DNN for T has a varying number of hidden layers depending on the problem dimension. The pretrained ANN for scaling factor β has only one hidden layer with 30 neurons. Two DNNs are trained simultaneously with a unified physics-informed loss function. We employ the Swish activation function (x·Sigmoid(x))57 in each layer except the last one, where a linear activation function is applied. The Adam optimizer58, a robust variant of the stochastic gradient descent algorithm, is used to solve the optimization problem defined in Eq. (10) by training on mini-batches of inputs. The initial learning rate is set as 5 × 10−3, and training points are generated by sampling of the input domain. To approximate the integrals in Eq. (8), Gauss-Legendre quadrature59 is adopted for the solid angle space, while the midpoint rule is used for the frequency space. In the case the spatial domain is logically rectangular, we can set the interior training points as quasi-random low-discrepancy Sobol sequences60 to alleviate the curse of dimensionality. Input spatial coordinates are scaled to the range [0, 1]. The PINN algorithm is implemented within the PyTorch platform61, and all numerical experiments are performed on a single NVIDIA Tesla P100 Graphic Processing Unit (GPU).

Boundary conditions

Three categories of boundary conditions are usually met in phonon transport problems, including isothermal boundary conditions, diffusely reflecting boundary conditions, and periodic boundary conditions. These boundary conditions can be applied to problems with any parameter sets μ.

Isothermal boundary absorbs all incident phonons and emits phonons in thermal equilibrium with the boundary temperature Tb. Mathematically, this can be expressed as

$$f({{\bf{x}}}_{b},{\bf{s}},k,p)={f}^{{\rm{eq}}}(k,p,{T}_{b}),\,{\bf{s}}\cdot {{\bf{n}}}_{b}\;>\;0,$$
(13)

where nb is the normal unit vector pointing into the simulation domain.

Diffusely reflecting boundary is a type of adiabatic boundary. At this boundary, the net heat flux is zero, meaning that the phonons are reflected with equal probability along all possible directions, namely,

$$f({{\bf{x}}}_{b},{\bf{s}},k,p)=\frac{1}{\pi}{\int_{{\bf{s}}^{{\prime}}\cdot {{\bf{n}}}_{b} < 0}}f({{\bf{x}}}_{b},{\bf{s}}^{\prime},k,p)|{\bf{s}}^{\prime}\cdot {{\bf{n}}}_{b}|d\Omega ,\,{\bf{s}}\cdot {{\bf{n}}}_{b}\;>\;0.$$
(14)

For the periodic boundary, a phonon that crosses it is emitted at the opposite boundary with the same velocity vector and frequency. Besides, two corresponding boundaries follow the local thermal equilibrium,

$$f({{\bf{x}}}_{{b}_{1}},{\bf{s}},k,p)-{f}^{{\rm{eq}}}(k,p,{T}_{{b}_{1}})=f({{\bf{x}}}_{{b}_{2}},{\bf{s}},k,p)-{f}^{{\rm{eq}}}(k,p,{T}_{{b}_{2}}),$$
(15)

where \({{\bf{x}}}_{{b}_{1}}\), \({T}_{{b}_{1}}\) and \({{\bf{x}}}_{{b}_{2}}\), \({T}_{{b}_{2}}\) are the spatial coordinates and temperatures of two associated periodic boundaries \({b}_{1}\) and \({b}_{2}\), respectively.

Phonon dispersion and scattering

The dispersion relations of the acoustic phonons are approximated as \(\omega ={c}_{1}k+{c}_{2}{k}^{2}\)48, where for LA branch \({c}_{1}\) = 9.01 × 105 cm/s, \({c}_{2}\) = −2.0 × 10−3 cm2/s; for TA branch \({c}_{1}\) = 5.23 × 105 cm/s, \({c}_{2}\) = −2.26 × 10−3 cm2/s. The Matthiessen’s rule is used to estimate the effective relaxation time by combining different scattering processes62, including the impurity scattering, umklapp (U) and normal (N) phonon-phonon scattering, \({\tau }^{-1}={\tau }_{{\rm{impurity}}}^{-1}+{\tau }_{{\rm{U}}}^{-1}+{\tau }_{{\rm{N}}}^{-1}={\tau }_{{\rm{impurity}}}^{-1}+{\tau }_{{\rm{NU}}}^{-1}\), where the relaxation time formulas and coefficients10 are given in Table 3. We note that the dispersion and relaxation times can also be from first-principles calculations for each discrete mode63. The PINN implementation will not change.

Table 3 Relaxation time formulas and coefficients.