Physics-informed deep learning for three-dimensional transient heat transfer analysis of functionally graded materials

We present a physics-informed deep learning model for the transient heat transfer analysis of three-dimensional functionally graded materials (FGMs) employing a Runge–Kutta discrete time scheme. Firstly, the governing equation, associated boundary conditions and the initial condition for transient heat transfer analysis of FGMs with exponential material variations are presented. Then, the deep collocation method with the Runge–Kutta integration scheme for transient analysis is introduced. The prior physics that helps to generalize the physics-informed deep learning model is introduced by constraining the temperature variable with discrete time schemes and initial/boundary conditions. Further the fitted activation functions suitable for dynamic analysis are presented. Finally, we validate our approach through several numerical examples on FGMs with irregular shapes and a variety of boundary conditions. From numerical experiments, the predicted results with PIDL demonstrate well agreement with analytical solutions and other numerical methods in predicting of both temperature and flux distributions and can be adaptive to transient analysis of FGMs with different shapes, which can be the promising surrogate model in transient dynamic analysis.

ture for the purpose of controlling variations in physical properties. Due to their excellent thermal properties, functionally graded materials (FGMs) have been widely used in the high temperature environments such as aerospace engineering (e.g. as thermal barrier coatings for aerospace structures [1]), microelectronics, power generation [2].
Apart from those traditional numerical methods, machine learning offers another novel opportunity to solve complex partial differential equations (PDEs). Such approaches can be traced back to the seminal work of Lagaris et al. [16] in 1997. However, such approaches gained in popularity only recently, probably due to advancements in machine learning techniques and associated open-sources tools such as TensorFlow or Pytorch. Some recent and innovative applications with artificial neural networks are summarized as follows: Khatir et al. [17] proposed two-stage approaches to study damage detection, localization and quantification in Functionally Graded Material (FGM) plate structures, with IsoGeometric Analysis (IGA) for modelling while using an improved damage indicator based on Frequency Response Function (FRF) for damage elements identification in first stage and improved Artificial Neural Network using Arithmetic Optimization Algorithm (IANN-AOA) for damage quantification problem in the second stage. Wang et al. [18] proposed a novel and intelligent algorithm based on deep learning to realize the recognition of different types of rail profiles and achieve rapid tracking of the railhead laser stripe. Ho et al. [19] combined feedforward neural networks and marine predator algorithm for structural health monitoring in different scenarios including a simply supported beam, a two-span continuous beam, and a laboratory free-free beam. Based on a coupled model between an artificial neural network (ANN) and antlion optimizer (ALO), Ho et al. managed to localize damages in fixed-free plate structures based on mode shape derivative based damage identification index [20].
Raissi et al. studied physics-informed machine learning by encoding physics with kernel matrix in Gaussian Processes [21,22]. The physics-informed Gaussian Processes were applied in solving linear and nonlinear differential equations. They [23,24] later introduced a physical informed neural networks for supervised learning of nonlinear partial differential equations such as the Burger's equations or Navier-Stokes equations, see also their recent contributions in [25]. Two distinct models were tailored for spatio-temporal datasets: continuous time and discrete time models. Their physical informed neural networks were successfully applied in solving coupled high-dimensional forward-backward stochastic differential equations. The convergence behaviour of physics informed neural networks was studied by Shin et al. [26]. Fu et al. [27] proposed an extrinsic approach based on physics-informed neural networks (PINNs) for solving the partial differential equations on surfaces embedded in high dimensional space which manifested good accuracy and higher efficiency compared with the embedding approach. Karniadakis et al. [25] gave a comprehensive review on physic-informed machine learning framework and summarized the general approaches of introducing physics in machine learning framework and introduced some of the latest applications of physics-informed machine learning.
However, for the transient analysis with physics-informed neural networks, the current works mostly solve the PDEs with a continuous time model and applications are often limited in simple one-dimensional cases [28][29][30]. Yu et al. [30] applied PINN and extend physical-informed Neural Networks (XPINN) in solving steady and transient heat conduction problems in FGMs based on a continuous time scheme, for the transient analysis of 2D FGMs, however, only the radial coordinate and temporal coordinate are considered. Raissi et al. [21] pointed out that the continuous time model needs a large amount of collocation points in the entire spatio-temporal domain which makes the training prohibitively expensive, and typically fails to handle long-time prediction tasks [31].
In this study, we suggest the physics informed deep learning based collocation method with a discrete time scheme that avoids extra training data from simulations for three dimensional transient heat conduction analysis of FGMs. We also propose fitted activation functions suitable for transient heat transfer analysis. The proposed model will then be validated through several numerical examples. The remainder of this paper is organized as follows. In Sect. 2, we describe the physical model we present our deep collocation method in Sect. 3. Section 4 contains several numerical examples to demonstrate the performance of our approach before the manuscript concludes in Sect. 5.

Transient heat transfer analysis in 3D FGMs
The general transient diffusion equation for functional graded materials can be written as: where T (x, t) is the temperature function, c is the specific heat, k the thermal conductivity and denotes the domain and denotes its boundary. We assume that the thermal conductivity and specific heat vary exponentially in the z−direction: where β is the so-called non-homogeneous parameter. After substituting Equation (2) into Equation (1), we obtain The Dirichlet boundary D and von Neumann boundary N conditions are given as: where n is the unit outward normal to N . In this paper, we assume that the initial temperature to be zero.

Physics-informed neural network using collocation method
In this section, the deep learning based collocation method that using physics-informed deep neural networks with Runge-Kutta (RK) integration schemes is introduced. First, a series of collocation points will be generated in the physical domain and at the boundaries denoted by x = (x 1 , ..., x N ) T and x = (x 1 , ..., x N ) T , respectively, which formulates the dataset of training. Then the time-dependent heat conduction equation will be discretized using the classical Runge-Kutta method -with q stages.

Collocation points generation
To generate randomly distributed collocation points, various sampling strategies have been developed. The Halton and Hammersley sequences generate random points by constructing the radical inverse [32]. They are both low discrepancy sequences. Another approach is based on the Korobov Lattice [33]. Sobol Sequence is a quasi-random low-discrepancy sequence to generate sampling points [34]. Latin hypercube sampling (LHS) is a statistical method, where a near-random sample of parameter values is generated from a multidimensional distribution [35]. Monte Carlo methods create sampling points by repeated random sampling [36]. We have compared different sampling strategies for the steady state heat conduction equations in nonhonogeneous media in a previous study and found that Latin hypercube sampling (LHS) could yield favourable results with increasing layers [37]. Therefore, Latin hypercube sampling (LHS) is selected to generate collocation points in the transient heat transfer analysis.

Runge-Kutta methods with q stages
For a general time-dependent partial differential equationas u (t) = N (t, u), by applying the general form of Runge-Kutta methods with q stages [21], an update iterative form can be obtained where with u n+c j (x) = u(t n + c j t, x) for j = 1, . . . , q. Depending on the choice of the triple parameters {a i j , b j , c j }, an implicit or explicit time-stepping scheme can be obtained. Generally, the matrix A = [a i j ] defines the Runge-Kutta matrix and b = b i and c = c i indicate the weights and nodes, which can be arranged in Butcher tableau as follows: The theoretical error estimates for Runge-Kutta methods with q stages predict a temporal error accumulation of O t 2q assuming that t < 1. Otherwise the solution may not converge.

Discrete time approach
In a continuous time scheme, the neural network is used to approximate the mapping (t, x) → u(t, x). The training data needs to be generated in the spatial-temporal domain, which can be too costly for analysis in high-dimensions, especially in the long time integration. On the other hand, we perform transient analysis in a discrete time scheme by placing a multi-output neural network prior on The neural network is used to approximate the mapping (x) → (u n+c 1 (x), . . . , u n+c q (x), u n+1 (x)). This prior assumption along with equation (6) results in a physics informed neural network that takes x as an input and output: with u (t n , x) ≈ u n q+1 (x). The architecture of the neural network in this application can be found in Fig. 1. The colored round circle is the basic computational unit and the purple one on the output layer is the solution at time-step n.

Activation functions
To introduce the non-linearity regarding material variations into the neural network of Fig. 1 and enable the backpropagation, the activation function σ on hidden layers is  defined. There are many activation functions σ available such as sigmoids function and hyperbolic tangent function (T anh), to name a few [38]. Selecting the activation function in many cases still remains an open issue and commonly a trade-off between expressivity and trainability of the neural network [39]. For vibration analysis, Raissi et al. [40] report that the sinusoidal activation function is more stable than the hyperbolic tangent function (T anh). For transient analysis with PINNs using discrete time scheme, we also found that the hyperbolic tangent function (T anh) cannot converge but the sinusoidal activation function is also not stable for many cases. To our experience, the bipolar sigmoid function [41] and sigmoid-weighted linear unit (SiLU) function f (x) = x × sigmoid(x) [42] yield better results in transient dynamic analysis. The two activation functions that fitted for transient analysis are shown in The Bipolar sigmoid function is a continuous activation function with a gradual output value in the range [−1, 1], which looks similar to hyperbolic tangent function. However, the hyperbolic tangent function has a steeper slope. The sigmoid-weighted linear unit (SiLU) function resembles the classical ReLU activation but is nevertheless a smooth activation unbounded above but bounded below. Small negative values can capture underlying patterns from data, while large negative values may be filtered out to keep sparsity.

Physics-informed deep learning formulation
Taking advantage of the Runge Kutta method -substituting Equation (6)

to Equation (3), we have
Placing a multi-output neural network prior with coordinates as inputs x on temperatures (T n+c 1 (x), . . . , T n+c q (x), T n+1 (x)) yields Combined with Equation (10), we can devise the physicsinformed neural network that outputs (T n 1 (x), . . . , T n q (x), T n q+1 (x)) which are approximated by (x; θ )). (12) where N = α∇ 2 + 2αβ is the differential operator. The loss function thus constructed from the mean square error is given by with and where M SE T D and M SE q N are defined as: and The transient analysis with discrete time PINNs model is reduced to an optimization problem: One of the most widely used optimization method to train the physics-informed neural network is the combined Adam-L-BFGS-B optimization algorithm. This strategy consists of     The basic scheme of the 'discrete physics-informed deep learning' is shown in Fig. 3. A fully-connected neural network with space coordinates as inputs is first applied to approximate the temperature at Runge-Kutta nodes and at time step (n + 1) t. Then, the derivatives of the temperature outputs are calculated using automatic differentiation (AD), which is then used to formulate the loss. The hyperparameters θ are learnt by minimizing the loss function.

Numerical examples
We demonstrate the presented approach through four benchmark problems; 100 stages are employed for the discrete time scheme. We fix the first six hidden layers with 20 neurons per layer and the rest layers are set to be q + 1 (101) neurons per layer.

Case 1: FGMs with exponential material gradation
Let us consider a unit cube shown in Fig. 4 where the material properties vary smoothly and continuously in the z-direction. The initial and boundary conditions are given as follows: and The thermal conductivity parameter in Eq. (2) is k 0 = 5 and specific heat parameter is c 0 = 1 and non-homogeneous β = 1. The analytical solution for this problem is given as: with B n = − 2T e β L β 2 L 2 + n 2 π 2 β Lsin(nπ)  Fig. 4 is generated inside the cubic, with 400 collocation points in the domain and 80 collocation points on the boundaries. The learning rate is set to 0.001. The predicted and analytical temperature at time t = 0.1s are illustrated in Fig. 5. The temperature profiles in z direction at different time levels ranging from 0.002 to 0.1 s for the FGM cube are shown in Fig. 6 and compared with the analytical solution. At each time level the predicted temperature coincides with the analytical solution. The flux distribution at time t = 0.005s is shown in Fig. 7 and agrees well with the analytical solution. The relative error between the predicted temperature and the analytical solution at time t = 0.1s inside the whole functionally graded cube is 3.661e−04.

Case 2: FGMs with time-dependent boundary condition
For this unit cube, the top surface is prescribed with a timedependent boundary condition T (x, y, 1; t) = 10t and all other surfaces are insulated. The initial and boundary conditions are given as  [43]. The temperature and flux contours at time t = 1s can be found in Fig. 9. Compared with BEM and FEM, deep collocation method is more easy in implementation without the necessity of building elaborate grids and once the deep learning model is trained, it can be deployed to predict the temperature and flux distribution in seconds while maintain the same level of accuracy, which can be a suitable surrogate model for tradition numerical methods.

Case 3: FGMs with Irregular domain
Let us consider now a problem with an irregular domain as shown in Fig.10. The inner and outer radii (r 1 and r 2 ) of and ⎧ ⎨ ⎩ q(r 2 , θ, z; t) = 0; q(r 1 , θ, z; t) = 0 q(r , π 3 , z; t) = 0; q(r , 0, z; t) = 0 T (r , θ, 0.01; t) = 1; T (r , θ, 0; t) = 0 (26) where r 1 ≤ r ≤ r 2 and 0 ≤ θ ≤ π 3 . First, we generate random collocation points inside the irregular physical domain as shown in Fig. 11, with 400 collocation points inside the domain and 200 collocation points on the boundaries. Figure 12 depicts the temperature profile along the z direction at t = 0.00001s and compared with the reference solutions from MFS and FEM [43]. Table 1 shows the profile of the overall temperature distribution. The results agree well with the ones from [43]. The gradation of the temperature along z-axis matches the material gradation property of the FGM.

Case 4: Functionally graded rotor problem
Finally, we study a functionally graded rotor with eight holes presented in Fig. 13. Due to the symmetry, only one-eighth of the rotor is analyzed. The geometric parameters of this rotor are marked in Fig. 13. All lines with arrows imply the length, namely the inner radius is R inner = 0.5, outer radius is R outer = 0.3 and the height is 0.1; the diameter of the mounting hole is Dia hole = 0.075. The thermal conductivity and specific heat parameters in Equation (2) are k 0 = 5 and c 0 = 1. Non-homogeneous parameter β = 1.5. The initial conditions are T (r , θ, z; 0) = 0 (27) and temperature boundary conditions are imposed on the inner side/surface (0K) and outer side/surface (100K) while other surfaces are adiabatic in which the heat flux is set to 0. Figure 14 shows the collocation points for training. Our results are compared to results of ABAQUS simulations. The temperature profile along the right top edge at time t = 0.0066 s is shown in Fig. 15 and the temperature distributions at different time levels is listed in Table 2. It can be observed that predicted temperature at specific locations and time matches well with ABAQUS results. The same can be observed for the evolution of temperature distribution with both numerical methods.

Conclusion
We presented a deep learning based collocation method for transient heat transfer analysis of three-dimensional functionally graded materials (FGMs). This deep collocation method combines the classical collocation method and the deep learning method in one framework which is easy in implementation and no necessity to build elaborate grids. For the deep learning model, a physics-informed neural network is combined with a q-stage Runge-Kutta discrete time scheme for transient heat transfer analysis. Nonlinear activation functions are adopted to introduce a nonlinearity into the neural network and fitted for dynamic analysis. We found the bipolar sigmoid function and sigmoid-weighted linear unit (SiLU) function most suitable for the physics-informed neural network for the transient analysis. Based on our previous study, Latin Hypercube sampling is selected for random sampling of collocation points making the proposed truly "meshfree" deep collocation method, such that it can deal with irregular shaped domains easily.
Various numerical examples were studied to validate the performance of proposed method including FGMs with an irregular shape and heat conduction with a variety of boundary conditions. From numerical results, it can be concluded that both temperature and flux inside FGMs predicted by deep collocation method with discrete time scheme and fitted activation function agree well with analytical solutions and other classical numerical methods. The physics-informed deep collocation method can be promising as surrogate models for FEM in dynamic analysis.