Analysis of three dimensional potential problems in non-homogeneous media with physics-informed deep collocation method using material transfer learning and sensitivity analysis

In this work, we present a deep collocation method for three dimensional potential problems in nonhomogeneous media. This approach utilizes a physics informed neural network with material transfer learning reducing the solution of the nonhomogeneous partial differential equations to an optimization problem. We tested different cofigurations of the physics informed neural network including smooth activation functions, sampling methods for collocation points generation and combined optimizers. A material transfer learning technique is utilised for nonhomogeneous media with different material gradations and parameters, which enhance the generality and robustness of the proposed method. In order to identify the most influential parameters of the network configuration, we carried out a global sensitivity analysis. Finally, we provide a convergence proof of our DCM. The approach is validated through several benchmark problems, also testing different material variations.

Relative error to measure the model accuracy Ea Analytical solution Ea Predicted solution · L 2 -norm

Introduction
Recent years have witnessed the vast growing application of neural networks in physics, this is partly due to the fact that by training the neural network, high-dimensional raw data can be converted to low-dimensional codes [1], and thus the high-dimensional PDEs can be directly solved using a 'meshfree' deep learning algorithm, which improves computing efficiency and reduces the complexity of problems. The deep learning method deploys a deep neural network architecture with nonlinear activation functions that introduces the nonlinearity that the system as a whole needs for learning nonlinear patterns. This lends some credence to the application of a physics informed machine learning method in discovering the physics behind the potential problems in non-homogeneous media, which is a wide range of problems in physics and engineering.
The current wave of deep learning started around 2006, when Hinton et al. [2,3] introduced deep belief nets and unsupervised learning procedures that could create layers of feature detectors without needs of labelled data. Equipped with deep learning model, information can be extracted from complicated raw input data with multiple levels of abstraction through a layerby-layer process [4]. Various variants such as multilayer perceptron (MLP), convolutional neural networks (CNN) and recurrent/recursive neural networks (RNN) [5] have been developed and applied to e.g. image processing [6,7], object detection [8,9], speech recognition [10,11], biology [12,13] and even finance [14,15]. Over the past decade it has been widely used in applications due to high performance demonstrated. Deep learning can learn features from data automatically, and the features can be used to get the approximation of solutions to differential equations [16], which cast light on the possibility of using deep learning as functional approximators.
Artificial neural networks (ANN) stands at the center of the deep learning revolution, it can be traced back to the 1940's [17] but they became especially popular in the past few decades due to the vast development in computational power and sophisticated machine learning algorithms such as backpropagation technique and advances in deep neural networks. Due to the simplicity and feasibility of ANNs to deal with nonlinear and multi-dimensional problems, they were applied in inference and identification by data scientists [18]. They were also adopted to solve partial differential equations (PDEs) [19][20][21] but shallow ANNs are unable to learn the complex nonlinear patterns effectively. With improved theories incorporating unsupervised pre-training, stacks of autoencoder variants, and deep belief nets, deep learning with enhanced learning abilities can also sever as an interesting alternative to classical methods such as FEM.
According to the universal approximation theorem [22,23], any continuous function can be approximated by a feedforward neural network with one single hidden layer. However, the number of neurons of the hidden layer tends to increase exponentially with increasing complexity and non-linearity of a model. Recent studies show that DNNs render better approximations for nonlinear functions [24]. Some researchers employed deep learning for the solution of PDEs. E et al. developed a deep learning-based numerical method for high-dimensional parabolic PDEs and back-forward stochastic differential equations [25,26]. Raissi et al. [27] introduced physics-informed neural networks for supervised learning of nonlinear partial differential equations. Beck et al. [28] employed deep learning to solve nonlinear stochastic differential equations and Kolmogorov equations. Sirignano and Spiliopoulos [29] provided a theoretical proof for deep neural networks as PDE approximators, and concluded that it converged as the number of hidden layers tend to infinity. Karniadakis et al. give an overview of physics-informed machine learning and introduced application of physics-informed neural networks in various fields such as fluid mechanics [30]. For modelling problems in solid mechanics, we presented a Deep Collocation Method (DCM) [31,32]. Based on DCM, we have proposed a stochastic deep collocation method with neural architecture search strategy for stochastic flow analysis in heterogeneous media and found that physics-informed deep learning model can account for stochastic disturbance/uncertainties efficiently and stably [33]. Rather than the strong form of the boundary value problem with higher order derivatives, we presented a Deep Energy Method (DEM) [34][35][36][37] by constraining total potential energy in the loss instead of a BVP. With the maturity of physics-informed deep learning model, it can be further applied to practical engineering problems.
The problems of potential represent a category of physical and engineering problems. For some physical parameters in potential problems, for example, heat conductivity, permeability, permittivity, resistivity, magnetic permeability, tends to have a spatial distribution, and they can vary with respect to one or more coordinates. In order to deal with these problems, the nonhomogeneous problems is translated into homogeneous problems with some classes of material variations. The steady state heat conduction analysis of FGMs analysis is a representative of potential problems. Due to the inherent mathematical difficulties, closed-form solutions exist in a few simple cases. Some traditional powerful methods, such as the finite element method(FEM), the boundary element method(BEM), and the method of fundamental solutions (MFS) and the dual reciprocity method (DRM) were used to solve the potential problems [38,39]. The 'meshfree' physics-informed neural networks offered a novel and robust approach in discovering the nonlinear patterns behind the potential patterns, especially for higher dimensions.
In practice the learning ability of deep neural networks can strongly rely on the neural network configurations and the optimization algorithms, such as the activation forms, number of neurons and layers, weight initialization methods, number of iterations, and so on. In this paper, we therefore compare different parameters to offer suggestions on the choice of a favourable configuration for the physics-informed neural network. Moreover, to increase the generality and robustness of the physics-informed deep learning based collocation method, the material transfer learning technique is integrated in the model, which will reduce the computation costs for different material variation types and help to improve the numerical results. Further, to unveil those influencing parameters for the proposed model, a global sensitive analysis is supplemented in the paper, which will be instructive for setting up physics-informed neural networks.
The paper is organised as follows: First, the three dimensional potential problem with in-homogeneous media is presented. Then we introduce the physics-informed deep learning based collocation method, which includes the neural network architecture, activation functions, sampling methods, a convergence proof, the material transfer learning and sensitivity analysis. Subsequently, a sufficient survey of numerical examples are presented, which investigated different neural network configurations, material transfer learning and model sensitivity analysis. Finally, the effectiveness of the deep learning method is demonstrated for solving three dimensional potential problems in non-homogeneous media.

The governing equation for 3D problems of potential
The general partial differential equation for potential function φ defined on a region Ω bounded by surface τ , with an outward normal n, can be written as: where k is a position-oriented material function. Equation (1) is the field equation for a wide range of problems in physics and engineering such as heat transfer, incompressible flow, gravity field, shaft torsion, electrostatics and magnetostatics, some of which are shown in Table 1 [40]. The Dirichlet τ D and Neumann boundary τ N conditions are given as: where n is the unit outward normal to τ N . The material properties of functionally graded materials (FGMs) vary gradually in space. Classical variations of k(x) take the form k(x) = k 0 f (x), k 0 denoting a reference value and f (x) is the material property variation function. Among the most common variation functions are the quadratic, exponential and trigonometric: The governing equations for different material variations in the z−direction are summarized in Table 2: Differential equation 3 Physics-informed deep learning based collocation method

Feed forward neural network
The basic architecture of a fully connected feedforward neural network is shown in Figure 1. It comprises multiple layers: an input layer, one or more hidden layers and an output layer. Each layer consists of one or more nodes called neurons, shown in Figure 1 by the small colored circles. For an interconnected structure, every two neurons in neighboring layers have a connection, where the weights between neuron k in hidden layer l−1 and neuron j in hidden layer l is denoted by w l jk , see Figure 1. No connection exists among neurons in the same layer as well as in the non-neighboring layers. Input data, defined from x 1 to x N , flows through this neural network via connections between neurons, starting from the input layer, through the hidden layers l − 1, l, to the output layer, which eventually outputs data from y 1 to y M .
The activation function is defined for an output of each neuron in order to introduce a non-linearity into the neural network and make the backpropagation possible where gradients are supplied along with an error to update weights and biases. The activation function in layer l will be denoted by σ here. There are many activation functions σ proposed for inference and identification with neural networks such as sigmoids function [41], hyperbolic tangent function (T anh) [41], Rectified linear units (Relu), to name a few. And some recent smooth activation functions, such as Swish [42], LeCuns Tanh [41], Bipolar sigmoid [41], Mish [42], Arctan [43], listed in Appendix B Table B1 have been studied and compared in the numerical example section. All selected activation functions must be smooth enough in order to avoid gradient vanishing during backpropagation, since the governing equation is introduced in the loss which includes the second order derivatives of the field variable. Afterward, the value on each neuron in the hidden layers and output layer can be yielded by adding the weighted sum of values of output values from previous layer to basis. An intermediate quantity for neuron j on hidden layer l is defined as and its output is given by the activation of the above weighted input where y l−1 k is the output from previous layer. Based on the previous derivation and description, we can draw a definition which will be used in Section 3.3: The tuple formed neural network in all defines a continuous bounded function mapping R d to R n : where d indicates the dimension of the inputs, n the number of field variables, θ = {W ; b} consisting of hyperparameters such as weights and biases and • denotes the element-wise operator.
The universal approximation theorem [22,23] states that this continuous bounded function F with nonlinear activation σ can be adopted to capture the nonlinear property of the system, in our case the potential problem. With this definition, we can define [44]:

Backpropagation
Backpropagation (backward propagation) can be used to train multilayer feedforward networks by calculating the gradient of a loss function and finding the minimum value of the loss function. The backward (output-to-input) flow determine how to adjust each weight as shown in Figure 2.
Backpropagation is based on the chain rule, which is used to calculate the derivative of loss function with regard to the weight in the network. The governing equation in our problem requires the second partial derivatives of the potential function φ (x). In order to find the weights and biases, a loss function Loss (f, θ) is defined. The backpropagation algorithm for computing the gradient of this loss function Loss (f, θ), the weight coefficients w and thresholds of neurons b can be written as follow:

Physics-informed deep learning based collocation method
To train the network, we place collocation points in the physical domain and at the boundaries denoted by x Ω = (x 1 , ..., x NΩ ) T and x Γ (x 1 , ..., x NΓ ) T , respectively. Then the potential function φ is approximated with the aforementioned deep feedforward neural network φ h (x; θ). Thus, a loss function related to the underlying BVP is constructed. Substituting φ h (x Ω ; θ) into governing equation, we obtain which results in a physics-informed deep neural network G (x Ω ; θ). The boundary conditions illustrated in Section 2 can also be expressed by the neural for l from 2 to L do compute z l = w l a l−1 + b l and a l = σ z l end for Output error: Compute the output error δ L = ∇ a Loss σ z L Backpropagate error: for l from L-1 to 2 do compute δ l = w l+1 T δ L+1 σ z l end for Output gradient: The gradient of the loss function ∂Loss where q h x Γ N ; θ can be obtained from Equation (2) by combing φ h x Γ N ; θ . Note the induced physics-informed neural network G (x; θ), q (x; θ) share the same parameters as φ h (x; θ). Considering the generated collocation points in domain and on boundaries, they can all be learned by minimizing the mean square error loss function [45]: with is then a solution to potential function. Here, the defined loss function measures how well the approximation satisfies the physical law (governing equation), boundaries conditions. Our goal is to find a set of parameters θ that the approximated potential φ h (x; θ) minimizes the loss Loss. If Loss is a very small value, the approximation φ h (x; θ) is very closely satisfying governing equations and boundary conditions, namely The solution of heat conduction problems by deep collocation method can be reduced to an optimization problem. In the deep learning Tensorflow framework, a variety of optimizers are available. One of the most widely used optimization methods is the Adam optimization algorithm, which is also adopted in the numerical study. The idea is to take a descent step at collocation point x i with Adam-based learning rate η i , and then the process in Equation (13) is repeated until a convergence criterion is satisfied.

Convergence of deep collocation method for non-homogeneous PDEs
With the universal approximation theorem of neural networks, a feedforward neural network is used to approximate the potential function as φ h (x; θ).
The approximation power of neural networks for a quasilinear parabolic PDEs were shown by Sirignano et al. [29]. For non-homogeneous elliptic PDEs, the convergence study can be boiled down to: The non-homogeneous PDEs has a unique solution, s.t. φ ∈ C 2 (Ω) with its derivatives uniformly bounded. Also, the conductivity function k(x) is assumed to be C 1,1 (C 1 with Lipschitz continuous derivative).
Theorem 2 Assume that Ω is compact with measures 1 , 2 , and 3 whose supports are constrained in Ω, Γ D , and Γ N . Furthermore, the governing equation (1) subject to 2 has a unique classical solution and material function k(x) being C 1,1 (C 1 with Lipschitz continuous derivative). Then, ∀ ε > 0, ∃ K > 0, which may dependent on Proof For governing Equation (1) subject to 2, according to Theorem 1, ∀ ε > 0, Recalling that the loss is constructed by Equation (10), for M SE G and applying triangle inequality, we obtain: Let us consider the C 1,1 conductivity function k(x), (15), we can then obtain: On boundaries Γ D and Γ N , we can obtain: Therefore, using Equations 17 and 18, as n → ∞, we obtain With Theorem 2 and the condition that Ω is a bounded open subset of R, ∀n ∈ N + , φ h ∈ F n ∈ L 2 (Ω), it can be concluded from Sirignano et al. [29] that: Theorem 3 ∀ p < 2, φ h ∈ F n converges to φ strongly in L p (Ω) as n → ∞ with φ being the unique solution to the potential problems.
In summary, for feedforward neural networks F n ∈ L p space (p < 2), the approximated solution φ h ∈ F n will converge to the solution to the nonhomogeneous PDE.

Collocation points generation
Model training is an important process in machine learning and the quality of training datasets determines the reliability of the machine learning model to a large extent. The deep collocation method (DCM) utilizes physics-informed neural networks for solving PDEs with randomly generated training points in the physical domain. In order to test the influence of training points on the stability and accuracy, different sampling methods are compared. The Halton and Hammersley sequences generate random points by a constructing the radical inverse [46]. They are both low discrepancy sequences. The method of Korobov Lattice creates samples from Korobov lattice point sets [47]. Sobol Sequence is a quasi-random low-discrepancy sequence to generate sampling points [48]. Latin hypercube sampling (LHS) is a statistical method, where a near-random sample of parameter values is generated from a multidimensional distribution [49]. Monte Carlo methods can create points by repeated random sampling [50]. The distribution plots of different sampling points inside a cube is listed in Appendix B Table B2.

Material transfer learning
In order to improve the generality and robustness of the DCM, transfer learning is exploited, which will make use of the information from an already trained model yielding to training with less data and a reduced training time. The basic idea can be found in Figure 4. For different material variations in nonhomogeneous media, the 'knowledge' of one material model can be set up as the pretrained model resulting in a two-stage paradigm. The material transfer learning model is divided into two parts, i.e. pretraining, where the network is trained on a large dataset and longer iterations for one material variation type. The remaining part is the fine-tuning, where the pretrained model is trained on other material variations with less data and number of epochs. Consequently, the weights and biases and network configurations from a trained model are passed to other relevant models.

Sensitivity analysis
Algorithm-specific parameters, such as the neural architecture configurations, parameters related to optimizers and number of collocation points significantly influence the model's accuracy. To quantify their influence on the accuracy, a global sensitivity analysis (GSA) is performed. Classical GSA including regression methods, screening approaches such as Morris method [51], the variance-based measures, such as Sobol's method [52], and the Fourier amplitude sensitivity test (FAST) [53], or the extended FAST (EFAST) [54].
Variance-based methods are usually more computationally expensive than the derivative-based methods as well as the regression methods. If the model or the parameters in analysis is large, the use of variance-based method can be costly. The method of Morris is generally robust to correctly screen the most and least sensitive parameters for a highly parameterized model with 300 times fewer model evaluations than the Sobol' method [55]. Therefore, the computational cost of a sensitivity analysis can potentially be reduced by first performing parameter screening using the Morris method to identify non-influential parameters, reducing the dimension of the parameter space to be studied in further analysis, then filter them again, but with the eFAST method. In this way, we can quantifying the effects of inputs more accurately with a relatively small amount of time.

Method of Morris
The method of Morris [56] is a screening technique used to rank the importance of parameters by averaging coarse difference relations termed elementary effects. Given a model with n parameters, X = X 1 , X 2 , ...X n denoting a vector of parameter values, we can specify an objective function y(x) = f (X 1 , X 2 , ...X n ), change the variables X i by specific ranges and then calculate the distribution of elementary effects (EE) of each input factor with respect to the model outputs, i.e.
where f (x) represents the prior point in the trajectory. Using the single trajectory shown in Equation (20), the elementary effects of each parameter can be calculated with p + 1 model evaluations. After sampling the trajectories, the resulting set of elementary effects are then averaged to obtain the total-order sensitivity of the i-th parameter µ * i : Similarly, the variance of the set of EEs can be calculated as The mean value µ * quantifies the individual effect of the parameters on an output while the variance σ 2 indicates the influence of parameter interactions. We rank the parameters according to σ 2 + µ * 2 .

eFAST method
The eFAST method [54] is based on Fourier transformations. The spectrum is obtained by each parameter and the output variance of model results due to interactions. Employing a suitable search function, the model y(x) = f (X 1 , X 2 , ...X n ) can be transformed by the Fourier transform into y = f (s) with The spectral curve of the Fourier progression is defined as Λ j = A 2 j + B 2 j . The variance of the model results due to the uncertainty in the parameter X i is given by with the parametric frequency ω 1 , the spectrum of the Fourier transform Λ, and the non-zero integers Z 0 . The total variance can be obtained by cumulatively summing the spectra at all frequencies The fraction of the total output variance caused by each parameter apart from interactions with other parameters is measured by the first-order index To find the total sensitivity of X i , the frequency of X i is set to ω i , while a different frequency ω is set for all other parameters. By calculating the frequency ω i and its higher harmonics pω i spectra, the output variance D −i due to the influence of all parameters except X i and their interrelationships can be obtained. Thus, the total-order sensitivity indices can be obtained:

Numerical examples
In this section, several cases are considered testing the accuracy and efficiency of our DCM including the influence of suitable NN configurations, sampling methods and optimizers taking advantage of GSA. Also, different material variations using material transfer learning are studied. The accuracy is measured in the relative error between the predicted solution and the analytical solution: where E a is the analytical solution and E pred is the predicted solution while · refers to the L 2 -norm. All simulations are done on a 64-bit macOS Catalina computer with Intel(R) Core(TM) i7-8850H CPU, 32GB memory. The parametric settings for training is summarised in Table 3.

Case 1: Sensitivity analysis
First, we perform a SA to determine the key parameters of the deep collocation method.

Parameters screening with Morris method
The sensitivity indices computed by the Morris screening method with 30 trajectories and 4 grid levels are listed in Figures 5 and 6, showing the effect of the numbers of neurons, layers, iterations and collocation points on the loss values. Figure 5 depicts the horizontal barplot of the GSA measure µ * . The highest µ * value is found for the numbers of layers and neurons. The numbers of collocation points barely have an effect on the loss value. According to a classification scheme proposed by Garcia Sanchez et al. [57], the ratio σ/µ * allows the characterisation of the model parameters in terms of (non-)linearity (σ/µ * < 0.1), (non-) monotony (0.1 < σ/µ * < 0.5) or possible parameter interactions (1 < σ/µ * ), see also Figure 6. For our test models, all parameters are in the range σ/µ * > 1 suggesting that most parameters exhibit either nonlinear behaviour, interaction effects with each other or both. The plot of the mean value and standard deviation (σ, µ * ) in Figure 6 reveals that the most influential parameter with largest σ 2 + µ * 2 is the numbers of layers.

Variance-based sensitivity indices
We now take advantage of the variance-based eFAST method to compute sensitivity indices. The independent first-order sensitivity indices S i and dependent total order sensitivity indices S Ti can be found in Figure 8. Due to the high computational costs, with 3000 simulations run and 1000 generated samples, no analysis concerning the variation of in S i and S Ti with different sample sizes were performed. The associated scatter plots are shown in Figure 7. The more randomly the loss values are distributed, the less sensitive the parameters is. According to Figure 7, the number of layers is the most influential parameter, followed by the number of neurons and number of iterations.  The first order sensitivity index S i represents the parameter importance. The number of layers affects the model most, followed by the numbers of neurons and the least influential parameter is the number of iterations, which agrees well with the results of Morris Method. However, the first-order indices are all beyond 0.01, which manifest that those algorithm-specific parameters individually do not have too much influence on the loss value of the model. The total effects index S Ti greater than 0.8 can be regarded very important parameters. Again, the number of layers and neurons are greater than 0.8. For the number of iterations it is between 0.5 to 0.8. However, there is a big difference between the value of total and first-order sensitivity indices, which quantifies the effects of the parameter's interactions. It can be concluded that the output variance can be attributed to their interactions with other parameters rather than their nonlinear effects and all interactions between these three parameters are noteworthy.  Table 4 are considered [58]. The profiles of the thermal conductivity k(z) of the three material variation cases are illustrated in Figure 9, and the boundary conditions of the unit cube can be found in Figure 10. For each nonhomogeneous thermal conductivity, the analytical solution is summarized in Table 4.
Analytical solution for potential function

Deep collocation method configurations
First, different NN configurations are investigated. Figure 11 shows the relative error for various activation functions and layers. The arctan function yields the most stable and accurate results. Both arctan and T anh function outperform the other activation functions. Figure 12 depicts the influence of different sampling methods on the relative error. Random sampling method obtained most stable and accurate potentials with increasing layers. Korobov, Hammersley, LatinHypercube sampling methods also provide reasonable results. Next, we focus on various material variations, see Figure 13. All material variations can be predicted accurately, but the most accurate results are obtained for the exponential conductivity. The results from Figure 11 to 13 suggest that 2 hidden layers are a good choice for the underlying problem.  We study now different numbers of collocation points (inside the cube and on its surface). The relative error in the temperature is depicted in Figure 14 and 16. We also compared our results to results from FEM in Figure 15. The  Figure 17.
The predicted temperature and flux distributions for three material variations inside the cube is shown in Figures 18-20. The heat distribution varies with graded variation in the z coordinates which is consistent with the material property of the FGMs.
Let us now test the influence of the optimizer on the results. First-order methods minimize the function using its gradient while second-order methods minimize the loss function using the second order derivatives (Hessian information). In this application a combination of these two optimizers is employed. The used first-order method is the Adam algorithm while L-BFGS is the tested second-order method. The convergence history for different optimizers is illustrated in Figure 21. Although the first order optimizer can be faster, they require more iterations. The L-BFGS optimizer needs less iterations, but there is the risk in being trapped in local minima. Using the combined optimizers, the loss reaches a significant smaller value with acceptable number of iterations and simultaneously ensures the solution being close to the global minima. The

Material transfer learning
The loss vs number of iterations is shown in Figure 23. After funetuning, the loss decreases to a smaller value in less iterations for all three material variations. The numerical results are summarized in Table 5 demonstrating that the computational effort can be drastically reduced with transfer learning.   Figure 24 shows the loss vs iteration using transfer learning for different material parameters while Tables 6 and 7 list the accuracy and CPU time with and without transfer learning.

Case 4: Irregular shaped annular sector
Next, we present results for an irregular shaped annular sector as depicted in Figure 29. The inner radius is 0.3, outer radius 0.5, the top surface at Z=0.1 and the thermal conductivity for the geometry varies exponentially according to k(z) = 5e (3z) The variation of the thermal conductivity k(z) is illustrated in Figure 28. The temperature is specified along the inner radius as T inner = 0, and outer radius as T outer = 100, and all other surfaces are insulated. The boundary conditions of the geometry are shown in Figure 29.
The results of the predicted temperature are shown in Figure 30-31 and compared to an FEM solution of the commercial software package ABAQUS, as no analytical solution is available for this problem. The temperature along the radial direction at the edge is plotted and compared with results obtained by ABAQUS in Figure 32.

Conclusion
We presented a transfer learning based deep collocation method (DCM) for solving the problems of potential in non-homogeneous media. It avoids classical discretization methods such as FEM and treats the problem as minimzation problem, minimizing a loss function which is related to the underlying governing equation. Thanks to the nonlinear activation function, the approach enables us to discover complex nonlinear pattern. The DCM requires sampling inside the physical domain. Therefore, we obtained a suitable sampling method for selected problems. In order to find the most favorable configuration of the neural network for specific problems, we carried out a sensitivity analysis quantifying the influence of algorithm-specific parameters on specific outputs such as the relative error in the L2 norm. For different material variation forms and material parameters, a material transfer learning is embedded into the framework to enhance the robustness and generality of this deep collocation method. To demonstrate the performance of the proposed DCM, various benchmark problems including the heat transfer and a representative potential problem are studied, which has been verified with effectiveness and accuracy. Even though DCM is quite promising in solving partial differential equations, there are limitations remains to be solved, such as lack of a more systematic procedure in preventing overfitting issues for physics informed deep learning model, the local minima issues and DCM still remains to be enhanced in the learning ability and efficiency for complex multi-physics and multi-scale modelling, which will be our major research topics in the future.