Appendix
Data flow in stochastic deep collocation method
The submodules for the stochastic deep collocation method and data flow inside the whole model are illustrated in Figure 25. There are three main submodules involved for stochastic deep collocation method: neural architecture search method, physics-informed deep neural networks and a transfer learning technique. First, the neural architecture search model shown on the left module is performed to find a physics-informed deep neural network with optimal performance and then the deep collocation method is constructed based on the deep neural network with searched configurations on the right. To enhance the model generality and efficiency, the neural network settings is inherited for transfer learning. In general, the data first flow inside neural architecture search model to help find an optimal deep neural network architecture and corresponding parameter settings and then to the PINNs model for stochastic analysis of heterogeneous porous media with initial hydraulic and material parameters. Finally, the data will flow to transfer learning module for stochastic flow analysis in more general cases to reduce the computational costs and enhance the model accuracy and generality.
Choices of random variables \(\mathbf{k} \)
From Section 2.2, we can derive the following probability density function (PDF) \(p({\varvec{k}})\) for exponential and Gaussian correlations:
$$\begin{aligned} p({\varvec{k}})= & {} \uplambda ^d \frac{\Gamma [\frac{d+1}{2}]}{(\pi (1+({\varvec{k}}\uplambda )^2))^{\frac{d+1}{2}}} \end{aligned}$$
(B1)
$$\begin{aligned} p({\varvec{k}})= & {} \pi ^{d/2}\uplambda ^d e^{-(\pi {\varvec{k}}\uplambda )^2}, \end{aligned}$$
(B2)
where \(\Gamma \) means the gamma function, with \(\Gamma (n)=(n-1)!\) and \(\Gamma (n+1/2)=\frac{\sqrt{\pi }}{2^n}(2n-1)!!\) for \(n=1,2,3...\)
For Gaussian correlations in three dimensional case, the PDF of \({\varvec{k}}\) can be transformed into the following form:
$$\begin{aligned} p({\varvec{k}})= & {} \left( \sqrt{\pi }\uplambda _1 e^{-(\pi k_1\uplambda _1)^2}\right) \left( \sqrt{\pi }\uplambda _2 e^{-(\pi k_2\uplambda _2)^2}\right) \nonumber \\&\quad \left( \sqrt{\pi }\uplambda _3 e^{-(\pi k_3\uplambda _3)^2}\right) . \end{aligned}$$
(B3)
Each part of Equation (B3) can be considered as a normal distribution with \(\mu =0\) and \(\sigma =\frac{1}{\sqrt{2}\pi \uplambda _i}\). Thus, the random vector \({\varvec{k}}\) can be simulated by the formula \({\varvec{k}}=\frac{1}{\sqrt{2}\pi }(\mu _1/\uplambda _1,\mu _2/\uplambda _2,\mu _3/\uplambda _3)\), where \(\mu _i\) are independent standard Gaussian random variables.
The \({\varvec{k}}\) in the two-dimensional case can be derived by analogy from the above inference as \({\varvec{k}}=\frac{1}{\sqrt{2}\pi }(\mu _1/\uplambda _1,\mu _2/\uplambda _2)\).
For exponential correlations in two-dimensional case, the PDF of \({\varvec{k}}\) can be transformed into the following form:
$$\begin{aligned} p({\varvec{k}})= \frac{\uplambda _1 \uplambda _2}{2\pi \big (1+(k_1\uplambda _1)^2+(k_2\uplambda _2)^2\big )^{\frac{3}{2}}}. \end{aligned}$$
(B4)
A possible solution to calculate the cumulative distribution function (CDF) is the transformation from Cartesian into polar coordinates, i.e. a representation like:
$$\begin{aligned} \begin{aligned} k_1=r cos(2\pi {\hat{h}})/\uplambda _1,\\ k_2=r sin(2\pi {\hat{h}})/\uplambda _2. \end{aligned} \end{aligned}$$
(B5)
Here \({\hat{h}}\) is a uniformly distributed random variable and r is a random variable distribute according to
$$\begin{aligned} p_r(r)=\frac{2\pi r p(r)}{\uplambda _1\uplambda _2}. \end{aligned}$$
(B6)
Integrating Equation (B6) yields the CDF
$$\begin{aligned} \begin{aligned} F(r)&=\int _{-\infty }^{r} p_r(r) \mathrm {d}r\\&=\int _{-\infty }^{r} \frac{r}{(1+r^2)^{3/2}} \mathrm {d}r\\&=-\frac{1}{(1+r^2)^{1/2}}\bigg |_{-\infty }^{r}\\&=-\frac{1}{(1+r^2)^{1/2}}. \end{aligned} \end{aligned}$$
(B7)
Choose a uniformly distributed random variable \(\mu \), the inverse function \(k=F^{-1}(\mu )\) can be obtained
$$\begin{aligned} \begin{aligned} \mu&=-\frac{1}{(1+r^2)^{1/2}}\\ \mu ^2&=\frac{1}{1+r^2}\\ r&=\sqrt{1/\mu ^2-1}. \end{aligned} \end{aligned}$$
(B8)
Substitute Equation (B5) into Equation (B5), we get the \(k_1=(1/\mu ^2-1)^{1/2}cos(2\pi {\hat{h}})/\uplambda _1\), and \(k_2=(1/\mu ^2-1)^{1/2}sin(2\pi {\hat{h}})/\uplambda _2\).
For exponential correlations in three-dimensional case, the PDF of \({\varvec{k}}\) can be transformed into the following form:
$$\begin{aligned} p({\varvec{k}})= \frac{\uplambda _1 \uplambda _2 \uplambda _3}{\pi ^2(1+(k_1\uplambda _1)^2+(k_2\uplambda _2)^2+(k_3\uplambda _3)^2)^2}. \end{aligned}$$
(B9)
A similar procedure can be used, where spherical instead of polar coordinates are used
$$\begin{aligned} \begin{aligned} k_1&=r sin(\theta ) cos(2\pi \gamma )/\uplambda _1,\\ k_2&=r sin(\theta ) sin(2\pi \gamma )/\uplambda _2,\\ k_3&=r cos(\theta )/\uplambda _3. \end{aligned} \end{aligned}$$
(B10)
Here \(\gamma \) is again a uniformly distributed random variable and \(\theta \) is given as
$$\begin{aligned} \theta = \arccos (1 - 2\xi ), \end{aligned}$$
(B11)
with \(\xi \) being a uniformly distributed random variable. The two random variables were chosen with reference to Weissten’s research on generating random points on the surface of a unit sphere [71]. The radius r is distributed according to
$$\begin{aligned} p_r(r)=4\pi r^2 p(r). \end{aligned}$$
(B12)
The CDF can be calculated as follows:
$$\begin{aligned} \begin{aligned} F(r)&=\int _{-\infty }^{r} p_r(r) dr\\&=\int _{-\infty }^{r} \frac{4 r^2}{\pi (1+r^2)^2} dr\\&=\frac{2}{\pi }(-\frac{r}{1+r^2}\bigg |_{-\infty }^{r}-\int _{-\infty }^{r} -\frac{1}{1+r^2} dr)\\&=\frac{2}{\pi }(\arctan (r)-\frac{r}{1+r^2}). \end{aligned} \end{aligned}$$
(B13)
Choose a uniformly distributed random variable \(\gamma _1\), r can be obtained by solving the next Equation (B14):
$$\begin{aligned} \frac{2}{\pi }\left( \arctan (r)-\frac{r}{1+r^2}\right) =\gamma _1. \end{aligned}$$
(B14)
Manufactured solutions
For the 1D case, we have selected the following manufactured solution according to benchmark [43]:
$$\begin{aligned} {\hat{h}}_{MMS}(x)=3+sin(x), with \; x \in [0,25]. \end{aligned}$$
(C15)
This leads to the following Dirichlet boundary conditions :
$$\begin{aligned} \left\{ \begin{array}{lr} {\hat{h}}(0)=3, &{} \\ {\hat{h}}(25)=3+sin(25). &{} \end{array} \right. \end{aligned}$$
(C16)
The function K is now given by
$$\begin{aligned} K(x)=C_1 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\bigg ), \end{aligned}$$
(C17)
where we use the shorthand notations \(C_1=\langle K\rangle exp(-\frac{\sigma ^2}{2})\) and \(C_2=\sigma \sqrt{\frac{2}{N}}\). And the source term f has the following form:
$$\begin{aligned} \begin{aligned} f(x)=&C_1 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\bigg )\\&\cdot \bigg ((-2\pi )C_2k_{i,1}\sum _{i=1}^{N}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\\&\quad cos(x)-sin(x)\bigg ). \end{aligned} \end{aligned}$$
(C18)
The Python code for the generation of K is shown as follows:
For the 2D case, we consider the following smooth manufactured solution :
$$\begin{aligned} {\hat{h}}_{MMS}(x,y)= & {} 1+sin(2x+y), \nonumber \\&\quad with\quad x \in [0,20] \quad and\quad y \in [0,20], \end{aligned}$$
(C19)
along with the Dirichlet and Neumann boundary conditions:
$$\begin{aligned} \left\{ \begin{array}{lr} {\hat{h}}(0,y)=1+sin(y), &{} \forall y \in [0,20],\\ {\hat{h}}(20,y)=1+sin(2\times 20+y), &{} \forall y \in [0,20],\\ \frac{\partial {\hat{h}}}{\partial y}(x,0)=cos(2x), &{} \forall x \in [0,20],\\ \frac{\partial {\hat{h}}}{\partial y}(x,20)=cos(2x+20), &{} \forall x \in [0,20].\\ \end{array} \right. \end{aligned}$$
(C20)
The function K is now given by
$$\begin{aligned} K(x,y)=C_1 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg ), \end{aligned}$$
(C21)
where we use the shorthand notations \(C_1\) and \(C_2\) same as in 1-dimensional case. And the source term f has the following form:
$$\begin{aligned} \begin{aligned} f(x,y)=&2C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,1}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\bigg )\\&\quad cos(2x+y)\\&-5C_1exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg )\\&\quad sin(2x+y)\\&+ C_1C_2exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,2}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\bigg )\\&\quad cos(2x+y). \end{aligned} \end{aligned}$$
(C22)
An alternative manufactured solution is
$$\begin{aligned} {\hat{h}}_{MMS}(x,y)= & {} 1+sin(2x)+sin(y), \nonumber \\&\quad with\quad x \in [0,20] \quad and\quad y \in [0,20], \end{aligned}$$
(C23)
along with the Dirichlet and Neumann boundary conditions:
$$\begin{aligned} \left\{ \begin{array}{lr} {\hat{h}}(0,y)=1+sin(y), &{} \forall y \in [0,20],\\ {\hat{h}}(20,y)=1+sin(2\times 20)+sin(y), &{} \forall y \in [0,20],\\ \frac{\partial {\hat{h}}}{\partial y}(x,0)=cos(0), &{} \forall x \in [0,20],\\ \frac{\partial {\hat{h}}}{\partial y}(x,20)=cos(20), &{} \forall x \in [0,20].\\ \end{array} \right. \end{aligned}$$
(C24)
The source term f has the following form:
$$\begin{aligned} \begin{aligned} f(x,y)=&2C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,1}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\bigg )cos(2x)\\&-C_1exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg )\\&\quad \big (4sin(2x)+sin(y)\big )\\&+ C_1C_2exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,2}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2})\big )\bigg )cos(y). \end{aligned} \end{aligned}$$
(C25)
For the 3D case, we consider the following smooth manufactured solution:
$$\begin{aligned} {\hat{h}}_{MMS}(x,y,z)= & {} 1+sin(3x+2y+z), \nonumber \\&\quad with\, x \in [0,5],\, y \in [0,2]\, and\, z \in [0,1], \end{aligned}$$
(C26)
along with the Dirichlet and Neumann boundary conditions:
$$\begin{aligned} \left\{ \begin{array}{lr} {\hat{h}}(0,y,z)=1+sin(2y+z), &{} \forall y \in [0,2],\forall z \in [0,1],\\ {\hat{h}}(5,y,z)=1+sin(3\times 5+2y+z), &{} \forall y \in [0,2],\forall z \in [0,1],\\ \frac{\partial {\hat{h}}}{\partial y}(x,0,z)=2cos(3x+z), &{} \forall x \in [0,5],\forall z \in [0,1],\\ \frac{\partial {\hat{h}}}{\partial y}(x,2,z)=2cos(3x+2\times 2+z), &{} \forall x \in [0,5],\forall z \in [0,1],\\ \frac{\partial {\hat{h}}}{\partial z}(x,y,0)=cos(3x+2y), &{} \forall x \in [0,5],\forall y \in [0,2],\\ \frac{\partial {\hat{h}}}{\partial z}(x,y,1)=cos(3x+2y+1), &{} \forall x \in [0,5],\forall y \in [0,2]. \end{array} \right. \end{aligned}$$
(C27)
The function K is now given by
$$\begin{aligned} K(x,y,z)=C_1 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg ), \end{aligned}$$
(C28)
where we use the shorthand notations \(C_2\) same as in 1-dimensional case, but \(C_1=\langle K\rangle exp(-\frac{\sigma ^2}{6})\). And the source term f has the following form:
$$\begin{aligned} \begin{aligned} f(x,y,z)=&3C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\big (-2\pi k_{i,1}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}+k_{i,3}z)\bigg )\bigg )\\&\quad cos(3x+2y+z)\\&+2C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,2}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}+k_{i,3}z)\big )\bigg )\\&\quad cos(3x+2y+z)\\&+C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,3}sin\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}+k_{i,3}z)\big )\bigg )\\&\quad cos(3x+2y+z)\\&-14C_1 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot sin(3x+2y+z). \end{aligned} \end{aligned}$$
(C29)
An alternative manufactured solution is
$$\begin{aligned} {\hat{h}}_{MMS}(x,y,z)= & {} 5+sin(3x)+sin(2y)+sin(z), \, \nonumber \\&\quad with\, x \in [0,5],\, y \in [0,2]\, and\, z \in [0,1], \end{aligned}$$
(C30)
along with the Dirichlet and Neumann boundary conditions:
$$\begin{aligned} \left\{ \begin{array}{lr} {\hat{h}}(0,y,z)=5+sin(2y)+sin(z), &{} \forall y \in [0,2],\forall z \in [0,1],\\ {\hat{h}}(5,y,z)=5+sin(3\times 5)+sin(2y)+sin(z), &{} \forall y \in [0,2],\forall z \in [0,1],\\ \frac{\partial {\hat{h}}}{\partial y}(x,0,z)=2cos(0), &{} \forall x \in [0,5],\forall z \in [0,1],\\ \frac{\partial {\hat{h}}}{\partial y}(x,2,z)=2cos(2\times 2), &{} \forall x \in [0,5],\forall z \in [0,1],\\ \frac{\partial {\hat{h}}}{\partial z}(x,y,0)=cos(0), &{} \forall x \in [0,5],\forall y \in [0,2],\\ \frac{\partial {\hat{h}}}{\partial z}(x,y,1)=cos(1), &{} \forall x \in [0,5],\forall y \in [0,2]. \end{array} \right. \end{aligned}$$
(C31)
And the source term f has the following form:
$$\begin{aligned} \begin{aligned} f(x,y,z)=&3C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,1}sin\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}+k_{i,3}z)\big )\bigg )cos(3x)\\&+2C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,2}sin\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}+k_{i,3}z)\big )\bigg )cos(2y)\\&+C_1C_2 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \sum _{i=1}^{N}\bigg (-2\pi k_{i,3}sin\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}+k_{i,3}z)\big )\bigg )cos(z)\\&-C_1 exp\bigg (C_2\sum _{i=1}^{N}cos\big (\xi _1 \\&\quad +2\pi (k_{i,1}x+k_{i,2}y+k_{i,3}z)\big )\bigg )\\&\cdot \big (9sin(3x)+4sin(2y)+sin(z)\big ). \end{aligned} \end{aligned}$$
(C32)
Approximation proof
Here we will give a proof of the convergence of physics-informed neural network approximating hydraulic head in solving the proposed model. First, we need to assume that this partial differential equation has a unique solution, s.t. \({\hat{h}} \in C^2(\Omega )\) with its derivatives uniformly bounded and the heterogeneous hydraulic conductivity function \(K({\varvec{{x}}})\) to be \(C^{1,1}\) (\(C^1\) with Lipschitz continuous derivative). The smoothness of the K field is essentially determined by the correlation of the random field \(Y'\). According to [72], the smoothness conditions are fulfilled if the correlation of \(Y'\) has a Gaussian shape and is infinitely differentiable. For the source term, the smoothness of the source term is determined by the constructed manufactured solution \({\hat{h}}_{MMS}\), in Equations 22 and 24, which is obvious continuous and infinitely differentiable \(f\in C^\infty (\Omega )\).
Theorem 2
With assumption that \(\Omega \) is compact and considering measures \(\ell _1\), \(\ell _2\), and \(\ell _3\) whose supports are constrained in \(\Omega \), \(\Gamma _D\), and \(\Gamma _N\). Also, the governing Equation (19) subject to boundary conditions (21) is assumed to have a unique classical solution and conductivity function \(K({\varvec{{x}}})\) is assumed to be \(C^{1,1}\) (\(C^1\) with Lipschitz continuous derivative). Then \(\forall \;\; \varepsilon >0 \), \(\exists \;\; \uplambda >0\), which may dependent on \(sup_{\Omega }\left\| {\hat{h}}_{ii}\right\| \) and \(sup_{\Omega }\left\| {\hat{h}}_{i}\right\| \), s.t. \(\exists \;\; {\hat{h}}^h\in F^n\), that satisfies \(L(\theta )\le \uplambda \varepsilon \).
Proof
For governing Equation (19) subject to boundary conditions (21), according to Theorem 1, \(\forall \) \(\varepsilon \;\; >0\), \(\exists \;\; {\hat{h}}^h\;\; \in \;\; F^n\), s.t.
$$\begin{aligned} \sup _{x\in \Omega }\left\| {\hat{h}}_{,i}\left( {\varvec{{x}}}\,_\Omega \right) - {\hat{h}}^h_{,i}\left( {\varvec{{x}}}\,_\Omega \right) \right\| ^2+\sup _{x\in \Omega }\left\| {\hat{h}}_{,ii}\left( {\varvec{{x}}}\,_\Omega \right) - {\hat{h}}^h_{,ii}\left( {\varvec{{x}}}\,_\Omega \right) \right\| ^2<\varepsilon . \end{aligned}$$
(D33)
Recalling that the Loss is constructed in the form shown in Equation (34), for \(MSE_G\), applying triangle inequality, and obtains:
$$\begin{aligned} \begin{aligned} \begin{Vmatrix} G\left( {\varvec{{x}}}\,_\Omega ;\theta \right) \end{Vmatrix}^2\leqslant \begin{Vmatrix} K({\varvec{x}}_\Omega ){\hat{h}}_{,ii}^{h}\left( {\varvec{{x}}}\,_\Omega ;\theta \right) \end{Vmatrix}^2\\+\begin{Vmatrix} K_{,i}({\varvec{x}}_\Omega ){\hat{h}}^{h}_{,i}\left( {\varvec{{x}}}\,_\Omega ;\theta \right) \end{Vmatrix}^2+\begin{Vmatrix} f\left( {\varvec{{x}}}\,_\Omega \right) \end{Vmatrix}^2. \end{aligned} \end{aligned}$$
(D34)
Also, considering the \(C^{1,1}\) conductivity function \(K({\varvec{{x}}})\), \(\exists \;\;M_1>0,\;\;M_2>0\), \(\exists \;\; x \in \;\Omega \), \(\left\| K({\varvec{{x}}})\right\| \leqslant M_1\), \(\left\| K_{,i}({\varvec{{x}}})\right\| \leqslant M_2\). From Equation (D33), it can be obtained that
$$\begin{aligned} \begin{aligned} \int _{\Omega }K_{,i}^2({\varvec{x}}_\Omega )\left( {\hat{h}}_{,i}^h-{\hat{h}}_{,i} \right) ^2d\ell _1\leqslant M_2^2 \varepsilon ^2\ell _1(\Omega ) \\ \int _{\Omega }K^2({\varvec{x}}_\Omega )\left( {\hat{h}}_{,ii}^h-{\hat{h}}_{,ii} \right) ^2d\ell _1\leqslant M_1^2 \varepsilon ^2\ell _1(\Omega ). \end{aligned} \end{aligned}$$
(D35)
On boundaries \(\Gamma _{D}\) and \(\Gamma _{N}\),
$$\begin{aligned} \begin{aligned}&\int _{\Gamma _{D}}\left( {\hat{h}}^h \left( {\varvec{{x}}}\,_{\Gamma _D};\theta \right) -{\hat{h}}\left( {\varvec{{x}}}\,_{\Gamma _D};\theta \right) \right) ^2d\ell _2\leqslant \varepsilon ^2\ell _2(\Gamma _{D})\\&\int _{\Gamma _{N}}K^2({\varvec{x}}_{\Gamma _N})\left( {\hat{h}}^h_{,n} \left( {\varvec{{x}}}\,_{\Gamma _N};\theta \right) -{\hat{h}}_{,n}\left( {\varvec{{x}}}\,_{\Gamma _N};\theta \right) \right) ^2\\&\quad d\ell _3\leqslant M_1^2\varepsilon ^2\ell _3(\Gamma _{N}). \end{aligned} \end{aligned}$$
(D36)
Therefore, using Equations (D35) and (D36), as \(n\rightarrow \infty \)
$$\begin{aligned} \begin{aligned} L\left( \theta \right)&=\frac{1}{N_\Omega }\sum _{i=1}^{N_\Omega }\begin{Vmatrix} K({\varvec{x}}_\Omega ){\hat{h}}_{,ii}^{h}\left( {\varvec{{x}}}\,_\Omega ;\theta \right) +K_{,i}({\varvec{x}}_\Omega ){\hat{h}}^{h}_{,i}\left( {\varvec{{x}}}\,_\Omega ;\theta \right) -f\left( {\varvec{{x}}}\,_\Omega \right) \end{Vmatrix}^2+\\&\frac{1}{N_{\Gamma _D}}\sum _{i=1}^{N_{\Gamma _D}}\begin{Vmatrix} {\hat{h}}^h \left( {\varvec{{x}}}\,_{\Gamma _D};\theta \right) -{\hat{h}}_{MMS}\left( {\varvec{{x}}}\,_{\Gamma _D}\right) \end{Vmatrix}^2+\\&\frac{1}{N_{\Gamma _N}}\sum _{i=1}^{N_{\Gamma _N}}\begin{Vmatrix} -K({\varvec{x}}_{\Gamma _N})\frac{\partial {\hat{h}}\left( {\varvec{{x}}}_{\Gamma _N};\theta \right) }{\partial n}+K({\varvec{x}}_{\Gamma _N})\frac{\partial {\hat{h}}{MMS}}{\partial n} \end{Vmatrix}^2 \\&\leqslant \frac{1}{N_\Omega }\sum _{i=1}^{N_\Omega }\begin{Vmatrix} K({\varvec{x}}_\Omega ){\hat{h}}_{,ii}^{h}\left( {\varvec{{x}}}\,_\Omega ;\theta \right) \end{Vmatrix}^2\\&\quad +\frac{1}{N_\Omega }\sum _{i=1}^{N_\Omega }\begin{Vmatrix} K_{,i}({\varvec{x}}_\Omega ){\hat{h}}^{h}_{,i}\left( {\varvec{{x}}}\,_\Omega ;\theta \right) \end{Vmatrix}^2\\&+\frac{1}{N_\Omega }\sum _{i=1}^{N_\Omega }\begin{Vmatrix} f\left( {\varvec{{x}}}\,_\Omega \right) \end{Vmatrix}^2\\&\quad +\frac{1}{N_{\Gamma _D}}\sum _{i=1}^{N_{\Gamma _D}}\begin{Vmatrix} {\hat{h}}^h \left( {\varvec{{x}}}\,_{\Gamma _D};\theta \right) -{\hat{h}}_{MMS}\left( {\varvec{{x}}}\,_{\Gamma _D}\right) \end{Vmatrix}^2+\\&\frac{1}{N_{\Gamma _N}}\sum _{i=1}^{N_{\Gamma _N}}\begin{Vmatrix} -K({\varvec{x}}_{\Gamma _N})\frac{\partial {\hat{h}}\left( {\varvec{{x}}}_{\Gamma _N};\theta \right) }{\partial n}+K({\varvec{x}}_{\Gamma _N})\frac{\partial {\hat{h}}{MMS}}{\partial n} \end{Vmatrix}^2 \\&\leqslant (M_2^2+M_1^2+1)\varepsilon ^2\ell _1(\Omega )+\varepsilon ^2\ell _2(\Gamma _{D})+M_1^2\varepsilon ^2\ell _3(\Gamma _{N})=K\varepsilon . \end{aligned} \end{aligned}$$
(D37)
\(\square \)
With the hold of Theorem 2 and conditions that \(\Omega \) is a bounded open subset of R, \(\forall n\in N_+\), \({\hat{h}}^h\in \;F^n \;\in L^2(\Omega )\), it can be concluded from Sirignano et al. [56] that
Theorem 3
\(\forall \;p<2\), \({\hat{h}}^h\in \;F^n\) converges to \({\hat{h}}\) strongly in \(L^p(\Omega )\) as \(n\rightarrow \infty \) with \({\hat{h}}\) being the unique solution to the potential problems.
In summary, for feedforward neural networks \(F^n \in L^p\) space (\(p<2\)), the approximated solution \({\hat{h}}^h\in F^n\) will converge to the solution to this PDE.