Skip to main content
Log in

NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs

  • Original Article
  • Published:
Engineering with Computers Aims and scope Submit manuscript

Abstract

We consider a mesh-based approach for training a neural network to produce field predictions of solutions to parametric partial differential equations (PDEs). This approach contrasts current approaches for “neural PDE solvers” that employ collocation-based methods to make pointwise predictions of solutions to PDEs. This approach has the advantage of naturally enforcing different boundary conditions as well as ease of invoking well-developed PDE theory—including analysis of numerical stability and convergence—to obtain capacity bounds for our proposed neural networks in discretized domains. We explore our mesh-based strategy, called NeuFENet, using a weighted Galerkin loss function based on the Finite Element Method (FEM) on a parametric elliptic PDE. The weighted Galerkin loss (FEM loss) is similar to an energy functional that produces improved solutions, satisfies a priori mesh convergence, and can model Dirichlet and Neumann boundary conditions. We prove theoretically, and illustrate with experiments, convergence results analogous to mesh convergence analysis deployed in finite element solutions to PDEs. These results suggest that a mesh-based neural network approach serves as a promising approach for solving parametric PDEs with theoretical bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The data and code associated with this publication will be available on GitHub.

Notes

  1. In contrast, state-of-art neural methods allow us to use basis functions beyond polynomials or Fourier bases and approximate much more complicated mappings. Although such methods can be analyzed theoretically, the estimates are often impractical [25,26,27] This is a very active area of research, and we expect tighter estimates in the future.

  2. While a probability-based definition of \(\omega\) is not needed for defining a parameteric PDE, we choose this definition for two reasons. First, such a formulation allows easy extension to the stochastic PDE case. Second, such a formulation will allow using expectation-based arguments in the analysis of convergence.

References

  1. Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707

    Article  MathSciNet  Google Scholar 

  2. Rudy S, Alla A, Brunton SL, Nathan Kutz J (2019) Data-driven identification of parametric partial differential equations. SIAM J Appl Dyna Syst 18(2):643–660

    Article  MathSciNet  Google Scholar 

  3. Tompson J, Schlachter K, Sprechmann P, Perlin K (2017)Accelerating Eulerian fluid simulation with convolutional networks. In: International Conference on machine learning, pp 3424–3433. PMLR,

  4. Raissi M, Karniadakis GE (2018) Hidden physics models: machine learning of nonlinear partial differential equations. J Comput Phys 357:125–141

    Article  MathSciNet  Google Scholar 

  5. Lu L, Jin P, Karniadakis GE (2019) Deeponet: learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193,

  6. Kharazmi E, Zhang Z, Karniadakis GE (2019) Variational physics-informed neural networks for solving partial differential equations. arXiv preprint arXiv:1912.00873,

  7. Sirignano J, Spiliopoulos K (2018) Dgm: a deep learning algorithm for solving partial differential equations. J Comput Phys 375:1339–1364

    Article  MathSciNet  Google Scholar 

  8. Yang L, Dongkun Z, Karniadakis GE (2018) Physics-informed generative adversarial networks for stochastic differential equations. arXiv preprint arXiv:1811.02033,

  9. Guofei Pang LL, Karniadakis GE (2019) fpinns: fractional physics-informed neural networks. SIAM J Sci Comput 41(4):A2603–A2626

    Article  MathSciNet  Google Scholar 

  10. Karumuri S, Tripathy R, Bilionis I, Panchal J (2020) Simulator-free solution of high-dimensional stochastic elliptic partial differential equations using deep neural networks. J Comput Phys 404:109120

    Article  MathSciNet  Google Scholar 

  11. Han J, Arnulf J, Weinan E (2018) Solving high-dimensional partial differential equations using deep learning. Proc Natl Acad Sci 115(34):8505–8510

    Article  MathSciNet  Google Scholar 

  12. Michoski C, Milosavljevic M, Oliver T, Hatch D (2019) Solving irregular and data-enriched differential equations using deep neural networks. arXiv preprint arXiv:1905.04351

  13. Samaniego E, Anitescu C, Goswami S, Nguyen-Thanh VM, Hongwei G, Khader H, Zhuang X, Rabczuk T (2020) An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications. Comput Methods Appl Mech Eng 362:112790

    Article  MathSciNet  Google Scholar 

  14. Ramabathiran Amuthan A, Prabhu R (2021) Spinn: sparse, physics-based, and partially interpretable neural networks for pdes. J Comput Phys 445:110600

    Article  MathSciNet  Google Scholar 

  15. Botelho S, Joshi A, Khara B, Sarkar S, Hegde C, Adavani S, Ganapathysubramanian B (2020) Deep generative models that solve pdes: distributed computing for training large data-free models. arXiv preprint arXiv:2007.12792,

  16. Mitusch Sebastian K, Funke Simon W, Miroslav K (2021) Hybrid fem-nn models: combining artificial neural networks with the finite element method. J Comput Phys 446:110651

    Article  MathSciNet  Google Scholar 

  17. Jokar M, Semperlotti F (2021) Finite element network analysis: a machine learning based computational framework for the simulation of physical systems. Comput Struct 247:106484

    Article  Google Scholar 

  18. Sitzmann V, Martel JNP, Bergman AW, Lindell DB, Wetzstein G (2020) Implicit neural representations with periodic activation functions. Adv Neural Inform Process Syst 33:7462–7473

  19. Mishra S, Konstantin Rusch T (2021) Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences. SIAM J Numer Anal 59(3):1811–1834

    Article  MathSciNet  Google Scholar 

  20. Zhu Y, Zabaras N, Koutsourelakis P-S, Perdikaris P (2019) Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J Comput Phys 394:56–81

    Article  MathSciNet  Google Scholar 

  21. Wen G, Li Z, Azizzadenesheli K, Anandkumar A, Benson Sally M (2021) U-fno–an enhanced fourier neural operator based-deep learning model for multiphase flow. arXiv preprint arXiv:2109.03697,

  22. Ranade R, Hill C, Pathak J (2021) Discretizationnet: a machine-learning based solver for Navier-Stokes equations using finite volume discretization. Comput Methods Appl Mech Eng 378:113722

    Article  MathSciNet  Google Scholar 

  23. Brenner S, Scott R (2007) The mathematical theory of finite element methods, vol 15. Springer Science & Business Media

    Google Scholar 

  24. Larson MG, Bengzon F (2013) The finite element method: theory, implementation, and applications, vol 10. Springer Science & Business Media

    Book  Google Scholar 

  25. Shin Y , Zhang Z, Karniadakis GE (2020) Error estimates of residual minimization using neural networks for linear pdes. arXiv preprint arXiv:2010.08019,

  26. Mishra S, Molinaro R (2020) Estimates on the generalization error of physics informed neural networks (pinns) for approximating pdes. arXiv preprint arXiv:2006.16144

  27. Jiao Y, Lai Y, Luo Y, Wang Y, Yang Y (2021) Error analysis of deep ritz methods for elliptic equations. arXiv preprint arXiv:2107.14478

  28. He J, Li L, Xu J, Zheng C (2018) Relu deep neural networks and linear finite elements. arXiv preprint arXiv:1807.03973

  29. Takeuchi J, Kosugi Y (1994) Neural network representation of finite element method. Neural Netw 7(2):389–395

    Article  Google Scholar 

  30. Xu G, Littlefair G, Penson R, Callan R (1999) Application of fe-based neural networks to dynamic problems. In: ICONIP’99. ANZIIS’99 & ANNES’99 & ACNN’99. 6th International Conference on neural information processing. Proceedings (Cat. No. 99EX378), volume 3, pp 1039–1044. IEEE

  31. Ramuhalli P, Udpa L, Udpa SS (2005) Finite-element neural networks for solving differential equations. IEEE Trans Neural Netw 16(6):1381–1392

    Article  Google Scholar 

  32. Chao X, Wang C, Ji F, Yuan X (2012) Finite-element neural network-based solving 3-d differential equations in mfl. IEEE Trans Magn 48(12):4747–4756

    Article  Google Scholar 

  33. Khodayi-Mehr R, Zavlanos M (2020) Varnet: variational neural networks for the solution of partial differential equations. In: Learning for dynamics and control, pp 98–307. PMLR,

  34. Gao H, Zahr Matthew J, Wang J-X (2022) Physics-informed graph neural Galerkin networks: a unified framework for solving pde-governed forward and inverse problems. Comput Methods Appl Mech Eng 390:114502

    Article  MathSciNet  Google Scholar 

  35. Yao H, Gao Y, Liu Y (2020) Fea-net: a physics-guided data-driven model for efficient mechanical response prediction. Comput Methods Appl Mech Eng 363:112892

    Article  MathSciNet  Google Scholar 

  36. Evans LC (1998) Partial differential equations. Graduate Stud Math 19(4):7

    Google Scholar 

  37. Reddy JN (2010) An introduction to the finite element method, vol 1221. McGraw-Hill New York

    Google Scholar 

  38. Bing Y et al (2017) The deep ritz method: a deep learning-based numerical algorithm for solving variational problems. arXiv preprint arXiv:1710.00211

  39. Ming YL et al (2021) Deep Nitsche method: Deep ritz method with essential boundary conditions. Commun Comput Phys 29(5):1365–1384

    Article  MathSciNet  Google Scholar 

  40. Courte L, Zeinhofer M (2021) Robin pre-training for the deep ritz method. arXiv preprint arXiv:2106.06219

  41. Müller J, Zeinhofer M (2022) Error estimates for the deep ritz method with boundary penalty. In: Mathematical and Scientific Machine Learning, pp 215–230. PMLR

  42. Müller J, Zeinhofer M (2022) Notes on exact boundary values in residual minimisation. In: Mathematical and Scientific Machine Learning, pp 231–240. PMLR

  43. Dondl P, Müller J, Zeinhofer M (2022) Uniform convergence guarantees for the deep ritz method for nonlinear problems. Adv Contin Discrete Models 2022(1):1–19

    Article  MathSciNet  Google Scholar 

  44. Hyuk Lee and In Seok Kang (1990) Neural algorithm for solving differential equations. J Comput Phys 91(1):110–131

    Article  MathSciNet  Google Scholar 

  45. Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Netw 9(5):987–1000

    Article  Google Scholar 

  46. Malek A, Beidokhti Shekari R (2006) Numerical solution for high order differential equations using a hybrid neural network optimization method. Appl Math Comput 183(1):260–271

    MathSciNet  Google Scholar 

  47. Sukumar N, Srivastava A (2021) Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. arXiv preprint arXiv:2104.08426

  48. IsaacE L, AristidisC L, DimitrisG P (2000) Neural-network methods for boundary value problems with irregular boundaries. IEEE Trans Neural Netw 11(5):1041–1049

    Article  Google Scholar 

  49. Remco Van der M, Cornelis O, Anastasia B (2020) Optimally weighted loss functions for solving pdes with neural networks. arXiv preprint arXiv:2002.06269

  50. Sifan W, Yujun T, and Paris P (2020) Understanding and mitigating gradient pathologies in physics-informed neural networks. arXiv preprint arXiv:2001.04536,

  51. Oliver H, Susheela N, Mohammad Amin N, Akshay S, Kaustubh T, Zhiwei F, Max R, Wonmin B, Sanjay C (2021) Nvidia simnet: an ai-accelerated multi-physics simulation framework. In: International Conference on computational science, pp 447–461. Springer

  52. Wang S, Perdikaris P (2021) Long-time integration of parametric evolution equations with physics-informed deeponets. arXiv preprint arXiv:2106.05384

  53. Paganini M, de Oliveira L, Nachman B (2018) Calogan: simulating 3d high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks. Phys Rev D 97(1):014021

    Article  Google Scholar 

  54. Krishnapriyan AS, Gholami A, Zhe S, Kirby RM, Mahoney MW (2021) Characterizing possible failure modes in physics-informed neural networks. arXiv preprint arXiv:2109.01050

  55. Wang S, Teng Y, Perdikaris P (2021) Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J Sci Comput 43(5):A3055–A3081

    Article  MathSciNet  Google Scholar 

  56. Fox C (1987) An introduction to the calculus of variations. Courier Corporation

    Google Scholar 

  57. Kingma DP, Jimmy B (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  58. Olaf R, Philipp F, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on medical image computing and computer-assisted intervention, pp 234–241. Springer

  59. Özgün Ç, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: International Conference on medical image computing and computer-assisted intervention, pp 424–432. Springer

  60. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037

    Google Scholar 

  61. Bubeck S, Eldan R, Lee YT, Mikulincer D (2020) Network size and weights size for memorization with two-layers neural networks. arXiv preprint arXiv:2006.02855

  62. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  63. Oden JT, Reddy JN (2012) An introduction to the mathematical theory of finite elements. Courier Corporation

    Google Scholar 

  64. Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39(3):930–945

    Article  MathSciNet  Google Scholar 

  65. Balu A, Botelho S, Khara B, Rao V, Hegde C, Sarkar S, Adavani S, Krishnamurthy A, Ganapathysubramanian B (2021) Distributed multigrid neural solvers on megavoxel domains. arXiv preprint arXiv:2104.14538

  66. Lu L, Meng X, Mao Z, Karniadakis GE (2021) Deepxde: a deep learning library for solving differential equations. SIAM Rev 63(1):208–228

    Article  MathSciNet  Google Scholar 

  67. Hughes TJR, Scovazzi G, Franca LP (2017) Multiscale and stabilized methods. In: Stein E, de Borst R, Hughes TJR (eds) Encyclopedia of computational mechanics second edition, vol 5, chap 2, pp 1–64. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781119176817.ecm2051

  68. Ghanem RG, Spanos PD (2003) Stochastic finite elements: a spectral approach. Courier Corporation

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Science Foundation under Grant Nos. CCF-2005804, LEAP-HI-2053760, OAC-1750865, CPS-FRONTIER-1954556, and USDA-NIFA-2021-67021-35329.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Adarsh Krishnamurthy or Baskar Ganapathysubramanian.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Representation of random diffusivity

With \(\omega\) taken from the sample space \(\Omega\), the diffusivity/permeability \(\nu\) can be written as an exponential of a random quantity Z

$$\begin{aligned} \nu = \exp \left( Z({\textbf{x}}; \omega ) \right) . \end{aligned}$$
(46)

We assume that Z is square integrable, i.e., \(\mathbb {E}\left[ |Z({\textbf{x}}; \omega )|^2 \right] < \infty\). Then, we can write Z using the Karhunen–Loeve expansion [68], as

$$\begin{aligned} Z({\textbf{x}}; \omega ) = {\bar{Z}}({\textbf{x}}) + \sum _{i = 1}^{\infty } \sqrt{\lambda _i}\phi _i({\textbf{x}})\psi _i(\omega ), \end{aligned}$$
(47)

where \({\bar{Z}}({\textbf{x}}) = \mathbb {E}(Z({\textbf{x}}, \omega ))\), and \(\psi _i(\omega )\) are independent random variables with zero mean and unit variance. \(\lambda _i\) and \(\phi _i({\textbf{x}})\) are the eigenvalues and eigenvectors corresponding to the Fredholm equation

$$\begin{aligned} \int _D C_Z({\textbf{s}},{\textbf{t}})\phi (s)\textrm{d}s = \lambda \phi ({\textbf{t}}), \end{aligned}$$
(48)

where \(C_Z(s,t)\) is the covariance kernel given by,

$$\begin{aligned} C_Z({\textbf{s}}, {\textbf{t}}) = \sigma _z^2 \exp \left( -\left[ \frac{s_1-t_1}{\eta _1}+\frac{s_2-t_2}{\eta _2}+\frac{s_3-t_3}{\eta _3} \right] \right) , \end{aligned}$$
(49)

where \(\eta _i\) is the correlation length in the \(x_i\) coordinate. This particular form of the covariance kernel is separable in the three coordinates, thus the eigenvalues and the eigenfunctions of the multi-dimensional case can be obtained by combining the eigenvalues and eigenfunctions of the one-dimensional covariance kernel given by:

$$\begin{aligned} C_Z(s,t) = \sigma _Z^2\exp \left( -\frac{s-t}{\eta }\right) , \end{aligned}$$
(50)

where \(\sigma _Z\) is the variance and \(\eta\) is the correlation length in one-dimension.

Equation 46 can then be written as

$$\begin{aligned} {\tilde{\nu }}({\textbf{x}}; \omega )&= \exp \left( \sum _{i = 1}^{m} a_i \sqrt{\lambda _{xi}\lambda _{yi}} \phi _i(x)\psi _i(y) \right) , \end{aligned}$$
(51)

where \(a_i\) is an m-dimensional parameter, \(\lambda _x\) and \(\lambda _y\) are vectors of real numbers arranged in the order of monotonically decreasing values; and \(\phi\) and \(\psi\) are functions of x and y, respectively. \(\lambda _{xi}\) is calculated as

$$\begin{aligned} \lambda _{xi} = \frac{2\eta \sigma _x}{(1+\eta ^2 \omega _x^2)}, \end{aligned}$$
(52)

where \(\omega _x\) is the solution to the system of transcendental equations obtained after differentiating Eq. 48 with respect to \({\textbf{t}}\). \(\lambda _{yi}\) are calculated similarly. \(\phi _i(x)\) are given by

$$\begin{aligned} \phi _i(x) = \frac{a_i}{2}\cos (a_i x) + \sin (a_i x), \end{aligned}$$
(53)

and \(\psi _i(y)\) are calculated similarly. We take \(m = 6\) and assume that each \(a_i\) is uniformly distributed in \([-\sqrt{3},\sqrt{3}]\), and thus, \({\textbf{a}} \in [-\sqrt{3},\sqrt{3}]^6\). The input diffusivity \(\nu\) in all the examples in Sects. 6.3 and 6.4 are calculated by choosing the 6-dimensional coefficient \({\textbf{a}}\) from \([-\sqrt{3},\sqrt{3}]^6\).

Appendix 2: Further discussion on convergence studies

1.1 Discussion on the role of keeping \(e_{\mathcal {H}}\) and \(e_{\theta }\) low

If we choose a fixed network architecture and use it to solve Eq. 35 across different h-levels, then the errors do not necessarily decrease with decreasing h. As shown in Fig. 13, the errors actually increase when \(h>2^{-5}\). This reason for this behavior is that, when h becomes low, the number of discrete unknowns in the mesh (i.e., \(U_i\)’s in Eq. 14) increases. In fact, in this case, the number of basis functions/unknowns, N is exactly equal to \(\frac{1}{h^2}\). As h decreases, the size of the space \(V^h\) increases. However, since the network remains the same, the discrete function space \(V^h\) does not remain a subspace of \(V^h_\theta\) anymore. This network function class also needs to get bigger to accommodate all the possible functions at the lower values of h. Figure 13 also shows the errors obtained when the network is indeed enhanced to make \(V^h_\theta \supset V\) (this is a clone of the errors plotted in Fig. 7).

Fig. 13
figure 13

Convergence of the error in \(L^2\) norm for the Poisson equation with analytical solution \(u =\sin {(\pi x)}\sin {(\pi y}\))

Appendix 3: Solutions to the parametric Poisson’s equation

1.1 Randomly selected examples

See Figs. 14, 15.

Table 2 Norm of solution fields for a few randomly selected
Fig. 14
figure 14

Contours for the randomly selected examples presented in Table 2: (left) \(\ln (\nu )\), (mid-left) \(u_{\theta }\), (mid-right) \(u^h\) and (right) \((u_{\theta }-u^h)\)

1.2 Mean and standard-deviation fields

See Table 3.

Table 3 Norm of the mean and standard-deviation fields
Fig. 15
figure 15

(top) Mean and (bottom) standard-deviation fields: (left) NeuFENet (\(u_{\theta }\)), (mid) Conventional FEM (\(u^h\), (right) pointwise difference (\(u_{\theta }-u^h\))

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khara, B., Balu, A., Joshi, A. et al. NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs. Engineering with Computers (2024). https://doi.org/10.1007/s00366-024-01955-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00366-024-01955-7

Keywords

Navigation