NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs

Khara, Biswajit; Balu, Aditya; Joshi, Ameya; Sarkar, Soumik; Hegde, Chinmay; Krishnamurthy, Adarsh; Ganapathysubramanian, Baskar

doi:10.1007/s00366-024-01955-7

NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs

Original Article
Published: 10 April 2024

(2024)
Cite this article

Engineering with Computers Aims and scope Submit manuscript

Biswajit Khara¹,
Aditya Balu¹,
Ameya Joshi²,
Soumik Sarkar¹,
Chinmay Hegde²,
Adarsh Krishnamurthy¹ &
…
Baskar Ganapathysubramanian¹

113 Accesses
Explore all metrics

Abstract

We consider a mesh-based approach for training a neural network to produce field predictions of solutions to parametric partial differential equations (PDEs). This approach contrasts current approaches for “neural PDE solvers” that employ collocation-based methods to make pointwise predictions of solutions to PDEs. This approach has the advantage of naturally enforcing different boundary conditions as well as ease of invoking well-developed PDE theory—including analysis of numerical stability and convergence—to obtain capacity bounds for our proposed neural networks in discretized domains. We explore our mesh-based strategy, called NeuFENet, using a weighted Galerkin loss function based on the Finite Element Method (FEM) on a parametric elliptic PDE. The weighted Galerkin loss (FEM loss) is similar to an energy functional that produces improved solutions, satisfies a priori mesh convergence, and can model Dirichlet and Neumann boundary conditions. We prove theoretically, and illustrate with experiments, convergence results analogous to mesh convergence analysis deployed in finite element solutions to PDEs. These results suggest that a mesh-based neural network approach serves as a promising approach for solving parametric PDEs with theoretical bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

Fig. 9

Solving PDEs by variational physics-informed neural networks: an a posteriori error analysis

Article Open access 19 September 2022

Mesh-Informed Neural Networks for Operator Learning in Finite Element Spaces

Article Open access 23 September 2023

The neural network collocation method for solving partial differential equations

Article 09 October 2020

Data availability

The data and code associated with this publication will be available on GitHub.

Notes

In contrast, state-of-art neural methods allow us to use basis functions beyond polynomials or Fourier bases and approximate much more complicated mappings. Although such methods can be analyzed theoretically, the estimates are often impractical [25,26,27] This is a very active area of research, and we expect tighter estimates in the future.
While a probability-based definition of $\omega$ is not needed for defining a parameteric PDE, we choose this definition for two reasons. First, such a formulation allows easy extension to the stochastic PDE case. Second, such a formulation will allow using expectation-based arguments in the analysis of convergence.

References

Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707
Article MathSciNet Google Scholar
Rudy S, Alla A, Brunton SL, Nathan Kutz J (2019) Data-driven identification of parametric partial differential equations. SIAM J Appl Dyna Syst 18(2):643–660
Article MathSciNet Google Scholar
Tompson J, Schlachter K, Sprechmann P, Perlin K (2017)Accelerating Eulerian fluid simulation with convolutional networks. In: International Conference on machine learning, pp 3424–3433. PMLR,
Raissi M, Karniadakis GE (2018) Hidden physics models: machine learning of nonlinear partial differential equations. J Comput Phys 357:125–141
Article MathSciNet Google Scholar
Lu L, Jin P, Karniadakis GE (2019) Deeponet: learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193,
Kharazmi E, Zhang Z, Karniadakis GE (2019) Variational physics-informed neural networks for solving partial differential equations. arXiv preprint arXiv:1912.00873,
Sirignano J, Spiliopoulos K (2018) Dgm: a deep learning algorithm for solving partial differential equations. J Comput Phys 375:1339–1364
Article MathSciNet Google Scholar
Yang L, Dongkun Z, Karniadakis GE (2018) Physics-informed generative adversarial networks for stochastic differential equations. arXiv preprint arXiv:1811.02033,
Guofei Pang LL, Karniadakis GE (2019) fpinns: fractional physics-informed neural networks. SIAM J Sci Comput 41(4):A2603–A2626
Article MathSciNet Google Scholar
Karumuri S, Tripathy R, Bilionis I, Panchal J (2020) Simulator-free solution of high-dimensional stochastic elliptic partial differential equations using deep neural networks. J Comput Phys 404:109120
Article MathSciNet Google Scholar
Han J, Arnulf J, Weinan E (2018) Solving high-dimensional partial differential equations using deep learning. Proc Natl Acad Sci 115(34):8505–8510
Article MathSciNet Google Scholar
Michoski C, Milosavljevic M, Oliver T, Hatch D (2019) Solving irregular and data-enriched differential equations using deep neural networks. arXiv preprint arXiv:1905.04351
Samaniego E, Anitescu C, Goswami S, Nguyen-Thanh VM, Hongwei G, Khader H, Zhuang X, Rabczuk T (2020) An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications. Comput Methods Appl Mech Eng 362:112790
Article MathSciNet Google Scholar
Ramabathiran Amuthan A, Prabhu R (2021) Spinn: sparse, physics-based, and partially interpretable neural networks for pdes. J Comput Phys 445:110600
Article MathSciNet Google Scholar
Botelho S, Joshi A, Khara B, Sarkar S, Hegde C, Adavani S, Ganapathysubramanian B (2020) Deep generative models that solve pdes: distributed computing for training large data-free models. arXiv preprint arXiv:2007.12792,
Mitusch Sebastian K, Funke Simon W, Miroslav K (2021) Hybrid fem-nn models: combining artificial neural networks with the finite element method. J Comput Phys 446:110651
Article MathSciNet Google Scholar
Jokar M, Semperlotti F (2021) Finite element network analysis: a machine learning based computational framework for the simulation of physical systems. Comput Struct 247:106484
Article Google Scholar
Sitzmann V, Martel JNP, Bergman AW, Lindell DB, Wetzstein G (2020) Implicit neural representations with periodic activation functions. Adv Neural Inform Process Syst 33:7462–7473
Mishra S, Konstantin Rusch T (2021) Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences. SIAM J Numer Anal 59(3):1811–1834
Article MathSciNet Google Scholar
Zhu Y, Zabaras N, Koutsourelakis P-S, Perdikaris P (2019) Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J Comput Phys 394:56–81
Article MathSciNet Google Scholar
Wen G, Li Z, Azizzadenesheli K, Anandkumar A, Benson Sally M (2021) U-fno–an enhanced fourier neural operator based-deep learning model for multiphase flow. arXiv preprint arXiv:2109.03697,
Ranade R, Hill C, Pathak J (2021) Discretizationnet: a machine-learning based solver for Navier-Stokes equations using finite volume discretization. Comput Methods Appl Mech Eng 378:113722
Article MathSciNet Google Scholar
Brenner S, Scott R (2007) The mathematical theory of finite element methods, vol 15. Springer Science & Business Media
Google Scholar
Larson MG, Bengzon F (2013) The finite element method: theory, implementation, and applications, vol 10. Springer Science & Business Media
Book Google Scholar
Shin Y , Zhang Z, Karniadakis GE (2020) Error estimates of residual minimization using neural networks for linear pdes. arXiv preprint arXiv:2010.08019,
Mishra S, Molinaro R (2020) Estimates on the generalization error of physics informed neural networks (pinns) for approximating pdes. arXiv preprint arXiv:2006.16144
Jiao Y, Lai Y, Luo Y, Wang Y, Yang Y (2021) Error analysis of deep ritz methods for elliptic equations. arXiv preprint arXiv:2107.14478
He J, Li L, Xu J, Zheng C (2018) Relu deep neural networks and linear finite elements. arXiv preprint arXiv:1807.03973
Takeuchi J, Kosugi Y (1994) Neural network representation of finite element method. Neural Netw 7(2):389–395
Article Google Scholar
Xu G, Littlefair G, Penson R, Callan R (1999) Application of fe-based neural networks to dynamic problems. In: ICONIP’99. ANZIIS’99 & ANNES’99 & ACNN’99. 6th International Conference on neural information processing. Proceedings (Cat. No. 99EX378), volume 3, pp 1039–1044. IEEE
Ramuhalli P, Udpa L, Udpa SS (2005) Finite-element neural networks for solving differential equations. IEEE Trans Neural Netw 16(6):1381–1392
Article Google Scholar
Chao X, Wang C, Ji F, Yuan X (2012) Finite-element neural network-based solving 3-d differential equations in mfl. IEEE Trans Magn 48(12):4747–4756
Article Google Scholar
Khodayi-Mehr R, Zavlanos M (2020) Varnet: variational neural networks for the solution of partial differential equations. In: Learning for dynamics and control, pp 98–307. PMLR,
Gao H, Zahr Matthew J, Wang J-X (2022) Physics-informed graph neural Galerkin networks: a unified framework for solving pde-governed forward and inverse problems. Comput Methods Appl Mech Eng 390:114502
Article MathSciNet Google Scholar
Yao H, Gao Y, Liu Y (2020) Fea-net: a physics-guided data-driven model for efficient mechanical response prediction. Comput Methods Appl Mech Eng 363:112892
Article MathSciNet Google Scholar
Evans LC (1998) Partial differential equations. Graduate Stud Math 19(4):7
Google Scholar
Reddy JN (2010) An introduction to the finite element method, vol 1221. McGraw-Hill New York
Google Scholar
Bing Y et al (2017) The deep ritz method: a deep learning-based numerical algorithm for solving variational problems. arXiv preprint arXiv:1710.00211
Ming YL et al (2021) Deep Nitsche method: Deep ritz method with essential boundary conditions. Commun Comput Phys 29(5):1365–1384
Article MathSciNet Google Scholar
Courte L, Zeinhofer M (2021) Robin pre-training for the deep ritz method. arXiv preprint arXiv:2106.06219
Müller J, Zeinhofer M (2022) Error estimates for the deep ritz method with boundary penalty. In: Mathematical and Scientific Machine Learning, pp 215–230. PMLR
Müller J, Zeinhofer M (2022) Notes on exact boundary values in residual minimisation. In: Mathematical and Scientific Machine Learning, pp 231–240. PMLR
Dondl P, Müller J, Zeinhofer M (2022) Uniform convergence guarantees for the deep ritz method for nonlinear problems. Adv Contin Discrete Models 2022(1):1–19
Article MathSciNet Google Scholar
Hyuk Lee and In Seok Kang (1990) Neural algorithm for solving differential equations. J Comput Phys 91(1):110–131
Article MathSciNet Google Scholar
Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Netw 9(5):987–1000
Article Google Scholar
Malek A, Beidokhti Shekari R (2006) Numerical solution for high order differential equations using a hybrid neural network optimization method. Appl Math Comput 183(1):260–271
MathSciNet Google Scholar
Sukumar N, Srivastava A (2021) Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. arXiv preprint arXiv:2104.08426
IsaacE L, AristidisC L, DimitrisG P (2000) Neural-network methods for boundary value problems with irregular boundaries. IEEE Trans Neural Netw 11(5):1041–1049
Article Google Scholar
Remco Van der M, Cornelis O, Anastasia B (2020) Optimally weighted loss functions for solving pdes with neural networks. arXiv preprint arXiv:2002.06269
Sifan W, Yujun T, and Paris P (2020) Understanding and mitigating gradient pathologies in physics-informed neural networks. arXiv preprint arXiv:2001.04536,
Oliver H, Susheela N, Mohammad Amin N, Akshay S, Kaustubh T, Zhiwei F, Max R, Wonmin B, Sanjay C (2021) Nvidia simnet: an ai-accelerated multi-physics simulation framework. In: International Conference on computational science, pp 447–461. Springer
Wang S, Perdikaris P (2021) Long-time integration of parametric evolution equations with physics-informed deeponets. arXiv preprint arXiv:2106.05384
Paganini M, de Oliveira L, Nachman B (2018) Calogan: simulating 3d high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks. Phys Rev D 97(1):014021
Article Google Scholar
Krishnapriyan AS, Gholami A, Zhe S, Kirby RM, Mahoney MW (2021) Characterizing possible failure modes in physics-informed neural networks. arXiv preprint arXiv:2109.01050
Wang S, Teng Y, Perdikaris P (2021) Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J Sci Comput 43(5):A3055–A3081
Article MathSciNet Google Scholar
Fox C (1987) An introduction to the calculus of variations. Courier Corporation
Google Scholar
Kingma DP, Jimmy B (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Olaf R, Philipp F, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on medical image computing and computer-assisted intervention, pp 234–241. Springer
Özgün Ç, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: International Conference on medical image computing and computer-assisted intervention, pp 424–432. Springer
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Google Scholar
Bubeck S, Eldan R, Lee YT, Mikulincer D (2020) Network size and weights size for memorization with two-layers neural networks. arXiv preprint arXiv:2006.02855
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Oden JT, Reddy JN (2012) An introduction to the mathematical theory of finite elements. Courier Corporation
Google Scholar
Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39(3):930–945
Article MathSciNet Google Scholar
Balu A, Botelho S, Khara B, Rao V, Hegde C, Sarkar S, Adavani S, Krishnamurthy A, Ganapathysubramanian B (2021) Distributed multigrid neural solvers on megavoxel domains. arXiv preprint arXiv:2104.14538
Lu L, Meng X, Mao Z, Karniadakis GE (2021) Deepxde: a deep learning library for solving differential equations. SIAM Rev 63(1):208–228
Article MathSciNet Google Scholar
Hughes TJR, Scovazzi G, Franca LP (2017) Multiscale and stabilized methods. In: Stein E, de Borst R, Hughes TJR (eds) Encyclopedia of computational mechanics second edition, vol 5, chap 2, pp 1–64. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781119176817.ecm2051
Ghanem RG, Spanos PD (2003) Stochastic finite elements: a spectral approach. Courier Corporation
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Science Foundation under Grant Nos. CCF-2005804, LEAP-HI-2053760, OAC-1750865, CPS-FRONTIER-1954556, and USDA-NIFA-2021-67021-35329.

Author information

Authors and Affiliations

Iowa State University, Ames, USA
Biswajit Khara, Aditya Balu, Soumik Sarkar, Adarsh Krishnamurthy & Baskar Ganapathysubramanian
New York University, New York, USA
Ameya Joshi & Chinmay Hegde

Authors

Biswajit Khara
View author publications
You can also search for this author in PubMed Google Scholar
Aditya Balu
View author publications
You can also search for this author in PubMed Google Scholar
Ameya Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Soumik Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Chinmay Hegde
View author publications
You can also search for this author in PubMed Google Scholar
Adarsh Krishnamurthy
View author publications
You can also search for this author in PubMed Google Scholar
Baskar Ganapathysubramanian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Adarsh Krishnamurthy or Baskar Ganapathysubramanian.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Representation of random diffusivity

With $\omega$ taken from the sample space $\Omega$, the diffusivity/permeability $\nu$ can be written as an exponential of a random quantity Z

$$\begin{aligned} \nu = \exp \left( Z({\textbf{x}}; \omega ) \right) . \end{aligned}$$

(46)

We assume that Z is square integrable, i.e., $\mathbb {E}\left[ |Z({\textbf{x}}; \omega )|^2 \right] < \infty$. Then, we can write Z using the Karhunen–Loeve expansion [68], as

$$\begin{aligned} Z({\textbf{x}}; \omega ) = {\bar{Z}}({\textbf{x}}) + \sum _{i = 1}^{\infty } \sqrt{\lambda _i}\phi _i({\textbf{x}})\psi _i(\omega ), \end{aligned}$$

(47)

where ${\bar{Z}}({\textbf{x}}) = \mathbb {E}(Z({\textbf{x}}, \omega ))$, and $\psi _i(\omega )$ are independent random variables with zero mean and unit variance. $\lambda _i$ and $\phi _i({\textbf{x}})$ are the eigenvalues and eigenvectors corresponding to the Fredholm equation

$$\begin{aligned} \int _D C_Z({\textbf{s}},{\textbf{t}})\phi (s)\textrm{d}s = \lambda \phi ({\textbf{t}}), \end{aligned}$$

(48)

where $C_Z(s,t)$ is the covariance kernel given by,

$$\begin{aligned} C_Z({\textbf{s}}, {\textbf{t}}) = \sigma _z^2 \exp \left( -\left[ \frac{s_1-t_1}{\eta _1}+\frac{s_2-t_2}{\eta _2}+\frac{s_3-t_3}{\eta _3} \right] \right) , \end{aligned}$$

(49)

where $\eta _i$ is the correlation length in the $x_i$ coordinate. This particular form of the covariance kernel is separable in the three coordinates, thus the eigenvalues and the eigenfunctions of the multi-dimensional case can be obtained by combining the eigenvalues and eigenfunctions of the one-dimensional covariance kernel given by:

$$\begin{aligned} C_Z(s,t) = \sigma _Z^2\exp \left( -\frac{s-t}{\eta }\right) , \end{aligned}$$

(50)

where $\sigma _Z$ is the variance and $\eta$ is the correlation length in one-dimension.

Equation 46 can then be written as

$$\begin{aligned} {\tilde{\nu }}({\textbf{x}}; \omega )&= \exp \left( \sum _{i = 1}^{m} a_i \sqrt{\lambda _{xi}\lambda _{yi}} \phi _i(x)\psi _i(y) \right) , \end{aligned}$$

(51)

where $a_i$ is an m-dimensional parameter, $\lambda _x$ and $\lambda _y$ are vectors of real numbers arranged in the order of monotonically decreasing values; and $\phi$ and $\psi$ are functions of x and y, respectively. $\lambda _{xi}$ is calculated as

$$\begin{aligned} \lambda _{xi} = \frac{2\eta \sigma _x}{(1+\eta ^2 \omega _x^2)}, \end{aligned}$$

(52)

where $\omega _x$ is the solution to the system of transcendental equations obtained after differentiating Eq. 48 with respect to ${\textbf{t}}$. $\lambda _{yi}$ are calculated similarly. $\phi _i(x)$ are given by

$$\begin{aligned} \phi _i(x) = \frac{a_i}{2}\cos (a_i x) + \sin (a_i x), \end{aligned}$$

(53)

and $\psi _i(y)$ are calculated similarly. We take $m = 6$ and assume that each $a_i$ is uniformly distributed in $[-\sqrt{3},\sqrt{3}]$, and thus, ${\textbf{a}} \in [-\sqrt{3},\sqrt{3}]^6$. The input diffusivity $\nu$ in all the examples in Sects. 6.3 and 6.4 are calculated by choosing the 6-dimensional coefficient ${\textbf{a}}$ from $[-\sqrt{3},\sqrt{3}]^6$.

Appendix 2: Further discussion on convergence studies

1.1 Discussion on the role of keeping $e_{\mathcal {H}}$ and $e_{\theta }$ low

If we choose a fixed network architecture and use it to solve Eq. 35 across different h-levels, then the errors do not necessarily decrease with decreasing h. As shown in Fig. 13, the errors actually increase when $h>2^{-5}$. This reason for this behavior is that, when h becomes low, the number of discrete unknowns in the mesh (i.e., $U_i$’s in Eq. 14) increases. In fact, in this case, the number of basis functions/unknowns, N is exactly equal to $\frac{1}{h^2}$. As h decreases, the size of the space $V^h$ increases. However, since the network remains the same, the discrete function space $V^h$ does not remain a subspace of $V^h_\theta$ anymore. This network function class also needs to get bigger to accommodate all the possible functions at the lower values of h. Figure 13 also shows the errors obtained when the network is indeed enhanced to make $V^h_\theta \supset V$ (this is a clone of the errors plotted in Fig. 7).

Appendix 3: Solutions to the parametric Poisson’s equation

1.1 Randomly selected examples

See Figs. 14, 15.

Table 2 Norm of solution fields for a few randomly selected

Full size table

1.2 Mean and standard-deviation fields

See Table 3.

Table 3 Norm of the mean and standard-deviation fields

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Khara, B., Balu, A., Joshi, A. et al. NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs. Engineering with Computers (2024). https://doi.org/10.1007/s00366-024-01955-7

Download citation

Received: 13 April 2023
Accepted: 04 February 2024
Published: 10 April 2024
DOI: https://doi.org/10.1007/s00366-024-01955-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs

Abstract

Access this article

Similar content being viewed by others

Solving PDEs by variational physics-informed neural networks: an a posteriori error analysis

Mesh-Informed Neural Networks for Operator Learning in Finite Element Spaces

The neural network collocation method for solving partial differential equations

Data availability

Notes

References

Acknowledgements