Abstract
We consider a mesh-based approach for training a neural network to produce field predictions of solutions to parametric partial differential equations (PDEs). This approach contrasts current approaches for “neural PDE solvers” that employ collocation-based methods to make pointwise predictions of solutions to PDEs. This approach has the advantage of naturally enforcing different boundary conditions as well as ease of invoking well-developed PDE theory—including analysis of numerical stability and convergence—to obtain capacity bounds for our proposed neural networks in discretized domains. We explore our mesh-based strategy, called NeuFENet, using a weighted Galerkin loss function based on the Finite Element Method (FEM) on a parametric elliptic PDE. The weighted Galerkin loss (FEM loss) is similar to an energy functional that produces improved solutions, satisfies a priori mesh convergence, and can model Dirichlet and Neumann boundary conditions. We prove theoretically, and illustrate with experiments, convergence results analogous to mesh convergence analysis deployed in finite element solutions to PDEs. These results suggest that a mesh-based neural network approach serves as a promising approach for solving parametric PDEs with theoretical bounds.
Similar content being viewed by others
Data availability
The data and code associated with this publication will be available on GitHub.
Notes
In contrast, state-of-art neural methods allow us to use basis functions beyond polynomials or Fourier bases and approximate much more complicated mappings. Although such methods can be analyzed theoretically, the estimates are often impractical [25,26,27] This is a very active area of research, and we expect tighter estimates in the future.
While a probability-based definition of \(\omega\) is not needed for defining a parameteric PDE, we choose this definition for two reasons. First, such a formulation allows easy extension to the stochastic PDE case. Second, such a formulation will allow using expectation-based arguments in the analysis of convergence.
References
Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707
Rudy S, Alla A, Brunton SL, Nathan Kutz J (2019) Data-driven identification of parametric partial differential equations. SIAM J Appl Dyna Syst 18(2):643–660
Tompson J, Schlachter K, Sprechmann P, Perlin K (2017)Accelerating Eulerian fluid simulation with convolutional networks. In: International Conference on machine learning, pp 3424–3433. PMLR,
Raissi M, Karniadakis GE (2018) Hidden physics models: machine learning of nonlinear partial differential equations. J Comput Phys 357:125–141
Lu L, Jin P, Karniadakis GE (2019) Deeponet: learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193,
Kharazmi E, Zhang Z, Karniadakis GE (2019) Variational physics-informed neural networks for solving partial differential equations. arXiv preprint arXiv:1912.00873,
Sirignano J, Spiliopoulos K (2018) Dgm: a deep learning algorithm for solving partial differential equations. J Comput Phys 375:1339–1364
Yang L, Dongkun Z, Karniadakis GE (2018) Physics-informed generative adversarial networks for stochastic differential equations. arXiv preprint arXiv:1811.02033,
Guofei Pang LL, Karniadakis GE (2019) fpinns: fractional physics-informed neural networks. SIAM J Sci Comput 41(4):A2603–A2626
Karumuri S, Tripathy R, Bilionis I, Panchal J (2020) Simulator-free solution of high-dimensional stochastic elliptic partial differential equations using deep neural networks. J Comput Phys 404:109120
Han J, Arnulf J, Weinan E (2018) Solving high-dimensional partial differential equations using deep learning. Proc Natl Acad Sci 115(34):8505–8510
Michoski C, Milosavljevic M, Oliver T, Hatch D (2019) Solving irregular and data-enriched differential equations using deep neural networks. arXiv preprint arXiv:1905.04351
Samaniego E, Anitescu C, Goswami S, Nguyen-Thanh VM, Hongwei G, Khader H, Zhuang X, Rabczuk T (2020) An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications. Comput Methods Appl Mech Eng 362:112790
Ramabathiran Amuthan A, Prabhu R (2021) Spinn: sparse, physics-based, and partially interpretable neural networks for pdes. J Comput Phys 445:110600
Botelho S, Joshi A, Khara B, Sarkar S, Hegde C, Adavani S, Ganapathysubramanian B (2020) Deep generative models that solve pdes: distributed computing for training large data-free models. arXiv preprint arXiv:2007.12792,
Mitusch Sebastian K, Funke Simon W, Miroslav K (2021) Hybrid fem-nn models: combining artificial neural networks with the finite element method. J Comput Phys 446:110651
Jokar M, Semperlotti F (2021) Finite element network analysis: a machine learning based computational framework for the simulation of physical systems. Comput Struct 247:106484
Sitzmann V, Martel JNP, Bergman AW, Lindell DB, Wetzstein G (2020) Implicit neural representations with periodic activation functions. Adv Neural Inform Process Syst 33:7462–7473
Mishra S, Konstantin Rusch T (2021) Enhancing accuracy of deep learning algorithms by training with low-discrepancy sequences. SIAM J Numer Anal 59(3):1811–1834
Zhu Y, Zabaras N, Koutsourelakis P-S, Perdikaris P (2019) Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J Comput Phys 394:56–81
Wen G, Li Z, Azizzadenesheli K, Anandkumar A, Benson Sally M (2021) U-fno–an enhanced fourier neural operator based-deep learning model for multiphase flow. arXiv preprint arXiv:2109.03697,
Ranade R, Hill C, Pathak J (2021) Discretizationnet: a machine-learning based solver for Navier-Stokes equations using finite volume discretization. Comput Methods Appl Mech Eng 378:113722
Brenner S, Scott R (2007) The mathematical theory of finite element methods, vol 15. Springer Science & Business Media
Larson MG, Bengzon F (2013) The finite element method: theory, implementation, and applications, vol 10. Springer Science & Business Media
Shin Y , Zhang Z, Karniadakis GE (2020) Error estimates of residual minimization using neural networks for linear pdes. arXiv preprint arXiv:2010.08019,
Mishra S, Molinaro R (2020) Estimates on the generalization error of physics informed neural networks (pinns) for approximating pdes. arXiv preprint arXiv:2006.16144
Jiao Y, Lai Y, Luo Y, Wang Y, Yang Y (2021) Error analysis of deep ritz methods for elliptic equations. arXiv preprint arXiv:2107.14478
He J, Li L, Xu J, Zheng C (2018) Relu deep neural networks and linear finite elements. arXiv preprint arXiv:1807.03973
Takeuchi J, Kosugi Y (1994) Neural network representation of finite element method. Neural Netw 7(2):389–395
Xu G, Littlefair G, Penson R, Callan R (1999) Application of fe-based neural networks to dynamic problems. In: ICONIP’99. ANZIIS’99 & ANNES’99 & ACNN’99. 6th International Conference on neural information processing. Proceedings (Cat. No. 99EX378), volume 3, pp 1039–1044. IEEE
Ramuhalli P, Udpa L, Udpa SS (2005) Finite-element neural networks for solving differential equations. IEEE Trans Neural Netw 16(6):1381–1392
Chao X, Wang C, Ji F, Yuan X (2012) Finite-element neural network-based solving 3-d differential equations in mfl. IEEE Trans Magn 48(12):4747–4756
Khodayi-Mehr R, Zavlanos M (2020) Varnet: variational neural networks for the solution of partial differential equations. In: Learning for dynamics and control, pp 98–307. PMLR,
Gao H, Zahr Matthew J, Wang J-X (2022) Physics-informed graph neural Galerkin networks: a unified framework for solving pde-governed forward and inverse problems. Comput Methods Appl Mech Eng 390:114502
Yao H, Gao Y, Liu Y (2020) Fea-net: a physics-guided data-driven model for efficient mechanical response prediction. Comput Methods Appl Mech Eng 363:112892
Evans LC (1998) Partial differential equations. Graduate Stud Math 19(4):7
Reddy JN (2010) An introduction to the finite element method, vol 1221. McGraw-Hill New York
Bing Y et al (2017) The deep ritz method: a deep learning-based numerical algorithm for solving variational problems. arXiv preprint arXiv:1710.00211
Ming YL et al (2021) Deep Nitsche method: Deep ritz method with essential boundary conditions. Commun Comput Phys 29(5):1365–1384
Courte L, Zeinhofer M (2021) Robin pre-training for the deep ritz method. arXiv preprint arXiv:2106.06219
Müller J, Zeinhofer M (2022) Error estimates for the deep ritz method with boundary penalty. In: Mathematical and Scientific Machine Learning, pp 215–230. PMLR
Müller J, Zeinhofer M (2022) Notes on exact boundary values in residual minimisation. In: Mathematical and Scientific Machine Learning, pp 231–240. PMLR
Dondl P, Müller J, Zeinhofer M (2022) Uniform convergence guarantees for the deep ritz method for nonlinear problems. Adv Contin Discrete Models 2022(1):1–19
Hyuk Lee and In Seok Kang (1990) Neural algorithm for solving differential equations. J Comput Phys 91(1):110–131
Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans Neural Netw 9(5):987–1000
Malek A, Beidokhti Shekari R (2006) Numerical solution for high order differential equations using a hybrid neural network optimization method. Appl Math Comput 183(1):260–271
Sukumar N, Srivastava A (2021) Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. arXiv preprint arXiv:2104.08426
IsaacE L, AristidisC L, DimitrisG P (2000) Neural-network methods for boundary value problems with irregular boundaries. IEEE Trans Neural Netw 11(5):1041–1049
Remco Van der M, Cornelis O, Anastasia B (2020) Optimally weighted loss functions for solving pdes with neural networks. arXiv preprint arXiv:2002.06269
Sifan W, Yujun T, and Paris P (2020) Understanding and mitigating gradient pathologies in physics-informed neural networks. arXiv preprint arXiv:2001.04536,
Oliver H, Susheela N, Mohammad Amin N, Akshay S, Kaustubh T, Zhiwei F, Max R, Wonmin B, Sanjay C (2021) Nvidia simnet: an ai-accelerated multi-physics simulation framework. In: International Conference on computational science, pp 447–461. Springer
Wang S, Perdikaris P (2021) Long-time integration of parametric evolution equations with physics-informed deeponets. arXiv preprint arXiv:2106.05384
Paganini M, de Oliveira L, Nachman B (2018) Calogan: simulating 3d high energy particle showers in multilayer electromagnetic calorimeters with generative adversarial networks. Phys Rev D 97(1):014021
Krishnapriyan AS, Gholami A, Zhe S, Kirby RM, Mahoney MW (2021) Characterizing possible failure modes in physics-informed neural networks. arXiv preprint arXiv:2109.01050
Wang S, Teng Y, Perdikaris P (2021) Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J Sci Comput 43(5):A3055–A3081
Fox C (1987) An introduction to the calculus of variations. Courier Corporation
Kingma DP, Jimmy B (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Olaf R, Philipp F, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on medical image computing and computer-assisted intervention, pp 234–241. Springer
Özgün Ç, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: International Conference on medical image computing and computer-assisted intervention, pp 424–432. Springer
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Bubeck S, Eldan R, Lee YT, Mikulincer D (2020) Network size and weights size for memorization with two-layers neural networks. arXiv preprint arXiv:2006.02855
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Oden JT, Reddy JN (2012) An introduction to the mathematical theory of finite elements. Courier Corporation
Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39(3):930–945
Balu A, Botelho S, Khara B, Rao V, Hegde C, Sarkar S, Adavani S, Krishnamurthy A, Ganapathysubramanian B (2021) Distributed multigrid neural solvers on megavoxel domains. arXiv preprint arXiv:2104.14538
Lu L, Meng X, Mao Z, Karniadakis GE (2021) Deepxde: a deep learning library for solving differential equations. SIAM Rev 63(1):208–228
Hughes TJR, Scovazzi G, Franca LP (2017) Multiscale and stabilized methods. In: Stein E, de Borst R, Hughes TJR (eds) Encyclopedia of computational mechanics second edition, vol 5, chap 2, pp 1–64. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781119176817.ecm2051
Ghanem RG, Spanos PD (2003) Stochastic finite elements: a spectral approach. Courier Corporation
Acknowledgements
This work was supported in part by the National Science Foundation under Grant Nos. CCF-2005804, LEAP-HI-2053760, OAC-1750865, CPS-FRONTIER-1954556, and USDA-NIFA-2021-67021-35329.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors have no conflict of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Representation of random diffusivity
With \(\omega\) taken from the sample space \(\Omega\), the diffusivity/permeability \(\nu\) can be written as an exponential of a random quantity Z
We assume that Z is square integrable, i.e., \(\mathbb {E}\left[ |Z({\textbf{x}}; \omega )|^2 \right] < \infty\). Then, we can write Z using the Karhunen–Loeve expansion [68], as
where \({\bar{Z}}({\textbf{x}}) = \mathbb {E}(Z({\textbf{x}}, \omega ))\), and \(\psi _i(\omega )\) are independent random variables with zero mean and unit variance. \(\lambda _i\) and \(\phi _i({\textbf{x}})\) are the eigenvalues and eigenvectors corresponding to the Fredholm equation
where \(C_Z(s,t)\) is the covariance kernel given by,
where \(\eta _i\) is the correlation length in the \(x_i\) coordinate. This particular form of the covariance kernel is separable in the three coordinates, thus the eigenvalues and the eigenfunctions of the multi-dimensional case can be obtained by combining the eigenvalues and eigenfunctions of the one-dimensional covariance kernel given by:
where \(\sigma _Z\) is the variance and \(\eta\) is the correlation length in one-dimension.
Equation 46 can then be written as
where \(a_i\) is an m-dimensional parameter, \(\lambda _x\) and \(\lambda _y\) are vectors of real numbers arranged in the order of monotonically decreasing values; and \(\phi\) and \(\psi\) are functions of x and y, respectively. \(\lambda _{xi}\) is calculated as
where \(\omega _x\) is the solution to the system of transcendental equations obtained after differentiating Eq. 48 with respect to \({\textbf{t}}\). \(\lambda _{yi}\) are calculated similarly. \(\phi _i(x)\) are given by
and \(\psi _i(y)\) are calculated similarly. We take \(m = 6\) and assume that each \(a_i\) is uniformly distributed in \([-\sqrt{3},\sqrt{3}]\), and thus, \({\textbf{a}} \in [-\sqrt{3},\sqrt{3}]^6\). The input diffusivity \(\nu\) in all the examples in Sects. 6.3 and 6.4 are calculated by choosing the 6-dimensional coefficient \({\textbf{a}}\) from \([-\sqrt{3},\sqrt{3}]^6\).
Appendix 2: Further discussion on convergence studies
1.1 Discussion on the role of keeping \(e_{\mathcal {H}}\) and \(e_{\theta }\) low
If we choose a fixed network architecture and use it to solve Eq. 35 across different h-levels, then the errors do not necessarily decrease with decreasing h. As shown in Fig. 13, the errors actually increase when \(h>2^{-5}\). This reason for this behavior is that, when h becomes low, the number of discrete unknowns in the mesh (i.e., \(U_i\)’s in Eq. 14) increases. In fact, in this case, the number of basis functions/unknowns, N is exactly equal to \(\frac{1}{h^2}\). As h decreases, the size of the space \(V^h\) increases. However, since the network remains the same, the discrete function space \(V^h\) does not remain a subspace of \(V^h_\theta\) anymore. This network function class also needs to get bigger to accommodate all the possible functions at the lower values of h. Figure 13 also shows the errors obtained when the network is indeed enhanced to make \(V^h_\theta \supset V\) (this is a clone of the errors plotted in Fig. 7).
Appendix 3: Solutions to the parametric Poisson’s equation
1.1 Randomly selected examples
1.2 Mean and standard-deviation fields
See Table 3.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Khara, B., Balu, A., Joshi, A. et al. NeuFENet: neural finite element solutions with theoretical bounds for parametric PDEs. Engineering with Computers (2024). https://doi.org/10.1007/s00366-024-01955-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00366-024-01955-7