Introduction

Partial differential equations (PDEs) are mathematical expressions that describe physical laws governing natural and engineered systems, such as heat transfer, fluid flow, and wave propagation. They are widely used to predict physical variables of interest in spatial and temporal domains. The solutions of PDEs are often combined with laboratory and field experiments to help stakeholders make informed decisions. Therefore, accurate estimation of variables of interest is critical. Numerous methods are used to solve PDEs, including analytical1,2,3, finite difference4,5, finite volume5,6,7, finite element5,8.

In recent times, there have been significant developments in machine learning and physics-informed machine learning methods, leading to the emergence of physics-informed neural networks (PINN)9,10,11,12,13 as a popular tool for solving partial differential equations (PDEs). The approach involves treating the primary variable in the PDE as a neural network. The underlying PDE is then discretized using auto-differentiation, and modern optimization frameworks are used to minimize a loss function. The loss function captures losses from the residual PDE and the initial and boundary conditions. Such PINN methods have been used to solve a wide range of scientific and engineering problems, including metamaterials, fluid dynamics, and geometries9,10,11,11,12,13,14,15,16,17,18,19,20,21.

These literature examples demonstrate that PINN can handle complex geometry quite effectively. However, it has been shown that training a PINN model is a computationally expensive process, and the resulting solutions are not always as accurate compared to numerical solutions11,22,23. For instance, the training process can take thousands of epochs, and the absolute error of forward solutions is approximately \(10^{-5}\) due to the difficulties involved in solving high-dimensional non-convex optimization problems23,24.

Several studies have examined the limitations of PINN in terms of computation time and accuracy. For example, Jagtap et al.22 addressed the accuracy issue of PINN by introducing an additional constraint term to the loss function that enforces continuity/flux across the subdomain boundaries. However, this additional term makes the model more complex to train, requiring more data and increased computational cost. On the other hand, Huang et al.25 focused on improving the training efficiency of PINN by incorporating multi-resolution hash encoding that provides locally-aware coordinate inputs to the neural network. However, this encoding method requires careful selection of hyperparameters and auto-differentiation, complicating training and post-training processes. A comprehensive study is required to evaluate how well PINN solutions satisfy balance laws, such as mass, momentum, and energy.

This study aimed to determine whether PINN conserved local and global mass and compared its performance with the FV method. For this purpose, we solved a steady-state 1D groundwater flow equation (diffusion equations) to calculate hydraulic heads using analytical, the two-point finite volume (FV), and PINN methods for homogeneous and heterogeneous media cases. We conducted hyperparameter tuning by generating an ensemble of 10,800 unique scenarios for each case using a grid search technique to explore all possible combinations of hyperparameters and find the best PINN model. Subsequently, we computed the accuracy of the FV and PINN models using performance metrics such as MSE, \( R^2\), MBE, and NSE(spatial), comparing the hydraulic heads calculated using the analytical solution. Finally, we investigated the integrity of the PINN models by computing Darcy’s flux, local mass balance error, and global mass balance error.

The paper is structured as follows: the section “Governing equations” provides the governing equations for groundwater flow and equations for mass balance error calculation. The section “Methodology” presents the analytical, two-point flux finite volume, and PINN solutions for the governing equations described in section “Governing equations”, hyperparameters tuning, and performance matrices. The section “Results and discussion” discusses the results for tuning PINN to find the best model, and compares the analytical, finite volume, and PINN solutions for the hydraulic head, Darcy’s flux, and mass balance error. Finally, conclusions are drawn in the section “Conclusions”.

Governing equations

The one-dimensional steady-state balance of mass for groundwater flow in the absence of sources and sinks within the flow domain is:

$$\begin{aligned} \frac{\partial }{\partial {x}}\left[ {{q}(x)}\right] = 0, \end{aligned}$$
(1)

where, q(x) [L/T], follows the Darcy’s model given by:

$$\begin{aligned} {q}(x) = - K(x)\frac{{\partial {h}}}{{ \partial {x}}}, \end{aligned}$$
(2)

where, h[L] is the piezometric head, x[L] is the coordinate, and K[L/T] is the hydraulic conductivity. Using Eqs. 1 and 2, the governing equation for the balance of mass becomes:

$$\begin{aligned} \frac{\partial }{{\partial {x}}}\left[ {K\left( x \right) \frac{\partial }{{\partial x}}h(x)} \right] = 0. \end{aligned}$$
(3)

Equation 3 requires two boundary conditions for the head. Assuming a unit-length domain, if \(h_L, h_R\) are the heads at the left and right faces of the domain, then \(h(x=0) = h_L\) and \(h(x=1) = h_R\).

The model domain (\(x \in [0,1]\)) is discretized into a finite number of cells (NCells). Figure 1 shows the discretization for obtaining both the FV and PINN solutions. Assuming a unit area for the one-dimensional domain, the mass flux at face \(i-\frac{1}{2}\) (\(m_{\textrm{local}_{i-\frac{1}{2}}}\)), the local mass balance error (LMBE) in cell i, and the global mass balance error (GMBE) are:

$$\begin{aligned} m_{\textrm{local}_{i-\frac{1}{2}}}&= \rho _w q_{i-\frac{1}{2}}, \end{aligned}$$
(4a)
$$\begin{aligned} LMBE_i&= {m_{\textrm{local}}}_{i + \frac{1}{2}} - {m_{\textrm{local}}}_{i - \frac{1}{2}} \quad \forall i = 1, 2, \cdots , NCells, \end{aligned}$$
(4b)
$$\begin{aligned} GMBE&= \sum \limits _{i = 1}^{NCells} LMBE_i, \end{aligned}$$
(4c)

where, \({\rho _w}\) \([M/L^3]\) is the density of water, \(q_{i-\frac{1}{2}}\) [L/T] is Darcy’s flux at face \(i-\frac{1}{2}\), \(x_{i-1/2}\) and \(x_{i+1/2}\) are locations of the faces where the Darcy fluxes are computed. For the sake of simplicity, we set \(\rho_w=1\)

Figure 1
figure 1

Conceptual 1D model illustrating the locations of the head, Darcy’s flux, and local mass balance error (LMBE) calculations.

Methodology

This section briefly describes the analytical, FV, and PINN methods to solve Eq. 3. We consider two scenarios – homogeneous (K is a constant) and heterogeneous (K varies with x) porous media.

Analytical solution

Considering K as constant and choosing \(h(x=0) = 0\) and \(h(x=1) = 0.9\) as the Dirichlet boundary conditions, the analytical solution can be derived as follows:

$$\begin{aligned} h = - 0.1x + 1. \end{aligned}$$
(5)

With K varies as a function of space as \( K(x) = (x + 0.5)^2/2.25\) (a non-negative convex function), the analytical solution for the heterogeneous scenario can be derived as:

$$\begin{aligned} h = \frac{{0.075}}{{\left( {x + 0.5} \right) }} + {0.85}. \end{aligned}$$
(6)

Numerical solution using finite volume

For the homogeneous medium, the discretized forms of Eq. 3 using the two-point flux finite volume with a central difference for gradient, \(\frac{dh}{dx}\), are as follows:

$$\begin{aligned} {h_{i + 1}} - 2{h_i} + {h_{i - 1}} = 0, \end{aligned}$$
(7)

while for the heterogeneous medium:

$$\begin{aligned} \left( {{K_{i + 1}} + {K_i}} \right) {h_{i + 1}} - \left( {{K_{i + 1}} + 2{K_i} + {K_{i - 1}}} \right) {h_i} + \left( {{K_i} + {K_{i - 1}}} \right) {h_{i - 1}} = 0. \end{aligned}$$
(8)

Along with the boundary conditions, \(h(x=0) = 0\), and \(h(x=1) = 0.9\), the linear systems of Eqs. 7 and 8 were solved using the LU decomposition method in SciPy Python package26.

PINN solution

The deep neural network (DNN) approximation of groundwater heads as a function of x, and weights and biases (\(\theta \)) of a neural network is:

$$\begin{aligned} h(x) \approx \hat{h}(x;\theta ). \end{aligned}$$
(9)

DNN approximation of the governing Eq. 3 for the homogeneous case is given by:

$$\begin{aligned} f(x) = \frac{\partial ^2{h}}{\partial {x^2}} \approx \hat{f}(x;\theta ) = \frac{{{\partial ^2}\hat{h}(x;\theta )}}{{\partial {x^2}}}, \end{aligned}$$
(10)

while the heterogeneous case is

$$\begin{aligned} f(x) = \frac{\partial }{{\partial {x}}}\left[ {K\left( x \right) \frac{\partial }{{\partial x}}h(x)} \right] \approx \hat{f}(x;\theta ) = \frac{\partial }{{\partial {x}}}\left[ {K\left( x \right) \frac{\partial }{{\partial x}} \hat{h}(x;\theta )} \right] . \end{aligned}$$
(11)

The loss function (accounting for the losses in the PDE residual and the residual due to the boundary conditions) without training data for the above scenarios is

$$\begin{aligned} \mathcal {L}(\theta ) = \frac{1}{{{N_c}}}{\sum \limits _{i = 1}^{{N_c}} {\left[ {\hat{f}(x_i^c;\theta )} \right] } ^2} + \frac{1}{{{N_D}}}{\sum \limits _{i = 1}^{{N_D}} {\left[ {\hat{h}\left( {x_i^D;\theta } \right) - g_i^*} \right] } ^2}. \end{aligned}$$
(12)

The first term on the right-hand side in Eq. 12 is the loss due to the PDE residual at collocation points. The second term in Eq. 12 is loss due to the Dirichlet boundary condition. In this loss function, \({N_c}\) and \({N_D}\) represent the number of collocation points and the number of Dirichlet boundary conditions, respectively. \(x_i^C\) and \(x_i^D\) are the locations of the collocation points and Dirichlet boundaries, respectively.

If we include training data for the heads, an additional loss term is added in Eq.  12, and the overall loss function for this scenario become

$$\begin{aligned} \mathcal {L}(\theta ) = \frac{1}{{{N_c}}}{\sum \limits _{i = 1}^{{N_c}} {\left[ {\hat{f}(x_i^c;\theta )} \right] } ^2} + \frac{1}{{{N_D}}}{\sum \limits _{i = 1}^{{N_D}} {\left[ {\hat{h}\left( {x_i^D;\theta } \right) - g_i^*} \right] } ^2}\mathrm{{ + }}\frac{1}{{{N_h}}}{\sum \limits _{i = 1}^{{N_h}} {\left[ {\hat{h}\left( {x_i^h;\theta } \right) - h_i^*} \right] } ^2}, \end{aligned}$$
(13)

This last term in Eq. 13 is the loss to match h(x) to the measurements \(h^{*}\). \({N_h}\) represents the number of head measurements/data. The head training data and Dirichlet boundary conditions are represented by \(h^*\) and \(g^*\). \(x_i^C, x_i^D, x_i^h\) are the locations of the collocation points, Dirichlet boundaries, and head data, respectively.

Figure 2 illustrates the PINN structure used in this study to solve partial differential equations describing 1D steady-state groundwater flow problems in homogeneous and heterogeneous porous media. The PINN solution, \(\hat{h}(x;\theta )\) (see Fig. 2), is obtained by

$$\begin{aligned} \mathop {\text{ minimize }}_{\hat{h}(x;\theta ) \in \mathbb {R}^{+}} \mathcal {L}(\theta ), \end{aligned}$$
(14)

and the optimal solution is obtained by selecting the model with minimal loss through extensive hyperparameter tuning.

Figure 2
figure 2

PINN structure for solving partial differential equation describing 1D steady-state groundwater flow problems in homogeneous and heterogeneous porous media.

To develop PINN, multi-layer feed-forward DNNs27,28 are used. The hyperbolic tangent was used as the activation function because it is infinitely differentiable and facilitated for differentiating Eq. 3 twice. The remaining PINN hyperparameters are collocation points, learning rate, training data, and epoch numbers. Collocation points (\(x_i^C\)) in Eq. 13 are the locations in the model domain where PINN is trained for satisfying the PDE. We coincide these points with the FV cell centers for comparison its solution with analytical and numerical solutions. Training data are the observations/measurements or sampled values from an analytical solution used to train PINN. Epoch is one complete pass of the training dataset (PDE, BCs, ICs where needed, and the training data) through the optimization algorithm to train PINN. The learning rate is the step size at each epoch while PINN approaches its minimum loss. Our codes for PINN used the JAX python package29.

Hyperparameter tuning to find the best PINN model

To evaluate the performance of PINN solutions, we conducted detailed hyperparameter tuning. Hyperparameters are settings that control the PINN’s learning process, such as the learning rate in an optimization algorithm. They are distinct from process model parameters (e.g., permeability) learned from the data. Our approach involved varying several hyperparameters to generate an ensemble of 10,800 unique scenarios using grid search, which explores all possible combinations of hyperparameters to find the best PINN model within the search space. In grid search, we specified a grid of possible values for each hyperparameter you want to tune to define the search space. The search algorithm trains the PINN model on all possible combinations of hyperparameter values from the defined grid. The optimal set combines hyperparameter values that lead to the best performance metric (see Sec. "Performance metrics"). We aimed to identify the optimal PINN models for homogeneous and heterogeneous cases. We varied the hyperparameters, including the number of hidden layers, the number of nodes per hidden layer, the learning rates, the number of epochs, the number of collocation points, and the number of training data points for the head. Table 1 lists the values chosen for these parameters. To run 10,800 realizations, we used the high-performance computing cluster Tahoma30 at Environmental Molecular Sciences Laboratory, which enabled us to distribute the evaluation of different hyperparameter combinations across multiple processors. Tahoma has 184 Intel Cascade Lake nodes, each with a clock speed of 3.1 GHz. Each compute node has two 18-core sockets and 36 Intel Xeon Gold 6254 cores per node, and 384 GB of memory, giving 10 2/3 GB per core. By distributing the search, we could reduce the overall tuning time to less than 24 h to find the optimal hyperparameter set.

Table 1 The chosen hyperparameters and their ranges for finding the best PINN model for homogeneous and heterogeneous scenarios.

Performance metrics

In addition to the loss function, we used mean squared error (MSE), correlation coefficient (\(R^2\)), mean bias error (MBE), and Nash-Sutcliffe’s efficiency (spatial version of NSE) performance metrics31,32,33,34 to evaluate the performance of the PINN and FV methods. The MSE, \(R^2\), MBE, and NSE are given by:

$$\begin{aligned} MSE&= \frac{1}{n} \sum \limits _{i = 1}^n (h_{\textrm{a}_i} - h_{\textrm{p}_i})^2, \end{aligned}$$
(15a)
$$\begin{aligned} R^2&= 1 - \frac{\sum \nolimits _{i = 1}^n {{{\left( {{h_{\textrm{a}_i}} - h_{\textrm{p}_i}} \right) }^2}}}{\sum \nolimits _{i = 1}^n {{{\left( {{h_{\textrm{a}_i}} - {{\bar{h}}_{\textrm{a}}}} \right) }^2}}}, \end{aligned}$$
(15b)
$$\begin{aligned} MBE&= \frac{1}{n \cdot \bar{h}_{\textrm{a}}} \times \sum \limits _{i = 1}^n (h_{\textrm{p}_i} - h_{\textrm{a}_i}), \end{aligned}$$
(15c)
$$\begin{aligned} NSE&= 1 - \frac{\sum \nolimits _{i = 1}^n (h_{\textrm{p}_i} - h_{\textrm{a}_i})^2}{\sum \nolimits _{i = 1}^n (h_{\textrm{a}_i} - \bar{h}_{\textrm{a}})^2}, \end{aligned}$$
(15d)

where, n is the number of hydraulic heads, \(h_{\textrm{a}}\) is the head from the analytical solution, \(h_{\textrm{p}}\) is the predicted head from the PINN model, and \({\bar{h}}_{\textrm{a}}\) is the mean of heads derived from analytical solution. The values of the RMSE, \(R^2\), MBE, and spatial version of NSE would be, respectively, 0, 1, 0, 1 for a perfect model32,33. If the MBE is positive, it indicates that the model tends to overestimate the true values. If the MBE is negative, it suggests that the model tends to underestimate the true values. An MBE of zero implies that, on average, the model predictions are unbiased and align closely with the true values.

Results and discussion

This section will present the results of tuning the PINN’s hyperparameters to find the best model. We will then compare the tuned model’s performance with the analytical and FV solutions. Finally, we will assess the mass balance errors associated with the FV and PINN solutions for both homogeneous and heterogeneous cases.

Homogeneous porous media

Tuning PINN for homogeneous media

We trained physics-informed neural networks (PINN) for 10,800 different hyperparameter scenarios to find the best model for the homogeneous case. We used the loss metrics in Eq. 12 for PINN without training data and Eq. 13 for PINN with training data. The optimal set of hyperparameters was determined by selecting the scenario with the lowest loss value during PINN training. The hyperparameters considered included the following: the number of training data points (Fig. 3a), the number of hidden layers (Fig. 3b), the number of nodes per hidden layer (Fig. 3c), the learning rate (Fig. 3d), the number of epochs (Fig. 3e), and the number of collocation points (Fig. 3f). In our study, we trained the PINN model using 0, 7, 20, 40, 80, and 160 sets of training data to compare the performance of PINN with and without training data (see Fig. 3a). The results revealed that the PINN model trained without training data outperformed those trained with datasets. A PINN model out of 1800 trained neural architectures with no data achieved the lowest best loss value of \(1.83 \times 10^{-11}\). The best-performing PINN model trained with data sets out of 9,000 different architectures attained the lowest best loss value of \(2.23 \times 10^{-11}\) with 80 training data (see Table 2). Our thorough tuning analysis led us to focus on the best PINN model without training data, providing a robust basis for our discussion.

In Fig. 3b, it is clear that the tuned PINN achieved the lowest losses across scenarios with 1, 2, and 3 hidden layers. However, fewer hidden layers resulted in higher values when considering the upper bound of best losses (Fig. 3b). More hidden layers led to lower best loss values (Fig. 3b). This figure also illustrates that out of 3600 models, a PINN model with two hidden layers achieved the lowest best loss of \(1.83 \times 10^{-11}\) (Table 2). We investigated how changing the number of nodes (10, 25, and 50) in each hidden layer affected the optimal loss. As shown in Fig. 3c, a PINN model with 50 nodes per hidden layer achieved the lowest loss. Figure 3d shows that using relatively larger learning rates results in significantly lower best losses. A low learning rate, such as \(10^{-4}\), allows for the exploration of multiple local minima within the non-convex loss function, leading to high values for best loss (Fig. 3d). The learning rate \(10^{-2}\) yielded the best PINN model, achieving the lowest best loss of \(1.83 \times 10^{-11}\) (Fig. 3d). In Fig. 3e, it is evident that training the PINN models for a higher number of epochs results in lower best losses. For instance, training the PINN for only 10,000 epochs showed many scenarios with higher best losses (Fig. 3e). The lowest best losses were achieved when the PINN model was trained for 100,000 epochs (Fig. 3e).

Figure 3
figure 3

Best losses for 10,800 hyperparameter scenarios for the homogeneous case using the loss function in Eqs. 12 and 13. The hyperparameters considered were: (a) the number of training data points, (b the number of hidden layers, (c) the number of nodes per hidden layers, (d) the learning rate, (e) the number of epochs, and f the number of collocation points. (be) show the optimal number of hidden layers, number of nodes per hidden layer, learning rate, and number of epochs are 2, 50, 10\(^{-2}\), and 10\(^5\), respectively, to obtain the lowest best loss. (a) shows that the PINN model with no data outperforms the PINN models with training data to obtain the best loss value.

Table 2 Hyperparameters and loss of the best two PINN models with and without training data out of 10,800 trained PINN models for the homogeneous case.

Figure 3f illustrates that the number of collocation points has a relatively minor impact on achieving the lowest best loss. Notably, models with 41, 81, and 161 collocation points consistently attained the lowest loss values. We investigated the accuracy of the tuned PINN’s prediction and the finite volume (FV) solution with the analytical solution to determine the optimal number of collocation points using the mean squared error (MSE) metrics. Figure 4a illustrates that for all the collocation points, the MSE associated with FV solutions converged to double precision (\(< 10^{-28}\)), while the MSE of PINN predictions with the best PINN models at each collocation point exhibited a relatively higher value of MSE (\(\approx 10^{-13}\)). Notably, 41 collocation points achieved the lowest MSE value for the PINN model. Based on the above systematic study, we found that the best set of hyperparameters for the PINN model is two hidden layers, 50 nodes per hidden layer, a 0.01 learning rate, 100,000 epochs, 41 collocation points, and no training data.

Figure 4
figure 4

(a) MSE of the FV solutions and best PINN models without training data with the corresponding analytical solutions for all collocation points. (a) shows the optimal collocation points for the PINN model, which is 41, where PINN shows minimum MSE. (b) Epoch versus loss values during the best PINN model training. This figure shows that the PINN model achieved the best loss \(1.83 \times 10^{-11}\) at 97,386 epoch.

Figure 4b shows the evolution of loss over the epochs of the PINN model during training with the best set of hyperparameters. This figure indicates that the loss value fluctuates significantly throughout the training process. Initially, the loss decreases rapidly, showing that the model is learning quickly and improving its predictions. However, periodic spikes in the loss value increase sharply before reducing again. This behavior could be due to several factors, including the optimizer’s dynamics, learning rate adjustments, or the loss landscape’s non-convex nature. Despite the frequent fluctuations, the overall long-term trend of the loss is downward, suggesting that the model is gradually improving its performance and reducing the error over time. By the end of the training period (at 100,000 epochs), the loss reaches very low values of \(1.83 \times 10^{-11}\) at 97,386 epochs. This indicates that the PINN has achieved a highly accurate fit to the underlying physical equations.

Hydraulic head solutions for the homogeneous media

The hydraulic heads at each collocation point were calculated using the best PINN model with the lowest loss. Additionally, the hydraulic heads at these points were computed using analytical and finite volume (FV) methods. The FV and the PINN solutions overlap the analytical solution (Fig. 5). Table 3 presents statistical analysis for performance evaluation of the FV and PINN methods with the corresponding analytical solution. This table shows that the FV and PINN methods can accurately approximate the true solution for the hydraulic head. The values of MSE, \(R^2\), MBE, and NSE(Spatial) between the analytical and FV solution are \(5.48 \times 10^{-31}\) (approximately the square of machine precision), 1.00, \(-1.54 \times 10^{-15}\), and 1.00, respectively. These metrics show that the FV solution is identical to the analytical solution, proving its accuracy and reliability. The very low MSE and the low MBE confirm that the FV method has minimal errors and biases. High \(R^2\) and NSE values further confirm the high precision and efficiency of the FV solution in approximating the analytical solution. Similarly, the values of MSE, \(R^2\), MBE, and NSE(Spatial) between the analytical and PINN predictions are \(5.27 \times 10^{-14}\), 1.00, \(-8.92 \times 10^{-7}\), and 1.00, respectively. These metrics collectively indicate that the PINN prediction is accurate and closely matches the analytical solution, similar to the Finite Volume (FV) method. Although the MSE for the PINN prediction is slightly higher than that for the FV solution, it is still very low, indicating minimal errors.

Figure 5
figure 5

Solution of hydraulic heads for the homogeneous porous media of analytical (black line), FV (blue dots), and PINN (red dashes) models.

Table 3 Performance of the FV and PINN with the corresponding analytical solution for the homogeneous case.

Darcy’s flux, local and global mass balance errors

Darcy’s flux is a key indicator that demonstrates the integrity of a numerical solution in porous media. For the steady-state problem, it should remain constant throughout the domain. We calculated Darcy’s fluxes at all nodes/collocation points within the model domain using Eq. 2. The Darcy’s fluxes computed from the FV and PINN solutions are nearly constant across the solution domain and closer to the analytical solution (Fig. 6a). The mean Darcy’s flux for the analytical, FV, and PINN methods are \(4.833 \times 10^{-2} [L/T]\), \(4.896 \times 10^{-2} [L/T]\), and \(4.896 \times 10^{-2} [L/T]\) respectively. The mean Darcy’s fluxes for the FV and PINN methods are slightly higher than that computed from the analytical solution (Table 4), suggesting numerical errors associated with the FV and PINN methods.

Figure 6
figure 6

(a) Darcy’s fluxes computed using hydraulic heads with the analytical, FV, and PINN methods for the homogeneous media. Darcy’s fluxes of the FV and PINN are higher than that of the analytical solution. (b) The LMBE of the FV method is close to machine precision, whereas the LMBE of PINN is higher.

The Local Mass Balance Error (LMBE) and Global Mass Balance Error (GMBE) in the model domain were calculated using LMBE and GMBE equations for both FV and PINN solutions. Analytical solutions are exact solutions that do not incur round-off or numerical truncation errors. The LMBE and GMBE of the analytical solutions are zero due to the exactness of the solutions; therefore, they are not discussed here. The LMBEs of the FV solution range from 0 to \(4.46 \times 10^{-15}\) [M/TL\(^2\)], close to machine precision, with a mean LMBE of \(1.28 \times 10^{-15}\) [M/TL\(^2\)]. On the other hand, the LMBEs of the PINN prediction vary from 0 to \(4.46 \times 10^{-6}\) [M/TL\(^2\)], with an average LMBE of \(4.18 \times 10^{-7}\) [M/TL\(^2]\) (Fig. 6b, Table  4). The mean LMBE of PINN prediction is eight orders of magnitude higher than that of the FV solution. The GMBE of the FV and the PINN solutions are \(5.02 \times 10^{-14}\) and \(1.63 \times 10^{-5}\), respectively (Table 4). The GMBE of the PINN predictions is nine orders of magnitude higher than that of the FV solution.

The LMBE and GMBE for PINN are much higher than those of the FV solution, indicating that PINN do not conserve mass locally and globally (Table 1). This discrepancy is primarily due to the fact that FV imposes a strict constraint on the mass balance when solving Eq. 1. Although PINN accurately computes heads, it fails to balance the mass because it aims to minimize the residual contributions from the PDE and the BCs instead of setting these residuals to zero. Another potential reason for this mismatch is the neural network’s inability to optimize its weights and biases as the loss function value decreases due to the non-convex nature of PINN training.

Table 4 Calculated mean Darcy’s flux [L/T], mean and maximum LMBE [M/TL\(^2\)], and GMBE [M/TL\(^2\)] of the analytical, FV, and PINN solutions for homogeneous media.

Heterogeneous Porous Media

Tuning PINN for heterogeneous media

Similar to homogeneous media, we explored 10,800 hyperparameter scenarios to optimize PINN for the heterogeneous case. We utilized the loss metrics outlined in Eq. 12 for PINN without training data and the loss function in Eq. 13 for those with training data. The optimal hyperparameters were selected based on the lowest loss value during PINN training. The hyperparameters assessed included the number of training data points (Fig. 7a), the number of hidden layers (Fig. 7b), the number of nodes per hidden layer (Fig. 3c), the learning rate (Fig. 7d), the number of epochs (Fig. 7e), and the number of collocation points (Fig. 7f). In our hyperparameter tuning experiments, we trained the PINN model with 0, 7, 20, 40, 80, and 160 sets of training data to evaluate the performance of PINN with and without training data (see Fig. 7a). The results showed that the PINN model without training data performed better than those trained with data sets, similar to the homogeneous case. We trained a substantial number of 1,800 models without any data, and one of these models achieved the lowest best loss value of \(4.52 \times 10^{-11}\). In comparison, the best-performing PINN model with training data sets, out of 9000 models, reached the lowest best loss value of \(5.95 \times 10^{-11}\) with seven training data points (see Table 5). This comprehensive approach led us to focus on the best PINN model without training data for further analysis. In Fig. 7b, it is evident that the best losses for hidden layers 1, 2, and 3 can vary depending on other hyperparameter configurations. The data indicates that the PINN models achieved relatively low best losses. Specifically, out of 3600 models with two hidden layers, a PINN model attained the lowest best loss of \(4.52 \times 10^{-11}\) (refer to Table 5 for details). Adjusting the number of nodes (10, 25, and 50) in each hidden layer has similar effects to changing the number of hidden layers. Depending on the other hyperparameter configurations, the best losses may be higher or lower for all the nodes (see Fig. 7c). Out of 3600 models, a PINN model with ten nodes per hidden layer achieved the lowest best loss (see Fig. 7c). Figure 7d shows that using relatively larger learning rates leads to significantly lower best losses. For instance, a low learning rate, such as \(10^{-4}\), results in higher values for best loss. The best PINN model achieved the lowest best loss of \(4.52 \times 10^{-11}\) with a learning rate of \(10^{-1}\). In Fig. 7e, it is evident that training the PINN models for more epochs leads to lower best losses. Conversely, training with fewer epochs resulted in significantly higher best losses. The PINN models trained for 80,000, 90,000, and 100,000 epochs exhibited similar best-loss values. However, the model trained for 80,000 epochs achieved the lowest best losses. The lowest best loss for a PINN model was achieved with 11 collocation points (Fig. 7f). To determine the optimal number of collocation points for comparison, we analyzed the convergence of the PINN prediction and FV solution with the analytical solution. As detailed in Eq.15a, the mean squared error metric (MSE) was used to assess the accuracy. The MSE of the FV solution decreases as the number of collocation/grid points within the solution domain increases. Increasing the number of collocation/grid points allows the FV method to more accurately approximate the continuous partial differential equations governing the physical phenomena, resulting in a decrease in the MSE and an improvement in the solution’s accuracy (Fig. 8a). Also, the MSE of the best PINN model for 11, 21, and 41 is lower than that of the FV solutions, indicating the spatial resolution independence advantage of the PINN method in predicting accurate solutions (Fig. 8a). For further analysis and comparisons of PINN prediction with the FV solution, we selected 81 collocation points where the MSE of both methods is comparable (Fig. 8a). The MSE of the PINN prediction and FV for 81 collocation points are \(9.27 \times 10^{-13}\) and \(1.56 \times 10^{-12}\), respectively. Finally, we determined the best optimal hyperparameter values for the PINN model, which includes two hidden layers, 10 nodes, a 0.1 learning rate, 80,000 epochs, 81 collocation points, and 0 training data.

Figure 7
figure 7

Best losses for 10,800 hyperparameter scenarios for the heterogeneous case using the loss function in Eqs. 12 and 13. The hyperparameters considered were: (a) the number of training data points, (b) the number of hidden layers, (c) the number of nodes per hidden layer, (d) the learning rate, (e) the number of epochs, and (f) the number of collocation points. (be) show the optimal number of hidden layers, number of nodes per hidden layer, learning rate, and number of epochs are 2, 10, 10\(^{-1}\), and 10\(^5\), respectively, to obtain the best loss. (a) shows that the PINN model with no data performs better than the PINN models with training data to obtain the best loss.

Table 5 Optimal values of hyperparameters of the optimal PINN models with and without training data out of 10,800 trained PINN models for the heterogeneous case.
Figure 8
figure 8

(a) MSE of the FV solutions and best PINN models without training data with the corresponding analytical solutions for all collocation points for heterogeneous media. (a) shows the optimal collocation point for the FV and PINN models 81, where both solutions show comparable MSE. (b) Epoch versus loss values during the best PINN model training with 81 collocation points. This figure shows that the PINN model obtained the best loss \(4.40 \times 10^{-7}\) at 51,030 epoch.

Figure 8b shows the changes in loss over the epochs of the PINN model during training with the best set of hyperparameters. The figure demonstrates that the loss fluctuates significantly during training. Initially, the loss decreases rapidly, indicating that the model is learning quickly and improving its predictions. However, periodic spikes in the loss value occur, increasing sharply before decreasing again. At the end of the 80,000-epoch training period, the loss reaches very low values of \(8.40 \times 10^{-9}\) at the 51,030-th epoch, indicating that the PINN has achieved a highly accurate fit to the underlying physical equations.

Hydraulic head solutions for the heterogeneous media

The best PINN model, which achieved the lowest loss, was used to calculate the hydraulic heads at each collocation point. Hydraulic heads at these points were also computed using the analytical and FV methods. The FV solution and the PINN prediction closely approximate hydraulic heads as in the analytical solution (Fig. 9). The performance evaluation matrics results reveal that the FV and PINN methods can accurately approximate the true solution for the hydraulic head (Table 6). For the FV solution, the performance metrics are as follows: the mean squared error (MSE) is \(9.27 \times 10^{-13}\), the coefficient of determination (\(R^2\)) is 1.00, the mean bias error (MBE) is \(2.09 \times 10^{-6}\), and the Nash-Sutcliffe efficiency (NSE) for spatial data is 1.00. These metrics indicate that the FV solution is very close to the analytical solution, demonstrating high accuracy and reliability. The low MSE and MBE confirm minimal errors and biases in the FV method. The perfect \(R^2\) and NSE values underscore the FV solution’s precision and efficiency in approximating the analytical solution. For the PINN prediction, the metrics are: MSE of \(1.56 \times 10^{-12}\), \(R^2\) of 1.00, MBE of \(3.47 \times 10^{-6}\), and \(NSE (\text {Spatial})\) of 1.00. These metrics indicate that the PINN prediction is highly accurate and closely matches the analytical solution. Although the MSE for the PINN prediction is slightly higher than that for the FV solution, it remains very low, suggesting minimal errors.

Figure 9
figure 9

Solution of hydraulic heads for the heterogeneous porous media with the analytical, FV, and best PINN model. This figure shows that the PINN solution for the 1D steady-state flow problem excellently agrees with the analytical and FV solution.

Table 6 Statistical analysis for performance evaluation of the FV and PINN with the corresponding analytical solution for the heterogeneous case.

Darcy’s flux, local and global mass balance error

We calculated Darcy’s fluxes at all nodes/collocation points across the model domain to assess the solution’s reliability to the steady-state problem in the heterogeneous case. Darcy’s flux should remain constant throughout the model domain to uphold the solution in porous media. The average Darcy’s flux for the analytical, FV, and PINN methods are \(3.333 \times 10^{-2} [L/T]\), \(3.361 \times 10^{-2} [L/T]\), and \(3.361 \times 10^{-2} [L/T]\), respectively (Table 7). Due to numerical error, Darcy’s fluxes for the FV and PINN methods are slightly higher than those of the analytical solution (Fig. 10). We observed that while the mean Darcy’s fluxes of the FV and PINN solutions are the same, the Darcy’s fluxes of the PINN prediction are not constant for the PINN solution (Fig. 10a). These results indicate that while PINN can find accurate head predictions for the solution domain, it does not maintain the integrity of the solution.

Figure 10
figure 10

(a) Darcy’s fluxes using hydraulic heads obtained by the analytical, FV, and PINN solutions for the heterogeneous media. Figure (a) shows that Darcy’s fluxes of the FV and PINN are higher than that of the analytical solution. Darcy’s fluxes of the PINN solution are not constant across the domain. (b) Local mass balance error (LMBE) of the FV and PINN solutions. This figure shows that the LMBEs of the PINN method are higher than that of the FV method.

The maximum and mean LMBE of the FV solution are \(2.43 \times 10^{-7}\) [M/TL\(^2\)] and \(5.73 \times 10^{-8}\) [M/TL\(^2\)], respectively, whereas the maximum and mean LMBE of the PINN prediction are \(7.71 \times 10^{-5}\) [M/TL\(^2\)] and \(2.05 \times 10^{-5}\) [M/TL\(^2\)] (Fig. 10b, Table 7). The PINN prediction’s mean LMBE is three orders higher than the FV solution’s. The GMBE of the FV and the PINN solutions are \(5.02 \times 10^{-14}\) and \(1.63 \times 10^{-5}\), respectively (Table  7). The GMBE of the PINN predictions is also three orders of magnitude higher than that of the FV solution.

The LMBE and GMBE for PINN are higher than those for the FV solution. This suggests that PINN does not conserve mass locally and globally (see Table 7). One of the main reasons for the significant difference in LMBE and GMBE between the FV and PINN solutions is that the FV method strictly enforces a mass balance constraint when solving Eq. 3. Although the PINN model accurately computes head values, it fails to maintain mass balance because its primary goal is to minimize the global residual contributions from the PDEs and BCs rather than setting these residuals exactly to zero.

Table 7 Computed mean \(Darcy's flux\) [L/T], mean and maximum LMBE [M/TL\(^2\)], GMBE [M/TL\(^2\)] of the analytical, FV, and PINN solutions for heterogeneous media.

Conclusions

PINN offers an alternative approach to solving physical problems described by partial differential equations. We examined whether PINN preserved local and global mass and assessed its performance against the FV method. This involved solving a steady-state 1D groundwater flow equation to determine hydraulic heads using analytical, FV, and PINN approaches for homogeneous and heterogeneous media scenarios. We fine-tuned hyperparameters by generating 10,800 unique scenarios per case through grid search. Subsequently, we evaluated FV and PINN accuracy using metrics like MSE, R2, MBE, and NSE(spatial) relative to the analytical solution for hydraulic heads. Lastly, we assessed the integrity of PINN models by analyzing Darcy’s flux and local and global mass balance errors.

In the case of homogeneous media, we determined the optimal configuration for the PINN model. This included two layers with 50 nodes per hidden layer, a learning rate 0.01, 100,000 epochs, 41 collocation points, and no training data through 10,800 hyperparameter tuning scenarios. The MSE, \(R^2\), MBE, and NSE(spatial) scores were found to be \(5.27 \times 10^{-14}\), 1.0, \(-8.92 \times 10^{-7}\), and 1.0 respectively. These scores indicate the reliability and accuracy of the PINN method in predicting hydraulic heads. Darcy’s fluxes with the FV and PINN methods appear similar and relatively constant, resembling the analytical solution. However, the fluxes are slightly higher than the analytical solution due to numerical truncation error. The mean LMBE and GMBE of the PINN prediction are 8 and 9 orders of magnitude higher, respectively, than those of the FV solution. These higher values of LMBE and GMBE for the PINN prediction suggest that the PINN method struggles to conserve mass locally and globally while solving a PDE problem.

We have selected the best set of hyperparameters for the PINN model based on convergence analysis and comparison with the FV solution for heterogeneous media case. This includes two hidden layers, ten nodes, a learning rate 0.1, 80,000 epochs, 81 collocation points, and no training data for PINN training and prediction. PINN demonstrated reliability and accuracy in predicting hydraulic head solutions in the heterogeneous case. The MSE, \(R^2\), MBE, and spatial NSE scores for the heterogeneous media case are \(1.56 \times 10^{-12}\), 1.0, \(3.47 \times 10^{-7}\), and 1.0, respectively. While the mean values of Darcy’s fluxes for the finite volume and PINN solutions are consistent, Darcy’s fluxes in the PINN prediction fluctuate within the solution domain. This fluctuation contradicts the expected behavior, as Darcy’s flux should remain constant throughout the solution domain for a steady-state problem. The fluctuation of Darcy’s flux in the PINN method indicates a lack of integrity in solving the problem. The mean LMBE and GMBE of the PINN prediction are three orders of magnitude higher than those of the FV solution. Similar to the homogeneous media case, PINN fails to conserve mass locally and globally while solving the Partial Differential Equation (PDE) for the heterogeneous media case.

In both homogeneous and heterogeneous cases, the significant difference in LMBE and GMBE between FV and PINN solutions stems from FV’s imposition of a strict mass balance constraint, whereas PINN achieves its solution by softly enforcing this constraint when solving PDE problems. Although PINN calculates accurate heads, it struggles to balance the mass because it aims to minimize the residual contribution from the PDE and the BCs instead of precisely setting these residuals to zero. Another potential reason for this discrepancy is the neural network’s inability to optimize its weights and biases as the loss function value decreases due to the non-convex nature of PINN training. These findings highlight the limitations of PINN in applications where local or global mass conservation is crucial.