Introduction

Tight sandstone oil is China’s most potential strategic replacement area for petroleum resources. Capillary pressure and relative permeability, the most critical properties of tight sandstones, are extensively used to construct the model of oil migration in the tight sandstone reservoir (Blunt 2017; Wang et al. 2022). Due to the large reservoir scale, the efficient and accurate assessment of capillary pressure and relative permeability for numerous reservoir rock samples with limited routine parameters is essential for reservoir classification, reservoir modeling, and productivity prediction. However, the current estimation software of capillary pressures and relative permeability is only applicable to conventional reservoirs, not unconventional tight reservoirs. The tight reservoirs are very inhomogeneous and have significant variations in porosity and permeability. Moreover, the tight sandstone usually has multiscale pores (including numerous nanopores and micropores), multiple fluid flow mechanisms (including non-Darcy and Darcy flow), and complex wettability. All this makes the conventional estimation method of capillary pressures and relative permeability unsuitable for tight sandstone (Xiao et al. 2017; Zhou et al. 2022). Thus, it is essentially necessary to develop a new two-phase flow prediction method for tight sandstone reservoirs.

The theoretical models for multiphase flow in rock have been researched for several years (Valvatne and Blunt 2004). The three primary theoretical methods to simulate multiphase fluid flow in porous media are the direct numerical method, dynamic and quasi-static pore network models (PNM). The direct numerical method usually solves Navier–Stokes on a meshed 3D digital core of the porous media by traditional computational fluid dynamic algorithms (Ferrari and Lunati 2014). However, when simulating the multiphase flow in samples with multiscale pore space (such as tight sandstone and shale core), most direct methods are very time-consuming and have serious numerical instability problems (Porter et al. 2009; Raeini et al. 2012). The dynamic and quasi-static pore network models simulate the multiphase fluid flow in a simplified pore network extracted from the actual 3D digital core of the porous media (Al-Gharbi and Blunt 2005; Mehmani et al. 2013; Qin 2015). The dynamic pore network model can simulate the transient flow characteristics of the multiphase fluid in the network (Joekar-Niasar et al. 2010; Zhang et al. 2014; Gong et al. 2021a, b). Its calculation cost is very high because it usually calculates the coupled fluid pressure field by solving mass conservation equations for all phases simultaneously throughout the pore network. Recently, an entirely new, cost-effective, heavily parallelized, dynamic pore network modeling framework has been developed to study the two-phase flow in rough-walled fractures (Gong et al. 2021a, 2021b). It dramatically improves the computational efficiency by using the 2D equivalent pore network of fracture and parallel computing. However, the dynamic pore network model for tight sandstone with multiscale pore structures still needs much research. The computational efficiency of the quasi-static pore network method is much higher than that of the dynamic one because it does not need to solve the pressure field (Idowu and Blunt 2010; Zhao et al. 2010). However, the above theoretical models require the 3D structure and pore network of the rock sample to simulate capillary pressure and relative permeability accurately. It is well known that obtaining the 3D structures and pore networks of many tight sandstones in a tight oil reservoir is usually very difficult. Therefore, the theoretical models cannot be directly used to calculate numerous capillary pressure and relative permeability for reservoir modeling. It is worth making some efforts to develop new approaches to estimate the capillary pressure and relative permeability of two-phase flow using only some routine parameters (such as porosity, permeability, wettability, etc.).

Over the past few decades, many scholars have researched the prediction methods for capillary pressure and relative permeability. Leverett proposed the J-Function method to calculate the capillary pressure curves that reflect the complexity of the whole reservoir (Leverett 1941). J-Function averages the capillary pressure curve using some parameters (porosity, permeability, wettability, and interfacial tension). It is the most commonly used method in reservoir modeling and applies to conventional homogeneous reservoirs. However, for the tight reservoir with significant variations in porosity and permeability, the J-Function values are dispersed and may lead to large errors. Burdine derived some equations to calculate the relative permeability from pore size distribution based on the fluid flow laws in porous media (Burdine 1953). The other prediction models have been proposed to represent the relationship among capillary pressure, relative permeability, and petrophysical properties (porosity, permeability, wettability, etc.) (Jason et al. 2007; Standnes 2009; Hou et al. 2012). However, the parameters in these prediction models need to be fitted for specific rock samples and fluids, which limits the application range and computational accuracy of these models. The significant advancements in machine learning methods offer a different and more efficient way of predicting capillary pressure and relative permeability in porous media (Golsanami et al. 2015; Xiao et al. 2016; You et al. 2018; Arigbe et al. 2019). You et al. (2018) utilized the artificial neural network and particle swarm optimization method to calculate the capillary pressure curves for rocks in both homogenous and heterogeneous reservoirs. The results demonstrate that the PSO-BP network exhibits higher accuracy than the J-Function method. However, only capillary pressure curves of mercury injection have been considered, and the oil–water flow is much more complex than mercury injection. A prediction model of capillary pressure and relative permeability for two-phase flow in sandstone has been constructed based on the capillary tube model and neural network (Liu et al. 2019). Note that the capillary tubes are too simple to describe the two-phase flow in tight sandstone, so the prediction model still needs to be improved. The pore network model (PNM) and computational fluid dynamics (CFD) have been used to construct the training data, and then the machine learning methods are trained based on the dataset to predict the two-phase flow in porous media (Rabbani et al. 2020; Zhao et al. 2020; Zeinedini et al. 2022). This approach provides an idea for us to improve the accuracy of the prediction models. Recently, Yoga et al. (2022) developed a physics-informed data-driven approach to predict the relative permeability based on ANN, using physical limits within specific space as constraint conditions. The results showed that physical limits could improve predictability outside the region of measured data. However, the relative permeability was predicted as a function of phase saturation and phase connectivity, and other important parameters such as wettability, pore structure, and capillary number were maintained constant. This assumption is too idealistic and affects the applicability range of the prediction model. All the above studies focus on homogeneous porous media with relatively simple properties, and the two-phase fluid flow in tight sandstone is more complicated than in conventional sandstone. Also, most current perdition models based on machine learning methods only rely on correlations and lack a rigorous physical basis. In conclusion, the rapid prediction of capillary pressure and relative permeability for two-phase fluid flow in tight sandstone is still an immature research field and needs more research.

This paper introduces a physics-informed neural network to simulate the capillary pressure and relative permeability curves in tight sandstone. The paper is organized as follows. The physics-informed neural network, which combines neural networks, the improved parallel genetic algorithm (PGA) and physical models, is described in detail. Then through analyzing actual rock samples in the Ordos Basin of China, a tight sandstone dataset including a variety of petrophysical and fluid flow characteristics is constructed, which can provide adequate data for training the physics-informed neural network. Then, the prediction model of capillary pressure and relative permeability curves in tight sandstone has been established based on the dataset and the physics-informed neural network. Finally, the prediction model is validated and some important results are summarized.

Methodology

In this section, we will describe the physics-informed neural network that we use to predict the capillary pressure and relative permeability curves of oil–water flow in tight sandstone. The structure of the physics-informed neural network is shown in Fig. 1. Due to the existence of the multiscale pore structure and complex mineral component, nonlinear multiphase flow and wetting behavior exist in tight sandstone, and the estimation of the capillary pressure and relative permeability is very complicated. Fortunately, the physics that govern the nonlinear flow in tight sandstone have been studied and understood. Thus, physical models can be combined with the neural network to help us to build the physics-informed prediction model. Note that the genetic algorithm (GA) is one of the most successful optimization algorithms that can deal with highly nonlinear problems. But the learning rate of the traditional GA is slow. Considering that the parallel algorithm can improve computational efficiency and the oil–water flow in tight sandstone is complex, the PGA is combined with the rock typing method to enhance the precision of the prediction model.

Fig. 1
figure 1

The structure of the physics-informed neural network (The green circles are the input parameters of the physics-informed neural network. The red circles are the neurons of the physical layers. The yellow circles are the neurons of the hidden layers. The brown circles are the output of Pc, and they are also part of the input for the training of the relative permeability. The purple circles are the output of water and oil fluxes, and the light blue circles are the output of Smax,o. The sky-blue circles are the neurons of the transformation layers. The pink circles are the output of the relative permeability.)

Figure 1 shows that the physics-informed neural network includes five machine learning structures: the input layers, the physical layers, the fully connected neural networks, the transformation layers, and the output layers. The task of the physics-informed neural network is to map the input parameters and the capillary pressure (relative permeability) of the oil–water flow. In the following, we will describe the physics-informed neural network in detail.

In the input layer of the physics-informed neural network, input parameters are required. From the analysis in Appendix, we know that the main controlling factors of the oil–water flow in tight sandstone are the pore size distribution, tortuosity, contact angle, wettability index, oil–water interfacial tension, and slip length. The input parameters should be derived from them. The tortuosity, contact angle, and wettability index of the rock samples are not routine parameters and difficult to obtain. However, they can be derived indirectly from routine parameters (porosity, permeability, mineral composition) based on physical models. Thus, the required input parameters can be simplified to the pore size distribution, porosity, permeability, and mineral composition.

The physical models for the neutral network

As shown in Fig. 1, we combine the neural network with physical models by adding the physical and transformed layers in the neural network.

The first physical model is the gas apparent permeability formula for tight sandstone. As mentioned above, tortuosity is the main controlling factor, but it is not easily available. Fortunately, it can be derived indirectly from porosity and gas apparent permeability based on the gas apparent permeability formula. Due to the strong nonlinear gas flow at low pressure, we use the gas apparent permeability formula accounting for slip flow and Knudsen diffusion developed by Jiang to derive the tortuosity (Jiang et al. 2017). And the tortuosity derived from the gas apparent permeability formula is as follows

$$\left\{ {\begin{array}{*{20}l} {\tau = \frac{{2R_{nt} \mu_{g} \emptyset_{f} }}{{3\rho_{{{\text{avg}}}} K_{om} }}\left( {\frac{{d_{m} }}{{2R_{nt} }}} \right)^{{D_{f} - 2}} \sqrt {\frac{8M}{{\pi RT}}} + \frac{{\emptyset_{f} R_{nt}^{2} }}{{8K_{om} }} \times \left( {1 + \sqrt {\frac{8M}{{\pi RT}}} \frac{{\mu_{g} }}{{R_{{{\text{avg}}}} p_{{{\text{avg}}}} }}\left( {\frac{2}{\alpha } - 1} \right)} \right)} \hfill \\ {R_{nt} = \left( {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {\frac{1}{{r_{i} }}} \right)^{{D_{f} - 3}} } } \right)^{{ - \frac{1}{{D_{f} - 3}}}} } \hfill \\ {R_{{{\text{avg}}}} = \left( {\frac{1}{N}\sum\limits_{i = 1}^{N} {\frac{1}{{r_{i} }}} } \right)^{ - 1} } \hfill \\ \end{array} } \right.$$
(1)

where \({r}_{i}\),\({K}_{om},{D}_{f}\) are the i th pore radius, the gas apparent permeability, and the surface fractal dimension of the pore. N, \({\varnothing }_{f}\) refer to the total pore number and the flowing porosity. M is the molar mass. \({\mu }_{g}\) is gas viscosity. T is the temperature of the experiment. R is the gas constant. The pore network analysis of the actual tight sandstone samples shows that the values of \({D}_{f}\) are between 2.1–2.6 (as shown in the figure below), and most of them are 2.4 (Jiang et al. 2017), so DF is assumed to be 2.4 in our study. This physical model is added in Physical Layer 1 in the physics-informed neural network.

The second physical model is related to capillary pressure and wettability. Many studies have shown that capillary pressure is mainly determined by interfacial tension, wettability, and pore size. Usually, they satisfy the following relationship

$$Pc = \frac{2\sigma \cos \left( \theta \right)}{r}$$
(2)

where Pc is capillary pressure. r and σ refer to the inscribed radius and the oil–water interfacial tension, respectively. \(\theta\) represents the oil–water contact angle. Many studies have shown that the contact angle and wettability index strongly correlate with mineral composition (Wang et al. 2022; Wu et al. 2022). Generally, one mineral corresponds to one contact angle (wettability property). As a result, the mineral that has the most proportion on a pore wall will provide its mineral property on this pore. In this way, each pore will receive one mineral characteristic. After analyzing the rock samples of tight sandstone in Ordos Basin, we find that the tight sandstone usually consists of several mineral grains, such as quartz, clay, dolomite, calcite, and feldspar. The wettability property (contact angle) for each mineral can be measured by experiment. The oil–water interfacial tension is mainly determined by the properties of the fluid in pores and can also be measured by the experiment. Thus, when we get the pore size distribution and the mineral composition from the input layer, we can use Eq. 2 to determine the capillary pressure preliminarily. This physical model is used in Physical Layer 1 in the physics-informed neural network.

The third physical model is the transformed formula of the maximum oil saturation. Because of the corner, mixed wet, and water-blocking influence, the water cannot be completely replaced by oil in the pore network. We define the maximum oil saturation at the end of the oil–water flow as \({S}_{max,o}\). And \({S}_{max,o}\) is used to transform the oil (water) saturation to 1.0 (0.0) for the convenience of the prediction of the capillary pressure. The transformed formula is as follows:

$$S_{w}^{\prime } = {{1 - \left( {1 - S_{w} } \right)} \mathord{\left/ {\vphantom {{1 - \left( {1 - S_{w} } \right)} {S_{\max ,o} }}} \right. \kern-0pt} {S_{\max ,o} }}$$
(3)

where \({S}_{w}\) and \({S}_{w}{\prime}\) is the actual and transformed water saturation. This physical model is also used in Physical Layer 1 in the physics-informed neural network.

The fourth physical model is the flow rate formula for different pores in tight sandstone. Generally, there are many nanopores in the tight sandstone with a multiscale pore space. Afsharpoor and Javadpour have proved that the liquid slip effect in nanopores cannot be neglected (Afsharpoor and Javadpour 2016). At the same time, there is only linear fluid flow in micropores. Therefore, when predicting relative permeability in tight sandstone, the flow rate calculation in nanopores and micropores should be different. The flow rate formulas in nanopores and micropores can be written as follows

$$\left\{ {\begin{array}{*{20}l} {Q_{{{\text{nanopore}}}} = \frac{{A^{2} }}{{\mu_{l} L}}\left[ {a + bL_{sd} + cG + dL_{sd}^{2} + eG^{2} + fL_{sd} G} \right]\Delta p} \hfill \\ {Q_{{{\text{micropore}}}} = \frac{{\pi A^{2} }}{{\mu_{l} L}}\Delta p} \hfill \\ \end{array} } \right.$$
(4)

where \({Q}_{nanopore}\) and \({Q}_{micropore}\) are the flow rate in nanopores and micropores. The coefficients a–f are six fitting constants in the reference (Afsharpoor and Javadpour 2016). A is the cross-sectional area. P refers to the perimeter, \(\Delta p\) is the pressure drop, and L is the duct length. \({\mu }_{l}\) is the viscosity of oil. G is the dimensionless shape factor. \({L}_{sd}\) is dimensionless slip length and can be calculated as follows

$$L_{sd} = \frac{{L_{s} }}{\sqrt A }$$
(5)

where Ls represents the slip length. It should be noted that vast experiments and molecular dynamics simulations have shown that the slip length is generally from nanometers to tens of nanometers. We assume the slip length is 10 nm in the physics-informed neutral network. This physical model is used in Physical Layer 2 in the physics-informed neural network.

The fifth physical model is added in the transformation layer, which processes the output from the training process to obtain the relative permeability for different water saturation. This process consists of two steps: one step is the inverse transform of the water saturation. We have transformed the water saturation to 1.0 in the above physical layer. Thus in this step, the water saturation is inversely transformed by the maximum oil saturation obtained from the output result, as shown in Eq. 6. The other step is to inversely transform the relative permeability, as shown in Eq. 7.

$$S_{w} = 1 - \left( {1 - S_{w}^{\prime } } \right)*S_{\max ,o}$$
(6)
$$k_{r,o} = \frac{{q_{o,w}^{o} }}{{q_{o} }}$$
(7)

where the range of \({S}_{w}{\prime}\) is from 0 to 1.

In sum, five physical models are coupled into the neural network by adding physical and transformed layers. These physical models can help us describe the complex multiphase flow in tight sandstone and improve the accuracy of the neural network.

The algorithm for the physics-informed neural network

After demonstrating the structure of the physics-informed neural network, we will illustrate the algorithm for solving the weights and the threshold of the physics-informed neural network. The weights and the threshold in the physics-informed neural network can be obtained by the improved parallel genetic algorithm (PGA) and the back propagation algorithm (Fig. 2).

Fig. 2
figure 2

The algorithm flowchart for the physics-informed neural network

In the first step, the input parameters are input, and then, the topology of the physics-informed neural network is determined. In the second step, the input parameters are processed by the physical models 1–3, and the outputs of the physical layer are obtained.

In the third step, the weights and the threshold in the physics-informed neural network can be initialed randomly and optimized by the improved parallel genetic algorithm (PGA). First, the training data can be classified into several subpopulations based on the tight sandstone typing method (Ji et al. 2022). Each subpopulation has one type of tight sandstone, and different type of tight sandstone has different pore structure and multiphase fluid flow characteristics. Thus, this can help us increase the diversity of the multiple processing. Then, diversity is essential for the PGA to search for the optimal global solution. Controlling the diversity of the population is a critical way to improve the performance of the improved PGA. In each subpopulation, we increase the randomness of the initial value to avoid falling into the local optimal solution. The stronger the randomness, the less likely it is to fall into the local optimal solution. We judge each initial value and evaluate its similarity with other initial values. If the similarity is strong, discard and regenerate a new initial value. The similarity of each initial value is calculated as follows

$$r\left( {x_{i} } \right) = \mathop \sum \limits_{j = 1}^{N} S\left( {x_{i} ,x_{j} } \right), S\left( {x_{i} ,x_{j} } \right) = \left\{ { \begin{array}{*{20}c} { 1 d\left( {x_{i} ,x_{j} } \right) \ge \beta } \\ {0 d\left( {x_{i} ,x_{j} } \right) < \beta } \\ \end{array} } \right.$$
(8)
$$d\left( {x_{i} ,x_{j} } \right) = \mathop \sum \limits_{k = 1}^{l} \sqrt {\left( {x_{i}^{k} - x_{j}^{k} } \right)^{2} }$$
(9)

where \(r\left({x}_{i}\right)\) is the similarity index, and \({x}_{i}\) (\({x}_{j}\)) is the ith (jth) initial value in the subpopulation. \(d\left({x}_{i},{x}_{j}\right)\) is the distance between \({x}_{i}\) and \({x}_{j}\). \(\beta\) is the threshold of the distance, and \(S\left({x}_{i},{x}_{j}\right)\) is the similarity of \({x}_{i}\) and \({x}_{j}\). Then, the selection, crossover, and mutation process work in each subpopulation independently. After a period of generation, we choose the chromosomes in each subpopulation whose fitness is high and exchange the chromosomes with each other. The subpopulation receives excellent individuals from other subpopulations, compares their fitness, replaces the individuals with low fitness, and then forms a new subpopulation. The new subpopulation continues to evolve and migrate until the criteria are reached, as shown in Fig. 2. Finally, we can obtain the optimized weights and the threshold in the physics-informed neural network.

In the fourth step, the back propagation algorithm is utilized to minimize the generated error and determine the final values of the weights and the threshold for the physics-informed neural network. In the training process, the weights are updated in each training iteration using the error generated by the output results of the capillary pressure curve and relative permeability. The training iteration continues until the generated error is smaller than the threshold, as shown in Fig. 2. It should be pointed out that the fourth physical model is inserted into the neural network after the capillary pressure is obtained. The mean squared error (MSE) is a loss function to evaluate the estimate. The MSE for the capillary pressure (\({E}_{Pc}\)) and relative permeability (\({E}_{q}\)) are calculated as follows:

$$E_{Pc} = \frac{1}{2}\mathop \sum \limits_{i} \left( {Pc_{{{\text{in}}}} - Pc_{{{\text{out}}}} } \right)^{2}$$
(10)
$$E_{q} = \frac{1}{2}\left( {\mathop \sum \limits_{i} \left( {q_{{{\text{in}}}} - q_{{{\text{out}}}} } \right)^{2} + \mathop \sum \limits_{i} \left( {S_{{\max ,o,{\text{in}}}} - S_{{\max ,o,{\text{out}}}} } \right)^{2} } \right)$$
(11)

where \(P{c}_{in}\) is the actual output, and \(P{c}_{out}\) is the predicted value. \({q}_{in}\mathrm{ and}\) \({S}_{max,o,in}\) are the actual output values. \({q}_{out}\) and \({S}_{max,o,out}\) are the predicted values.

Finally, we can estimate the capillary pressure and relative permeability curve of oil–water flow in tight sandstone by the predicting model generated by the physics-informed neural network. The process can be summarized as follows (Fig. 3). First, four routinely acquired parameters, including the pore size distribution, porosity, permeability, and mineral composition, are input into the prediction model. It should be noted that these routine parameters can be measured by library experiments (mercury injection, nuclear magnetic resonance, helium porosity, pulse-decay permeability, and X-ray powder diffraction experiment) or by logging data and logging interpretation methods (Shi et al. 2019; Zhang et al. 2020, 2021). Then, the capillary pressure and relative permeability of oil–water flow in tight sandstone can be obtained through the prediction model.

Fig. 3
figure 3

Schematic representation of predicting the capillary pressure and relative permeability curve

Results and discussion

As mentioned earlier, the primary purpose of the physics-informed neural network is to establish a mapping between routine parameters and their capillary pressures (relative permeability). We should first generate a dataset to train the physics-informed neural network to do so.

Dataset constructing

It is well known that high-quality datasets will positively impact the accuracy and efficiency of neural networks. Thus before constructing the physics-informed neural network, we should first prepare the dataset. The dataset should have the following characteristics: first, it should involve diverse information on the tight sandstones. The information includes the size and shape of pores and throats, the connectivity of the pore space, the mineral composition, etc. Second, the fluid flow properties should also be included in the dataset. Correspondingly, there are two steps to constructing the dataset. In the first step, we generate various random pore networks of tight sandstones based on the classical stochastic network algorithm to prepare the dataset (Idowu and blunt 2010). Although the digital cores and 3D pore structures of samples can be observed through high-resolution images, the 3D images for tight sandstones with multiscale pore structures cannot be easily obtained. Furthermore, reconstructing 3D digital cores based on high-resolution imaging techniques is time-consuming and expensive. Thus in this study, we use random pore networks to characterize the pore space in tight sandstone. The petrophysical properties of tight sandstone have been investigated by analyzing the actual rock samples in the reservoir of Ordos Basin, NW China. The pore size distribution, porosity, and mineral composition of the rock samples are measured by high-pressure mercury injection, helium porosimeter, and X-ray powder diffraction measurement. The distribution features of the porosity and pore size are illustrated in Fig. 4. The result indicates that most of the samples have porosity below 10%. The pore size distribution of different rock samples varies greatly. Especially most samples are dominated by nanopores, and several samples are dominated by micropores. In addition, the diversity of element (pores and throats) shapes and the tortuosity of pore space are also investigated (see Fig. 4). In the next step, the fluid flow properties of the rock samples are analyzed. The permeability of actual rock samples in the reservoir is measured by a pulse-decay permeability instrument, and the result is shown in Fig. 4. Then, we consider the wide variation of the contact angle distribution, interfacial tension, and wettability. All the above petrophysical parameters and fluid flow parameters are taken as constraints to construct many random pore networks based on the stochastic network algorithm, as shown in Fig. 4 (Idowu and Blunt 2010). Then, the capillary pressure and relative permeability are simulated based on the above random pore networks and the QSPNM, as shown in Fig. 4. It should be emphasized that the closed-form generalized fluid flow equation accounting for the slip effect has been used to study the oil–water flow in nanopores (Afsharpoor and Javadpour 2016). The above steps can ensure that the constructed dataset characterizes the actual oil–water flow in the tight reservoir of the Ordos Basin. Finally, we can obtain a dataset with both various petrophysical and two-phase fluid flow characteristics of tight sandstone. In this study, 10,000 groups of data are established, and they will be used for training and testing the physics-informed neural network.

Fig. 4
figure 4

The dataset for the physics-informed neural network (a), the workflow for generating the training data in the dataset for the physics-informed neural network (b)

Validation

This study uses 90% of the data in the constructed dataset and is used to train the physics-informed neural network. To make the training dataset as useful as possible for the physics-informed neural network, we should ensure that the selected data is diverse while sampling the training data from the dataset. The unselected data is used for testing the accuracy of the physics-informed neural network. In addition, the k-fold cross-validation scheme has been implemented to examine the performance of the physics-informed neural network and find the optimal hidden layer structure parameters. In our study, ten folds are used, as shown in Fig. 5. During this process, the data in the training set is further divided evenly into ten subsamples. It should be emphasized that the data in each fold is selected randomly from the training data. We can see from Fig. 5b that the data in each fold is evenly distributed. The nine folds are used to train the model and the rest serve as verification data. This step repeats ten times, and then the averaged correlation coefficient of the ten steps is calculated to investigate the precision of the model. After comparing the accuracy of models with different structural parameters, we finally determine the optimal hidden layer structure parameters of the physics-informed neural network. The averaged correlation coefficients of 10 runs for the final model are listed in Table 1. The final model produces the best correlation coefficient of the predicted and test capillary pressure (oil–water relative permeability) is 0.98.

Fig. 5
figure 5

The k-fold cross-validation method for the physics-informed neural network (a), the capillary pressure curves of the training data in D1-D10 (b)

Table 1 The list of the correlation coefficients for the final network structure. The correlation coefficients include the average R2 of 10 runs, the best train, and validation R2

Then, we use the test data to examine the performance of the final prediction model. Figure 6 demonstrates that there is an accurate correlation between the simulated and test (actual) capillary pressure under different water saturations. To further examine the performance of the physics-informed neural network, we use two different samples, which represent different morphologies and wide ranges of the pore-throat size distribution, to test the physics-informed neural network. Figure 7 compares the predicted results with the corresponding test ones. It can be found that the predicted capillary pressure curves agree very well with the test ones.

Fig. 6
figure 6

Comparison of estimated capillary pressure and test ones

Fig. 7
figure 7

Comparison of estimated capillary pressure curves and actual values for two tight sandstone samples

Like the capillary pressure, Fig. 8 depicts the estimated relative permeability of the final model and the test data under different water saturations. An accurate correlation exists between the simulated and test (actual) relative permeability. To further examine the accuracy of the physics-informed neural network, the relative permeability curves of the two different tight sandstone samples are plotted in Fig. 9. The result indicates that the physics-informed neural network provides acceptable results for the oil–water relative permeability prediction.

Fig. 8
figure 8

Comparison of the estimated relative permeability and test values

Fig. 9
figure 9

Comparison of estimated relative permeability curves and actual values for two tight sandstone samples

Discussion

To further examine the performance of the physics-informed neural network, we compared it with the traditional pore network model (PNM) and the conventional ANN method, respectively.

First, two tight sandstone samples with different pore size distributions are chosen. Note that the experimental relative permeability of these samples has been published before (Zeng et al. 2020). Figure 10 compares the relative permeability curves calculated by the physics-informed neural network, the quasi-static pore network model (QSPNM), and the dynamic pore network model (DPNM) with the experiment results. We can find from the figure that the accuracy of the physics-informed neural network is slightly lower than that of the two pore network models. Furthermore, Table 2 lists the average absolute errors and calculation time of the three methods. It can be seen that DPNM has the highest accuracy, but it takes a vast amount of time. Moreover, DPNM requires a high-precision 3D image and pore network as input data. The accuracy of QSPNM is similar to DPNM, and its calculation time is shorter than DPNM and longer than the physics-informed neural network. Although the calculation time of QSPNM seems short for one rock sample, it should be emphasized that our purpose is to predict many rock samples throughout the reservoir. So we further investigate the calculation time for a large number of samples. Table 3 and Fig. 11 give the calculation time and the relative speedup comparison between QSPNM and the physics-informed neural network introduced in this study. The results illustrate that the speedup factor increases approximately linearly with the incensement in the number of samples. In particular, when the number of samples is 1000, the QSPNM calculation will take 12,651 times longer than that of the physics-informed neural network prediction. Moreover, similar to DPNM, QSPNM requires a high-precision 3D image and pore network as input data. In contrast, the absolute error of the physics-informed neural network is slightly bigger than that of DPNM and QSPNM, but it is still within the acceptable range. In particular, the physics-informed neural network takes only seconds and four input parameters to predict the relative permeability curves. Therefore, the physics-informed neural network is more suitable for predicting relative permeability curves for reservoir modeling.

Fig. 10
figure 10

The relative permeability curves for the physics-informed neural network predictions compared to the pore network model for two tight sandstone samples: samples 1 (a) and samples 2 (b). The published experimental measurement data is plotted here to evaluate the prediction results (Zeng et al 2020)

Table 2 The absolute error and computing time for the physics-informed neural network, QSPNM, and DPNM
Table 3 The comparison for the QSPNM speed and the physics-informed neural network prediction speed
Fig. 11
figure 11

Speedup results for the physics-informed neural network compared to QSPNM

In the following, we will further investigate the difference in performance between the conventional ANN. The computation time of the physics-informed neural network proposed in our manuscript and ANN are almost the same, and thus, we focus on the precision difference between the two methods. It should be noted that the traditional ANN and the physics-informed neural network have the same number of hidden layers and neurons.

Berea sandstone and a tight sandstone rock sample are chosen (Oak 1990; Zeng et al. 2020). Figure 12 shows the prediction results of the relative permeability curves calculated by the conventional ANN and the physics-informed neural network. We can find from the figure that the relative error for conventional ANN is bigger than that of the physics-informed neural network.

Fig. 12
figure 12

The relative permeability curves for the physics-informed neural network predictions compared to conventional neural network for two samples: Berea (a), tight sandstone (b). The published experimental measurement data is plotted here to evaluate the prediction results (Oak 1990; Zeng et al. 2020)

Then, we quantitatively examine the change rule of the difference between the conventional ANN and the physics-informed neural network. We have selected thirty different samples in the dataset. The variance-mean ratio of the pore size distribution is denoted as Vr, and it can reflect the heterogeneity of the tight sandstone. Furthermore, the ratio of the relative error between the conventional ANN and the physics-informed neural network is denoted as \(\gamma\). The absolute errors of the physics-informed neural network and the ANN are shown in Fig. 13a. It can be deduced that the absolute error of the conventional ANN increases with Vr. Furthermore, \(\gamma\) is plotted versus Vr for several tight sandstone samples, as shown in Fig. 13b. It can be seen from the figure that the difference in the performance in predicting the relative permeability curves between the conventional ANN and the physics-informed neural network increases as Vr increases. The red line in Fig. 13 can be obtained by fitting the points (black squares). The results confirm that \(\gamma\) is small and increase slowly as Vr is small (< 0.5). Then, when Vr is bigger than 0.5, \(\gamma\) increases rapidly with the heterogeneity of the sample. This is because the nonlinear flow in tight sandstone becomes more complex as the heterogeneity of the samples increases. However, the conventional ANN is insufficient to deal with nonlinear problems. Therefore, we can deduce that the physics-informed neural network performs better than the conventional neural network, especially for heterogeneous tight sandstones.

Fig. 13
figure 13

Absolute errors of the physics-informed neural network and the ANN (a), relationship between \(\gamma\) and Vr (b)

It should be emphasized that the prediction model in this study is constructed for tight sandstone in Ordos Basin. It may not be suitable for other types of rock samples, such as carbonate rocks and shale rocks. Similar to the tight sandstone discussed in our work, the physics-informed neural network can be further extended to other rocks by introducing more physical models and training data according to the structural and two-phase flow characteristics of other rocks.

Conclusions

In this study, we developed a novel physics-informed neural network to improve the prediction of the capillary pressure and relative permeability of tight sandstone. Some important conclusions can be summarized as follows:

  1. 1.

    Five important physical models, including the gas apparent permeability formula, capillary pressure formula, normalized water saturation, and flow rate formula for multiscale pores, are extracted and added into the neural network to improve the accuracy of the prediction model. The results demonstrate that these physical models can help neural networks better predict the complex nonlinear two-phase flow and wettability in tight sandstones.

  2. 2.

    Unlike the traditional parallel genetic algorithm, the tight sandstone rock typing method and the similarity judgment are used to increase the diversity of the population and optimize the weights and the threshold in the novel physics-informed neural network.

  3. 3.

    Compared with the quasi-static and dynamic pore network models, the novel physics-informed neural network requires fewer parameters (only four routine parameters) and significantly reduces computational time, making it more appropriate for supplying potential parameters in large-scale reservoir simulations.

  4. 4.

    The comparison between the physics-informed neural network and the traditional ANN reveals that when the heterogeneity of the tight sandstone increases, the innovative physics-informed neural network exhibits more distinct advantages over the conventional ANN. This is attributed to the increasing complexity of nonlinear flow in tight sandstone as the heterogeneity of the samples intensifies. Nevertheless, the traditional ANN proves to be inadequate for handling such nonlinear challenges.