Abstract
The water retention behavior—a critical factor of unsaturated flow in porous media—can be strongly affected by deformation in the solid matrix. However, it remains challenging to model the water retention behavior with explicit consideration of its dependence on deformation. Here, we propose a data-driven approach that can automatically discover an interpretable model describing the water retention behavior of a deformable porous material, which can be as accurate as non-interpretable models obtained by other data-driven approaches. Specifically, we present a divide-and-conquer approach for discovering a mathematical expression that best fits a neural network trained with the data collected from a series of image-based drainage simulations at the pore-scale. We validate the predictive capability of the symbolically regressed counterpart of the trained neural network against unseen pore-scale simulations. Further, through incorporating the discovered symbolic function into a continuum-scale simulation, we showcase the inherent portability of the proposed approach: The discovered water retention model can provide results comparable to those from a hierarchical multi-scale model, while bypassing the need for sub-scale simulations at individual material points.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The water retention behavior of a porous material is the relationship between the matric suction and the degree of saturation.Footnote 1 It governs the hydromechanical behavior of an unsaturated porous material since the capillarity therein induces cohesion, which, in turn, affects the effective stress and material strength. While the water retention behavior has been traditionally characterized through various devices such as pressure plate apparatus, filter paper, or tensiometer, recent advances in imaging technologies have diversified the ways to quantify it. For instance, X-ray computed tomography (CT) provides high-resolution images of the porous material of interest, from which we can detect the spatial distribution of wetting and non-wetting phases in the pores [49, 66] or conduct pore-scale flow simulations [15, 33]. Applications of nuclear magnetic resonance (NMR) enable us to directly measure the water content of soils and rocks, and they can be utilized to determine their pore size distributions and water retention curves [20, 58].
Standard water retention models in the literature (e.g., [14, 25, 39, 67]) were originally developed for unsaturated flow in porous media without deformation. While these standard models have also been employed in coupled hydromechanical modeling of deformable porous media, see, e.g., [13, 16, 17, 29, 32, 35, 38, 71], they are not ideal because they do not account for the dependence of the water retention behavior on the deformation of the solid matrix. In other words, while the standard water retention models assume that the relationship between suction and saturation is unaffected by deformation, a change in the specific volume alters the pore size distribution and the tortuosity of the pore space, thereby changing the water retention behavior.
To overcome the aforementioned limitation for deformable porous media, two types of strategies have been pursued in previous studies. The first type is to develop a physics-inspired phenomenological model that incorporates the impact of solid deformation on water retention characteristics, see, e.g., [27, 47, 62]. The second type of approach is to adopt a hierarchical multi-scale model that performs micro-scale (i.e., pore-scale) simulations on representative volume elements to replace a continuum-scale constitutive law (e.g., [1, 24, 70]). Compared to the former which still partially neglects micro-scale processes, the latter can reflect the multi-scale nature of water retention behavior. Still, however, the latter approach is impractical due to the high computational cost required for incremental constitutive updates at each material point by running sub-scale simulations.
As an alternative, data-driven approaches have recently gained popularity since they can replace micro-scale simulations in representative volume elements with machine learning models trained with experimental or simulation data. Among a variety of existing frameworks, deep neural networks have been the most popular due to their simplicity and capability of capturing complex dependencies on the micro-scale attributes without the need to determine the material parameters explicitly. For instance, previous studies have employed neural networks as universal function approximators that can reproduce nonlinear stress–strain relations [18, 37, 40], permeability [57, 65], and water retention characteristics [31, 43], demonstrating the remarkable predictive capabilities of the trained models.
Nevertheless, the existing data-driven approaches have not yet brought a practical impact on real-world applications, presumably because they often rely on black-box models that are not interpretable. Although there have been several attempts to address this issue of non-interpretability (e.g., [3, 45, 63]), the lack of an analytical expression makes them only partially interpretable while being less accurate than their black-box counterparts. Meanwhile, a symbolic regression approach can yield a completely interpretable data-driven model since it results in a mathematical expression that fits the given dataset. As pointed out in [11], however, this approach is often data-hungry to reach a desired level of accuracy, while the computational cost required to search the possible combinations of symbolic expressions proliferates with the dimensionality of the problem. To address this issue, Bahmani et al. [8] have recently proposed a neural polynomial method to perform a series of symbolic regressions in low-dimensional spaces to capture an evolving yield surface. Regardless, no attempt has been made to discover an interpretable data-driven model of the water retention behavior of a deformable porous material.
This study aims to develop an interpretable machine learning model that can replace sub-scale simulations in a hierarchical multi-scale model without compromising accuracy. Specifically, we propose a divide-and-conquer approach that leads to the discovery of an interpretable symbolic expression describing the water retention characteristics of deformable porous media. For this purpose, we consider multiple sets of points that comprise water retention curves of a certain type of porous material as a training dataset, collected from a series of pore-scale simulations through a pore-morphology-based algorithm. From the acquired dataset, we formulate a regression task by training a multi-layer perceptron in a supervised fashion that yields a black-box function exhibiting a high degree of expressivity. We then train a symbolic regression model via genetic programming to discover a mathematical expression that replicates the neural network function. The proposed framework can be integrated into multi-scale simulations by replacing pore-scale simulations with a discovered model that possesses high degrees of expressivity and interpretability at the same time. To limit the scope of the present work, we shall restrict our attention to the water retention behavior under drying (drainage), without consideration of hysteresis in the water retention behavior.
2 Methodology
In this section, we first summarize the image-based method adopted in this study which enables us to obtain a discretized water retention curve from a digital microstructure (Sect. 2.1). We then present a divide-and-conquer machine learning approach that yields an interpretable data-driven water retention model and trained with a dataset collected from a series of pore-scale simulations (Sect. 2.2). Specifically, our proposed framework trains a black-box neural network that can accurately represent the water retention behavior of a deformable porous material and then performs symbolic regression to discover the interpretable mathematical expression that best fits the learned function. The image-based simulation results and the predictive capability of the resulting data-driven model will be presented later in Sect. 3.
2.1 Image-based sphere insertion method
This study adopts an image-based method to simulate the invasion of the non-wetting fluid (e.g., air phase) into a water-saturated digital porous material to obtain water retention data. The technique is referred to as the image-based sphere insertion method or the pore-morphology-based method, since it mimics the process of fluid infiltration by inserting a set of spheres of radius r into the pore space to approximate the fluid configurations at a given suction s. For brevity, this section only provides a summary of the simulation process, referring the readers to [30, 33, 56] for details on the morphological operations and image processing algorithms.
Consider a three-dimensional binary image where 0 represents the solid while 1 represents the pore at each voxel, which serves as a platform to conduct the pore-scale simulations. We first define the inlet face, which is comprised of a set of pore voxels located on the boundary plane. Then, we apply the Euclidean distance transform on the binary image to replace the value of 1 with the minimum distance to the solid region at each pore voxel. This yields a distance map where the individual voxel values represent the radius of the maximum possible sphere that can be inserted into the center of each pore voxel, which can be converted to the minimum capillary pressure (\(p_c\)) required to invade the corresponding pore space through the Young–Laplace equation:
where \(\gamma _\textrm{aw}\) indicates the air–water interfacial tension and \(\theta\) denotes the contact angle. This implies that we can identify all the regions that can be invaded by the non-wetting fluid from a series of morphological operations. Hence, at a given suction s, we construct a binary mask by thresholding the obtained distance map by \(r_{\text {th}} = 2 \gamma _\textrm{aw} \cos {\theta } / s\), which indicates the regions where the minimum capillary pressure required for an invasion is less than or equal to s. At this point, the voxels within the mask that are disconnected to the inlet face are trimmed, while the remaining connected voxels are morphologically dilated by \(r_{\text {th}}\) to obtain the configuration of the fluids, where we can record the degree of saturation \(S_w\) by computing the volume fraction of the pore space filled by the wetting fluid. By repeating this process at increasing values of s, we can replicate a drainage experiment that yields a water retention curve formed by a set of points in a two-dimensional Euclidean space spanned by s and \(S_w\).
In this study, we consider a number of representative volume elements that, respectively, mimic the digital snapshots of a certain porous material undergoing an isotropic confinement while performing pore-scale simulations therein to obtain a set of points (e.g., \(\lbrace v^i, s^i, S_w^i \rbrace _{i=1}^{N_{\text {data}}}\), where \(v = 1 + e\) is the specific volume, e is the void ratio, and \(N_{\text {data}}\) indicates the total number of points) that comprises a surface in a three-dimensional Euclidean space that describes the water retention behavior of the target material that is deformable. Digital microstructures considered in this work and their simulation results will be presented later in Sect. 3.1.
2.2 Neural network-based symbolic regression
By considering the point cloud \(\lbrace v^i, s^i, S_w^i \rbrace _{i=1}^{N_{\text {data}}}\) as a training dataset, we formulate a supervised learning problem to train a feed-forward neural network counterpart of the water retention function, parameterized by the weights \(\varvec{W}\) and biases \(\varvec{b}\). Based on this setting, the learning problem seeks to minimize the difference between the ground truth and the neural network prediction for given samples, i.e.,
where \(\mathcal {L}\) indicates the mean square error loss and \(f^{\text {NN}}\) is the neural network function. Although the hyperparameters can be optimized to achieve a higher level of efficiency (e.g., [9, 10]), for simplicity, this study considers a typical fully connected neural network comprised of two hidden layers with 20 neurons each with rectified linear unit activation functions \(\text {ReLU}( \bullet ) = \max {(0, \bullet )}\) followed by an output dense layer. Here, we have chosen the ReLU activation function instead of others (e.g., sigmoid or hyperbolic tangent activation functions) to ensure the effectiveness of the backpropagation process in training. In this case, the neural network function can be expressed as
where the size of the weight matrices \(\varvec{W}^{(1)}\), \(\varvec{W}^{(2)}\), and \(\varvec{W}^{(3)}\) are \(20 \times 2\), \(20 \times 20\), and \(1 \times 20\), while the biases \(\varvec{b}^{(1)}\), \(\varvec{b}^{(2)}\), and \(\varvec{b}^{(3)}\) are the vectors of sizes 20, 20, and 1, respectively. This indicates that our neural network contains a total of 501 trainable parameters, which makes it difficult for us to interpret and is hence considered as a black box even though it can achieve a high level of accuracy, owing to its high degree of expressivity [7].
To overcome this issue, this study performs symbolic regression via genetic programming [22, 48] to discover the mathematical expression of the learned neural network function, which is inherently human-interpretable while leveraging the expressive power of a multi-layer perceptron. The symbolic regression algorithm used in this study considers a mathematical expression as a binary tree, which can be constructed from the set of variables, constants, and binary and unary operators. Specifically, as illustrated in Fig. 1, the leaf nodes (blue) of the tree structure contain either an input variable or a constant (e.g., 0.5, 2.5, or x), while the internal nodes (red) can accommodate mathematical operators [e.g., \(+\), −, \(\times\), or \(\exp {( \bullet )}\)]. The use of genetic programming in this work is similar to [7, 55], whereby the combinatorial space of all possible mathematical expressions is searched. It first randomly generates a population of candidate binary trees and evaluates their mean square error losses to measure the fitness of each candidate solution. Then, the algorithm iteratively evolves the population based on the selection (selecting the fittest candidate from a population), crossover (exchanging randomly selected sub-trees of two candidates), and mutation (randomly replacing a sub-tree of a candidate solution with a new tree) operators. This process is repeated until the algorithm discovers a satisfactory mathematical expression (i.e., \(f^{\text {NN+SR}}\)) that best fits the black-box function \(f^{\text {NN}}\).
The schematic of the proposed divide-and-conquer strategy is shown in Fig. 2, where a superposed tilde indicates the degree of saturation predicted via a machine learning model (e.g., either the black-box network or the symbolic expression discovered from it). The main advantage of this framework is that the interpretability enhanced via symbolic regression enables us to better understand the water retention behavior of a target material, without sacrificing the expressive power of a black-box neural network. It also results in an enhancement in the portability of the learned function for continuum-scale partial differential equation solvers regardless of the written programming languages, such that the model can easily be incorporated into almost any kind of existing simulation code. The training results and the discovered mathematical expressions will be reported in Sect. 3.2.
Remark 1
In addition to the use of ReLU activation layers, we normalize the data before the training to avoid problems that may occur during the backpropagation process (e.g., vanishing/exploding gradients). Specifically, a sample \(X^i\) of a measure X is scaled to a unit interval as:
where \(\bar{X}^i\) is the normalized sample point, while \(X_{\text {min}}\) and \(X_{\text {max}}\) are the minimum and the maximum values of the measure X inside the training dataset, respectively, such that all the data used in this paper are normalized within [0, 1] as a preprocessing step.
3 Results and discussion
This section presents the image-based simulation results from a set of randomly generated digital pore structures that serve as a dataset for training a neural network function, from which we extract mathematical equations via symbolic regression. Specifically, Sect. 3.1 first provides details on generating a set of digital microstructures that resembles the snapshots of a porous material subjected to isotropic confinement and then presents a series of discrete water retention curves obtained from the pore-morphology-based simulations. Based on the collected water retention data, Sect. 3.2 focuses on the machine learning models trained via the divide-and-conquer approach and on demonstrating their predictive capability and interpretability. Subsequently, Sect. 3.3 showcases the potential of the interpretable data-driven water retention model by incorporating it into a mixed finite element model. In this work, we have generated digital microstructures and pore-scale simulations therein by utilizing an image analysis toolkit PoreSpy [28] and have implemented the proposed framework with deep learning libraries PyTorch [51] and PySR [21].
3.1 Image-based drainage simulations in deformable porous media
In this study, we consider a set of digital microstructures represented by \(500 \times 500 \times 500\) voxels that have voxel sizes of 2 \(\upmu\)m, which mimics a set of representative volume elements of a highly porous and soft rock at different confinement levels. As in Ávila et al. [5], we generate each microstructure firstly from a \(500 \times 500 \times 500\) matrix of random noise with zeros and ones. To ensure that they share similar microstructural attributes, the images are blurred by the same Gaussian kernel by using the same blobiness parameter of 1.5 and are then binarized by applying thresholding until they reach their own target specific volume. By specifying the target values of specific volume v from 1.8 to 2.4 with an interval of 0.05, a total of 13 different binary images are obtained through this process. Figure 3 shows a subset of obtained digital microstructures with the specific volume specified as \(v = 1.8\) and \(v = 2.4\), respectively. Although their pore structures appear similar, we further investigate their pore size distributions and orientations to confirm that they can be considered the same type of material before subjecting them to pore-scale simulations.
To quantify the topological characteristics of individual digital microstructures, we extract their pore networks based on the methods proposed by [59] and [61]. The obtained pore networks consist of pore chambers (i.e., pore volume segments) connected by pore throats (i.e., capillary tubes), which enables us to investigate the pore size distributions and their orientations. Figure 4a illustrates the pore size distributions obtained from the microstructures with \(v = 1.8\) (red), \(v = 2.1\) (purple), and \(v = 2.4\) (blue), respectively, while Fig. 4b shows the relationship between specific volumes of all 13 microstructures considered in this work and the corresponding mean pore diameters (circular symbols), where the error bars represent \(\pm 2.5\) % of their standard deviations. The results indicate that the pore sizes closely follow the lognormal distributions, similar to those of the typical geological materials [42], and show that the mean pore size tends to increase with increasing v accompanied by small increases in the standard deviation, exhibiting similar trends reported in Penumadu and Dean [52] using mercury intrusion porosimetry on kaolin clay under varying preconsolidation pressures. In addition, similar distributions of pore throat orientations from microstructures with different specific volumes as shown in Fig. 5 corroborate that the generated digital pore structures can be regarded as the same type of material.
We then conduct a series of drainage simulations from the obtained microstructures using the image-based sphere insertion method. As exemplified in Fig. 6, the simulation injects the air phase into the water-saturated digital pore structure at the specified inlet face \(z = 0\), while we set the interfacial tension as \(\gamma _\textrm{aw} = 0.072\) N/m and the contact angle as \(\theta = 0^{\circ }\), following [6, 36]. Specifically, by prescribing zero pore water pressure (\(p_w = 0\)) at the top surface, the numerical simulation is performed by applying a stepwise increment of suction (\(s = p_a - p_w\)) from 0 kPa to 20 kPa, while we record the degree of saturation \(S_w\) at every 1 kPa intervals to construct a discrete water retention curve.
Figure 7a illustrates the discrete water retention curves (circular symbols) obtained from 13 different microstructures considered in this study, fitted by the van Genuchten model [67]:
where \(p_0\), m, and \(S_\textrm{wr}\) are the fitting parameters that are related to the air entry pressure, slope of the curve, and residual saturation, respectively. Although Eq. (5) does not accurately represent the suction-saturation relationship of our material of interest, as depicted in Fig. 7b–d, we can observe strong correlations between the specific volume v and the calibrated fitting parameters that are similar to those discovered by the previous studies [34, 50]: The air entry suction and residual saturation decrease with growing voids, while the water retention curve tends to exhibit steeper slopes with increasing v. The results also support the assumption that the generated digital microstructures are considered the same type, but more importantly, they imply that we require a phenomenological model that describes the relation between the specific volume, suction, and degree of saturation to replicate the water retention behavior of deformable porous media, which can either be handcrafted (e.g., [27]) or discovered through machine learning process (e.g., [31]).
a Water retention curves fitted by [67] and the correlations between specific volume v and the fitting parameters b \(p_0\), (c) m, and (d) \(S_\textrm{wr}\)
3.2 Data-driven discovery of interpretable water retention models
From the water retention data collected from a series of pore-scale simulations, our goal is to discover an interpretable data-driven water retention model for the material of interest based on the divide-and-conquer approach. Since the data recorded from the drainage simulations (i.e., circular symbols in Fig. 7a) can be considered a point cloud in a three-dimensional Euclidean space spanned by v, s, and \(S_w\), as described in Sect. 2.2, we first train a feed-forward neural network that predicts the saturation from the given specific volume and suction until 5000 epochs. The learning curve in Fig. 8a shows that the mean square error loss tends to decrease as the number of epochs increases until the model performance converges after \(\sim\)3000 epochs, while the trained neural network yields a smooth representation of a surface (Fig. 8b) that well fits the data. Even though the trained model is a black box, the results indicate that the neural network represents the water retention behavior better compared to Eq. (5), implying that neural networks are highly expressive models such that they can approximate almost any function of interest, which may not be an easy task for a human [23, 46, 54]. It should be noted that one may introduce additional constraints (e.g., on convexity [26] or monotonicity [4]) for training neural networks to make them physically sound. However, since this study focuses on discovering a phenomenological model, this extension will be considered in the future.
To enhance the interpretability of the learned function \(f^{\text {NN}}\), we now conduct symbolic regression in an offline setting to discover the mathematical expression \(f^{\text {NN+SR}}\) that replicates the neural network. As pointed out in Cogswell et al. [19], one possible way to prevent overfitting in machine learning models is to train them with more data. Since one of the upshots of the divide-and-conquer approach is that we can resample as many data points as desired from the trained neural network, we therefore reconstruct the training dataset for the symbolic regression at this point. Following [60, 68, 69], the resampling is performed on a uniform grid of v and s, while the corresponding values for \(S_w\) are computed from the neural network function. Specifically, we sample 50 points along the V-axis (from 1.8 to 2.4) and 50 points along the S-axis (from 0 kPa to 20 kPa), such that the reconstructed dataset has \(N_{\text {data}} = \text {2500}\). Theoretically, from the dataset generated from a known function, symbolic regression can discover the expression that is nearly identical to the true expression if the specified operators are properly chosen (see Appendix A). However, considering the case where we do not have a priori knowledge, we specify the set of unary operators as \(\lbrace \text {abs}( \bullet ), \text {exp}( \bullet ), \text {sqrt}( \bullet ), \text {inv}( \bullet ) \rbrace\) and the set of binary operators as \(\lbrace +, \times , {^{\hat{\,}}}, / \rbrace\) that can be accommodated in the internal nodes of a binary tree. Here, \(\text {abs}( \bullet ) = | \bullet |\), \(\text {sqrt}( \bullet ) = \sqrt{\bullet }\), and \(\text {inv}( \bullet ) = 1 / \bullet\), while the symbol ’\({^{\hat{\,}}}\)’ denotes the power operation (e.g., \(a {^{\hat{\,}}}b = a^b\)). By setting the number of iterations of the algorithm as 1000, the number of populations as 15, the number of individual binary trees in each population as 33, and the maximum number of nodes in individual trees (i.e., the maximum degrees of complexity) as 40, the CPU time to complete the training takes 3129 s on a laptop equipped with an Apple M2 Max processor (12 cores, 12 threads @ 3.5 GHz) with 64 GB DDR5 memory.
Figure 9 shows the best-fit mathematical expressions discovered by the symbolic regression algorithm depending on the degrees of complexity, where the superposed bar indicates the normalized values that range from 0 to 1 (see Remark 1), while Fig. 10 illustrates the surface representations of the discovered expressions with Complexity = 1, 3, 16, and 38, respectively. The results reveal that the discovered mathematical expression tends to better represent the learned function as the number of nodes in the tree increases, indicating that there is a trade-off between accuracy and simplicity. For instance, an expression from a single-noded tree is a constant function \(\bar{S}_w = 0.57\), which is the simplest expression that the algorithm can discover, results in an inaccurate representation of the learned function even though it can be considered the most accurate for the case where Complexity = 1. On the other hand, the best-fit solution discovered from a tree with Complexity = 38 results in a very accurate prediction with a relatively small mean square error, but it may be a very complex expression that is not very intuitive to interpret. Nevertheless, we underscore that the analytical expressions are always interpretable regardless of their complexity, which not only makes the post hoc analyses easier but also enhances the portability of the learned function, compared to the black-box neural networks. This implies that we may achieve the highest level of accuracy when we choose an expression that exhibits the lowest mean square error among all the potential candidates. Hence, based on the hyperparameters we specified, this study considers the mathematical expression with Complexity = 38 shown in Fig. 9 as the resulting symbolically regressed counterpart (i.e., \(f^{\text {NN+SR}}\)) of the neural network function \(f^{\text {NN}}\).
As a validation example, we conduct image-based drainage simulations within two digital microstructures that are not used during the training processes and compare the results with the predictions made from the trained neural network \(f^{\text {NN}}\) and its best-fit mathematical expression \(f^{\text {NN+SR}}\). Furthermore, as a reference, we have also made an additional set of predictions from a mathematical expression with Complexity = 38, which is directly discovered from the symbolic regression (\(f^{\text {SR}}\)) without neural network pretraining or data augmentation (Fig. 11). Here, all the parameters used for generating the microstructures remain the same as those summarized in Sect. 3.1 except the threshold values, which are specified to reach specific values of V: One microstructure exhibits \(v = 1.825\) which falls into the training data range, whereas the other has \(v = 2.5\) which is outside the range of the specific volume of the training dataset. For the case where we set \(v = 1.825\), as illustrated in Fig. 11a, both neural network-based model (orange curve) and its best-fit mathematical expression obtained from the divide-and-conquer approach (green curve) are capable of reproducing water retention curves that are very close to the image-based simulation results, whereas the model discovered directly from the symbolic regression (purple curve) fails to capture the water retention characteristics for the range of matric suction below the air entry value. More importantly, although the predictions are not as accurate as those made in the interpolation regime, the water retention curves for an unseen digital microstructure with \(v = 2.5\) can still be constructed with reasonable accuracy from the trained neural network (\(f^{\text {NN}}\)) and its best-fit symbolic expression (\(f^{\text {NN+SR}}\)), respectively (Fig. 11b). The results not only confirm the validity of the trained models but also emphasize the predictive capability of the mathematical model discovered by the divide-and-conquer approach, which cannot be achieved directly from a symbolic regression. This demonstrates that the proposed framework enables us to achieve high levels of accuracy and interpretability at the same time.
3.3 Implication for continuum-scale modeling
In this section, we showcase the applicability of the trained model by incorporating it into a continuum-scale simulation that may resolve the high computational cost issues of multi-scale models pointed out in Sect. 1. Specifically, we employ a mixed finite element method as a continuum-scale model, while replacing the micro-scale model at each Gauss point with a mathematical expression discovered from the divide-and-conquer approach (\(f^{\text {NN+SR}}\)) that replicates the pore-scale water retention behavior of the material of interest. The details of the continuum-scale model can be found in Appendix B.
The numerical example considered herein resembles the experiment conducted by Liakopoulous [41]. As illustrated in Fig. 12, the problem domain is a 1-m-tall column that possesses microstructure similar to those generated in Sect. 3.1. The domain is spatially discretized into a structured mesh with an element size of \(h_e = 0.02\) m and we set the time step size as \(\varDelta t = 10\) s, while the specified material properties are summarized in Table 1. By assuming that the column is initially equilibrated with hydrostatic pore water pressure so that it retains fully saturated condition (i.e., \(S_w = 1\)) at \(t = 0^{-}\), the numerical experiment begins at \(t = 0^{+}\) by allowing the water phase to escape from the bottom surface by prescribing the pore water pressure boundary condition as \(\hat{p}_w = 0\) while applying zero flux boundary conditions at all other boundaries.
Figure 13 shows the variations of the pore water pressure \(p_w\), degree of saturation \(S_w\), and the specific volume V along the Y-axis. Similar to the results reported in [41], gravity gradually builds up the negative pore pressure which affects the pore water to migrate toward the bottom end over time (Fig. 13a), accompanied by an increase in the compressive effective stress of the entire column (Fig. 13c), i.e., consolidation. Meanwhile, as illustrated in Fig. 13b, the air phase starts to invade from the top surface once the suction \(s = p_a - p_w\) exceeds the air entry value at the current specific volume, following a certain path on the surface represented by \(f^{\text {NN+SR}}\) that we specified (Fig. 14). In addition, the recorded saturation path of the top surface as shown in Fig. 14 suggests that a density-independent phenomenological model may not reflect the water retention behavior of a deformable porous material, while it underscores that an interpretable data-driven model enables us to capture the multi-scale nature of unsaturated flow with a computational cost similar to that of a single-scale finite element model.
4 Conclusion
In this work, we have proposed a framework for data-driven discovery of an interpretable model for the water retention behavior of deformable porous media. A divide-and-conquer approach has been developed to discover a data-driven water retention model that overcomes the accuracy-interpretability dilemma by leveraging the expressive power of a neural network and the interpretability of a symbolic expression. The proposed approach has been validated against the data generated from a series of image-based pore-scale simulations which resembles the water retention data collected from a deformable porous material. Unlike a typical neural network that yields a non-interpretable black-box function, the genetic programming algorithm can provide an interpretable symbolic expression that satisfactorily replicates the learned function. We have shown that both the trained neural network and its symbolically regressed counterpart are capable of reproducing water retention curves of digital microstructures unused in the training processes with reasonable accuracy. We have also demonstrated that thanks to its inherent portability, the discovered water retention model can easily be incorporated into continuum-scale simulations, revealing the potential of capturing the multi-scale nature of unsaturated flow in a deformable porous material without significant computational cost. Future works include an extension of the proposed framework to capture hysteresis in water retention and an improvement of the trustworthiness through the use of a physics-constrained layer or loss function.
Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Notes
In this work, we focus on non-adsorptive porous media (e.g., granular materials) in which the suction is entirely due to capillarity.
References
Aarnes JE, Krogstad S, Lie K-A (2006) A hierarchical multiscale method for two-phase flow based upon mixed finite elements and nonuniform coarse grids. Multiscale Model Simul 5(2):337–363
Abhyankar S, Brown J, Constantinescu EM, Ghosh D, Smith BF, Zhang H (2018) PETSc/TS: a modern scalable ODE/DAE solver library. arXiv preprint arXiv:1806.01437
Agarwal R, Melnick L, Frosst N, Zhang X, Lengerich B, Caruana R, Hinton GE (2021) Neural additive models: interpretable machine learning with neural nets. Adv Neural Inf Process Syst 34:4699–4711
Altendorf EE, Restificar AC, Dietterich TG (2012) Learning from sparse data by exploiting monotonicity constraints. arXiv preprint arXiv:1207.1364
Ávila J, Pagalo J, Espinoza-Andaluz M (2022) Evaluation of geometric tortuosity for 3D digitally generated porous media considering the pore size distribution and the A-star algorithm. Sci Rep 12(1):19463
Bachmann J, Woche S, Goebel M-O, Kirkham M, Horton R (2003) Extended methodology for determining wetting properties of porous media. Water Resour Res 39(12):1353
Bahmani B, Suh HS, Sun W (2023) Discovering interpretable elastoplasticity models via the neural polynomial method enabled symbolic regressions. arXiv preprint arXiv:2307.13149
Bahmani B, Suh HS, Sun W (2024) Discovering interpretable elastoplasticity models via the neural polynomial method enabled symbolic regressions. Comput Methods Appl Mech Eng 422:116827
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. Adv Neural Inf Process Syst 24:2546–2554
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(2):281–305
Bomarito G, Townsend T, Stewart K, Esham K, Emery J, Hochhalter J (2021) Development of interpretable, data-driven plasticity models with symbolic regression. Comput Struct 252:106557
Borja RI (2006) On the mechanical energy and effective stress in saturated and unsaturated porous continua. Int J Solids Struct 43(6):1764–1786
Borja RI, Choo J, White JA (2016) Rock moisture dynamics, preferential flow, and the stability of hillside slopes. Multi-Hazard Approaches to Civil Infrastructure Engineering, pp 443–464
Brooks R, Corey A (1964) Hydraulic properties of porous media. hydrology paper no. 3. Civil Engineering Department, Colorado State University, Fort Collins, CO
Bultreys T, De Boever W, Cnudde V (2016) Imaging and image-based fluid transport modeling at the pore scale in geological materials: a practical introduction to the current state-of-the-art. Earth Sci Rev 155:93–128
Callari C, Abati A (2009) Finite element methods for unsaturated porous solids and their application to dam engineering problems. Comput Struct 87(7–8):485–501
Choo J (2018) Large deformation poromechanics with local mass conservation: an enriched Galerkin finite element framework. Int J Numer Meth Eng 116(1):66–90
Chung I, Im S, Cho M (2021) A neural network constitutive model for hyperelasticity based on molecular dynamics simulations. Int J Numer Meth Eng 122(1):5–24
Cogswell M, Ahmed F, Girshick R, Zitnick L, Batra D (2015) Reducing overfitting in deep networks by decorrelating representations. arXiv preprint arXiv:1511.06068
Costabel S, Yaramanci U (2013) Estimation of water retention parameters from nuclear magnetic resonance relaxation time distributions. Water Resour Res 49(4):2068–2079
Cranmer M (2023) Interpretable machine learning for science with pysr and symbolicregression. jl. arXiv preprint arXiv:2305.01582
Cranmer M, Sanchez Gonzalez A, Battaglia P, Xu R, Cranmer K, Spergel D, Ho S (2020) Discovering symbolic models from deep learning with inductive biases. Adv Neural Inf Process Syst 33:17429–17442
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314
Efendiev Y, Ginting V, Hou T, Ewing R (2006) Accurate multiscale finite element methods for two-phase flow simulations. J Comput Phys 220(1):155–174
Fredlund DG, Xing A (1994) Equations for the soil-water characteristic curve. Can Geotech J 31(4):521–532
Fuhg JN, van Wees L, Obstalecki M, Shade P, Bouklas N, Kasemer M (2022) Machine-learning convex and texture-dependent macroscopic yield from crystal plasticity simulations. Materialia 23:101446
Gallipoli D, Wheeler S, Karstunen M (2003) Modelling the variation of degree of saturation in a deformable unsaturated soil. Géotechnique 53(1):105–112
Gostick JT, Khan ZA, Tranter TG, Kok MD, Agnaou M, Sadeghi M, Jervis R (2019) Porespy: a python toolkit for quantitative analysis of porous media images. J Open Sour Softw 4(37):1296
Griffiths D, Lu N (2005) Unsaturated slope stability analysis with steady infiltration or evaporation using elasto-plastic finite elements. Int J Numer Anal Meth Geomech 29(3):249–267
Hazlett R (1995) Simulation of capillary-dominated displacements in microtomographic images of reservoir rocks. Transp Porous Media 20:21–35
Heider Y, Suh HS, Sun W (2021) An offline multi-scale unsaturated poromechanics model enabled by self-designed/self-improved neural networks. Int J Numer Anal Meth Geomech 45(9):1212–1237
Heider Y, Sun W (2020) A phase field framework for capillary-induced fracture in unsaturated porous media: Drying-induced vs. hydraulic cracking. Comput Methods Appl Mech Eng 359:112647
Hilpert M, Miller CT (2001) Pore-morphology-based simulation of drainage in totally wetting porous media. Adv Water Resour 24(3–4):243–255
Huang S, Barbour S, Fredlund D (1998) Development and verification of a coefficient of permeability function for a deformable unsaturated soil. Can Geotech J 35(3):411–425
Ip SC, Choo J, Borja RI (2021) Impacts of saturation-dependent anisotropy on the shrinkage behavior of clay rocks. Acta Geotech 16:3381–3400
Jang J, Santamarina JC (2014) Evolution of gas saturation and relative permeability during gas production from hydrate-bearing sediments: gas invasion vs. gas nucleation. J Geophys Res: Solid Earth 119(1):116–126
Javadi A, Tan T, Zhang M (2003) Neural network for constitutive modelling in finite element analysis. Comput Assist Mech Eng Sci 10(4):523–530
Jha B, Juanes R (2014) Coupled modeling of multiphase flow and fault poromechanics during geologic co2 storage. Energy Procedia 63:3313–3329
Kosugi K (1994) Three-parameter lognormal distribution model for soil water retention. Water Resour Res 30(4):891–901
Lefik M, Schrefler BA (2003) Artificial neural network as an incremental non-linear constitutive model for a finite element code. Comput Methods Appl Mech Eng 192(28–30):3265–3283
Liakopoulos AC (1964) Transient flow through unsaturated porous media. University of California, Berkeley
Lindquist WB, Venkatarangan A, Dunsmuir J, Wong T-F (2000) Pore and throat size distributions measured from synchrotron X-ray tomographic images of fontainebleau sandstones. J Geophys Res: Solid Earth 105(B9):21509–21527
Liu S, Zolfaghari A, Sattarin S, Dahaghi AK, Negahban S (2019) Application of neural networks in multiphase flow through porous media: predicting capillary pressure and relative permeability curves. J Petrol Sci Eng 180:445–455
Logg A, Mardal K-A, Wells G (2012) Automated solution of differential equations by the finite element method: the FEniCS book, vol 84. Springer Science and Business Media
Lou Y, Caruana R, Gehrke J, Hooker G (2013) Accurate intelligible models with pairwise interactions. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 623–631
Lu Z, Pu H, Wang F, Hu Z, Wang L (2017) The expressive power of neural networks: a view from the width. Adv Neural Inf Process Syst 30:6232–6240
Mašín D (2010) Predicting the dependency of a degree of saturation on void ratio and suction using effective stress principle for unsaturated soils. Int J Numer Anal Meth Geomech 34(1):73–90
McConaghy T (2011) Ffx: Fast, scalable, deterministic symbolic regression technology. Genet Program Theory Pract IX, pp 235–260
Milatz M, Andò E, Viggiani GC, Mora S (2022) In situ X-ray CT imaging of transient water retention experiments with cyclic drainage and imbibition. Open Geomech 3:1–33
Nuth M, Laloui L (2008) Advances in modelling hysteretic water retention curve in deformable soils. Comput Geotech 35(6):835–844
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026-8037
Penumadu D, Dean J (2000) Compressibility effect in evaluating the pore-size distribution of kaolin clay using mercury intrusion porosimetry. Can Geotech J 37(2):393–405
Pinder GF, Gray WG (2008) Essentials of multiphase flow and transport in porous media. Wiley Online Library, Hoboken
Raghu M, Poole B, Kleinberg J, Ganguli S, Sohl-Dickstein J (2017) On the expressive power of deep neural networks. In: International conference on machine learning, pp 2847–2854. PMLR
Schmidt M, Lipson H (2009) Distilling free-form natural laws from experimental data. Science 324(5923):81–85
Schulz VP, Wargo EA, Kumbur EC (2015) Pore-morphology-based simulation of drainage in porous media featuring a locally variable contact angle. Transp Porous Media 107:13–25
Singh VK, Kumar D, Kashyap P, Singh PK, Kumar A, Singh SK (2020) Modelling of soil permeability using different data driven algorithms based on physical properties of soil. J Hydrol 580:124223
Stingaciu L, Weihermüller L, Haber-Pohlmeier S, Stapf S, Vereecken H, Pohlmeier A (2010) Determination of pore size distribution and hydraulic properties using nuclear magnetic resonance relaxometry: a comparative study of laboratory methods. Water Resour Res 46(11):W11510
Suh HS, Kang DH, Jang J, Kim KY, Yun TS (2017) Capillary pressure at irregularly shaped pore throats: implications for water retention characteristics. Adv Water Resour 110:51–58
Suh HS, Kweon C, Lester B, Kramer S, Sun W (2023) A publicly available pytorch-abaqus umat deep-learning framework for level-set plasticity. Mech Mater 184:104682
Suh HS, Yun TS (2018) Modification of capillary pressure by considering pore throat geometry with the effects of particle shape and packing features on water retention curves for uniformly graded sands. Comput Geotech 95:129–136
Sun D, Sheng D, Xiang L, Sloan SW (2008) Elastoplastic prediction of hydro-mechanical behaviour of unsaturated soils under undrained conditions. Comput Geotech 35(6):845–852
Sun X, Bahmani B, Vlassis NN, Sun W, Xu Y (2022) Data-driven discovery of interpretable causal relations for deep learning material laws with uncertainty propagation. Granular Matter 24:1–32
Szymkiewicz A (2012) Modelling water flow in unsaturated porous media: accounting for nonlinear permeability and material heterogeneity. Springer Science and Business Media
Tembely M, AlSumaiti AM, Alameri W (2020) A deep learning perspective on predicting permeability in porous media from network modeling to direct simulation. Comput Geosci 24:1541–1556
Tracy SR, Daly KR, Sturrock CJ, Crout NM, Mooney SJ, Roose T (2015) Three-dimensional quantification of soil hydraulic properties using x-ray computed tomography and image-based modeling. Water Resour Res 51(2):1006–1022
van Genuchten MT (1980) A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci Soc Am J 44(5):892–898
Vlassis NN, Sun W (2021) Sobolev training of thermodynamic-informed neural networks for interpretable elasto-plasticity models with level set hardening. Comput Methods Appl Mech Eng 377:113695
Vlassis NN, Sun W (2022) Component-based machine learning paradigm for discovering rate-dependent and pressure-sensitive level-set plasticity models. J Appl Mech 89(2):021003
Wang K, Sun W (2019) An updated lagrangian LBM-DEM-FEM coupling model for dual-permeability fissured porous media with embedded discontinuities. Comput Methods Appl Mech Eng 344:276–305
Wang W, Regueiro R, McCartney J (2015) Coupled axisymmetric thermo-poro-mechanical finite element analysis of energy foundation centrifuge experiments in partially saturated silt. Geotech Geol Eng 33:373–388
White JA, Borja RI (2011) Block-preconditioned Newton-Krylov solvers for fully coupled flow and geomechanics. Comput Geosci 15:647–659
Acknowledgements
This work is primarily supported by the National Research Foundation of Korea (NRF) grants funded by the Korean government (MIST) (Nos. RS-2023-00209799 and 2023R1A2C2003534) and the start-up grant from Case Western Reserve University. Xiong (Bill) Yu also appreciates partial support from the US. National Science Foundation (NSF) under Award Number CMMI-0846475 for related research activities.
Funding
Open Access funding enabled and organized by KAIST.
Author information
Authors and Affiliations
Contributions
HSS: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data Curation, Writing—Original Draft, Supervision, Project Administration. JYS: Validation, Formal analysis, Investigation. YK: Validation, Formal analysis, Investigation. XY: Formal analysis, Investigation, Supervision. JC: Methodology, Formal analysis, Investigation, Writing—Original Draft, Writing—-Review and Editing, Funding Acquisition.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Appendices
Appendices
1.1 Verification exercise: discovery of porosity-permeability relationship
This section verifies whether the symbolic regression algorithm adopted in this study can discover a mathematical expression that closely resembles the benchmark function generated from synthetic data. We consider a Kozeny–Carman-type equation that relates the permeability (\(k_w\)) to the porosity (\(\phi\)) at the laminar flow regime as a benchmark function:
where C is a parameter that is related to the geometrical and topological characteristics of the pore space. By setting \(C = 10^{-12} \text { m}^2\), we sample a total of 30 points along the \(\phi\)-axis that are evenly spaced from 0.2 to 0.8, and compute their corresponding permeability values from Eq. (A1) such that we can construct the training dataset \(\lbrace \phi ^i, k_w^i \rbrace _{i=1}^{30}\). While utilizing the hyperparameters identical to those used in Sect. 3.2, we specify the set of unary operators as \(\lbrace \text {square}(\bullet ), \text {cube}(\bullet ) \rbrace\) where \(\text {square}(\bullet ) = \bullet ^2\) and \(\text {cube}(\bullet ) = \bullet ^3\), and the set of binary operators as \(\lbrace -, \times , / \rbrace\), so that the algorithm can explore the operator space where Eq. (A1) can be recovered.
As illustrated in Fig. 15, we observe the same trend that can be found in Sect. 3.2, while the best-fit symbolic expressions discovered during the training are summarized in Table 2. The result shows that the discovered expressions (black curves) tend to fit the benchmark function (red symbols) better with increasing degrees of complexity. More importantly, the expression with Complexity = 14 contains \(\bar{\phi }^3\) in its numerator while \(\bar{\phi }^2\) in the denominator, which nearly resembles Eq. (A1) and exhibits a very low mean square error of the order of \(10^{-8}\). This highlights that the symbolic regression algorithm with an operator space specified from a priori knowledge is capable of recovering the solutions that are similar to the given mathematical expression.
1.2 Finite element formulation for unsaturated deformable porous media
This study considers an unsaturated porous material as a three-phase continuum mixture where the solid (s), water (w), and air (a) phase constituents may coexist simultaneously, occupying fractions of the volume of the same material point \(\mathcal {P}\). Based on this setting, by letting \(\rho _s\), \(\rho _w\), and \(\rho _a\) denote the intrinsic mass densities of the solid, water, and air, respectively, the total mass density of the entire mixture \(\rho\) can be defined as the sum of the partial mass densities (\(\rho ^{\alpha }\)) of each phase constituents (\({\alpha } = \lbrace s, w, a \rbrace\)), i.e.,
where \(\phi ^{\alpha } = \textrm{d} V_{\alpha } / \textrm{d}V\) indicates the volume fraction of the \(\alpha\)-phase constituent while \(\textrm{d}V = \textrm{d}V_s + \textrm{d}V_w + \textrm{d}V_a\). By assuming that the intrinsic density of the air phase is negligible (i.e., \(\rho _a = 0\)), Eq. (B1) can be further simplified as \(\rho = (1 - \phi ) \rho _s + \phi S_w \rho _w\), where \(\phi = (\textrm{d} V_w + \textrm{d} V_a) / \textrm{d}V\) is the porosity and \(S_w = \textrm{d} V_w / (\textrm{d} V_w + \textrm{d} V_a)\) is the degree of saturation. This passive-gas assumption leads to a pseudo-three-phase formulation in which the pore air pressure is always zero and hence yields the following set of equations that govern the continuum behavior of the unsaturated porous material [12, 72]:
where \(\varvec{\sigma }\) is the total stress, \(\varvec{u}\) is the displacement of the solid matrix, \(\varvec{g}\) is the gravitational acceleration, \(v = (1 - \phi )^{-1}\) is the specific volume, and \(\varvec{w}\) is the Darcy velocity, while the superposed dot indicates the material time derivative following the solid motion. Another necessary assumption to arrive at this particular model is the incompressibility of the phase constituents, such that the bulk moduli \(K_s = K_w \approx \infty\). This implies that the Biot coefficient is considered equivalent to 1, which leads to a particular form for the stress decomposition: \(\varvec{\sigma } = \varvec{\sigma }' - p_a \varvec{I} + S_w s \varvec{I} = \varvec{\sigma }' - S_w p_w \varvec{I}\), where \(\varvec{\sigma }'\) is the effective stress, \(s = p_a - p_w\) is the suction, and \(p_w\) is the pore water pressure. Here, we adopt a linear elastic effective stress–strain relationship while using a generalized Darcy’s law to relate the Darcy velocity to the pore water pressure:
where \(\varvec{\varepsilon } = (\nabla {\varvec{u}} + \nabla {^{\text {T}} \varvec{u}})/2\) is the infinitesimal strain tensor, E is Young’s modulus, \(\nu\) is Poisson’s ratio, \(k_r \in [0, 1]\) is the relative permeability, \(k_w\) is the intrinsic permeability, and \(\mu _w\) is the dynamic viscosity of water. In this work, we use the porosity–suction–saturation relation discovered via neural network-based symbolic regression (Sect. 3.2), while adopting a power-type relative permeability model \(k_r = S_w^3\) [53, 64] for simplicity.
The initial boundary value problem can be constructed by considering a continuum mixture that occupies a domain \(\mathcal {B}\) whose boundary \(\partial \mathcal {B}\) is composed of Dirichlet boundaries (displacement boundary \(\partial \mathcal {B}_u\) and pore water pressure boundary \(\partial \mathcal {B}_p\)) and Neumann boundaries (traction boundary \(\partial \mathcal {B}_t\) and water flux boundary \(\partial \mathcal {B}_q\)) that satisfies \(\partial \mathcal {B} = \overline{\partial \mathcal {B}_u \cup \partial \mathcal {B}_t} = \overline{\partial \mathcal {B}_p \cup \partial \mathcal {B}_q}\) and \(\emptyset = \partial \mathcal {B}_u \cap \partial \mathcal {B}_t = \partial \mathcal {B}_p \cap \partial \mathcal {B}_q\). Specifically, the prescribed boundary conditions read:
where \(\varvec{n}\) indicates the outward-oriented unit normal on the boundary surface. To close the problem, the initial conditions are imposed as follows:
at time \(t = 0\). Then, we consider the trial spaces \(V_u\) and \(V_p\) for the solution variables as:
where \(\text {n}_\text {dim}\) is the spatial dimension while \(H^1\) indicates the Sobolev space of degree 1. Similarly, the corresponding admissible spaces for Eq. (B7) can be defined as,
By applying the weighted residual procedure, the weak statements for Eqs. (B2) and (B3) are to find \(\lbrace \varvec{u}, p_w \rbrace \in V_u \times V_p\) such that for all \(\lbrace \varvec{\eta }, \xi \rbrace \in V_{\eta } \times V_{\xi }\), \(G_u(\varvec{u}, p_w) = G_p(\varvec{u}, p_w) = 0\), where:
In this study, we adopt the standard Galerkin approach for spatial approximation of the problem while using an implicit backward Euler time integration scheme. The formulation is implemented with the finite element package FEniCS with PETSc scientific computational toolkit [2, 44].
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Suh, H.S., Song, J.Y., Kim, Y. et al. Data-driven discovery of interpretable water retention models for deformable porous media. Acta Geotech. 19, 3821–3835 (2024). https://doi.org/10.1007/s11440-024-02322-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11440-024-02322-y