1 Introduction

Water flooding is a major oil production method in the petroleum industry. The idea behind this process, that is, generally applied as a secondary phase when the original energy of the reservoir has been exhausted, is to use water injection wells to generate an immiscible piston-like oil bank and push it toward the producers. However, the flow of two immiscible fluids with different viscosities in a porous medium gives rise to the so-called viscous fingering instability. The early work by Buckley and Leverett (1942) was pioneer in analyzing fluid displacements in this case. Viscous fingering is an intrinsically nonlinear phenomenon that substantially affects the performance of the recovery. A direct measure of its performance, under the effect of nonlinear instabilities, is the arriving time of the fluid injected to the well, which is known as the water breakthrough. An early breakthrough time indicates the presence of instabilities and thus also a reduction in the estimated ultimate recovery. These nonlinear instabilities are affected by the reservoir permeability distribution, fluid viscosities, capillary effects, dispersion and gravity. Thus, even when the reservoir is perfectly homogeneous (which is never encountered in practical situations), instabilities can still develop.

After the work of Buckley and Leverett, many researchers advocated the need to study the problem of fluid displacement in porous media combining theoretical, numerical and experimental approaches. Historically, studies have been focused on the recovery efficiency and on the estimation of the breakthrough time. Several studies considered the conditions under which viscous fingering appears, using tools of linear instability analysis, such as those of Koval (1963), Heller (1966), Peters et al.(1984), Chikhliwala and Yortsos (1985) and Guzman and Fayers (1997). Another approach is to numerically simulate the system of equations that governs the dynamics of the system. In this way, solutions to the problem can be obtained for a given permeability field and geometry, thus extending theoretical studies to the full nonlinear case as well as to more realistic scenarios (Jha et al. 2011; Nicolaides et al. 2015; Riaz and Tchelepi 2016). With the growth of computing power, problems tackled using this approach have become larger both in space and time resolution. A pioneer work in this area was done by Peaceman and Rachford (1962), where the authors solved the system of equations using a finite differences scheme. More recent works (Juanes 2005) tackle the problem using multiscale methods. Also, experimental studies were able to obtain data of the fluid in the porous medium using tomographic techniques [see, e.g., the work by Siddiqui et al. (1996)]. As these experimental studies are hard to perform, direct numeric simulations are still important to study fluid displacement, as it gives access to datasets and information that may be hard to obtain otherwise.

Using data from the numerical simulations, it is possible to apply different reduction techniques to decrease the dimensionality of the problem. The main goals of these techniques are: (1) to identify dominant modes in the dynamics, and to extract relevant information from those modes, (2) to compress the datasets and (3) to build a basis of solutions that can be used to derive a reduced system of equations for the evolution of the most energetic modes, neglecting contributions from modes with lower energy. In general, these techniques have a wide spectrum of applications, that in the oil industry go from seismic data reconstruction (Zhao and Song 2012), prediction of instabilities in oil well perforations (Lin et al. 2018), to more recently, the extraction of principal components from measured data using machine learning techniques. A pioneer work in the application of these so-called empirical mode decompositions upon a flow in a porous medium was that of Gharbi et al. (1997). In this work, the authors used a Karhunen–Loeve decomposition with the goal of characterizing the fluid displacement in the medium. The study showed that using such a decomposition can be useful to identify coherent spatial structures in the flow and that their behavior can be correctly captured by a finite set of modes which define a reduced basis. The same decomposition, used in conjunction with machine learning techniques, was used to study the fingering process in porous media (Smaoui and Gharbi 2000). Other works, such as those of Walton et al. (2013), considered the applicability of radial basis functions. Another (directly related) empirical decomposition is given by the proper orthogonal decomposition (POD), introduced by Berkooz et al. (1993) to identify coherent structures in turbulent flow. After this work, many authors used this technique for a large variety of problems from the study of solar magnetic fields (Mininni et al. 2004) to the analysis of multiphase displacements in porous media (Ghasemi and Gildin 2015).

In this work, we solve numerically the equations for multiphase flow in a porous medium using the finite volume method, and we study viscous fingering as the flow passes through a fixed number of obstacles in a channel-like geometry. The obstacles were randomly placed in order to generate the onset of viscous instabilities. We consider two phases (oil and water, to mimic the conditions of water flooding in oil reservoirs), and we vary the ratio of viscosities of the two fluids as well as the number of obstacles. Using these data, we define several characteristic lengths based on the probability density function (PDF) of the flow correlation length. We then perform a POD of the data to obtain an expansion of the datasets into a basis of empirical modes. In practice, PODs are often used to construct a surrogate model to improve the efficiency in intensive computation cases, such as in uncertainty quantification and optimization. In this light, the results presented in this work focus on the first step toward the construction of such reduced-order models, showing conditions for the convergence and useful metrics for its characterization. The main objective of this work is thus to characterize the behavior of the empirical modes and its geometrical properties, and in particular, to study the convergence of a truncated series of modes to the actual data as the parameters in the simulations are varied. Our main findings are that: (1) Properly defined correlation lengths, over the entire data as well as over individual modes of the POD, can be used to characterize viscous fingering and to identify the time of its onset. (2) The convergence of the truncated series of POD modes varies non-trivially with the ratio of viscosities, with the case with smaller viscosity ratio (and thus with less viscous fingering) requiring more modes to capture the flow profiles at fixed error, while cases with larger viscosity ratios (and thus with small-scale structures arising from the fingering) require less modes. The results have implications on proposed methods to decrease the dimensionality of the problem by deriving reduced dynamical models using truncated series of empirical modes, as the determination of the number of modes required to approximate the solutions with fixed error depends on the speed of convergence of the decomposition.

2 Theory and governing equations

In this work, we solve the saturation and pressure field equations in the absence of gravity (Peaceman 2000), given by

$$\emptyset \frac{{\partial S_{\text{o}} }}{\partial t} = \nabla .\left[ {\left( {\frac{{Kk_{\text{ro}} }}{{\mu_{\text{n}} }}} \right)\nabla p_{\text{o}} } \right] + \frac{{q_{\text{o}} }}{{\rho_{\text{o}} }} ,$$
(1)
$$\emptyset \frac{{\partial S_{\text{w}} }}{\partial t} = \nabla .\left[ {\left( {\frac{{Kk_{\text{rw}} }}{{\mu_{\text{w}} }}} \right)\nabla p_{\text{w}} } \right] + \frac{{q_{\text{w}} }}{{\rho_{\text{w}} }} ,$$
(2)

where \(\emptyset\) is the porosity field; \(K\) is the absolute permeability field; \(S\) is the saturation field; \(k_{\text{r}}\) is the relative permeability field; \(\mu\) is the viscosity; \(p\) is the pressure; \(q\) is the mass injection rate; \(\rho\) is the density; and subscripts w and o represent the water and oil phases, respectively. Note there is no term associated to compressibility, as there is no gaseous phase in the system. The two saturation fields are related by

$$S_{\text{o}} + S_{\text{w}} = 1.$$
(3)

It is useful to introduce the phase mobility for oil and water as

$$\lambda_{{{\text{o}} }} = \frac{{Kk_{\text{ro}} }}{{\mu_{\text{o}} }},$$
(4)
$$\lambda_{{{\text{w}} }} = \frac{{Kk_{\text{rw}} }}{{\mu_{\text{w}} }}.$$
(5)

We can also define the total mobility \(\lambda_{\text{t}}\) and the difference mobility \(\lambda_{\text{d}}\), respectively, as

$$\lambda_{\text{t}} = \lambda_{\text{o}} + \lambda_{\text{w}} ,$$
(6)
$$\lambda_{\text{d}} = \lambda_{\text{o}} - \lambda_{{{\text{w}}.}}$$
(7)

In order to rearrange Eqs. (1) and (2), we define the average pressure \(p_{\text{avg}}\) and the capillary pressure \(p_{\text{c}}\) as

$$p_{\text{avg}} = \left( {p_{\text{o}} + p_{\text{w}} } \right)/2,$$
(8)
$$p_{\text{c}} = p_{\text{o}} - p_{\text{w}} .$$
(9)

We can also define the velocity of each phase as

$$v_{\text{o}} = - \nabla p_{\text{o}} \lambda_{\text{o}} ,$$
(10)
$$v_{\text{w}} = - \nabla p_{\text{w}} \lambda_{{{\text{w}}.}}$$
(11)

And the total velocity as

$$v_{\text{t}} = v_{\text{o}} + v_{\text{w}}$$
(12)

Finally, we can define the volumetric rates of injection per unit of volume \(Q_{\text{w}}\) and \(Q_{\text{o}}\) as

$$Q_{\text{w}} = \frac{{q_{\text{w}} }}{{\rho_{\text{w}} }},\quad Q_{\text{o}} = \frac{{q_{\text{o}} }}{{\rho_{\text{o}} }}.$$
(13)

Equation (3) allows us to reduce the variables to a single saturation profile \(S\), which in our case will correspond to the water saturation field, i.e., \(S = S_{\text{w}}\). Using these new variables, Eqs. (1) and (2) can be written as

$$0 = \nabla .\left( {\lambda_{\text{t}} \nabla p_{\text{avg}} } \right) + \frac{1}{2}\nabla .\left( {\lambda_{\text{d}} \nabla p_{\text{c}} } \right) + Q_{\text{w}} + Q_{\text{o}} ,$$
(14)
$$- \emptyset \frac{\partial S}{\partial t} = \nabla .\left( {f_{\text{w}} \lambda_{\text{o}} \nabla p_{\text{c}} } \right) + \nabla .\left( {f_{\text{w}} \varvec{v}_{\text{t}} } \right) + Q_{{{\text{w}},}}$$
(15)

where \(f_{\text{w}}\) is the so-called ratio of the wetting phase to total mobility, or fractional flow, and is given by

$$f_{\text{w}} \left( S \right) = \frac{{\lambda_{\text{w}} }}{{\lambda_{\text{w}} + \lambda_{\text{o}} }}.$$
(16)

Defining the mobility ratio \(M\) as

$$M = \frac{{k_{\text{rw}} \mu_{\text{w}} }}{{k_{\text{ro}} \mu_{\text{o}} }},$$
(17)

the fractional flow becomes

$$f_{\text{w}} \left( S \right) = \frac{1}{1 + M}.$$
(18)

To solve these equations, we need to specify functions for the relative permeabilities \(k_{\text{ro}}\) and \(k_{\text{rw}}\). It is known that inhomogeneities reduce de-performance of oil recovery (Khataniar and Peters 1992) and that fingering instabilities depend on the permeability distribution (Giordano et al. 1995). Based on these studies, in the present study we consider the following dependencies for each of them

$$k_{\text{ro}} = \left( {1 - S} \right)^{2} ,$$
(19)
$$k_{\text{rw}} = S^{2} .$$
(20)

Although this is an arbitrary parameterization, it is frequently used. From these permeabilities, we can rewrite \(f_{\text{w}}\) as

$$f_{\text{w}} \left( S \right) = \frac{{S^{2} }}{{S^{2} + \left( {1 - S} \right)^{2} /m}},$$
(21)

where \(m\) is the viscosity ratio

$$m = \frac{{\mu_{\text{o}} }}{{\mu_{\text{w}} }}.$$
(22)

In Moissis et al. (1993), it was shown that the viscosity ratio plays an important role in controlling viscous fingering. To solve numerically these equations, we use dimensionless quantities based on a characteristic length \(L_{0}\), a characteristic volumetric injection rate \(Q_{0}\) and a characteristic density \(\rho_{0}\). Dimensionless quantities are defined from dimensional quantities (denoted with a tilde) as

$$L_{x} = \frac{{\tilde{L}_{x} }}{{L_{0} }},\quad L_{y} = \frac{{\tilde{L}_{y} }}{{L_{0} }},$$
(23)
$$Q_{\text{w}} = \frac{{\tilde{Q}_{\text{w}} }}{{Q_{0} }},\quad Q_{\text{o}} = \frac{{\tilde{Q}_{\text{o}} }}{{Q_{0} }},$$
(24)
$$\rho = \frac{{\tilde{\rho }}}{{\rho_{0} }},$$
(25)
$$t = \tilde{t}Q_{0} ,$$
(26)
$$p_{\text{avg}} = \frac{{\tilde{p}_{\text{avg}} }}{{\rho_{0} L_{0}^{2} Q_{0}^{2} }},\quad p_{c} = \frac{{\tilde{p}_{\text{c}} }}{{\rho_{0} L_{0}^{2} Q_{0}^{2} }},$$
(27)
$$v_{x} = \frac{{\tilde{v}_{x} }}{{L_{0} Q_{0} }},\quad v_{y} = \frac{{\tilde{v}_{y} }}{{L_{0} Q_{0} }},$$
(28)
$$\lambda_{\text{t}} = \tilde{\lambda }_{\text{t}} \rho_{0} Q_{0} ,\quad \lambda_{\text{d}} = \tilde{\lambda }_{\text{d}} \rho_{0} Q_{0} .$$
(29)

with these choices, dimensional quantities can be obtained by fixing values for \(L_{0}\), \(Q_{0}\) and \(\rho_{0}\), and by multiplying all dimensionless quantities in the next sections by their corresponding factor.

3 Methods

3.1 Simulations and domain geometry

We solved Eqs. (14) and (15) numerically. To this end, we implemented a code using an implicit–explicit scheme (IMPES). The effectiveness of these methods can be seen (Chen et al. 2004). Spatial discretization was done using a first-order finite volume method (FVM), while time integration was done using a first-order Euler method. The code was written in C and parallelized using OpenMP. To properly capture fine structures associated with the fingering instability, a spatial resolution of \(N_{x} \times N_{y} = 1000 \times 500\) grid points was used (in a two-dimensional channel with 2:1 aspect ratio), and the time step was determined by the CFL condition.

As mentioned in the previous section, the geometry considered is that of a channel with no flux in the Y direction over the horizontal boundaries. In the vertical boundaries, we set the left border as the injection well, and the right border as the production well. In the left border, we set a constant rate of injected water. Thus, the water and oil that flows on the right border adjust to satisfy mass conservation. In all simulations, we used a homogeneous permeability field, except in a region near the injection area where obstacles are present, with the purpose of perturbing the flow and triggering the fingering process. At \(t = 0\), the value of Sw is zero for the entire reservoir except on the left side boundary and its vicinity, where the saturation profile Sw has a value \(S_{\text{w}} = 1\) in \(x = 0\) for all values of y, and decays smoothly reaching \(S_{\text{w}} = 0\) in \(x = 0.05L_{0}\). The oil saturation field is thus \(S_{\text{o}} = 1 - S_{\text{w}}\), and as a result, at the beginning of the simulation is close to \(1\) in most of the domain. It is worth noting that as we are using a homogeneous permeability field in the whole reservoir (with the exception of the obstacles close to the left boundary), our fingering will not be too strong, and we will not observe stronger permeability patterns that can typically appear in the inhomogeneous case [as an example, for studies of the dependence of the fingering pattern on lateral permeability, see Waggoner et al. (1992)].

As already mentioned, in order to characterize the scale of the fingering (and later, to compare POD decompositions with varying number of obstacles) we set a group of obstacles near the injection well. These obstacles have a circular section, with permeability field inside them equal to zero. The configuration gives a simplified conceptualization of heterogeneity in oil recovery problems, representing low-permeable zones. For practical purposes, this method is similar to introducing a small perturbation in the saturation field. In previous works (Christie and Bond 1987; Araktingi and Orr 1993), it was shown that under certain conditions, such as in the presence of an uncorrelated permeability field, the final behavior of the flow is independent of the method used to excite the instabilities. Thus, we choose the position of the center of the obstacles in the \(X\) direction to be the same for all the obstacles, and we do not vary it between simulations. The distances between the centers of the obstacles in the \(Y\) direction are given by \(L_{0} /\left( {1 + {{obs}}} \right) + \delta_{i,i + 1}\), where \({{obs}}\) is the number of obstacles, and \(\delta_{i,i + 1}\) is a random displacement between the obstacle \(i\) and the obstacle \(i + 1\). The random displacement \(\delta_{i,i + 1}\) has a normal distribution centered around zero, and thus has small values that can be positive or negative. The radius of these obstacles also follows a normal distribution with mean \(r_{\text{mean}} = 0.05L_{0}\). In Fig. 1, we show the detail of the permeability maps used in all simulations. Figure 1a shows the whole computational domain for the simulations with five obstacles, while Fig. 1b–e only shows the permeability maps in the subregions with obstacles, respectively, for the simulations with 1, 2, 3 and 4 obstacles. Instabilities start soon after the injected phase goes through the region with the obstacles.

Fig. 1
figure 1

a Permeability map in the whole computational domain XY for a simulation with five obstacles. On the left side, the obstacles with circular shape and permeability can be seen. In the rest of the domain, the permeability is uniform and equal to one. be show only the subregion containing the obstacles, respectively, for simulations with 1, 2, 3 and 4 obstacles

In different simulations below, the parameters that we vary are the number of obstacles (\({{obs}}\)) and the viscosity ratio \(m\). For the viscosity ratio, we use either \(m = 5\), \(m = 10\) or \(m = 15\). For each of these values of \(m\), we then perform five simulations with 1, 2, 3, 4 or 5 obstacles. The incoming flow has a fixed value \(Q_{\text{w}} = Q_{0}\), and the capillary pressure \(p_{\text{c}}\) is zero. As mentioned above, the permeability \(K\) is equal to unity in the entire reservoir except in the obstacles. The porosity \(\emptyset\) is also equal to unity in the entire reservoir. Finally, the viscosity of the water is set to \(\mu_{\text{w}} = 1\) (in dimensionless units as discussed above). The data from the simulation are saved at regular intervals of \(\Delta t = 0.2/Q_{0}\). A full list of the simulations and of the parameters we vary is shown in Table 1, where ‘Name’ indicates the label used for each simulation, m is the viscosity ratio, and ‘obs’ indicates the number of obstacles distributed randomly in each run. Note the simulations are separated in three groups, each with the same value of m and varying number of obstacles.

Table 1 List of the parameters used for different runs considered in this study

3.2 Proper orthogonal decomposition

The POD (Berkooz et al. 1993) has the main goal of projecting a high-dimensional dataset (usually obtained from numerical simulations, field observations or from experiments) into an optimal basis of orthogonal modes. The expansion is optimal in the sense that the modes are ordered in decreasing energy (where the energy is associated to the power contained in that mode), and with the fastest possible convergence. As the basis obtained from the POD is empirical and specific for a given dataset, the POD is useful to identify coherent structures and to find bifurcations in the dynamics of the problem. In particular, the POD has often been used as a first step to construct reduced systems of ordinary differential equations from a set of partial differential equations (PDEs), by doing a Galerking projection of the PDEs into the empirical basis truncated to a few modes.

Given a scalar field \(u\left( {x,t} \right)\), we can decompose it into orthonormal modes using the POD. As a result, we obtain a basis of spatial modes \(u^{\prime}_{i} \left( x \right)\), and a set of orthogonal temporal coefficients \(a_{i} \left( t \right)\), such that

$$u\left( {x,t} \right) = \mathop \sum \limits_{i = 1}^{N} a_{i} \left( t \right)u^{\prime}_{i} \left( x \right).$$
(30)

As already mentioned, the scalar field \(u\left( {x,t} \right)\) is usually obtained from experiments or simulations. In our case, the dataset will be the result of the numerical simulations described in Sect. 3.1. For the moment, we can think of the scalar field as a discrete set of arrays

$$u(x, t) = \{ u^{1} , u^{2} , u^{3} , . . . , u^{N} \} ,\quad {\text{with}}\;u^{i} \in {\mathbb{R}}^{{N_{x} xN_{y} }} ,$$
(31)

where \(u^{i} \left( {i = 1, 2, 3, \ldots ,N} \right)\) are snapshots of the scalar field at fixed times, and \(N\) in the total number of time steps in the simulation. Given \(u\left( {x,t} \right)\), we can build a linear operator \(u:H(X) \to H(T)\) (with \(H\) the Hilbert space) as

$$\forall \varphi \in H\left( X \right) \left( {u\varphi } \right)\left( t \right) = \mathop \int \limits_{X}^{ } u\left( {x,t} \right)\varphi \left( x \right){\text{d}}x.$$
(32)

The adjoint \(u:H(T) \to H(X)\) operator becomes

$$\forall \psi \in H\left( T \right) \left( {u\psi } \right)\left( x \right) = \mathop \int \limits_{T}^{ } \overline{{u\left( {x,t} \right)}} \psi \left( t \right){\text{d}}t,$$
(33)

where the overline denotes the complex conjugate. Assuming \(u\) is a compact operator, we can rewrite these relations as an eigenvalue problem for eigenfunctions \(\varphi_{k}\) and \(\psi_{k}\), both with eigenvalue \(a_{k}^{2}\), given by

$$\begin{aligned} \left( {u*u\varphi_{k} } \right)\left( x \right) & = a_{k}^{2} \varphi_{k} \left( x \right), \\ \left( {uu*\psi_{k} } \right)\left( t \right) & = a_{k}^{2} \psi_{k} \left( t \right). \\ \end{aligned}$$
(34)

The scalar field \(u\left( {x,t} \right)\) can then be expanded in terms of the solutions to Eq. (34) as

$$u\left( {x,t} \right) = \mathop \sum \limits_{k = 1}^{N} \alpha_{k} \overline{{\varphi_{k} \left( x \right)}} \psi_{k} \left( t \right).$$
(35)

Here, \(\alpha_{k}\) are the amplitudes of each mode (with \(a_{k}^{2}\) associated to their energies), and \(\varphi_{k} \left( x \right)\) and \(\psi_{k} \left( t \right)\) are the empirical orthogonal modes which spectrally decompose \(u\left( {x,t} \right)\). The spatial modes \(\varphi_{k} \left( x \right)\) are usually called the ‘topos,’ while the temporal eigenfunctions \(\psi_{k} \left( t \right)\) are known as ‘cronos.’

To apply this decomposition to experimental or numerical data (which are discrete in space and in time) of saturation, we can use the so-called snapshot method as described by Sirovich (1987), which in practice is a decomposition in the time domain. In this work, the saturation field S corresponds to the results of the numerical simulations described in Sect. 3.1, saved to N snapshots \(S^{i}\) every dt intervals in time. Thus, Eq. (31) can be rewritten as

$$S\left( {x,y, t} \right) = \left\{ { S^{1} , S^{2} , S^{3} , . . . , S^{N} } \right\} ,$$
(36)

where \(S^{i} = S^{i} \left( {x,y} \right)\) (i.e., it is the saturation at fixed time and as a function of the two spatial coordinates). The base saturation is then defined as the average of the ensemble,

$$S_{0} \left( {x,y} \right) = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} S^{i} \left( {x,y} \right).$$
(37)

We can use the base saturation to rewrite each of the snapshots \(S^{i}\) as a mean field plus fluctuations, as

$$S^{i} \left( {x,y} \right) = S_{0} \left( {x,y} \right) + \mathop {S^{i} }\nolimits^{{\prime }} \left( {x,y} \right).$$
(38)

We then define the correlation matrix between any two snapshots p and q as

$$C_{pq} = \frac{1}{N}\left\langle {\mathop {S^{p} }\nolimits^{{\prime }} \left( {x,y} \right) \mathop {S^{q} }\nolimits^{{\prime }} \left( {x,y} \right)} \right\rangle ,$$
(39)

where the brackets denote the average over space. Using this matrix, the equations above are equivalent to solving the eigenvalue problem given by

$$\varvec{Cb}^{i} = \lambda_{i} \varvec{b}^{i} .$$
(40)

The eigenvectors of this equation are the temporal coefficients of the expansion \(\varvec{b}^{i} = \left( {b_{1}^{i} , \ldots ,b_{N}^{i} } \right)\), which are equivalent to the coefficients \(a_{i} \left( t \right)\) defined above. The spatial modes can then be obtained as

$$\varphi_{i} \left( {x,y} \right) = \frac{1}{{N\lambda_{i} }}\mathop \sum \limits_{k = 1}^{N} b_{k}^{i} \left( {S^{k} \left( {x,y} \right) - S_{0} } \right).$$
(41)

The equivalent of Eq. (30), giving the reconstruction of the saturation field at any point and at any time, finally becomes

$$S\left( {x,y,t} \right) = \mathop \sum \limits_{i = 1}^{N} b^{i} \left( t \right)\varphi_{i} \left( {x,y} \right),$$
(42)

where for brevity we replaced the index k (corresponding to the discrete time snapshot) by the time label t. The larger the number of modes we use in the reconstruction, the more accurate it will be.

Once the decomposition is obtained in this way, it can also be used to project the governing equations into a few modes (the modes that dominate the dynamics), allowing for a reduction in the dimensionality of the problem. In other words, the set of PDEs in Sect. 2 can be truncated to a finite number of ordinary differential equations. In the Appendix, we describe the procedure used for that purpose.

4 Analysis

4.1 Global evolution

Figure 2a shows a typical saturation profile for \(m = 5\) and one obstacle at time \(t = 53/Q_{0}\) (run I). Water is injected from the left, pushing the oil phase to the right. As the water phase passes through the obstacles, it can develop a viscous fingering instability which, in all runs, appears for times larger than \(t \approx 17/Q_{0}\). A vertical cut across any fixed value of \(L_{x}\) to the right of the obstacle then results in a saturation profile which is inhomogeneous and displays alternating maxima and minima.

Fig. 2
figure 2

a Water saturation profile for m = 5 and one obstacle at time \(t = 53/Q_{0}\) (run I). Light colors indicate maximum value of saturation while dark colors indicate a minimum value. As an illustration, white and gray arrows along a vertical cut of Sw (water cut) show regions with saturation above and below the average. b A vertical profile of the saturation along the vertical line indicated in a. Gray and black arrows correspond to the same regions as, respectively, gray and white arrows in a

With the aim of characterizing the development of viscous fingering along the flow direction, we can look at characteristic scales in the flow and at their time evolution. A usual definition for the characteristic scale of a flow is given by correlation length \(L_{\text{corr}}^{S}\), which here we compute in the \(Y\) direction. Simply put, the length \(L_{\text{corr}}^{S}\) is obtained from the correlation function of the saturation field \(S = S_{\text{w}}\), as a function of \(y\) and averaged over all values of \(x\). Figure 3 shows this length for simulations with \(m = 10\) and varying number of obstacles. As expected, the correlation length is dominated by the large-scale structures. While at early times the behavior of \(L_{\text{corr}}^{S}\) is the same for all runs (before the water phase passes through the obstacles), at later times \(L_{\text{corr}}^{S}\) depends on the number of obstacles, reaching a value that decreases with increasing number of obstacles. To have a better understanding of how the correlation length varies as a function of the number of obstacles, we average \(L_{\text{corr}}^{S}\) over time using a temporal range over which it remains approximately constant (see the squared region in Fig. 3). Figure 4 shows the values obtained from the average, \(\langle L_{\text{corr}}^{S} \rangle\), as a function of the mean distance between obstacles, for \(m = 5\), \(m = 10\) and \(m = 15\). For large distances between obstacles (i.e., for a few obstacles in the domain), \(\langle L_{\text{corr}}^{S} \rangle\) converges to the same value independently of \(m\), confirming that \(\langle L_{\text{corr}}^{S} \rangle\) is dominated by the contribution of the large-scale structure in the flow associated to the distribution of the obstacles. However, for larger numbers of obstacles (or, equivalently, for smaller distances between obstacles) \(\langle L_{\text{corr}}^{S} \rangle\) also depends on the value of \(m\) (taking smaller values for larger values of \(m\)) and thus appears to become sensitive to the contribution of the small-scale structures associated with the fingering instability.

Fig. 3
figure 3

Correlation length \(L_{\text{corr}}^{S}\) for different numbers of obstacles for \(m = 10\) (for runs VI, VII, VIII, IX and X) and as a function of time. The squared region indicates the area where we average each curve

Fig. 4
figure 4

Average correlation length \(\langle L_{\text{corr}}^{S} \rangle\) as a function of the mean distance between obstacles L0/obs

This correlation length will be useful to identify the characteristic scale of each POD mode in Sect. 4.2. However, to study the fingering instability from global data, we need a different definition for a characteristic scale that is more sensitive to the small-scale correlations associated with the viscous fingers. We thus introduce now a ‘cross-length’ \(L_{\text{cross}}^{S}\) also in the \(Y\) direction and over the saturation field \(S\). We will see this magnitude is more useful to quantify the size of small-scale structures, and therefore, to identify the onset of the instability. The length \(L_{\text{cross}}^{S}\) is defined as a cross-length, i.e., as the length we have to displace in the \(Y\) direction to cross a given value of the concentration. The detailed procedure is as follows: (1) At each time, we compute the mean concentration \(\bar{S}\). (2) For each value of \(x\), we move across the \(Y\) direction and compute the distance between all points with \(S = \bar{S}\), as shown in Fig. 2b (i.e., we compute the distance between ‘crossings’ of \(S\) with its mean value). (3) The process is repeated for different times in the simulation. (4) Finally, we compute the PDF of all these distances. From the PDF, we can compute different moments (as, e.g., the mean cross-length, its median or its deviation).

The PDF of \(L_{\text{cross}}^{S}\) for different simulations with three obstacles and viscosity ratios \(m = 5\), \(m = 10\) and \(m = 15\) are shown in Fig. 5. The arrows in each figure indicate the separation distance between obstacles, as well as between the obstacles and the wall of the computational domain (note obstacles are distributed randomly, so the separation is not the same for all obstacles). The peaks in the squared region remain approximately the same as \(m\) is varied; they correspond to large values of \(L_{\text{cross}}^{S}\), and they lay close to the distance between obstacles. Thus, we can conclude that this part of the PDF of \(L_{\text{cross}}^{S}\) is associated with large-scale geometrical features in our domain. However, for smaller values of \(L^{s}_{\text{cross}}\) (see the white region on the left of Fig. 5) the shape of the PDF changes as \(m\) is increased. In particular, for larger values of \(m\), the PDFs display a broader peak around \(L_{\text{cross}}^{S} \approx 0.05/{obs}\) (where \({{obs}}\) is the number of obstacles). As in the runs with larger values of \(m\), we have more viscous fingering, we will see that these changes in the peak of the PDF are associated with the growth of small-scale structures in the flow, resulting from the fingering instability.

Fig. 5
figure 5

Probability density function (PDF) of the cross-length \(L_{\text{cross}}^{S}\) for simulations with three obstacles, and with different values of m: a\(m = 5\), b\(m = 10\) and c\(m = 15\) (i.e., respectively, for runs III, VIII and XIII). The squared region indicates values of the cross-length (and in particular, two groups of peaks in the histograms) associated to the problem geometry. The arrows indicate the sizes of the regions between the obstacles

From the observation in Fig. 5 that a part of the PDF of \(L_{\text{cross}}^{S}\) is sensitive to the value of \(m\), we can build a characteristic length associated with the instability as the median of \(L_{\text{cross}}^{S}\), denoted as \(\overline{{L_{\text{cross}}^{S} }}\). Figure 6 shows this quantity as a function of time, first for simulations with fixed \(m = 10\) and with different numbers of obstacles in Fig. 6a. The median of the cross-length decreases with the number of obstacles, just as the correlation length also does. However, a sudden drop can be also seen at intermediate times in all cases. Figure 6b shows a detail of the median for simulations with just one obstacle, with \(m = 5\), \(m = 10\) and \(m = 15\). In the simulation with \(m = 5\), \(\overline{{L_{\text{cross}}^{S} }}\) remains constant in time, once the flow of water passes through the obstacles. However, for all other values of \(m\), \(\overline{{L_{\text{cross}}^{S} }}\) shows again a sudden drop in its value at later times. Visual exploration of the simulations indicates that the time of the drop corresponds to the onset of the viscous fingering instability. This can be understood, as small-scale correlations appear in the data once the fingering starts to grow (see Fig. 7).

Fig. 6
figure 6

Median of the cross-length, \(\overline{{L_{\text{cross}}^{S} }}\), as a function of time, in a simulations with different numbers of obstacles and \(m = \text{10}\) (runs VI, VII, VIII, IX and X) and b simulations with only one obstacle and three different values of m (runs I, VI and XI). The shaded regions in b indicate the moment when \(\overline{{L_{\text{cross}}^{S} }}\) suddenly falls for the two simulations with viscous fingering

Fig. 7
figure 7

a Saturation profiles for \(m = \text{10}\) and one obstacle (run VI). The dotted squared region indicates a region where the fingering phenomenon is starting. b Detail of the dotted squared region in a at \(t_{0}\). c Shows the later evolution of the area shown in b at \(t_{1} \varvec{ }\) when the fingering has developed. The times \(t_{0}\) and \(\varvec{t}_{1}\) are defined in the text

To further illustrate this effect, the squared regions in Fig. 6b define two times for each simulation: The left boundary of the squared region, corresponding to the beginning of the drop in \(\overline{{L_{\text{cross}}^{S} }}\), defines a time \(t_{0}\). The right boundary defines another time \(t_{1}\). At these two times, Fig. 7 shows the saturation field in the simulation with one obstacle and \(m = 10\) at \(t_{0}\). It can be seen in this figure that structures associated to fingering appear at time \(t_{0} .\)

4.2 POD analysis

Applying the POD method to the simulations, we can decompose them into orthogonal modes, study each mode separately and reconstruct the saturation field using a reduced number of modes (both topos and chronos) as discussed in the Introduction and in Sect. 3.2. The elements of the basis, the spatial functions \(\varphi_{i}\) and the temporal coefficients \(\varvec{b}^{i} ,\) were obtained using the snapshot method from the correlation matrix \(C_{pq}\). Figure 8 shows a simulation with \(m = 10\) for one obstacle, together with a reconstruction using 10, 20 and 30 modes, respectively. As expected, if we increase the number of modes, the reconstruction is more accurate.

Fig. 8
figure 8

Saturation profiles at time \(t = 48/Q_{0}\) when the fingering process is developed for runs with one obstacle and \(m = 10\) (run VI): a shows the profile given by the simulation, while bd are reconstructions using different numbers of modes: 10, 20 and 30 modes, respectively

We need to set a criterion to estimate how accurate the reconstruction is for a certain number of modes. As an attempt to accomplish this, we calculate the energy spectrum as a function of the mode number (i.e., the energy contained per POD mode). Figure 8 shows this spectrum normalized for runs with \(m = 5\), \(m = 10\) and \(m = 15\) as a function of \(k\) (the mode number), and for simulations with three obstacles. We found that for different numbers of obstacles used, the results practically did not change. To make an estimation of the error made in the reconstruction as a function of \(k\), we then calculated the cumulative energy up to the \(k\)th mode. This energy quantifies the amount of total energy gathered in a reconstruction with \(k\) modes, while the residual is the energy lost, and is a usual measure to quantify the quality of the reconstruction in PODs. The result is shown in Fig. 9b. The green line indicates, as a reference, a fixed value of energy of \(0.98 E_{0}\) (where \(E_{0}\) is the total energy in the original field). This means that, in this case, we need five modes to reconstruct (up to an error in the energy of 2%) the simulation with \(m = 15\). Note that the larger the viscosity ratio \(m\), the larger is the number of modes we need to attain the same accuracy. The results shown in Fig. 9 were obtained for simulations with three obstacles; however, the results are similar regardless of the number of obstacles.

Fig. 9
figure 9

a Energy of each mode as a function of the number of modes used for the reconstruction, for three different values of the viscosity ratio, and for runs with three obstacles. b Cumulative energy for different values of modes used in the reconstruction. The horizontal green line indicates a value of 98% of E0 only as a reference. The mean squared error normalized by the mean squared amplitude of the concentration for the cases with \(m = 10\) is 0.00443 when five modes are used for the reconstruction, 0.00075 when 20 modes are used and 0.00041 when 30 modes are used. For a fixed number of modes (10 modes in the reconstruction), and the normalized mean squared error is 0.00332 for \(m = 5\), 0.00194 for \(m = 10\) and 0.00038 for \(m = 15\)

To explicitly show the dependence with \(m\) of the number of modes needed to reach 98% of \(E_{0}\), we plotted this number in Fig. 10 as a function of \(m\). The decaying behavior as \(m\) grows is counter-intuitive. We know that when \(m\) increases, the structures associated to viscous fingering also grow, and that more small-scale features are present in the flow. Thus, a larger number of POD modes can be expected to be required to reconstruct the field with a prescribed error when \(m\) is increased, unlike what is observed in the figure.

Fig. 10
figure 10

Number of modes needed to reconstruct the field of water saturation S with 0.98E0, for \(m = 5\), 10 and 15. The values on this graph are independent of the number of obstacles

The reason for this behavior is as follows: As \(m\) increases, the number and amplitude of the structures associated to the fingering indeed increase. However, when \(m\) increases, the gradient separating the water phase and the oil phase also becomes smoother. This permits a better reconstruction using less modes. Figure 11 shows a profile of the advancing front for simulations with one obstacle and with \(m = 5\), \(m = 10\) and \(m = 15\), respectively. Each one of them shows the advancing front in the simulation, and the reconstruction of the same front performed with 8, 16 32 and 64 modes. We can see how as the number of modes increases, the reconstruction becomes a better approximation to the advancing front in the simulation. However, for the sharper front in the simulation with \(m = 5\), more modes are required to get a reasonable approximation: While 64 modes in the simulation with \(m = 15\) are very close to the actual profile, the same number of modes displays strong fluctuations near the front in the \(m = 5\) case.

Fig. 11
figure 11

Horizontal profile of the advancing front of the saturation field. The continuous line corresponds to the profile obtained from the simulation, while the others are reconstructions with different number of modes (see inset): a case with \(m = 5\) (run V), b\(m = 10\) (run X) and c\(m = 15\) (run XV). In the \(X\) axis, \(x_{0}\) is the position of the front at time \(t = 20/Q_{0}\), such that the front is centered in the middle of the figure in all cases

In the same way as we calculated the correlation length over the saturation field \(L_{\text{corr}}^{S}\), we can now calculate the correlation length over each topo of a given decomposition, \(L_{\text{corr}}^{t}\). Figure 12 shows the correlation length over the \(Y\) direction as a function of the mode number \(k\) (i.e., for the kth topo mode), for different numbers of obstacles and for \(m = 15\). We can see that this correlation length is also ordered as a function of the number of obstacles in a descending way. Between \(k = 50\) and \(k = 80\), and specially for low number of obstacles, the function is compatible with a power law, which suggests that for a range of scales the viscous fingering process may become independent of the scale and self-similar. A certain scale independence of the process can be expected from the dynamic equations in Sect. 2, which when we made dimensionless using the length \(L_{0}\) and all other units in Eqs. (23) to (29) are not explicitly dependent on the length \(L_{0}\) nor on any dimensionless number based on \(L_{0}\).

Fig. 12
figure 12

Correlation length of each topo, \(L_{\text{corr}}^{t}\), as a function of the mode number k (for runs with \(m = 15\)). Each line corresponds to the decomposition of a simulation with different numbers of obstacles (see inset). A power law \(\sim\, {k}^{ - 7/10}\) is indicated only as a reference

Figure 13 further shows \(L_{\text{corr}}^{t}\) as a function of \(k\) for a fixed number of obstacles, and for different values of \(m\). Figure 13a corresponds to \({{obs}} = 1\), while Fig. 13b to \({obs} = 5\). Note that as \(m\) increases, the maximum of the corresponding curve (i.e., the maximum correlation length per mode) becomes smaller, in both cases (i.e., for both \({obs} = 1\) and 5). This behavior indicates that the spatial representation of the viscous fingering seen in the simulations also emerges in the individual \({\text{topos}}\) associated to these runs, and in particular, in their correlation lengths. In Fig. 13, it can also be seen that the maximum correlation takes place in \({\text{topos}}\) with smaller \(k\) as \(m\) increases. This effect is illustrated in better detail in Fig. 14, which shows the mode number \(k\) corresponding to the maximum of \(L_{\text{corr}}^{t}\) as a function of \(m\) in Fig. 14a, as well as the average value of \(L_{\text{corr}}^{t}\) as a function of the distance between obstacles in Fig. 14b.

Fig. 13
figure 13

Correlation length of each topo, \(L_{\text{corr}}^{t}\), as a function of the mode number k and for different values of \(m\). a Corresponds to runs with one obstacle, and b corresponds to cases with five obstacles

Fig. 14
figure 14

a Topo for which the correlation length \(L_{\text{corr}}^{t}\) is maximum, as a function of m. Each line corresponds to simulations with a fixed number of obstacles (see inset). b Average value of \(L_{\text{corr}}^{t}\) as a function of the average distance between obstacles L0/obs. Each line corresponds to runs with fixed values of m (as indicated in the inset)

In Sect. 4.1, we also introduced the cross-length over the saturation field S, as a way to quantify the onset of the fingering process. We can now do the same analysis over each spatial mode, by computing a cross-length \(\overline{{L_{\text{cross}}^{t} \left( k \right)}}\) for each topo (each labeled by the index k). The length \(\overline{{L_{\text{cross}}^{t} \left( k \right)}}\) is calculated following the same procedure as before for the \(\overline{{L_{\text{cross}}^{S} }}\) length, only for each topo instead of for the total concentration \(S\). This will allow us to identify which individual spatial modes capture the fingering process, and which ones are associated to the large-scale (and smoother) flow. Figure 15 shows this length as a function of the kth mode for the different simulations. In Fig. 15a, we present the value of the cross-length for multiple simulations with \(m = 10\) and with different numbers of obstacles. The dashed lines demark the values for \(\overline{{L_{\text{cross}}^{S} }}\) obtained in Sect. 4.1 for the entire saturation field \(S\) before the fingering process is started (and for each simulation). In other words, it indicates the characteristic cross-length of the bulk (large-scale) flow. Note the first three modes in all cases have cross-lengths similar to those obtained for the entire saturation field. This indicates the first spatial modes of the POD are associated with the large-scale flow. For \(k = 4\) and larger, \(\overline{{L_{\text{cross}}^{t} \left( k \right)}}\) quickly drops to values that are comparable to those seen in Fig. 6 (for \(\overline{{L_{\text{cross}}^{S} }}\)) after the fingering starts. Thus, these modes have cross-lengths comparable to those of the viscous fingers and are the modes that capture the dynamics of this process. Figure 15b shows how this changes as we change \(m\) from 5 to 15, for only one obstacle. The behavior of \(\overline{{L_{\text{cross}}^{t} \left( k \right)}}\) is much alike to the one already described, but the number of modes associated with the large-scale flow decreases as \(m\) increases.

Fig. 15
figure 15

a Cross-length median \(\overline{{L_{\text{cross}}^{t} \left( k \right)}}\) as a function of the mode number k in simulations with \(m = 10\) and with different numbers of obstacles. The dashed horizontal lines indicate the length \(\overline{{L_{\text{cross}}^{S} }}\) obtained in Sect. 4.1 (see Fig. 6) for the complete saturation field S before the appearance of the fingering (i.e., the cross-length of the large-scale flow). b Length \(\overline{{L_{\text{cross}}^{t} \left( k \right)}}\) as a function of the mode number, but now for simulations with \(m = 5\), \(m = 10\varvec{ }\) and \(m = 15\) and for only one obstacle. The dashed line indicates the length \(\overline{{L_{\text{cross}}^{S} }}\) measured for the entire saturation field before the fingering process started

5 Conclusions

We presented numerical simulations of the fingering instability in multiphase flow in porous media, varying the ratio of viscosities between the oil and water phases, as well as the number of obstacles used to trigger the instability. Our main goal was to characterize the process of viscous fingering using global correlation lengths, as well as an empirical mode decomposition. To this end, we studied the evolution of the saturation correlation length \(L_{\text{corr}}^{S}\) and introduced a new definition for a characteristic length based on the distance between crossings of the saturation with its mean value and the cross-length \(L_{\text{cross}}^{S}\). We also performed a POD decomposition of the simulations and studied its convergence as well as the correlation length \(L_{\text{corr}}^{t}\) and cross-length \(L_{\text{cross}}^{t}\) of each topo.

We showed that, when computed on the entire (non-decomposed) saturation field, the correlation length \(L_{\text{corr}}^{S}\) is dominated by the number of obstacles in the flow, while the median of the cross-length, \(\overline{{L_{\text{cross}}^{S} }}\) is a better indicator of the onset of the fingering instability. However, when applied to each individual mode of the POD, the correlation length \(L_{\text{corr}}^{t}\) also becomes sensitive to the growth of small-scale features associated to the instability. Moreover, the cross-length of each topo \(L_{\text{cross}}^{t}\) can also be used to distinguish modes associated with large- and small-scale features in the flow.

We showed that when attempting a reconstruction of the saturation field using a finite number of POD modes, the convergence is non-trivially dependent of the value of the viscosity ratio \(m\). While in all cases a small number of modes suffice to obtain the saturation to a good approximation (with errors smaller than 2% or 5%), more modes are required for small values of \(m\) (which do not display strong viscous fingering) than in the cases with larger values of \(m\) (which display fingering, and thus, small-scale features). This result is associated with the sharpness of the boundary between water and oil in the former case, which requires more modes to capture the stronger gradients in the saturation field. Also, this result is important when reduced dynamical systems for multiphase flow are derived from empirical modes, e.g., by truncating the governing equations to a few POD modes, as the number of modes required to properly approximate the solutions depends on the speed of convergence of the series, and thus on the viscosity ratio.