1 Introduction

Simulation of multiphase flow in geological porous media forms the basis for geotechnical applications such as \(\hbox {CO}_2\) storage and petroleum recovery. The governing equations are comprised of strongly coupled equations, typically formulated as a system of partial differential equations. Nonlinear conservation laws and constitutive relations, heterogeneous parameter fields, and multiscale effects all contribute to the complexity of the problem. Thus, resolving all relevant features of the problem may require grids of unrealistically high resolution, resulting in a larger number of unknowns and equations than available computing resources can handle. Consequently, simplifications such as upscaling of the finest scales or neglecting physical features are common, resulting in less accurate solutions.

Solution approaches to simulating nonlinear problems in general, and flow in porous media in particular, often share the following generic structure. The nonlinear system of equations is resolved by some variant of Newton’s method, whereafter a preconditioning technique is applied to the resulting linear system, and the preconditioned linear system is solved by a Krylov iterative method. Within this framework, the focus has largely been on the development of better preconditioners for the linearized system, such as multiscale methods (Juanes and Tchelepi 2008; Nordbotten 2009), two-stage preconditioners (Cao et al. 2005; Stüben et al. 2007; Wallis et al. 1985), sequential splitting strategies (Watts 1986), and domain decomposition (DD) methods (Quarteroni and Valli 1999; Smith et al. 1996; Toselli and Widlund 2005). It should, however, be noted that some multiscale methods also take the full nonlinear problem into account, to a varying degree (Jenny et al. 2006; Juanes and Tchelepi 2008). The aim of these and other linear preconditioners is to lower the number of linear iterations needed, thus increasing the computational efficiency of the code.

Fundamentally, simulation of flow in porous media will, even in the presence of efficient solution approaches, be applied to upscaled parameters. For linear or linearized problems, this has led to an increased understanding of the connections between efficient upscaling (or multiscale treatment) and solvers. In particular, domain decomposition strategies have been shown to be particularly closely related to multiscale methods (Brandt and Livne 1984; Graham et al. 2007; Nordbotten and Bjørstad 2008). These concepts can be extended in the linear case toward bridging the gap between upscaling and solvers (Keilegavlen and Nordbotten 2015).

Our motivation is to achieve a similar approach for nonlinear flows in porous media, wherein upscaling can be seen as a component of the nonlinear solver. In order to make progress toward this goal, it is necessary to establish the performance of nonlinear domain decomposition type solvers for porous media flow problems. This is the purpose of the current paper.

Existing strategies that target the nonlinear solver in order to lower the number of Newton iterations include cascade and reordering methods (Appleyard and Cheshire 1982; Kwok and Tchelepi 2007; Natvig and Lie 2008) and similar adaptive localization methods (Younis et al. 2010) and nonlinear preconditioning (Cai and Keyes 2002). Our interest is in the applicability of domain decomposition type strategies for preconditioning the Newton iterations. Such an approach has been considered in the context of computational fluid dynamics by Cai and Keyes (2002). There, the framework of additive Schwarz preconditioned inexact Newton (ASPIN) was introduced as a single-level nonlinear preconditioner. A two-level version was introduced in Cai et al. (2002). Single-level ASPIN has been shown by the authors to be a promising approach for resolving multiphase flow in porous media (Skogestad et al. 2013). There it was shown that the Newton iterations could be localized around an advancing front of unstable displacement, whereas the corresponding linear preconditioner does not localize computations at all. While these results were promising, they also indicated the need for a coarse solver in order to get good scalability with respect to the number of subdomains.

In this study, we will introduce a two-level nonlinear preconditioner for flow in porous media, based on the ASPIN method. Our emphasis is to derive the method to be consistent with a wide range of coarsening and reconstruction operators, recognizing the nonlinear nature of existing upscaling methods (Farmer 2002). We furthermore recognize that upscaled equations for flow in porous media are frequently still nonlinear; thus, we will consistently deal with a nonlinear coarse operator. The general formulation of the coarsening and reconstruction operators, as well as the emphasis on a nonlinear coarse equation, sets this work apart from earlier work (Cai and Keyes 2002; Cai et al. 2002; Hwang and Cai 2007; Marcinkowski and Cai 2005).

We support the method development with a comprehensive numerical investigation, which to our knowledge represents the first robust verification of two-level nonlinear preconditioning for multiphase flow in porous media.

The rest of this paper is organized as follows. In Sect. 2, we outline the physical problem, including modeling assumptions and discretization methods. Section 3 contains general background on nonlinear domain decomposition preconditioning. In Sect. 4, we derive the new physics-based preconditioner. In Sect. 5, the new method is verified through a scalability study. Section 6 summarizes the paper.

2 Model Problem

We consider two-phase flow problem in porous media as our model problem, representing, e.g., an oil-water system (Aziz and Settari 1979; Chen et al. 2006; Nordbotten and Celia 2012).

2.1 Continuous Problem

The phases are assumed to be immiscible and incompressible, and capillary effects are ignored. The effects of these simplifications on the results reported herein are discussed toward the end of Sect. 5. This system is described by nonlinear conservation laws for phase masses, which can be rearranged as a coupled system of an elliptic and a hyperbolic partial differential equation,

$$\begin{aligned}&-\nabla \cdot \left( k\left( \nabla p + \left( \rho _n \lambda _n + \rho _w \lambda _w \right) g \nabla z \right) \right) = q_T, \end{aligned}$$
(1)
$$\begin{aligned}&\phi \frac{\partial s_{w}}{\partial t} + \nabla \cdot \left( \mathbf {v}_w \right) = q_w. \end{aligned}$$
(2)

These equations are solved for pressure p and the saturation of one of the two phases, say \(s_w\). The subscripts w and n refer to wetting and nonwetting phases, respectively, and subscript T is used to denote quantities that are summed over the phases. The tensor k represents permeability, \(\lambda _\alpha \) represents phase mobility, and \(\rho _\alpha \) is the mass density of phase \(\alpha \). The porosity of the medium is denoted by \(\phi \), source terms are represented by q, and the vertical coordinate is represented by z. The flux \(\mathbf {v}_\alpha \) in the latter equation is given by Darcy’s law,

$$\begin{aligned} \mathbf {v}_\alpha = - \lambda _\alpha k \left( \nabla p + \rho _\alpha g \nabla z \right) . \end{aligned}$$
(3)

The system needs to be closed by additional constitutive relations, which are obtained by requiring that the phases fill the pore space

$$\begin{aligned} s_w + s_n = 1, \end{aligned}$$
(4)

and letting the phase mobilities be a function of phase saturation,

$$\begin{aligned} \lambda _\alpha = \lambda _\alpha (s_\alpha ), \quad \alpha = w,n. \end{aligned}$$
(5)

Equations (12) are coupled and highly nonlinear, depending on further assumptions and parameter choices. They can be written on residual form by defining the nonlinear residual

$$\begin{aligned} F_c(u) = \left[ \begin{array}{l} F_p(u)\\ F_{s_w}(u) \end{array}\right] = \left[ \begin{array}{c} - \nabla \cdot \left( k \left( \nabla p + \left( \rho _n \lambda _n + \rho _w \lambda _w \right) g \nabla z \right) \right) - q_T \\ \phi \frac{\partial s_w}{\partial t} + \nabla \cdot (\mathbf {v}_w) - q_w \end{array} \right] , \end{aligned}$$
(6)

where \(u = (p,s_w)^T\), and require

$$\begin{aligned} F_c(u) = 0. \end{aligned}$$
(7)

The subscript c indicates that this is the continuous form of the equations.

2.2 Discrete Problem

We have confined our study to two spatial dimensions. For simplicity, we use a Cartesian grid with equidistant grid points in each dimension. The elliptic equation above is discretized using a two-point flux control volume method. The hyperbolic equation employs an upstream weighted scheme.

There are different available solution strategies for coupled nonlinear elliptic/hyperbolic problems. A common approach in the reservoir simulation community is implicit pressure, explicit saturation (IMPES), where the equations are solved sequentially. Another approach is the fully implicit method (FIM), which has much better stability properties than IMPES, thus allowing for longer time steps. Here, the equations are solved simultaneously, forming a nonlinear system of equations. Other approaches also are available. We use the FIM approach in this study, with a backward Euler time integration. This gives us a nonlinear system,

$$\begin{aligned} F(u) = 0, \end{aligned}$$
(8)

to solve on each time step. F(u) is the discretized version of \(F_c(u)\). This nonlinear system is linearized using Newton’s method, which solves the discretized nonlinear system iteratively until convergence, solving a linear system on each iteration step. That is, at step n, solve the linear system

$$\begin{aligned} J(u^{(n)}) \varDelta u^{(n)} = - F(u^{(n)}), \end{aligned}$$
(9)

and obtain a new solution

$$\begin{aligned} u^{(n+1)} = u^{(n)} + \varDelta u^{(n)}. \end{aligned}$$
(10)

Here, J(u) is the Jacobian of F(u). The linear system is normally solved using an iterative Krylov method, typically GMRES (Saad and Schultz 1986), which we also use here. In practice, the linear iteration is stopped before reaching the exact solution. This gives rise to the class of inexact Newton methods (Dembo et al. 1982; Dennis and Schnabel 1983), which typically combine truncated iterations with some safeguarding method to enlarge the convergence basin of the method. In general, Newton’s method requires a ’sufficiently good’ initial guess to converge.

3 Nonlinear Domain Decomposition Preconditioning

This section will provide the background for our preconditioning method, starting with a brief background on domain decomposition methods, before discussing nonlinear domain decomposition preconditioning (Cai and Keyes 2002).

3.1 General Background on Domain Decomposition

The modern use of domain decomposition (DD) methods arose in the 1980s (Quarteroni and Valli 1999; Smith et al. 1996; Toselli and Widlund 2005). The main idea behind DD is to split a large problem into smaller subproblems, which may be substantially cheaper to solve than the full problem. DD methods may be applied as iterative methods or as preconditioners to the global problem.

One of the most important classes of DD methods is the Schwarz methods (Lions 1988; Schwarz 1870), which include the additive (AS; Dryja and Widlund 1987) and the multiplicative (MS) Schwarz methods. AS has the advantage that all the subproblems may be solved in parallel, as only information from the previous iteration step is used. MS, on the other hand, incorporates information from the current iteration step, which limits the parallelism. It is often necessary to include a coarse level for improving global communication, which may otherwise lead to rapidly increasing computational costs as the number of subdomains increases.

3.2 Compression and Reconstruction

Keeping in mind that we ultimately aim to reduce the gap between upscaling and preconditioning, we will herein adopt the terminology and notation of the heterogeneous multiscale method (HMM) (Weinan and Engquist 2003). Starting from a fine-scale description of the system, represented by v, we consider a compression operator \(\mathcal {C}\) which maps to a coarse scale representation, \(V = \mathcal {C} v\). A reconstruction operator \(\mathcal {R}\) may then be defined for the interpolates back to the fine scale \(\hat{v} = \mathcal {R} V\). Naturally, the reconstruction cannot be unique due to the sparser nature of the coarse representation. Thus, in general it will be the case that \(\mathcal {R}\mathcal {C} \ne \mathcal {I}\), where \(\mathcal {I}\) is the identity operator. On the other hand, we require that the operators are consistent, in the sense that the reconstructed fine scale is compressed back to the same coarse scale as the original coarse scale, that is \(\mathcal {C}\mathcal {R} = \mathcal {I}\).

The concept of a coarse scale can easily be extended to include subdomains. Denoting the domain by \(\varOmega \), split into N possibly overlapping subdomains \(\varOmega _i, i = 1,\ldots , N\), the compression operator \(\mathcal {C}_i\) corresponds to restriction from \(\varOmega \) to \(\varOmega _i\) and the reconstruction operator \(\mathcal {R}_i\) to expansion from \(\varOmega _i\) to \(\varOmega \). For subdomains, both compression operators and reconstruction operators are usually linear. We will represent linear operators using normal typeface, \(C_i\) and \(R_i\). The linear subdomain operators are typically represented by matrices that extract rows and/or columns and expand by filling in zeros, respectively. Often, the reconstruction will be the transpose of the compression, \(R_i = C_i^T\).

On the coarse scale, it is also common to employ linear interpolation and extrapolation operators and, however, we do not make any assumptions on the forms of the coarse operators \(\mathcal {C}_0\) and \(\mathcal {R}_0\) at this point.

3.3 Linear Domain Decomposition Preconditioning

Different variants of AS are commonly used as preconditioners for the linear system arising from Newton’s method. For a linear system \(Ax=b\), a preconditioner P may be applied to obtain the preconditioned linear system,

$$\begin{aligned} P(b-Ax)=0. \end{aligned}$$
(11)

The linear AS preconditioner is on the form

$$\begin{aligned} P = \sum _{i = 0}^N R_i (C_i A R_i)^{-1} C_i, \end{aligned}$$
(12)

where \(C_i\) is the compression from the global domain to the subdomain, or to the coarse scale for \(i=0\), and \(R_i\) is the corresponding reconstruction. We note that the two-level method outlined above can be extended to a multigrid solver by introducing more grid levels.

3.4 Nonlinear Domain Decomposition Preconditioning

While the strategy of applying domain decomposition or multigrid techniques inside the Newton iteration is common in practical reservoir simulations, the approach separates the treatment of nonlinearities and multiscale physics. This removes the possibility of amending the nonlinear solvers with insights gained from nonlinear upscaling. As an alternative approach, we here consider the application of AS as a nonlinear preconditioner, in the framework of the additive Schwarz preconditioned inexact Newton (ASPIN) method (Cai and Keyes 2002).

The remainder of this section will be devoted to an overview of ASPIN, as it forms an important basis for the method we will present in the next section. We first describe the basic single-level version of the method before outlining different strategies for the coarse solver.

3.4.1 Single-Level ASPIN

In the following, we will give a brief overview of single-level ASPIN, which is presented in more detail in An (2005), Cai and Keyes (2002). The nonlinear system (8) is to be solved on the domain \(\varOmega \), given some initial and boundary conditions. The nonlinearly preconditioned problem may be written as

$$\begin{aligned} \mathcal {F}^{(1)}(u) = 0, \end{aligned}$$
(13)

where the superscript indicates a one-level preconditioner. The preconditioned residual \(\mathcal {F}^{(1)}(u)\) is obtained using nonlinear domain decomposition, that is, by solving locally restricted versions of (8) on each of the subdomains \(\varOmega _i\), given boundary conditions based on the current state in the neighboring subdomains. In a porous media flow setting, it is natural to impose Dirichlet conditions on the pressure, and to calculate fluxes using the upstream saturations on the internal boundaries.

The nonlinear subproblems are solved with respect to the local update \(\delta _i\) to the global solution, that is,

$$\begin{aligned} F_{i}(u - \delta _i(u)) = 0, \quad i = 1 \ldots N. \end{aligned}$$
(14)

All the updates are then summed to form a global vector, which is the preconditioned residual,

$$\begin{aligned} \mathcal {F}^{(1)}(u) = \sum _{i = 1}^{N} \delta _{i}(u). \end{aligned}$$
(15)

For simplicity of exposition, handling of overlaps is not taken into account in the above and following formulas that express summing over subdomains. In practice, each element in the overlaps is weighted using some partition of unity. A natural choice in the current physical setting is averaging of pressures and upstream weighting of saturations. After forming the nonlinear residual from the local updates, an inexact Newton step on the preconditioned residual Eq. (13) is performed. This forms the outer or global iteration. The above process is repeated on each outer iteration step.

The updates, \(\delta _i\), will generally decrease as the global state converges and are also a measure on how much the state is changing in the corresponding subdomain. If nothing changes within a subdomain over a time step, the local nonlinear problem is trivial and the update is zero. This provides automatic localization of computations in the solver, without changing the grid or applying heuristic methods for allocation of computational resources. On the other hand, this may in the worst case lead to load balancing issues and, however in the case of many more subdomains than processors, which is a common scenario, this may be dealt with by estimating expected loads per subdomain.

The Jacobian \(\mathcal {J}^{(1)}(u)\) of the preconditioned residual \(\mathcal {F}^{(1)}(u)\) may be found by differentiating all the terms in the sum in Eq. (15). Each term then takes the following form (An 2005; Cai and Keyes 2002)

$$\begin{aligned} \delta _{i}^\prime (u) = R_i \left( C_i J(u-\delta _{i}(u)) R_i \right) ^{-1} C_i J(u - \delta _{i}(u)). \end{aligned}$$
(16)

Here we denote the Jacobian of the nonpreconditioned problem F(u) as J(u), after which the preconditioned Jacobian can be written as

$$\begin{aligned} \mathcal {J}^{(1)}(u) = \sum _{i = 1}^{N} \delta _{i}^\prime (u). \end{aligned}$$
(17)

As the calculation of this matrix involves many local and global Jacobian evaluations at different points, it is not practical to work with (17) directly. The calculation of \(\mathcal {J}^{(1)}(u)\) is commonly simplified (An 2005; Cai and Keyes 2002) by assuming that u is close to the exact solution of the problem, \(u^*\). Then, if J(u) and \(\delta _i(u)\) are continuous, we have that \(J( u - \delta _i(u)) \approx J(u)\), and

$$\begin{aligned} \delta ^\prime _{i}(u) \approx \tilde{\delta }^\prime _{i}(u) = R_i \left( C_i J(u) R_i \right) ^{-1} C_i J(u), \end{aligned}$$
(18)

which leads to the approximated preconditioned Jacobian

$$\begin{aligned} \tilde{\mathcal {J}}^{(1)}(u) = \sum _{i = 1}^{N} \tilde{\delta }_{i}^\prime (u). \end{aligned}$$
(19)

This approximation only involves the left-hand side of the linear system in the Newton iterations. However, it yields the exact same Jacobian as if the linear additive Schwarz method was applied instead of ASPIN, which is convenient from an implementation perspective. We note that in the presence of counter-current flow due to gravitational or capillary forces, the upstream weighting of mobilities implies that the assumption of continuity of J(u) and \(\delta _i(u)\) does not hold. In these cases, we expect that other approaches (e.g., Li and Tchelepi 2015), which provide better treatment of numerical discontinuities and thus improved Newton convergence, also will be beneficial for ASPIN in terms of justifying Eq. (19).

3.4.2 Previous Work on Two-Level ASPIN

Two-level ASPIN was first introduced in Cai et al. (2002), where the exact coarse solution \(U^*\) was assumed to be known uniquely through a preprocessing step. While in general the coarse solution cannot be assumed a priori known, it may be available if a suitably accurate upscaled method exists. The preconditioned residual then takes the form

$$\begin{aligned} \mathcal {F}^{(2)}(u) = T_0(u) - T_0(u^*) + \mathcal {F}^{(1)}(u), \end{aligned}$$
(20)

where the coarse function \(T_0(u)\) is defined by

$$\begin{aligned} T_0(u) = R_0 T^C(u), \end{aligned}$$
(21)

and \(T^C(u)\) is found by solving the coarse nonlinear system

$$\begin{aligned} F_0(T^C(u)) = C_0 f(u). \end{aligned}$$
(22)

A variant of two-level ASPIN with a linear coarse solver was introduced in Hwang and Cai (2007). The linear coarse system is obtained by linearizing the nonlinear coarse problem of Cai et al. (2002) using a first-order Taylor approximation around \(u^*\). This is computationally cheaper than solving a nonlinear coarse system and was claimed to have similar scalability and robustness properties as the original two-level ASPIN method.

In contrast to previous work, we are interested in problems where the coarse equations are not known a priori, but are obtained as part of the solution process. This is of particular importance for flows in porous media, where upscaling methods frequently only have a limited range of applicability (Farmer 2002; Court et al. 2012). Furthermore, we will retain the full nonlinear coarse equations, as it is known from problems where upscaled equations exist that these equations are frequently nonlinear (see, e.g., Nordbotten and Celia 2012). Also, while the coarse solution is assumed to be known a priori in Cai et al. (2002) and related works, we will not need any explicit representation of it at all in our approach, as will be seen in the following section.

4 Nonlinear Two-Level Preconditioning

Single-level ASPIN provides automatic localization of computations to physically challenging subdomains (Skogestad et al. 2013), and the domain may be partitioned in ways that honor structures in the parameter field, but the form of the preconditioners and solvers is the same regardless of the nature of the problem. Our interest is in providing coarse spaces which are conceptually equivalent to the subdomains structures in that they can be formulated in a framework of compressions and reconstruction operators. However, we will allow for an extended class of reconstruction operators which may incorporate physical insight into the construction of the coarse system.

At the same time, when facing a specific physical problem, the existence of successfully upscaled equations to a similar problem may provide the user with knowledge about the dominating physical processes in the system. If this knowledge could be utilized by the solver, in a consistent way, a considerable potential for computational savings could be realized. The potential largely lies in the reconstruction operators, which in a more general form than the linear operators that are used in standard two-level ASPIN may allow for physics-based preconditioners that incorporate user intuition, and ultimately bridges the gap between nonlinear preconditioning and upscaling.

This section is devoted to the presentation of a new coarse solver that exploits more of the physics in the model while keeping good scalability properties. This solver will form the coarse component of a nonlinear preconditioner based on the ASPIN framework presented in the previous section. For the sake of simplicity, we have chosen to let the coarse grid resolution equal the subdomain resolution so that one coarse cell corresponds to exactly one subdomain and vice versa. In general, these may be different.

4.1 Generalized Reconstruction Operators

In the following, we do not make any a priori assumptions on the form of the compression and reconstruction operators between the coarse and fine levels. This is especially relevant for the reconstruction operator, since the choice of this operator directly determines the fine-scale representation of the coarse solution. The compression operator will typically be some sort of averaging or subsampling operator, as is common in the literature, and hence the focus will be on the reconstruction operator. The motivation here is to develop reconstructions that go beyond simple linear prolongation, allowing for incorporating features of the physical system in the reconstruction, which in limiting cases essentially may provide the true solution from solving only the coarse system. Examples of nonlinear reconstruction operators in porous media applications can be found in the context of vertically integrated models (Nordbotten and Celia 2012).

We formulate the coarse problem the same way as the subdomain problems (14), that is: We do not make the assumption that the true coarse solution is known, but rather derive a coarse system of equations which are consistent with the compression and reconstruction operators.

The coarse problem is then formulated as

$$\begin{aligned} F_0(u - \delta _0(u)) = 0, \end{aligned}$$
(23)

where the coarse problem is simply defined as \(F_0 = \mathcal {C} F\). Thus, we can formulate the two-level method exactly the same way as the single-level method, except that counters start at 0 instead of 1, with 0 indicating the coarse level,

$$\begin{aligned} \mathcal {F}^{(2a)}(u) = \sum _{i = 0}^{N} \delta _{i}(u). \end{aligned}$$
(24)

Note that the original two-level ASPIN method also can be stated like this by setting \(\delta _0(u) = T_0(u) - T_0(u^*)\). However, the underlying assumptions for the coarse contribution are different in the two cases.

An important aspect of treating the coarse scale directly in the framework of updates is that the coarse solution itself is not needed at all, since the output of the nonlinear coarse solvers, \(\delta _0(u)\), is the update to the current fine-scale state as reconstructed from the coarse update. This is a favorable feature, as the underlying fine-scale structure of the solution from the previous global fine-scale step is retained. The distribution of the update from the coarse solver is then determined by the form of the reconstruction operator.

For the solvability of Eq. (23), we need to specify the relationship between a low-dimensional coarse variable and the update, which is given by the reconstruction operator

$$\begin{aligned} \delta _0(u) = \mathcal {R}_\varDelta (\varDelta U;u), \end{aligned}$$
(25)

Using the reconstruction operator together with Eq. (23), we can write the coarse problem explicitly as:

$$\begin{aligned} \mathcal {C} F(u - \mathcal {R}_\varDelta (\varDelta U;u)) = 0. \end{aligned}$$
(26)

The Jacobian \(J_0(\varDelta U)\) of this system then can be found by using the chain rule on this expression,

$$\begin{aligned} J_0(\varDelta U) = -\mathcal {C} J(u) \frac{\partial \mathcal {R}_\varDelta }{\partial \varDelta U}. \end{aligned}$$
(27)

It is evident from this formula that the complexity of the coarse scale Jacobian largely depends on the complexity of the reconstruction operator.

4.2 Relating Coarse and Local Solvers

While it is natural to formulate the coarse system in the same notation as the subdomain solves, they still inherit disparate characteristics, as the subdomain solves are local while the coarse solver is global. This is naturally reflected in the preconditioning. Indeed an important feature of the additive Schwarz method is that all the subproblems, including the coarse problem, may be solved in parallel, since there is no dependence on the state on the current iteration step in the method. However, here it is desirable to solve the coarse system in a separate step before solving the local fine-scale problems, in order to take advantage of the information contained therein. This can be done by incorporating elements of multiplicative Schwarz, which uses the latest available information for each subdomain solve, such that the kth subdomain solver uses subdomain information from the latest step for the \(k-1\) first subdomains and from the previous step for the other subdomains. A multiplicative counterpart of ASPIN was demonstrated in Ernst et al. (2009).

By solving the coarse problem first on each global iteration step, the coarse scale update may be transferred to the subdomains before starting the subdomain solves. Then the local fine-scale problems may be solved, using the updated coarse scale state. This approach is a hybrid of the additive and multiplicative methods, with a multiplicative update from coarse to fine-scale and additive subdomains solves. The preconditioned residual then takes the form

$$\begin{aligned} \mathcal {F}^{(2m)}(u) = \delta _0(u) + \sum _{i=1}^N \delta _i(u-\delta _0(u)), \end{aligned}$$
(28)

and the Jacobian of the preconditioned system is the derivative of this,

$$\begin{aligned} \tilde{\mathcal {J}}^{(2m)}(u) = \tilde{\delta _0^\prime }(u) + \sum _{i=1}^N \tilde{\delta _i^\prime }(u-\delta _0(u)), \end{aligned}$$
(29)

where the tilde denotes the same approximation as in Eq. (18). The first term is the derivative coarse update, and the sum on the right-hand side of Eq. (29) is the single-level Jacobian evaluated at the updated state after the coarse solve, \(\tilde{\mathcal {J}}^{(1)}(u - \delta _0(u))\).

4.3 Algorithm

The two-level method described in this section may be summarized in the form of an algorithm. Note that the one-level method is obtained by skipping the coarse step 1(a) and looping from 1 instead of 0.

  1. 1.

    Compute the nonlinear residual \(\mathcal {F}^{(2m)}(u^{(n)})\):

    1. (a)

      Find the coarse update \(\delta _0^{(n)} = \delta _{0}(u^{(n)})\) by solving the coarse nonlinear system

      $$\begin{aligned} F_{0}(u^{(n)} - \delta _0^{(n)}) = 0, \end{aligned}$$
      (30)

      starting from initial guess \(\delta _0^{(n)} = 0\).

    2. (b)

      Terminate and proceed to next time step if \(F((u^{(n)} - \delta _0^{(n)}) < \mathrm {tolerance}\) (set \(u^{(n+1)} = u^{(n)} - \delta _0^{(n)}\))

    3. (c)

      Find local updates \(\delta _i^{(n)} = \delta _{i}(u^{(n)} - \delta _0^{(n)})\) by solving the local nonlinear systems

      $$\begin{aligned} F_{i}(u^{(n)} - \delta _0^{(n)} - \delta _i^{(n)}) = 0, \quad i = 1,\ldots ,N \end{aligned}$$
      (31)

      starting from initial guess \(\delta _i^{(n)} = 0\).

    4. (d)

      Form the global residual

      $$\begin{aligned} \mathcal {F}^{(2m)}(u^{(n)}) = \sum _{i = 0}^N \delta _i^{(n)}. \end{aligned}$$
      (32)
  2. 2.

    Find the updated global solution \(u^{(n+1)}\):

    1. (a)

      Find the global Newton direction \(s^{(n)}\) by solving

      $$\begin{aligned} \tilde{\mathcal {J}}^{(2m)}(u^{(n)}) s^{(n)} = -\mathcal {F}^{(2m)}(u^{(n)}), \end{aligned}$$
      (33)

      where \(\tilde{\mathcal {J}}^{(2m)}(u^{(n)})\) is the Jacobian of \(\mathcal {F}^{(2m)}(u^{(n)})\), inexactly using a Krylov subspace iterative method

    2. (b)

      Compute the new approximate solution

      $$\begin{aligned} u^{(n+1)} = u^{(n)} + s^{(n)}. \end{aligned}$$
      (34)
  3. 3.

    Terminate and proceed to next time step if \(F(u^{(n)}) < \mathrm {tolerance}\).

  4. 4.

    Go to 1.

Note that as the Jacobian in the linearized system in step 2(a) above involves the new coarse solver, the effect of the coarse component can be expected to be seen also on the performance on the linear solver. The stopping condition 1(b) gives the option of stopping the iteration before entering the fine-scale solves if the reconstructed coarse updates provide a sufficiently accurate fine-scale solution.

5 Validation of Coarse Solver

In this section, we establish the applicability of two-level nonlinear preconditioning for multiphase flow in porous media. This forms an essential prerequisite for further work in this direction. Specifically, we test two aspects of the scalability properties of our ASPIN solver: Weak scalability is tested by measuring solver performance as the number of subdomains with fixed size increases, thus increasing the problem size. Conversely, strong scalability is tested by increasing the number of subdomains for a problem of fixed size.

Throughout this section, we will consider the simplest possible coarse compression and reconstruction operators: The compression operator is taken as the subdomain average of both the pressure and the saturation equation, while the reconstruction operator is taken as the constant update, again of both pressure and saturation. Thus, the coarse system is a fully coupled nonlinear system, with the coarse updates of pressures and saturations as primary variables. As both the fine-scale system and the compression and reconstruction are conservative, the coarse scale equations express conservation on the coarse cells. Note that despite these operators being linear, Sect. 4 shows that the resulting coarse system nevertheless inherits the nonlinear nature of the fine-scale equations.

5.1 Simulation Setup

For these scalability tests, 2D simulations based on permeability and porosity fields from the 10th SPE Comparative Solution Project (Christie and Blunt 2001) are used. These fields represent near-shore sand layers (layers 1–35) and channelized layers (layers 36–85). A few sample layers are illustrated in Fig. 1. We have picked 35 layers from each section (layers 1–70) for the scaling tests. For reference, a homogeneous permeability field was also considered. For each of the layers, we have simulated unstable displacement with gravity effects, for instance representing water injection into a oil-filled reservoir which is tilted with respect to the horizontal plane. There is an injection well at the top boundary and a production well at the bottom boundary, both with a constant rate and placed in the cells with highest permeability along the respective boundaries. The density ratio is 0.8, and the viscosity ratio is 1/100, with the less viscous fluid displacing the more viscous fluid. Gravity is chosen to make the gravity number (Yortsos 1995),

$$\begin{aligned} N_g = \frac{l_y k g (\rho _w - \rho _n)}{l_x \mu _n q_T}, \end{aligned}$$
(35)

close to 1. Here, \(l_x\) and \(l_y\) are the spatial dimensions of the domain. The gravity vector points in the negative y direction. This choice assures that neither gravity nor any other physical feature is dominating.

Fig. 1
figure 1

SPE10, examples of sand (left) and channelized (right) permeability layers (layers 20 and 67, respectively). Top Permeability fields; Bottom saturation profiles after 0.1 pvi with injection at left, production at right and gravity pointing rightward

The time step is initially set to \(\varDelta t_G = t_{max}\) and is halved if a Newton iteration (global, local, or coarse) does not converge within a prescribed number of steps. This limit is set to 10 steps in all the simulations presented here. If a time step of length \(\varDelta t < \varDelta t_G\) is successful, the next step length is doubled. In our previous work on single-level ASPIN Skogestad et al. (2013), it was shown that the ASPIN method outperformed linear domain decomposition, requiring both a lower number of global Newton iterations and fewer time steps in the simulation. As discussed in the next subsection, we expect the two-level method to be as good as the single-level ASPIN, and hence keep the advantage over traditional Newton approaches.

For the sake of simplicity, no overlap between the subdomains has been used in these simulations. The global fine-scale Newton iteration, which is the outermost iteration loop, has an absolute tolerance of \(10^{-6}\) on the residual error. The local and coarse Newton iterations and the GMRES iteration used for solving the global linear systems all have an absolute tolerance of \(10^{-7}\). Local and coarse linear systems are solved using a direct solver, more precisely the built-in backslash operator in MATLAB.

We have employed a numerical implementation of the derivative of the reconstruction operator, as this makes it simple to switch between different operators. The implementation of the Jacobian J(u) itself is based on analytical expressions obtained by differentiating \(F_c(u)\).

For each test case, the problem is solved using one-level ASPIN and the new two-level method presented above. As a reference, the problems are also solved without any preconditioner.

Of the 4899 simulations in the test suite, 23 stagnated, typically due to failure to converge within the maximum allowed number of iterations. Our experience is that stagnation is associated with poorly connected flow paths between the injector and producer, as exemplified for both the sand and channelized layers in Fig. 1. This must be seen in the context of the rectangular structure of the subdomains used in this study, which as a consequence do not conform to the permeability field. This leads to subdomains with highly heterogeneous properties, in particular for the channelized layers.

It is well known that for optimal performance of two-level methods, either enhanced coarse spaces or unstructured aggregation needs to be applied (see, e.g., Scheichl and Vainikko (2007)). The use of either such strategy would add significant complexity to the current study and would detract from the focus on the nonlinear flow effects. As a consequence, we have instead chosen to remove all simulations associated with data points (defined by permeability field and subdomain resolution) where any method stagnates.

5.2 Performance Indicators

To validate our nonlinear preconditioner, two comparisons are natural to make. First, we expect a considerable performance enhancement of the preconditioned systems compared to simulations with no preconditioner. Of more interest is a comparison between one- and two-level methods. As discussed in Sect. 4.1, the coarse solver acts on both the nonlinear and linear residual. As the cost of the global linear system is dominated by the global couplings in the elliptic part of the governing equations, the introduction of a coarse component in the linear preconditioner should improve the performance of the global GMRES iterations. To verify this, we measure the number of global GMRES iterations per time step, averaged over the sand and channelized layers, respectively.

While the linearized version of the governing equations has a strong nonlocal component in the form of the elliptic pressure equation, the nonlinearities arise from the hyperbolic components of the system and are thus predominantly local in nature. For the problem setup and coarse system considered in this section, there is therefore only a moderate gain by considering a nonlinear (as opposed to linear) coarse solver. We nevertheless expect the two-level preconditioner to be as good as the one-level version in terms of nonlinear iterations, and we test this by measuring the number of global Newton solves per time step, again considering separate averages over sand and channelized layers.

5.3 Weak Scalability

For each of the preconditioning methods, the simulations are run on seven different subdomain configurations, ranging from 2 to 320 subdomains, with subdomain size fixed at 15*11 cells. For domain sizes larger than one SPE10 layer (60*220 cells), the permeability and porosity fields are mirrored around the outer edges to create larger domains with similar properties. Note that since the smaller domain sizes only contain a portion of the whole SPE layer, the properties of the problem may change significantly during a single scalability test. Also note that in the case of no preconditioner, the concept of subdomains makes no sense, but the total problem size is increased correspondingly.

Fig. 2
figure 2

Weak scaling performance for homogeneous, SPE10 sand and channel layers. 1*2, 1*4, 2*5, 2*10, 3*11, 4*20, 8*40 subdomains. Left linear problem, GMRES iterations versus number of subdomains; middle nonlinear problem, GMRES iterations per time step versus number of subdomains; right nonlinear case, global fine-scale Newton iterations per time step versus number of subdomains. Black no preconditioning; blue one-level ASPIN; red two-level ASPIN. Continuous lines homogeneous permeability; dash-dotted lines SPE10 sand layers; dashed lines SPE10 channel layers. Error bars represent the standard deviation in the results

Since we expect the performance enhancement introduced by the coarse level to be mainly associated with the linear iterations, we first consider only the elliptic part of the model problem, Eq. (1). This is a linear problem, and the test will serve as a benchmark for the more realistic tests which follow. The simulations without a preconditioner become prohibitively time-consuming for large problems, and thus the largest domain size omitted. For more realistic and challenging tests, we perform the same tests on the full nonlinear time-dependent problem, simulating an injection of 0.1 pvi for each test. Here even the one-level preconditioner becomes prohibitively time-consuming for the largest simulations.

The results of the simulations are summarized in Fig. 2. As expected, both the one- and two-level ASPIN methods outperform the case of no preconditioner in all metrics considered. For the preconditioned solvers, we observe that as expected, the performance of the linear solver is significantly improved by the introduction of a coarse solver. For both the linear and nonlinear flow problems, this effect is clear for the homogeneous permeability and the sand layers in SPE10. In all these cases, the number of GMRES iterations per time step is virtually independent of the number of subdomains. The improvement is less for the channelized layers, where the suboptimal coarse grids cause the number of iterations to increase with the number of subdomains. In all cases, the introduction of a coarse solver is a substantial improvement compared to the one-level preconditioner, which experiences an increasing number of iteration even for a linear problem on a homogeneous permeability field.

In terms of nonlinear solves, the introduction of a coarse component does not introduce significant changes in performance; both preconditioners perform well in terms of the number of Newton iterations per time step. This confirms the hypothesis that for the present system, the nonlinearities are treated locally, and the coarse nonlinear solver has minor effect on the number of Newton iterations.

5.4 Strong Scalability

To complement the test of weak scalability test, we also investigate the performance under strong scalability. In these tests, we have abandoned the unpreconditioned method, as the simulations would not change with number of subdomains. As can be seen in Fig. 3, the number of linear iterations needed with the two-level preconditioner is more or less independent of the subdomain configuration, confirming the finding from the weak scalability test. This is in contrast with the one-level preconditioner, which experiences an increase in the number of iterations as the coarse partitioning becomes finer. The two-level method scales well also with respect to Newton iterations, whereas the one-level method has some scalability issues. In fact, the number of iterations needed seems to have a slightly decreasing trend for higher numbers of subdomains. With the combination of a partitioning strategy that does not recognize geological features and the reconstruction of the coarse updates used here, the saturation is smeared out over entire subdomains, even to single-phase regions. However, as the subdomain resolution becomes finer, this effect becomes smaller, which may explain the somewhat surprisingly good result. Thus, the potential for improving the result by introducing more physically sound partitioning strategies and reconstructions seems to be larger for the coarser partitionings. An exception to this pattern is the nonlinear iterations trend for the homogeneous case. This makes sense, as the flow in this case will be more spread out across the domain in the absence of natural pathways generated by heterogeneities in the permeability field. As the one-level method here is the same as is referred to as ’nonlinear preconditioner’ in Sect. 4.4 of (Skogestad et al. 2013), these findings also complement that study by directly addressing the scalability issues raised there, in terms of both linear and nonlinear iterations. The rectangles in Figs. 2 and 3 correspond to the intersection of the curves, that is, exactly one SPE10 layer with 80 subdomains. Figure 4 compares the residuals of the saturation after the coarse step for a coarse (top) and a fine (bottom) subdomain resolution, at the same time step as shown in Fig. 1. It is evident that the finer resolution increases the quality of the reconstructed coarse updates, and the effect of the linear reconstruction that in practice smears out the saturation update over the subdomains is clearly visible in the top figures.

Fig. 3
figure 3

Strong scaling performance for homogeneous, SPE10 sand and channel layers. 1*4, 2*4, 2*10, 4*10, 4*20, 6*20, 10*20, 10*44 subdomains. Left Linear problem, GMRES iterations versus number of subdomains; middle nonlinear problem, GMRES iterations per time step versus number of subdomains; right nonlinear case, global fine-scale Newton iterations per time step versus number of subdomains. Black No preconditioning; blue one-level ASPIN; red two-level ASPIN. Continuous lines homogeneous permeability; dash-dotted lines SPE10 sand layers; dashed lines SPE10 channel layers. Error bars represent the standard deviation in the results

Fig. 4
figure 4

SPE10 layers 20 (sand, left) and 67 (channels, right). Top logarithm of saturation residual after coarse solve, 1*4 subdomains; bottom logarithm of saturation residual after coarse solve, 10*44 subdomains

5.5 Extensions to More General Problems

At this point, comments should be made on the extension of the methods to more general physical setups, specifically to 3D simulations and the introduction of capillary forces and compressibility. In principle, the extension of the nonlinear preconditioning to three dimensions can be carried out in a straightforward manner. However, we expect the need for unstructured aggregation of the coarse domains to be more pronounced when the third spatial dimension is introduced. Even more than for the two-dimensional simulations, it is therefore desirable that the coarse domains are constructed by graph partitioning methods (Karypis and Kumar 1998).

With regard to additional physical processes, capillary forces will smoothen the saturation field and should therefore make the simple reconstruction operators applied herein more accurate. If a more general reconstruction is applied, capillary forces should be included when appropriate. As previously noted, capillary forces may also introduce counter-current flow, bringing the validity of the approximation (19) into question; the practical impact of this is not clear. The nonlinearities introduced by compressibility should not pose major difficulties for the nonlinear solvers, as they tend to be either rather weak, as in the case of \(\mathrm {CO}_2\) sequestration, or localize in space, and thus be well suited for the local solves.

6 Summary

A new two-level nonlinear domain decomposition framework based on the ASPIN method has been proposed and demonstrated through computational examples. The approach is generally applicable to nonlinear problems; however, the emphasis of this study is to confirm the applicability to multiphase flow in porous media. Our numerical examples verify that the coarse solver improves the scalability properties of the method compared with single-level preconditioning.

This study focuses on establishing the suitability of nonlinear multilevel preconditioning for porous media. Further improvements can be expected by combining this preconditioning with dedicated approaches for multiphase flow in porous media, such as reordering (Kwok and Tchelepi 2007; Natvig and Lie 2008) and CPR (Wallis et al. 1985). It is also natural to consider a posteriori error estimators along the lines presented in Vohralík and Wheeler (2013) in order to optimize stopping criteria for the different iterative processes involved.

By introducing a framework that allows for nonlinear reconstruction operators, and the derivation of an explicit coarse equation, this work forms a first step in a larger effort to understand the relationship between upscaling and nonlinear solvers for multiphase flow in porous media.