1 Introduction

In this note, we consider a three-field formulation of the time-dependent Biot equations describing flow through an isotropic, porous and linearly elastic medium, reading as: find the elastic displacement u, the Darcy flux z and the (negative) fluid pressure p such that

$$\begin{aligned} - {\text {div}}\sigma (u) - \alpha \nabla p= & {} f, \end{aligned}$$
(1.1a)
$$\begin{aligned} \frac{1}{\kappa } z - \nabla p= & {} g, \end{aligned}$$
(1.1b)
$$\begin{aligned} \alpha {\text {div}}\partial _t u + {\text {div}}z - c_0 \, \partial _t p= & {} s, \end{aligned}$$
(1.1c)

for a given body force f, source s, and given g (typically \(g = 0\)) over a domain \(\varOmega \subset \mathbb {R}^d\) (\(d = 1, 2, 3\)). The expression \(\sigma (u)\) denotes the isotropic elastic stress tensor, \(\sigma (u) = \mu \varepsilon (u) + \lambda \mathrm {tr}\varepsilon (u)\), where \(\mathrm {tr}\) is the matrix trace. The material parameters are the elastic Lamé parameters \(\mu \) and \(\lambda \), the Biot-Willis coefficient \(\alpha \), the storage coefficient \(c_0 \ge 0\) and the hydraulic conductivity \(\kappa = K/\mu _f > 0\), in which K is the material permeability, and \(\mu _f\) is the fluid viscosity. Moreover, \(\varepsilon \) denotes the (row-wise) symmetric gradient, \({\text {div}}\) is the divergence, \(\nabla \) is the gradient, and \(\partial _t\) denotes the (continuous) time-derivative.

The three field formulation (1.1a1.1c) combines one scalar, time-dependent partial differential equation and two, stationary, vector partial differential equations. This combination of time-dependent and time-independent equations can lead to non-trivial issues when considering discretizations of the time derivative; as a result: several splitting scheme approaches have been proposed [7, 11, 18, 21, 32]. In this manuscript we will focus on a monolithic approach, namely a straightforward backward Euler scheme, where all unknowns are solved for simultaneously. In the case of monolithic time discretization schemes: robustness with respect to material parameters in spatial discretizations of (1.1) is a central concern and has been the topic of several recent investigations; c.f.  e.g. [17, 19, 20, 30]. A notable difficulty, both practically and theoretically, is that the parameter \(\lambda \) may be very large, while \(\kappa \) may be very small. The former corresponds to the (nearly) incompressible regime, while the latter corresponds to the (nearly) impermeable regime. Special care is required in the formulation and analysis of discretizations of (1.1) to retain stability and convergence within these parameter ranges.

Thus far, authors have analyzed mixed discretizations of (1.1) in the nearly-incompressible, and nearly-impermeable parameter regimes separately. For instance, a mixed discretization based on a total-pressure formulation [21,22,23,24, 28] has been well-studied and addresses the case of \(\lambda \rightarrow \infty \). In the context of vanishingly small hydraulic conductivity, the concept of a Stokes–Biot stable discretization has emerged [17, 22, 27, 30] as a guide for the design of discrete schemes that retain their convergence properties as \(\kappa \rightarrow 0\).

Remark 1.1

It is worth noting that the term Stokes–Biot stability refers, in the contemporary literature, to a particular type of dual inf-sup condition. It does not allude to an interface problem; that is, one should not confuse this term with that of a ‘Stokes-Darcy problem’, which refers to coupled Stokes and Darcy flow at an interface.

1.1 An intuition for the Stokes–Biot stability condition

To motivate an intuitive view on the current notion of Stokes–Biot stability, we begin by considering a three-field variational formulation of a related system of (time-independent) equations: find \(u \in U\), \(z \in W\), and \(p \in Q\) such that

$$\begin{aligned} \left( \sigma (u), \varepsilon (v)\right) + \left( {\text {div}}v, p\right)&= \left( f, v\right)&\quad \forall \,v \in U, \end{aligned}$$
(1.2a)
$$\begin{aligned} \tau \kappa ^{-1} \left( z, w\right) + \tau \left( {\text {div}}w, p\right)&= \tau \left( g, w\right)&\quad \forall \,w \in W, \end{aligned}$$
(1.2b)
$$\begin{aligned} \left( {\text {div}}u, q\right) + \left( \tau {\text {div}}z, q\right) - \left( c_0 p, q\right)&= \left( \tau s + {\text {div}}\bar{u} - c_0 \bar{p}, q\right) ,&\quad \forall \,q \in Q, \end{aligned}$$
(1.2c)

for given \(f, g, s, \bar{u}, \bar{p}\) and with \(\left( \cdot , \cdot \right) \) denoting the standard \(L^2\)-inner product over the domain \(\varOmega \). The continuous formulation (1.2) is representative of the equations resulting from an implicit Euler time discretization of (1.1) with time step \(\tau > 0\) and a prescribed set of homogeneous boundary conditions. To continue, set \(\tau = 1\); in this case we refer to (1.2) as (a mixed variational formulation of) a steady equation of Biot type; that is, the left-hand side is free of any time derivatives. The system (1.2) forms a generalized saddle-point system which can be informally related [27] to a stand-alone Stokes-like, and stand-alone mixed-Darcy system. For the former, multiply (1.2b) by \(\kappa \), take \(\kappa = c_0 = 0\) and assume \(s = {\text {div}}\bar{u} = 0\). If (uzp) solves (1.2) under these conditions, then \(z = 0\) almost everywhere and and (1.2) reduces to: find \(u \in U\) and \(q \in Q\) such that:

$$\begin{aligned} \left( \nu \varepsilon (u), \varepsilon (v)\right) + \left( {\text {div}}v, p\right)&= \left( f, v\right) , \end{aligned}$$
(1.3a)
$$\begin{aligned} \left( {\text {div}}u, q\right)&= \left( {\text {div}}\bar{u}, q\right) = 0, \end{aligned}$$
(1.3b)

for all \(v \in U\) and \(q \in Q\), with \(\nu = 2 \mu \). On the other hand, if \(c_0 = 0\) and the solution (uzp) to (1.2) satisfies \({\text {div}}u = 0\) then (zp) solve the mixed Darcy problem: find \(z\in W\) and \(p \in Q\) such that

$$\begin{aligned} \left( \kappa ^{-1} z, w\right) + \left( {\text {div}}w, p\right)&= \left( g, w\right) , \end{aligned}$$
(1.4a)
$$\begin{aligned} \left( {\text {div}}z, q\right)&= \left( \tilde{s}, q\right) , \end{aligned}$$
(1.4b)

for all \(w \in W\) and \(q \in Q\) for given \(g, \tilde{s}\). These observations hint at a close relationship between the Stokes equations, Darcy equations and the (steady) Biot-like system (1.2). With this background: the Stokes–Biot stability concept [17, 22, 27, 30] introduces two conditions for finite element discretizations \(U_h \times W_h \times Q_h\) of (1.1) or (1.2):

  1. (i)

    the displacement-pressure pairing \(U_h \times Q_h\) is a stable pair, in the sense of Babuška-Brezzi [6]), for the incompressible Stokes equations (1.3),

  2. (ii)

    the flux-pressure pairing \(W_h \times Q_h\) is a stable pair for the mixed Darcy problem (1.4).

1.2 The Darcy assumption of Stokes–Biot stability

Stokes–Biot stable discrete schemes should retain their convergence properties even when \(\kappa \rightarrow 0\). Indeed, a-priori error estimates, in appropriate parameter-dependent norms, have been advanced for both non-conforming [17, 20, 22] and conforming [30] discretizations of (1.1) or (1.2) satisfying the Stokes–Biot conditions (i) and (ii). Consider a numerical test with two closely-related choices of discrete spaces; the finite element pairings \(P_2^d\times RT_0 \times DG_0\) (product space of continuous piecewise quadratic vector fields, lowest order Raviart-Thomas elements and piecewise constants) and \(P_1^d \times RT_0 \times DG_0\). The former pairing satisfies conditions (i) and (ii) above (for given \(\kappa > 0\)), and is observed to converge even for \(\kappa \ll 1\), see e.g. [30] or Table 1a. The latter pairing, which violates condition (i), can easily fail to converge when \(\kappa \) is sufficiently small (c.f.  [30, Table 2.1] or [27, Section 6]). This numerical observation demonstrates that condition (i), Stokes stability, is indeed an integral player in discretizations of (1.1) that retain their convergence behaviour as \(\kappa \rightarrow 0\).

Table 1 Relative approximation errors for the displacement \(\Vert \tilde{u} - u_h\Vert _1/\Vert \tilde{u}\Vert _1\) (top three rows in each table), pressure \(\Vert \tilde{p} - p_h \Vert /\Vert \tilde{p} \Vert \) (middle three rows) and flux \(\Vert \tilde{z} - z_h\Vert _{{\text {div}}}/\Vert \tilde{z}\Vert _{{\text {div}}}\) (bottom three rows) for varying \(\kappa \) on a series of uniform meshes \(\mathcal {T}_h\) with mesh size h

The importance of the Stokes stability condition is not surprising from a theoretical perspective. In the early Stokes–Biot literature, condition (i) plays a formative role [30] in showing that Euler–Galerkin discretizations of (1.1) remain inf-sup stable as \(\kappa \rightarrow 0\). Conversely, the Darcy stability condition (ii) is used to construct a projection that facilitates an a-priori analysis; the condition is not used in the stability argument. This raises the question: is condition (ii) necessary to guarantee convergence as \(\kappa \rightarrow 0\)? This question is important; the Darcy stability condition can easily fail to hold uniformly in \(\kappa \ll 1\), thereby placing the previous analytic projection technique on questionable grounds. This observation was implicitly noted by other authors; c.f. for instance [17, Rmk. 5]. More precisely, the continuous mixed Darcy problem (1.4) does not satisfy the Babuska-Brezzi conditions [6] with bounds independent of \(0 < \kappa \ll 1\) in the standard \(H({\text {div}}) \times L^2\) norm. To compensate, permeability-weighted flux and pressure norms, such as e.g. \(\kappa ^{-1/2} H({\text {div}})\times \kappa ^{1/2} L^2\), have been suggested as viable alternatives [17]. However, resorting to a permeability-weighted pressure space is not entirely satisfactory; the relation between (1.2) and the Stokes Eq. (1.3), resulting from \(\kappa \rightarrow 0\), points at \(p \in L^2\) rather than \(p \in \kappa ^{1/2} L^2\).

Moreover, numerical experiments demonstrate convergence of the pressure in the \(L^2\)-norm even for diminishing \(\kappa \), see e.g. Table 1a for the pairing \(P_2^d \times RT_0 \times DG_0\). Conversely, consider the pairing \(P_2^d \times P_1^d \times DG_0\) which violates the Darcy condition (ii), for any \(\kappa > 0\), and thus does not satisfy the Stokes–Biot stability conditions. However, numerical experiments with this pairing, see Table 1b, show the hallmark of Stokes–Biot stable schemes. That is, they appear stable, with the displacement and pressure errors converging at comparable rates as for \(P_2^d \times RT_0 \times DG_0\), for small \(\kappa \); this behaviour even holds when \(c_0 = 0\). These observations call into question the precise role of the Darcy stability assumption in conforming mixed finite element discretizations of (1.1) or (1.2).

1.3 Stokes–Biot stability revisited

In this manuscript, we advance a theoretical point. Namely, that a full Darcy inf-sup assumption is not necessary and can be relaxed; at least in the case of conformal Euler–Galerkin discretizations of (1.1) or (1.2). Instead, we will see is that the following two assumptions are key:

  1. (I)

    the displacement-pressure pairing \(U_h \times Q_h\) is a stable pair for the incompressible Stokes Eq. (1.3); and that

  2. (II)

    the inclusion \({\text {div}}W_h \subseteq Q_h\) holds.

We return to, and formalize, these minimal Stokes–Biot stability conditions in Sect. 3. In practice, the class of minimally Stokes–Biot stable discretizations are a superset of Stokes–Biot stable discretizations; one could then naturally consider dropping the distinction and, instead, viewing Stokes–Biot stability from this alternative point of view. We will show that the relaxed conditions produce schemes that retain their stability and convergence properties, in appropriate norms, as \(\kappa \rightarrow 0\); motivated by the literature in applied porous-media modeling, we also note that this holds true for applications where \(0 \le c_0 < 1\) is chosen independently of other parameters.

Our primary purpose in this manuscript is theoretical in nature. After introducing the relaxed conditions, we comment on the inf-sup stability and advance an a-priori analysis that does not employ a Galerkin projection technique; thus avoiding either an implicit dependence on \(\kappa ^{-1}\) in any projection estimates or the problematic question of uniform inf-sup Darcy stability as \(\kappa \rightarrow 0\). Unlike some previous endeavors of convergence estimates, we will conduct our estimates in the full norm used [17, 30] for the inf-sup stability. In particular, we introduce the norms in which Euler–Galerkin schemes, satisfying the relaxed conditions, are well posed and we show that the corresponding a priori error convergence rates: hold in the limit as \(\kappa \rightarrow 0\); and coincide with canonically expected rates for well known mixed three-field finite element paradigms (e.g. first order for discretizations using linear or Raviart–Thomas type flux approximations, etc). Our objective, in clarifying these nuanced issues, is to establish a more consistent theory of Stokes–Biot stable schemes and to demonstrate an alternative, but standard, approach for their convergence analysis; such a view may also lead to downstream advances in the design of more efficient numerical schemes. The remainder of this manuscript is organized as follows: Sect. 2 describes basic spaces and notation that will be used throughout; Sect. 3 overviews the current view of Stokes–Biot stability [17, 22, 27, 30]; Sect. 4 introduces a slight relaxation on the Stokes–Biot stable conditions and recalls a well-posedness argument for Euler–Galerkin discrete schemes; Sect. 5 is aimed at a priori estimates for discretizations satisfying the relaxed conditions; finally, Sect. 6 is a numerical example demonstrating the retention of convergence behaviour as \(\kappa \rightarrow 0\).

Remark 1.2

In this manuscript, we are concerned with discretizations that retain their stability and convergence as \(\kappa \rightarrow 0\). The case of \(\lambda \rightarrow \infty \) has been investigated separately [21,22,23,24, 28] by introducing a total pressure, \(\hat{p} = \lambda \nabla \cdot u - p\), to achieve robustness with respect to \(\lambda \) when \(\kappa \approx 1\) is assumed. This view is similar to Herrmann’s method [16], where a ‘solid pressure’ term \(p_s = \lambda \nabla \cdot u\), for elasticity systems in primal form with \(\mu \ll \lambda \). One may wonder if these methods can be brought together in a conformal setting. This has not yet been investigated in the literature, and this is not the question we investigate in this manuscript; our current focus is to further the understanding of Stokes–Biot stable discretizations.

2 Notation and preliminaries

2.1 Sobolev spaces and norms

Let \(\varOmega \subset \mathbb {R}^d\) for \(d = 1, 2, 3\) be an open and bounded domain with piecewise \(C^2\) boundary [26, 31, 35]. We will consider discretizations of \(\varOmega \) by simplicial complexes of order d. All triangulations, \(\mathcal {T}_h\) of \(\varOmega \), will be assumed to be shape regular with the maximal element diameter, also referred to as the mesh resolution or mesh size, of \(\mathcal {T}_h\) denoted by h.

We let \(L^2(\varOmega ; \mathbb {R}^d)\), \(H({\text {div}}, \varOmega )\) and \(H^1(\varOmega ; \mathbb {R}^d)\) denote the standard Sobolev spaces of square-integrable fields over \(\varOmega \), fields with square-integrable divergence, and fields with square-integrable gradient, respectively, and define the associated standard norms

$$\begin{aligned}&\Vert f \Vert ^2 = \left( f, f\right) , \\&\Vert f \Vert _{1}^2 = \left( f, f\right) _1 = \left( f, f\right) + \left( \nabla f, \nabla f\right) , \\&\Vert f \Vert _{{\text {div}}}^2 = \left( f, f\right) _{{\text {div}}} = \left( f, f\right) + \left( {\text {div}}f, {\text {div}}f\right) . \end{aligned}$$

with \(\left( \cdot , \cdot \right) _{\varOmega }\) denoting the standard \(L^2(\varOmega )\)-inner product. We will frequently drop the arguments \(\varOmega \) and \(\mathbb {R}^d\) from the notation when the meaning is clear from the context. The notation \(H^1_{\varGamma }(\varOmega )\) represents those functions in \(H^1\) with zero trace on \(\varGamma \subseteq \partial \varOmega \). Similarly, \(H_{\varGamma }({\text {div}}, \varOmega )\) denotes fields in \(H({\text {div}}, \varOmega )\) with zero (normal) trace on \(\varGamma \subseteq \partial \varOmega \) in the appropriate sense [4]. We also define the standard space of square-integrable functions with zero average:

$$\begin{aligned} L^2_0(\varOmega ) = \left\{ p\in L^2(\varOmega ) \quad | \quad \int _{\varOmega } p \, \mathrm {d} x= 0 \right\} . \end{aligned}$$

We will also use parameter-weighted norms. For a Banach space X and real parameter \(\alpha > 0\), the space \(\alpha X\) signifies X equipped with the \(\alpha \)-weighted norm \(\Vert f \Vert _{\alpha X} = \alpha \Vert f \Vert _{X}\). Finally, for a coercive and continuous bilinear form \(a : V \times V \rightarrow \mathbb {R}\), we will also write

$$\begin{aligned} \Vert v \Vert _{a}^2 = a(v, v). \end{aligned}$$

2.2 Intersections and sums of Hilbert spaces

Let \(X \subset Z\) and \(Y \subset Z\) be two Hilbert spaces with a common ambient Hilbert space Z. The intersection space, denoted \(X \cap Y\), is a Hilbert space with norm

$$\begin{aligned} \Vert x\Vert _{X \cap Y}^2 = \Vert x\Vert ^2_X + \Vert x\Vert ^2_Y . \end{aligned}$$

For instance, to illustrate our notation, the norm on the intersection space \(\kappa ^{-1/2} L^2 \cap H({\text {div}})\) is given by

$$\begin{aligned} \Vert v \Vert _{\kappa ^{-1/2} L^2 \cap H({\text {div}})}^2 = \Vert v \Vert _{\kappa ^{-1/2} L^2}^2 + \Vert v \Vert _{H({\text {div}})}^2 = \kappa ^{-1} \Vert v \Vert _{L^2}^2 + \Vert v \Vert _{H({\text {div}})}^2 . \end{aligned}$$

The sum space \(X+Y\) is the set \(\left\{ z=x+y \,\,|\,\, x\in X,\, y\in Y \right\} \) equipped with the norm

$$\begin{aligned} \Vert z\Vert ^2_{X + Y} = \inf _{\begin{array}{c} z=x+y \\ x \in X, y \in Y \end{array}}\Vert x\Vert _X^2 + \Vert y\Vert _Y^2, \end{aligned}$$

and is also a Hilbert space. See e.g. [3, Ch. 2] for a further discussion of sum and intersection spaces.

2.3 Operators

For a given time step size \(\tau \), times \(t^{m-1}\) and \(t^m\) and fields \(u^m \approx u(t^m)\) and \(u^{m-1} \approx u(t^{m-1})\), we will make use of a discrete derivative notation

$$\begin{aligned} \partial _{\tau } u^m = \frac{u^m - u^{m-1}}{\tau }. \end{aligned}$$
(2.1)

2.4 Finite element spaces

Now, suppose that \(\varOmega \subset \mathbb {R}^d\) is a polygonal and let \(C^k(\varOmega )\) denote the space of k-continuously differentiable functions defined on \(\varOmega \). Let \(D \subseteq \varOmega \) and let \(P^k(D) \subset C^{\infty }(D)\) denote the set of polynomials of total degree k defined on D. Let \(\mathcal {T}_h\) be a simplicial triangulation of \(\varOmega \) and let \(T\in \mathcal {T}_h\) be any simplex; we denote the restriction of a function f to \(T\in \mathcal {T}_h\) by \(f_T\). The notation for the Lagrange elements of order k used here is then

$$\begin{aligned} P_k(\mathcal {T}_h) = \left\{ f \in C^0(\varOmega ) \quad \vert \quad f_{T} \in P^k(T), \quad \forall \,T \in \mathcal {T}_h \right\} . \end{aligned}$$
(2.2)

The notation \(P_k^d(\mathcal {T}_h)\) will be used to represent the d-dimensional (vector) Lagrange spaces in \(\mathbb {R}^d\). The discontinuous Galerkin spaces of order k relax the overall continuity requirement of the Lagrange finite element spaces; they are defined by

$$\begin{aligned} DG_k(\mathcal {T}_h) = \left\{ f \in L^2(\varOmega ) \quad \vert \quad f_{T} \in P^k(T) \quad \forall \,T \in \mathcal {T}_h \right\} . \end{aligned}$$
(2.3)

A comprehensive discussion on Lagrange and discontinuous Galerkin elements and their interpolation properties can be found in e.g. [8] and [29] respectively. We will also make use of the Brezzi-Douglas-Marini and Raviart–Thomas finite element spaces [4, Sec. 2.3]. Throughout the rest of the manuscript we use the notation \(P_k\), \(P_k^d\), \(DG_k\), \(BDM_k\) and \(RT_k\) in reference to the spaces defined above; that is, we drop the additional mesh domain specification.

2.5 Boundary and initial conditions

General boundary conditions for (1.1) start by considering two distinct, non-overlapping partitions of the \(d-1\) dimensional boundary \(\partial \varOmega \). The first, corresponding to the displacement, is \(\partial \varOmega = \overline{\varGamma _c} \cup \overline{\varGamma _t}\) and the second, corresponding to the pressure, is denoted \(\partial \varOmega = \overline{\varGamma _p} \cup \overline{\varGamma _f}\); the non-overlapping condition means \(\varGamma _c \cap \varGamma _t = \emptyset \) and \(\varGamma _p \cap \varGamma _f = \emptyset \). The general form of the typical boundary conditions are then expressed as

$$\begin{aligned} \begin{array}{llcrr} u = 0, &{} \text { on } \varGamma _c, &{} \text { and } &{} z \cdot n =0, &{} \text { on } \varGamma _f,\\ p = 0, &{} \text { on } \varGamma _p, &{} \text { and } &{} \hat{\sigma }(u,p)\cdot n = 0 &{} \text { on } \varGamma _t, \end{array} \end{aligned}$$
(2.4)

where \(\hat{\sigma }(u,p) = \sigma (u) + p\,I_d\) and \(I_d\) is the \(d\times d\) identity matrix. We will consider a simplification of the boundary conditions, above. The simplification that we will consider is that which was studied in the defining work on Stokes–Biot stable discretizations [17, 19, 26, 30]. These conditions take \(\varGamma _f = \varGamma _c\) and \(\varGamma _p = \varGamma _t\) with the \(d-1\) dimensional Lebesgue measure \(|\varGamma _c| > 0\). Thus we have

$$\begin{aligned} \begin{array}{llcrr} u = 0, &{} \text { on } \varGamma _c, &{} \text { and } &{} z \cdot n =0, &{} \text { on } \varGamma _c,\\ p = 0, &{} \text { on } \varGamma _t, &{} \text { and } &{} \sigma (u)\cdot n = 0 &{} \text { on } \varGamma _t, \end{array} \end{aligned}$$
(2.5)

Let \(\eta (x,t)\) denote the fluid content with equation

$$\begin{aligned} \eta (x,t) = c_0 p(x,t) + {\text {div}}u (x,t). \end{aligned}$$

We follow [31] and remark: that under appropriate regularity assumptions on the sources and initial data, (i.e. source data in \(C^{\alpha }(0,T;(\varOmega ))\) where \(\alpha \) is the Biot-Willis coefficient, boundary data in \(C^{\alpha }(0,T;L^2(\partial \varOmega ))\), initial fluid content \(\eta (x,0)\in L^2(\varOmega )\), etc), then there exists a unique solution to (1.1) satisfying the boundary conditions [31, 35].

Remark 2.1

A full discussion on regularity details for the source, initial and boundary data can be found in [35, Theorem 1], and [31, Sect. 3 and 4]. We also note that the boundary conditions (2.5) reflect a restriction that may not be practical for many applications. These boundary conditions coincide with those initially considered in the Stokes–Biot stability literature (e.g. [30]) and allow the key ideas behind the Stokes–Biot (respectively, minimal Stokes–Biot) conditions, discussed in Sect. 3 (respectively, Sect. 4), to be discussed simply. A discussion of more general conditions can be found in Sect. 7, and in e.g. [17].

2.6 Material parameters

To facilitate the analysis here, we will assume that the material parameters of (1.1a)-(1.1c), i.e. \(\mu \), \(\lambda \), \(\alpha \), \(\kappa \), and \(c_0\), are constant in space and time. For simplicity and without loss of generality we set \(\alpha = 1\). This view can either be interpreted literally or as having divided (1.1a1.1c) through by \(\alpha \) to obtain rescaled material parameters. Moreover, one need not look far [12, 13, 17, 25, 27, 33, 34] to find applications where \(\kappa \) is small, and the storage coefficient \(c_0\) varies over a wide range of values in the presence of only modest choices of \(\lambda \). For instance, the literature contains examples of low hydraulic conductivities where both \(\lambda \) and \(c_0\) are approximately unity [17]; in various soft-tissues, \(\lambda \approx 10^2\) and \(c_0 \approx 10^{-5}\) have been used [12, 13], in addition to \(\lambda \approx 10^1\) or \(\lambda \approx 10^3\) with \(c_0 \approx 10^{-10}\) [25, 33], and even \(c_0 = 0\) [27, 34]. This wide variation in \(c_0\), while \(\lambda \) remains modest, can be due to several reasons: an ad-hoc modeling assumption; to simplify numerical methods when storage coefficients are near the limits of computing precision; or due to the fact that, especially in biological applications, measurements for certain parameters may be unavailable and values are often estimated, chosen, or substituted from those, of similar biological regime, for which reasonable parameter estimates are available.

This manuscript is only concerned with Stokes–Biot stable discretizations; these discretizations are designed to retain their stability and convergence properties in the presence of diminished hydraulic conductivity. Given the wide variety of storage coefficients which appear, in the applied literature, in the presence of values for \(\lambda \in [10^1,10^3]\) we take the view here that \(c_0\) and \(\lambda \) are independent parameters; this is not to assert that the linear poroelasticity theory does not imply that \(\lambda \rightarrow \infty \) as \(c_0 \rightarrow 0\). Rather, we do this to make a secondary, strictly-numerical observation: that Stokes–Biot stable schemes, and the relaxation we propose herein, also retain their stability and convergence properties as \(\kappa \rightarrow 0\) for every fixed choice of \(0 \le c_0 < 1\). We will therefore assume that \(0 < \kappa \le 1\) and \(0 \le c_0 < 1\) are fixed, but otherwise arbitrary, constants.

Remark 2.2

The bilinear forms defined in Sect. 3 are parameter-dependent. Thus, the arguments advanced in this manuscript may potentially be extended to parameters that vary in space or time, provided they satisfy suitable regularity requirements to justify the requiste manipulations. As parameter, or data, regularity is not the focus on the current work, we do not take up this issue herein; we belay the topic and consider constant (i.e. constant \(\mu \), \(\lambda \) and arbitrary but fixed \(0<\kappa \le 1\), and \(0\le c_0 \le 1\)) parameters.

3 The Stokes–Biot stability conditions for conforming Euler–Galerkin schemes

Combining the nature of (1.1) with the boundary conditions (2.5), we define the spaces

$$\begin{aligned} U = H^1_{\varGamma _c}(\varOmega ), \quad W = H_{\varGamma _c}({\text {div}}, \varOmega ), \quad Q = L^2(\varOmega ). \end{aligned}$$
(3.1)

We consider the following variational formulation of (1.1) over the time interval (0, T]: for a.e. \(t \in (0, T]\), find the displacement u, flux z and pressure p such that \(u(t) \in U\), \(z(t) \in Z\) and \(p(t) \in Q\) satisfy

$$\begin{aligned} a(u, v) + b(v,p)&= \left( f, v\right)&\quad v \in V, \end{aligned}$$
(3.2a)
$$\begin{aligned} c(z, w) + b(w,p)&= \left( g, w\right)&\quad w \in W, \end{aligned}$$
(3.2b)
$$\begin{aligned} b(\partial _t u, q) + b(z, q) - d(\partial _t p, q)&= \left( s, q\right)&\quad q \in Q. \end{aligned}$$
(3.2c)

The bilinear forms in (3.2) are given by:

$$\begin{aligned} \begin{aligned}&a(u, v) = \left( \sigma (u), \varepsilon (v)\right) , \quad b(u, q) = \left( {\text {div}}u, q\right) , \\&c(z, w) = \left( \kappa ^{-1} z, w\right) , \quad d(p, q) = \left( c_0 p, q\right) . \end{aligned} \end{aligned}$$
(3.3)

As noted in [30]: the existence and uniqueness of a solution (uzp) to (3.2), with continuous dependence on f, g and s, has been established by previous authors [26, 31, 35].

Remark 3.1

If Dirichlet conditions are imposed for the displacement on the entire boundary and thus the pressure is only determined up to a constant (i.e. if \(\varGamma _c = \partial \varOmega \)) we instead let \(Q = L_0^2\).

3.1 An Euler–Galerkin discrete scheme

Following [30] we consider Euler–Galerkin discretizations; i.e., conforming finite element spaces in space and an implicit Euler in time, of (3.2). Let \(0 = t_0< t_1< \cdots < t_N=T\) be a uniform partition of the time interval [0, T]. The constant time step is then \(\tau = \tau _m = t^m - t^{m-1}\). For the function f(tx), evaluation at \(t^m\) is denoted by \(f^m = f(t^m, x)\), and similarly for g and s. We define conforming discrete spaces

$$\begin{aligned} U_h \subset U, \quad W_h \subset W, \quad Q_h \subset Q. \end{aligned}$$
(3.4)

The Euler–Galerkin discrete scheme of Biot’s equations then reads as follows: for each time iterate \(m \in \left\{ 1, 2, \ldots , N \right\} \), given \(f^m\), \(g^m\), \(s^m\), \({\text {div}}u_h^{m-1}\), and, iff \(c_0 > 0\), \(p_h^{m-1}\), we seek \((u_h^m, z_h^m, p_h^m) \in U_h\times W_h\times Q_h\) such that

$$\begin{aligned} a(u_h^m, v) + b(v, p_h^m)&= \left( f^m, v\right) , \end{aligned}$$
(3.5a)
$$\begin{aligned} \tau c(z_h^m, w) + \tau b(w,p_h^m)&= \tau \left( g^m, w\right) , \end{aligned}$$
(3.5b)
$$\begin{aligned} b(\partial _{\tau } u_h^m, q) + b(z_h^m, q) - d(\partial _{\tau } p_h^m, q)&= \left( s^m, q\right) , \end{aligned}$$
(3.5c)

for all \(v \in U_h\), \(w \in W_h\) and \(q \in Q_h\), and where we have made use of the discrete derivative notation (2.1).

3.2 The Stokes–Biot stability conditions

The Stokes–Biot stability conditions were introduced independently, in slightly different contexts, by several authors [17, 22, 27, 30] and guide the selection of discrete spaces, \(U_h \times W_h \times Q_h\), for (3.5). We recall a succinct statement of the (conforming) Stokes–Biot stability conditions, used in analogous forms by all original authors [17, 22, 30], here for posterity:

Definition 3.1

(c.f. [30, Defn. 3.1]) The discrete spaces \(U_h \subset U\), \(W_h \subset W\) and \(Q_h \subset Q\) are called a Stokes–Biot stable discretization if and only if the following conditions are satisfied:

  1. (i)

    The bilinear form a, as defined by (3.3), is bounded and coercive on \(U_h\);

  2. (ii)

    The pairing \((U_h, Q_h)\) is Stokes stable;

  3. (iii)

    The pairing \((W_h,Q_h)\) is Darcy (Poisson) stable.

We remark that [17, 22] were not conforming. More precisely, the Stokes and Darcy stability assumptions of Definition 3.1 entail that the relevant discrete spaces are stable in the (discrete) Babuška-Brezzi sense [4, 6] for the discrete Stokes and Darcy problems, respectively. We will now examine the Darcy stability condition more closely.

3.3 The Darcy stability condition

The discrete Darcy problem reads as: find \((z_h, p_h) \in W_h \times Q_h\) such that (1.4) holds for all \(w \in W_h\) and \(q \in Q_h\). Assume that \(W \subseteq W\) and \(Q_h \subset Q\) are equipped with norms \(\Vert \cdot \Vert _{W}\) and \(\Vert \cdot \Vert _{Q}\), respectively. The space \(W_h \times Q_h\) is Darcy stable in the (discrete) Babuška-Brezzi sense if the discrete Babuška-Brezzi conditions are satisfied, in particular, if there exists constants \(\alpha > 0\) and \(\beta > 0\), independent of h, such that

$$\begin{aligned}&c(w, w) \ge \alpha \Vert w\Vert _{W}^2 \quad \forall \,w \in \ker b = \{ w \in W_h\, | \, b(w, q) = 0 \, \forall \,q \in Q_h\}, \end{aligned}$$
(3.6)
$$\begin{aligned}&\inf _{q \in Q_h} \sup _{w \in W_h} \frac{b(w, q)}{\Vert w \Vert _W \Vert q \Vert _Q} \ge \beta > 0, \end{aligned}$$
(3.7)

with b and c as defined by (3.3). It is also assumed that b and c are continuous over \(W \times Q\) and \(W \times W\) with respect to the relevant norms; i.e. there exist constants \(C_b > 0\) and \(C_c > 0\), independent of h, such that

$$\begin{aligned} b(v, q) \le C_b \Vert v \Vert _W \Vert q \Vert _Q, \quad c(v, w) \le C_c \Vert v \Vert _W \Vert w \Vert _W, \end{aligned}$$
(3.8)

for all \(v, w \in W\), \(q \in Q\).

The assumption of discrete Darcy stability, and thus the existence of solutions to the discrete Darcy problem, has been used to define Galerkin projectors for use in the a-priori analysis of the Biot Eq. (3.5) (c.f. for instance [30, Sect. 4.2]). Given \(z(t) \in W\) and \(p(t) \in Q\) solving the continuous Biot Eq. (3.2), these projectors \(\varPi _{W_h} z(t)\) and \(\varPi _{Q_h} p(t)\) solve the discrete Darcy problem (1.4) for all \(w \in W_h\), \(q \in Q_h\) with right-hand sides given by \(\left( g, w\right) = c(z(t), w) + b(w, p(t))\) and \(\left( s, q\right) = b(z(t), q)\). For an a-priori analysis based on such a Galerkin-projection approach to be optimal, including in the limit as \(\kappa \rightarrow 0\), the continuity constants \(C_b, C_c\) and the Babuška-Brezzi stability constants \(\alpha , \beta \) must be independent of \(0 < \kappa \le 1\).

Attaining \(\kappa \)-independent continuity and stability constants is non-trivial for the Darcy problem, and the norms that are selected for W and Q play a vital role. For instance, the standard pairing \(H({\text {div}}) \times L^2\) with the natural norms is not appropriate as e.g. c is not continuous with respect to the \(H({\text {div}})\) norm: the continuity bound \(C_c\) depends on \(\kappa \). However, the following pairings for \(W \times Q\) are all meaningful for (1.4) or its dual \(L^2 \times H^1\) formulation:

  1. (A)

    \(\left( \kappa ^{-1/2} L^2 \cap H({\text {div}}) \right) \times \left( L^2 + \kappa ^{1/2} H^1 \right) \)

  2. (B)

    \(\kappa ^{-1/2} H({\text {div}}) \times \kappa ^{1/2} L^2\)

  3. (C)

    \(\kappa ^{-1/2} L^2 \times \kappa ^{1/2} H^1\)

In particular, the inf-sup condition (3.7) holds with inf-sup constant \(\beta \) independent of \(\kappa \) for each of these pairings. We remark that \(\Vert p\Vert _{L^2 + \kappa ^{1/2} H^1} \le \Vert p\Vert \) and \(\Vert p\Vert _{\kappa ^{1/2} L^2} \le \Vert p\Vert \) for \(\kappa \le 1\). The \(\kappa \)-independent inf-sup condition for (A) was recently shown in [2], the inf-sup condition for (B) follows directly by a scaling of the flux by \(\kappa ^{-1/2}\) and the pressure by \(\kappa ^{1/2}\). Finally, the inf-sup condition of (C) follows directly from Poincare’s inequality with a similar scaling as in (B). The boundedness of b(zp) can be established for each of the pairings above. The pairing of (C) corresponds to the case of the \(L^2\times H^1\) formulation of the mixed Darcy problem, i.e. \(b(z,p) = \left( z, \nabla p\right) \) with \(z\in W = L^2\) and \(p\in Q=H^1\), but boundedness is proved in the same manner as for (B). In the case of (B): applying Cauchy-Schwarz and the weighted norm definitions immediately gives

$$\begin{aligned} |b(z,p)| \le \Vert {\text {div}}z \Vert \Vert p \Vert \le \Vert z \Vert _{H({\text {div}})} \Vert p \Vert = \left( \kappa ^{-1/2}\Vert z \Vert _{H({\text {div}})}\right) \left( \kappa ^{1/2}\Vert p \Vert \right) . \end{aligned}$$

The case of (A) is complicated by the definition of the sum norm on the pressure space Q, and a one-line argument is not possible without additional context; see [2] for details.

Options (A) and (B) above fit naturally with the variational formulation of (3.5) and spaces (3.1). In the following, we suggest that a natural norm for the Darcy flux is

(3.9)

which is equivalent to the norm of the flux in (A) above for the relevant range of \(\kappa \) when \(\tau > 0\). However, both options (A) and (B) have disadvantages. For (B), the pressure norm (on Q) becomes progressively weaker as \(\kappa \) nears 0 while the norm of the flux divergence (on W) is unnecessarily large compared with e.g. (3.9). The primary drawback to using (A) is that the pressure norm is implicitly defined. This fact means that an a-priori analysis based on the method of projections is more complex to carry out in practice; it is not clear that standard analytic techniques, e.g. in [17, 19, 22, 30] among others, could be used directly when the norm of \(L^2 + \kappa ^{1/2} H^1\) is chosen for the pressure space.

We will argue instead that an a-priori analysis of (3.5) based on the use of a Galerkin projection of the form (1.4) is not necessary; thus alleviating the need for an explicit uniform-in-\(\kappa \) Darcy stability condition on \((W_h,Q_h)\). Neither (3.7) nor the saddle-point stability of (1.4) in general play a role in the well-posedness of (3.5). Condition (iii) of Definition 3.1 will thus be replaced by a less restrictive condition. An important consequence of relaxing the uniform-in-\(\kappa \) Darcy stability hypothesis is that the standard \(L^2\)-norm on Q can, and will, be used.

4 Minimal Stokes–Biot stability

In this section we state the definition of minimal Stokes–Biot stability and recall a previous inf-sup condition in the spirit of the Banach-Nec̆as-Babus̆ka theorem. In particular, the minimal Stokes–Biot stability conditions (c.f. Definition 4.1) relinquish the Darcy stability assumption in favor of a containment condition. In practice, this containment condition is satisfied for discrete flux-pressure pairings that are Darcy stable, though other discrete spaces satisfy this condition which are not stable pairings for the mixed Darcy problem. Throughout this section we assume that U, W and Q are defined by (3.1). The norm on U is taken to be the usual \(H^1(\varOmega )\)-norm \(\Vert \cdot \Vert _1\), the norm on Q is the standard \(L^2\)-norm \(\Vert \cdot \Vert \), while the norm on W is the weighted norm defined by (3.9). The norm (3.9) was first introduced in [19, Sect. 3.1]. The bilinear forms abcd are as defined by (3.3).

4.1 Minimal Stokes–Biot conditions

We now introduce our set of minimal Stokes–Biot stability conditions. For clarity and completeness (rather than e.g. brevity), we include the precise stability conditions in the definition here. In essence, between Definitions 3.1 and 4.1, only condition (iii) changes.

Definition 4.1

A family of conforming discrete spaces \(\{ U_h \times W_h \times Q_h \}_h\) with \(U_h \subset U\), \(W_h \subset W\) and \(Q_h \subset Q\) is called minimally Stokes–Biot stable if and only if

  1. (i)

    The bilinear form a is continuous and coercive on \(U_h \times U_h\); i.e. there exists constants \(C_a > 0\) and \(\gamma _a > 0\) independent of h such that

    $$\begin{aligned} a(u, u) \ge \gamma _a \Vert u \Vert _{1}^2, \quad a(u, v) \le C_a \Vert u \Vert _{1} \Vert v \Vert _{1}, \quad \forall \,u, v \in U_h. \end{aligned}$$
    (4.1)
  2. (ii)

    The pairings \(\{ U_h \times Q_h \}_h\) are Stokes stable in the discrete Babuška-Brezzi sense [5, 6]; i.e. in particular there exists an inf-sup constant \(\beta _S > 0\) independent of h such that

    $$\begin{aligned} \inf _{q \in Q_h} \sup _{v \in U_h} \frac{b(v, q)}{\Vert v \Vert _{1} \Vert q \Vert } \ge \beta _S > 0. \end{aligned}$$
    (4.2)
  3. (iii)

    \({\text {div}}W_h \subseteq Q_h\) for each h.

The classical flux-pressure pairings, e.g. \(RT_k \times DG_k\) or \(BDM_{k+1}\times DG_k\) for \(k=0,1,2,\ldots \), satisfying Definition 3.1(iii) also satisfy the conditions of minimal Stokes–Biot stability; in particular Definition 4.1(iii). However, the minimal Stokes–Biot condition also includes discretizations which are not encompassed by Definition 3.1. For instance: flux-pressure pairings where the flux is taken from the space of continuous Lagrange polynomials can satisfy Definition 4.1 while not satisfying Definition 3.1. An illustration of this can be found in the family of discretizations where the displacement-pressure pairing are of Scott-Vogelius type; these either have the form \(P^d_k \times RT_m \times DG_{k-1}\) or \(P^d_k \times P^d_m \times DG_{k-1}\) where \(k \ge 4\) and \(0\le m \le k-1\). The flux-pressure pairings \(RT_m \times DG_{k-1}\), for \(m < k-1\), and \(P^d_m \times DG_{k-1}\), for \(m\le k-1\), are not Darcy stable but do satisfy the minimal Stokes–Biot stability containment condition of Definition 4.1(iii).

A more pragmatic example is the discretization \(P^d_2 \times RT_0 \times DG_0\). This discretization is both Stokes–Biot stable and minimally Stokes–Biot stable; of note is that \(P^d_2 \times P^d_1 \times DG_0\) is not Stokes–Biot stable but is minimally Stokes–Biot stable. The \(P^d_2\times RT_0 \times DG_0\) discretization is a prototype for the minimal-dof displacement enrichment of a \(P^d_1\times RT_0 \times DG_0\) approach studied in [30]. The comparison between \(P^d_2 \times RT_0 \times DG_0\) and \(P^d_2 \times P^d_1 \times DG_0\) serves as a motivation for Definition 4.1, and will be studied in Sect. 6. A further discussion of spaces that satisfy the minimal Stokes–Biot stability condition is given in Sect. 7.

4.2 An inf-sup condition for minimal Stokes–Biot stable Euler–Galerkin schemes

The variational problem (3.5) can be shown to satisfy a requirement of the Banach-Nec̆as-Babus̆ka theorem with respect to the weighted norm (3.9) and Definition 4.1. In fact, this result was proved in [19].

Proposition 4.1

(Theorem 1, [19]) Let \(\Vert (u_h,w_h,q_h) \Vert _{UWQ}\) be defined by

where is defined by (3.9). Define a composite bilinear form, on \(U_h\times W_h \times Q_h\) and corresponding to (3.5), by the formula

$$\begin{aligned} \mathcal {B}(u_h,z_h,p_h;v_h,r_h,q_h)&= a(u_h,v_h) + b(v_h,p_h) + \tau ~c(z_h,r_h) \\ {}&\quad + \tau ~b(r_h,p_h) + b(u_h,q_h) + \tau ~b(z_h,q_h) - d(p_h,q_h) \end{aligned}$$

Suppose \(U_h \times W_h \times Q_h\) satisfy the assumptions of Definition 4.1. Then \(\mathcal {B}\) is continuous and there exists a constant \(\gamma > 0\), independent of \(\kappa \) and \(c_0\), such that

$$\begin{aligned} \sup _{(v_h,r_h,q_h)\in U_h\times W_h \times Q_h} \frac{\mathcal {B}(u_h,z_h,p_h;v_h,r_h,q_h)}{\Vert (v_h,r_h,q_h) \Vert _{UWQ}} \ge \gamma \Vert (u_h,z_h,p_h) \Vert _{UWQ} \end{aligned}$$

Proof

The proof follows from the arguments in [19, Theorem 1]. \(\square \)

Remark 4.1

Work by previous authors [17, 19] shows that the assumptions of Definition 4.1 were nascent in the literature. The proof [19] of Proposition 4.1 is independent of \(0\le c_0\), and does not invoke Darcy stability, but does, in fact, use condition (iii) of Definition 4.1. In fact, another version of Proposition 4.1 was also proved, independently, in [17, Theorem 3.2, Case I]; the proof, once more, is independent of \(c_0\) and does not assume that the divergence maps the flux space surjectively onto the pressure space (i.e. Darcy stability). A nice mention of the case \(U = H_0^1\) and \(Q = L_0^2\) can also be found therein. The arguments of [17, Theorem 3.2, Case I] follow similarly to those of [19, Theorem 2].

Corollary 4.1

Assume that the assumptions of Definition 4.1 hold; then (3.5) is well posed.

Proof

The Banach-Nec̆as-Babus̆ka theorem [8, Theorem 2.6], applied to (3.5), requires that two conditions are satisfied. The first condition is that of Proposition 4.1, which has been proved, independently, by several authors. The second condition, which remains to be verified, is that if an element \((v_h,r_h,q_h)\in U_h\times W_h \times Q_h\) is such that

$$\begin{aligned} \mathcal {B}(u_h,z_h,p_h;v_h,r_h,q_h) = 0,\quad \forall \,(u_h,z_h,p_h) \in U_h\times W_h \times Q_h, \end{aligned}$$

then \(v_h = r_h = q_h = 0\) must follow. To show that this condition also holds true, fix \((v_h,r_h,q_h) \in U_h\times W_h \times Q_h\) and suppose that the above implication holds; we need to show that, in this case, it must be that \(v_h = r_h = q_h = 0\). Towards this end we consider two cases: the first case is if \(c_0 > 0\), and the second case is if \(c_0 = 0\). For the first case, select \(u_h = v_h\), \(z_h = r_h\) and \(p_h = -q_h\), along with (3.9), the hypothesis above and (3.3), to get

$$\begin{aligned} \mathcal {B}(v_h,r_h,-q_h;v_h,r_h,q_h) = a(v_h,v_h) + \frac{\tau }{\kappa } \Vert r_h \Vert ^2 + c_0 \Vert q_h \Vert ^2 = 0. \end{aligned}$$

Coercivity (c.f. (4.1)) gives \(\gamma _a \Vert v_h \Vert _1^2 \le a(v_h,v_h)\) and \(v_h = r_h = q_h = 0\) follows. For the second case, assume that \(c_0 = 0\). The Stokes stability assumption (Definition 4.1(ii)) implies that (e.g. [5, p. 136]) there exists \(y_h \in U_h\) such that

$$\begin{aligned} \left( {\text {div}}y_h, q_h\right)&= \Vert q_h \Vert ^2, \end{aligned}$$
(4.3)
$$\begin{aligned} \beta _S \Vert y_h \Vert _{1}&\le \Vert q_h \Vert , \end{aligned}$$
(4.4)

where \(\beta _S\) is the Stokes inf-sup constant of (4.2). Let \(\delta \ge 0\) be a yet-undetermined, but fixed, constant and choose \(u_h = v_h + \delta y_h\), \(z_h = r_h\), and \(p_h = -q_h\). With these choices, and (3.3), we have

$$\begin{aligned} \mathcal {B}(v_h + \delta y_h,r_h,-q_h;v_h,r_h,q_h) = a(v_h,v_h) + \delta a(y_h,v_h) + \frac{\tau }{\kappa }\Vert r_h \Vert ^2 + \delta \Vert q_h \Vert ^2 = 0. \end{aligned}$$

The coercivity and continuity assumptions (c.f. (4.1)), together with Cauchy’s inequality with epsilon, (4.4) and gathering of like terms gives

$$\begin{aligned} \left( \gamma _a - \delta C_a \epsilon \right) \Vert v_h \Vert _{1}^2 + \frac{\tau }{\kappa } \Vert r_h \Vert ^2 + \delta \left( 1- \frac{C_a}{4 \beta _S^2 \epsilon } \right) \Vert q_h \Vert ^2 \le 0. \end{aligned}$$
(4.5)

We can now select the appropriate constants \(\delta \) and \(\epsilon \), as e.g.

$$\begin{aligned} \epsilon = 2 \frac{C_a}{4 \beta _S^2}> 0, \quad \delta = \frac{\gamma _a \beta _S^2}{C_a^2} > 0, \end{aligned}$$
(4.6)

from which it follows that

$$\begin{aligned} \gamma _a \Vert v_h \Vert _{1}^2 + \frac{\tau }{\kappa } \Vert r_h \Vert ^2 + \frac{1}{2} \frac{\gamma _a \beta _S^2}{C_a^2} \Vert q_h^m \Vert ^2 \le 0. \end{aligned}$$
(4.7)

and thus \(v_h = r_h = q_h = 0\). Thus, the second condition of the Banach-Nec̆as-Babus̆ka theorem [8, Theorem 2.6] holds, irregardless of \(c_0\), and the result follows. \(\square \)

5 A priori error estimates for minimally Stokes–Biot stable schemes

In this section, we derive a-priori error estimates for the Euler–Galerkin discrete Biot equations (3.5) using the assumptions of Definition 4.1. The final result is summarized in Proposition 5.2 of Sect. 5.4. We will assume the point of view of minimal Stokes–Biot stability as defined by Definition 4.1 and that \(U_h\) contains the continuous nodal Lagrange elements \(P_r\) for some \(r \ge 1\). We begin by establishing basic assumptions on the spaces \(U_h, W_h\) and \(Q_h\), and define projection operators in Sect. 5.1.

5.1 Projections and approximability

As in the previous section, let U, W, Q be given by (3.1) with norms \(\Vert \cdot \Vert _1\), cf. (3.9), and \(\Vert \cdot \Vert \), respectively. Assume that the discrete spaces \(U_h \times W_h \times Q_h\) satisfy the assumptions of Definition 4.1. We denote the (continuous) solutions to (3.2) at time \(t^m\) by \((u^m, z^m, p^m)\) for \(m = 1, 2, \ldots , N\) while \((u_h^m, z_h^m, p_h^m)\) represent the solutions of the discrete problem (3.5). For use in the subsequent error analysis, we make basic assumptions on the spaces \(U_h, W_h\) and \(Q_h\), and define projection operators \(\varPi _{U_h}: U \rightarrow U_h\), \(\varPi _{W_h} : W\rightarrow W_h\) and \(\varPi _{Q_h}: Q\rightarrow Q_h\) as follows.

\(Q_h\)::

Define \(\varPi _{Q_h}\) to be the standard \(L^2\)-projection into \(Q_h\). Then

$$\begin{aligned} \Vert q - \varPi _{Q_h} q \Vert \lesssim \inf _{q_h \in Q_h}\Vert q - q_h\Vert , \end{aligned}$$

for all \(q \in Q\). If \(Q_h\) contains piecewise polynomials of order \(k = k_Q \ge 0\), then in particular

$$\begin{aligned} \Vert q - \varPi _{Q_h}q \Vert \lesssim h^{k_Q+1} \Vert q \Vert _{k_Q+1}, \quad \forall \,w \in H^{k}. \end{aligned}$$
(5.1)
\(W_h\)::

Assume that \(W_h\) contains (at least) piecewise polynomial (vector) fields of order \(k = k_W \ge 0\). We assume the existence of a generic discrete interpolant \(\varPi _{W_h}: W \rightarrow W_h\) satisfying either

$$\begin{aligned} \Vert w - \varPi _{W_h} w \Vert\lesssim & {} h^{k_W + 1} \Vert w \Vert _{k_W + 1} \, \text {and} \, \Vert {\text {div}}(w - \varPi _{W_h} w) \Vert \nonumber \\\lesssim & {} h^{k_W+1}\Vert {\text {div}}w \Vert _{k_W+1}, \end{aligned}$$
(5.2)

for \(w \in H^{k_W+2}\), or

$$\begin{aligned} \begin{array}{lcr} \Vert w - \varPi _{W_h}w \Vert \lesssim h^{k_W+1} \Vert w \Vert _{k_W+1}&\text {and}&\Vert w - \varPi _{W_h} w \Vert _{1} \lesssim h^{k_W}\Vert w \Vert _{k_W+1}. \end{array} \end{aligned}$$
(5.3)

for \(w \in H^{k+1}(\varOmega )\). The estimates (5.2) are characteristic of a Raviart-Thomas type, \(RT_{k}\) (\(k = 0, 1, 2, \ldots \)), interpolant whereas (5.3) could correspond to a continuous Lagrange interpolant of order \(k \ge 1\) [8].

\(U_h\)::

Following [30], we define \(\varPi _{U_h} : U \rightarrow U_h\) as a modified elliptic projection satisfying for \(u \in U\):

$$\begin{aligned} a(\varPi _{U_h} u, v) = a(u, v) + b(v, q - \varPi _{Q_h} q) \quad \forall \,v \in U_h, \end{aligned}$$
(5.4)

where \(q \in Q\) is given and will, in practice, be selected as the exact pressure solution to (3.2) at given times. Assume that \(U_h\) contains (at least) continuous piecewise polynomial (vector) fields of order \(k_U \ge 1\). There then exists an interpolant, \(I^{k_U}: U \rightarrow U_h\), such that

$$\begin{aligned} \Vert u - I^{k_U} u \Vert _{1} \lesssim h^{k_U} \Vert u \Vert _{k_U+1} \end{aligned}$$

for all \(u \in H^{k_U+1}\), c.f. e.g [8]. Then for \(u \in U\) we have

$$\begin{aligned} \Vert u- \varPi _{U_h} u \Vert _{1}\le & {} \Vert u-I^{k_U} u \Vert _{1} + \Vert I^{k_U} u - \varPi _{U_h}u \Vert _{1} \lesssim h^{k_U} \Vert u \Vert _{k_U+1} \\&+ \Vert I^{k_U} u -\varPi _{U_h}u \Vert _{1}. \end{aligned}$$

Using assumption (i) of Definition 4.1 and (5.4) with \(v = \varPi _{U_h} u - I^{k_U} u\) imply that

$$\begin{aligned} \gamma _a \Vert \varPi _{U_h} u - I^{k_U} u \Vert _{1}^2&\le a(\varPi _{U_h} u - I^{k_U} u,\varPi _{U_h} u - I^{k_U} u) \\&= a(u - I^{k_U} u, \varPi _{U_h} u - I^{k_U} u) \\&\quad + b(\varPi _{U_h} u - I^{k_U} u, q -\varPi _{Q_h}q)\\&\le \Vert \varPi _{U_h} u - I^{k_U} u \Vert _{1} \left( C_a \Vert u - I^{k_U} u \Vert _{1} + \Vert q - \varPi _{Q_h}q \Vert \right) . \end{aligned}$$

Combining the above with assumption (5.1) gives

$$\begin{aligned} \Vert u - \varPi _{U_h} u \Vert _{1} \lesssim h^{k_U} \Vert v \Vert _{k_U + 1} + h^{k_Q+1}\Vert q \Vert _{k_Q+1}, \end{aligned}$$
(5.5)

where \(q \in Q\) is the fixed function defining the elliptic projection (5.4).

5.2 Interpolation notation and identities

Following standard notation [17, 22, 30], the error at time \(t^m>0\) can be decomposed into interpolation errors \(\rho \) and approximation errors e:

$$\begin{aligned} \begin{aligned} u^m - u^m_h&= \left( u^m - \varPi _{U_h} u^m\right) - \left( u_h^m - \varPi _{U_h} u^m\right) = \rho _u^m - e_u^m \\ z^m - z^m_h&= \left( z^m - \varPi _{W_h} z^m\right) - \left( z_h^m - \varPi _{W_h} z^m\right) = \rho _z^m - e_z^m\\ p^m - p^m_h&= \left( p^m - \varPi _{Q_h} p^m\right) - \left( p_h^m - \varPi _{Q_h} p^m\right) = \rho _p^m - e_p^m. \end{aligned} \end{aligned}$$
(5.6)

The interpolation errors satisfy the following identities. Since \({\text {div}}W_h \subseteq Q_h\) and by the definition of the \(L^2\)-projection \(\varPi _{Q_h}\), we have that

$$\begin{aligned} b(w, \rho _p^m) = \left( {\text {div}}w, p^m - \varPi _{Q_h} p^m\right) = 0 \quad \forall \,w \in W_h. \end{aligned}$$
(5.7)

Similarly, by the definition of \(\varPi _{Q_h}\),

$$\begin{aligned} d(\partial _{\tau } \rho _p^m, q) = c_0 \left( \partial _{\tau } \rho _p^m, q\right) = 0 \quad \forall \,q \in Q_h, \end{aligned}$$
(5.8)

where we recall the discrete derivative notation (2.1). Finally, (5.4) directly gives

$$\begin{aligned} a(\rho _{u}^m, v) + b(v, \rho _{p}^m) = 0, \quad \forall \,v \in U_h. \end{aligned}$$
(5.9)

Taking the difference between the continuous Eq. (3.2) and discrete scheme (3.5), after multiplying (3.2b) by \(\tau \), combined with the cancellations (5.7)–(5.9), yield the following error equations at \(t^m\): \((e_u^m, e_z^m, e_p^m)\) satisfies

$$\begin{aligned} a(e_u^m, v) + b(v, e_p^m)&= 0&\forall \,v_h \in U_h, \end{aligned}$$
(5.10a)
$$\begin{aligned} \tau c(e_z^m, w) + \tau b(w, e_p^m)&= \tau c(\rho _z^m, w),&\forall \,w \in W_h, \end{aligned}$$
(5.10b)
$$\begin{aligned} b(\partial _{\tau }e_u^m, q) + b(e_z^m, q) - d(\partial _{\tau }e_p^m, q)&= \left( R^m, q\right)&\forall \,q \in Q_h, \end{aligned}$$
(5.10c)

where

$$\begin{aligned} R^m = {\text {div}}(\partial _t u^m - \partial _{\tau } u^m) + {\text {div}}(\partial _{\tau }\rho _u^m) + {\text {div}}\rho _z^m + c_0(\partial _t p^m - \partial _{\tau }p^m) , \end{aligned}$$
(5.11)

by way of the general identity

$$\begin{aligned} \partial _t u^m - \partial _{\tau } u_h^m = \partial _t u^m - \partial _{\tau } u^m + \partial _{\tau }\rho _{u}^m - \partial _{\tau } e_u^m, \end{aligned}$$
(5.12)

and similarly for p.

5.3 Discrete approximation error estimates

In this section we estimate the discrete errors described by (5.10) in their respective norms; that is, \(\Vert e_u^m \Vert _{1}\), and \(\Vert e_p^m \Vert \). In contrast to e.g. [30], we do not make use of the restrictive uniform-in-\(\kappa \) Darcy stability assumption. In turn, the error equations require a more technical analysis and we have adapted related methods originally used to study \(\kappa \) fixed [22] and vanishing (\(c_0\)) storage coefficient. Despite the more technical approach, the resulting estimates presented in Proposition 5.2 is directly comparable to related results in the literature; c.f. [17, Lem. 3], [22, Thm. 4.1] and [30, Thm 4.6]. We conclude that the concept of minimal Stokes–Biot stability provides analogous error estimates for a more general set of conforming discrete spaces than the original Stokes–Biot stability concept.

During the course of the analysis will make use of the following useful inequality

Lemma 5.1

[22, Lemma 3.2] Suppose that A, B, C \(>0\) and \(D\ge 0\) satisfy

$$\begin{aligned} A^2 + B^2 \le CA + D. \end{aligned}$$

Then either \(A+B \le 4C\) or \(A+B \le 2\sqrt{D}\) holds.

Proposition 5.1

Suppose that \(U_h \times W_h \times Q_h\) is minimally Stokes–Biot stable (by satisfying the assumptions of Definition 4.1). Then, the discrete approximation errors \((e_u^m, e_z^m, e_p^m)\) described by (5.10) satisfy the inequality:

(5.13)

with inequality constant depending on \(C_a\), \(\gamma _a^{-1}\) and where

$$\begin{aligned} C_{\tau }^m \equiv \int _{0}^{t_m} \Vert {\text {div}}\rho _z \Vert + \Vert \rho _{\partial _t u} \Vert _{1} + \tau \left( c_0 \Vert \partial _{tt}p \Vert + \Vert \partial _{tt}u \Vert _{1} \right) \, \mathrm {d} s. \end{aligned}$$

Proof

In an analogous fashion as for Proposition 4.1, multiplying (5.10c) by \(\tau \), selecting \(v = e_u^m - e_u^{m-1}\), \(w = e_z^m\), and \(q = -e_p^m\) in (5.10) and summing gives

$$\begin{aligned} \begin{aligned} a(e_u^m - e_u^{m-1}, e_u^m) + \tau c(e_z^m,e_z^m) + d(e_p^m - e_p^{m-1},e_p^m) = \tau c(\rho _z^m,e_z^m) - \tau \left( R^m, e_p^m\right) . \end{aligned} \end{aligned}$$
(5.14)

For any (continuous) symmetric bilinear form a with induced norm \(\Vert \cdot \Vert _{a}\) we have the inequality [9]

$$\begin{aligned} \frac{1}{2}\left( \Vert \chi \Vert _{a}^2 - \Vert \chi -\xi \Vert _{a}^2\right) \le a(\xi ,\chi ). \end{aligned}$$
(5.15)

Using the above, and the symmetry of both \(a(\cdot ,\cdot )\) and \(d(\cdot ,\cdot )\), it follows that the left-hand side of (5.14) is bounded below by

$$\begin{aligned} \frac{1}{2} \Vert e_u^m \Vert _{a}^2 - \frac{1}{2} \Vert e_u^{m-1} \Vert _{a}^2 + \tau \Vert e_z^m \Vert _{c}^2 + \frac{1}{2} \Vert e_p^m \Vert _{d}^2 - \frac{1}{2} \Vert e_p^{m-1} \Vert _{d}^2. \end{aligned}$$
(5.16)

On the other hand, Cauchy-Schwarz and Young’s inequality give

$$\begin{aligned} |\tau c(\rho _z^m,e_z^m)| \le \frac{\tau }{2} \Vert \rho _z^m \Vert _{c}^2 + \frac{\tau }{2} \Vert e_z^m \Vert _{c}^2. \end{aligned}$$
(5.17)

From the Stokes stability assumption (4.2) and (5.10a) we have the estimate

$$\begin{aligned} \beta _S \Vert e_p^m \Vert \le \sup _{v \in U_h} \frac{b(v, e_p^m)}{\Vert v \Vert _{1}} = \sup _{v \in U_h}\frac{-a(e_u^m,v)}{\Vert v \Vert _{1}} \le C_a \Vert e_u^m \Vert _{1}. \end{aligned}$$
(5.18)

Then, Cauchy-Schwarz, (5.18) and the coercivity of a gives

$$\begin{aligned} \tau |\left( R^m, e_p^m\right) | \le C_a \beta _S^{-1} \gamma _a^{-1/2} \tau \Vert R^m \Vert \Vert e_u^m \Vert _{a}. \end{aligned}$$
(5.19)

Combining (5.165.19) yields

$$\begin{aligned} \Vert e_u^m \Vert _{a}^2 - \Vert e_u^{m-1} \Vert _{a}^2 + \tau \Vert e_z^m \Vert _{c}^2 + \Vert e_p^m \Vert _{d}^2 - \Vert e_p^{m-1} \Vert _{d}^2 \lesssim \tau \left( \Vert \rho _z^m \Vert _{c}^2 + \Vert R^m \Vert \Vert e_u^m \Vert _{a} \right) . \end{aligned}$$
(5.20)

with inequality constant depending on \(C_a \beta _S^{-1} \gamma _a^{-1/2}\).

Estimate of \(\Vert e_u^m \Vert _{a}\): Following a technique from [22], let J be the integer index where \(\Vert e_u^m \Vert _{a}\) (for \(m = 1, \ldots , N\)) obtains its maximal value. Summing (5.20) from \(m=1\) to \(m=J\), using the maximality assumption, and re-arranging terms yields

$$\begin{aligned} \Vert e_u^J \Vert _{a}^2 + \tau \sum \limits _{m=1}^{J}\Vert e_z^m \Vert _{c}^2 + \Vert e_p^J \Vert _{d}^2 \lesssim \Vert e_u^0 \Vert _{a}^2 + \Vert e_p^0 \Vert _{c}^2 + \sum \limits _{m=1}^{J} \tau \Vert \rho _z^m \Vert _{c}^2 + \sum \limits _{m=1}^{J} \tau \Vert R^m \Vert \Vert e_u^J \Vert _{a}. \end{aligned}$$
(5.21)

We can apply Lemma 5.15.21 by taking \(A = \Vert e_u^J \Vert _{a}\), \(B = \Vert e_p^J \Vert _{d}\) and dropping the additional left-hand side term; then we choose

$$\begin{aligned} C = \sum \limits _{m=1}^{J}\tau \Vert R^m \Vert , \quad D = \Vert e_u^0 \Vert _{a}^2 + \Vert e_p^0 \Vert _{c}^2 +\sum \limits _{m=1}^{J}\tau \Vert \rho _z^m \Vert _{c}^2, \end{aligned}$$

and, provided appropriate temporal regularity of the exact solution, have

$$\begin{aligned} \sum \limits _{m=1}^{J} \tau \Vert \rho _z^m \Vert _{c}^2 \lesssim \int _{0}^{t^J} \Vert \rho _z \Vert _{c}^2 \, \mathrm {d} s. \end{aligned}$$

Lemma 5.1, with the above and the triangle inequality, implies

$$\begin{aligned} \Vert e_u^J \Vert _{a} + \Vert e_p^J \Vert _{d} \lesssim \Vert e_u^0 \Vert _{a} + \Vert e_p^0 \Vert _{d} + \sum \limits _{m=1}^{J}\tau \Vert R^m \Vert + \left( \int _{0}^{t^J} \Vert \rho _z \Vert _{c}^2\right) ^{1/2}. \end{aligned}$$
(5.22)

Bound of \(\tau \Vert R^m \Vert \): We now develop a bound for the terms \(\tau \Vert R^m \Vert \); c.f. (5.11). From the fundamental theorem of calculus and integration by parts we have the general result

$$\begin{aligned} \partial _t f^m - \partial _{\tau }f^m = \frac{1}{\tau }\int _{t^{m-1}}^{t^m} (s-t^{m-1})\partial _{tt}f(s) \, \mathrm {d} s\end{aligned}$$

for any \(m = 1, \ldots , N\), assuming sufficient temporal regularity of the field f. We therefore, again under the assumption of sufficient spatial and temporal regularity, have the inequalities

$$\begin{aligned} c_0 \Vert \partial _t p^m - \partial _{\tau }p^m \Vert&\le \int _{t^{m-1}}^{t^m} c_0 \Vert \partial _{tt} p \Vert \, \mathrm {d} s \end{aligned}$$
(5.23)
$$\begin{aligned} \Vert {\text {div}}\left( \partial _t u^m - \partial _{\tau }u^m\right) \Vert&\le \int _{t^{m-1}}^{t^{m}} \Vert \partial _{tt}u \Vert _{1} \, \mathrm {d} s, \end{aligned}$$
(5.24)

which control the first and fourth terms of \(\Vert R^m \Vert \).

For the second term of \(R^m\) we have \(\Vert {\text {div}}\partial _\tau \rho _u^m \Vert \le \Vert \partial _\tau \rho _u^m \Vert _{1}\). Rearranging the terms of \(\partial _{\tau }\rho _{u}^m\), applying the fundamental theorem of calculus and using the commutation of the time derivative with the elliptic projection (5.4) yields

$$\begin{aligned} \Vert \partial _{\tau }\rho _{u}^m \Vert _{1} = \Vert \frac{u^m - u^{m-1}}{\tau } - \frac{\varPi _{U_h}u^m - \varPi _{U_h} u^{m-1}}{\tau } \Vert _{1} \le \frac{1}{\tau } \int _{t^{j-1}}^{t^j} \Vert \rho _{\partial _t u} \Vert _{1} \, \mathrm {d} s. \end{aligned}$$
(5.25)

For the third term of \(R^m\) we have, again up to sufficient temporal regularity of the exact solution, that

$$\begin{aligned} \sum \limits _{m=1}^{J}\tau \Vert {\text {div}}\rho _z^m \Vert \lesssim \int _{0}^{t^J} \Vert {\text {div}}\rho _z \Vert \, \mathrm {d} s. \end{aligned}$$
(5.26)

Summarizing, (5.235.26) thus yield

$$\begin{aligned} \begin{aligned} \sum _{m=1}^J \tau \Vert R^m \Vert \lesssim&\int _{0}^{t^J} \Vert {\text {div}}\rho _z \Vert + \Vert \rho _{\partial _t u} \Vert _{1} + \tau \left( c_0 \Vert \partial _{tt}p \Vert + \Vert \partial _{tt}u \Vert _{1} \right) \, \mathrm {d} s\equiv C_{\tau }^J. \end{aligned} \end{aligned}$$
(5.27)

And so, the estimate (5.22) becomes

$$\begin{aligned} \Vert e_u^J \Vert _{a} + \Vert e_p^J \Vert _{d} \lesssim \Vert e_u^0 \Vert _{a} + \Vert e_p^0 \Vert _{d} + \left( \int _{0}^{t^J} \Vert \rho _z \Vert _{c}^2 \, \mathrm {d} s\right) ^{1/2} + C_\tau ^J \end{aligned}$$
(5.28)

Clearly, by Definition 4.1(i), this also gives a bound for \(\Vert e_u^m \Vert _{1}\) (depending on \(\gamma _a^{-1}\)) for \(m = 1, \ldots , N\).

Estimate of \(\Vert e_p^m \Vert \): The norm \(\Vert e_p^J \Vert _{d}\) in e.g. (5.28) vanishes in the limit as \(c_0 \rightarrow 0\). An alternative bound for \(\Vert e_p^m \Vert \) can be derived from the Stokes stability assumption, Definition 4.1(ii). In particular, using (5.18) and (5.28) it follows that for each \(1 \le m \le N\):

$$\begin{aligned} \Vert e_p^m \Vert \lesssim \Vert e_u^m \Vert _{1} \lesssim \Vert e_u^J \Vert _{a}, \end{aligned}$$
(5.29)

with inequality constant \(C_a \beta _S^{-1} \gamma _a^{-1/2}\) and where J is the index where \(\Vert e_u^J \Vert _{1}\) is maximal. Thus \(\Vert e_p^m \Vert \) can be bounded by the right hand side of (5.28), independently of \(c_0\).

Estimate of \(\tau \Vert e_z^m \Vert _{c}^2\): In order to estimate the flux error in the norm defined by (3.9), i.e. , it will be advantageous to consider the constituents separately; e.g. \(\tau \Vert e_z^m \Vert _{c}^2\) and \(\tau ^2 \Vert {\text {div}}e_z^m \Vert ^2\).

We begin by considering the first component and again argue based on maximality. Take the difference of the error equation (5.10a) at time levels m, \(m-1\) and dividing by \(\tau \) to get

$$\begin{aligned} a(\partial _{\tau } e_u^m, v) + b(v, \partial _{\tau } e_p^m) = 0 \quad \text {for } v \in U_h. \end{aligned}$$
(5.30)

Similarly taking the difference of (5.10b) at time levels m and \(m-1\), and divide by \(\tau ^2\) to get

$$\begin{aligned} c(\partial _{\tau } e_z^m, w) + b(w, \partial _{\tau } e_p^m) = c(\partial _{\tau }\rho _{z}^m, w) \quad \text {for } w \in W_h. \end{aligned}$$

Choose \(v = \partial _{\tau } e_u^m\), \(w = e_z^m\) in the above as well as \(q = - \partial _{\tau } e_p^m\) in (5.10c); summing these three equations, using Cauchy-Schwarz on the right-hand side, and coercivity on the left-hand side gives

$$\begin{aligned} \gamma _a \Vert \partial _{\tau } e_u^m \Vert _{1}^2 + \Vert \partial _{\tau } e_p^m \Vert _{d}^2 + c(\partial _{\tau } e_z^m, e_z^m) \le \Vert \partial _{\tau }\rho _z^m \Vert _{c}\Vert e_z^m \Vert _{c} + \Vert R^m \Vert \Vert \partial _{\tau } e_p^m \Vert . \end{aligned}$$

From Definition 4.1(ii) and (5.30) we have that \(\Vert \partial _{\tau }e_p^m \Vert _{} \le C_a \beta _S^{-1} \Vert \partial _{\tau }e_u^m \Vert _{1}\) by the analogue of (5.18). Using this on the right-most term of the above, alongside Cauchy’s inequality with epsilon and choosing epsilon appropriately, gives

$$\begin{aligned} \Vert \partial _{\tau }e_u^m \Vert _{1}^2 + \Vert \partial _{\tau } e_p^m \Vert _{d}^2 + c(\partial _{\tau } e_z^m, e_z^m) \lesssim \Vert \partial _{\tau }\rho _z^m \Vert _{c} \Vert e_z^m \Vert _{c} + \Vert R^m \Vert ^2, \end{aligned}$$

with inequality constant depending on \(C_a \beta _S \gamma _a^{-1}\). Dropping the positive displacement and pressure left-hand side terms, multiplying both sides by \(\tau \), and using the symmetry of c together with the inequality (5.15) give

$$\begin{aligned} \Vert e_z^m \Vert _{c}^2 - \Vert e_z^{m-1} \Vert _{c}^2 \lesssim \tau \Vert \partial _{\tau } \rho _{z}^{m} \Vert _{c} \Vert e_z^m \Vert _{c} + \tau \Vert R^m \Vert ^2. \end{aligned}$$

Let M be the index where \(\Vert e_z^m \Vert _{c}^2\) achieves its maximum for \(1 \le m \le N\). Summing the above from \(m=1\) to \(m=M\), using the maximality of \(\Vert e_z^M \Vert _{c}\), multiplying both sides by \(\tau \) and re-arranging yields

$$\begin{aligned} \tau \Vert e_z^M \Vert _{c}^2 \lesssim \tau \Vert e_z^0 \Vert _{c}^2 + \tau \left( \sum \limits _{m=1}^{M} \tau \Vert \partial _{\tau }\rho _{z}^m \Vert _{c}\right) \Vert e_z^M \Vert _{c} + \sum \limits _{m=1}^{M} \left( \tau \Vert R^m \Vert \right) ^2. \end{aligned}$$
(5.31)

By the fundamental theorem of calculus, we have

$$\begin{aligned} \tau \left( \Vert \partial _{\tau }\rho _z^m \Vert _{c} \right) = \Vert \rho _z^m - \rho _z^{m-1} \Vert _{c} \le \int _{t^{m-1}}^{t^m} \Vert \rho _{\partial _t z} \Vert _{c} \, \mathrm {d} s. \end{aligned}$$

Applying Hölder’s inequality on the right-most term, above, gives

$$\begin{aligned} \int _{t^{m-1}}^{t^m} \Vert \rho _{\partial _t z} \Vert _{c} \le \left( \int _{t^{m-1}}^{t^m} 1\, dt\right) ^{1/2} \left( \int _{t^{m-1}}^{t^m} \Vert \rho _{\partial _t z} \Vert _{c}^2\right) ^{1/2} \end{aligned}$$

so that

$$\begin{aligned} \tau \left( \Vert \partial _{\tau }\rho _z^m \Vert _{c} \right) \le \tau ^{1/2}\left( \int _{t^{m-1}}^{t^m} \Vert \rho _{\partial _t z} \Vert _{c}^2\right) ^{1/2}. \end{aligned}$$
(5.32)

Inserting (5.32) and (5.27) into (5.31), using Young’s inequality on the second term on the right-hand side and rearranging yields

$$\begin{aligned} \tau \Vert e_z^M \Vert _{c}^2&\lesssim \tau \Vert e_z^0 \Vert _{c}^2 + \tau \left( \tau \sum \limits _{m=1}^{M}\Vert \partial _{\tau }\rho _{z}^m \Vert _{c}\right) ^2 + \sum \limits _{m=1}^{M} \left( \tau \Vert R^m \Vert \right) ^2, \nonumber \\&\lesssim \tau \Vert e_z^0 \Vert _{c}^2 + \tau ^2 \int _{0}^{t_M} \Vert \rho _{\partial _t z} \Vert _{c}^2 \, \mathrm {d} s+ \left( C_{\tau }^M \right) ^2 \end{aligned}$$
(5.33)

Estimate of \(\tau ^2\Vert {\text {div}}e_z^m \Vert _{}\):

Now we estimate the second, and final, term in the flux norm (3.9). Let K denote the index where \(\Vert {\text {div}}e_z^K \Vert \) is maximal. Using Definition 4.1(iii), and selecting \(q = \tau {\text {div}}e_z^K\) in the error equation (5.10c) for \(m = K\) yields

$$\begin{aligned}&\left( {\text {div}}(e_u^K - e_u^{K-1}), {\text {div}}e_z^K\right) + \tau \left( {\text {div}}e_z^K, {\text {div}}e_z^K\right) - \left( c_0 (e_p^K - e_p^{K-1}), {\text {div}}e_z^K\right) \\&\quad = \tau \left( R^K, {\text {div}}e_z^K\right) . \end{aligned}$$

Thus, re-arranging terms, using Cauchy-Schwarz and the triangle inequalities, and dividing by \(\Vert {\text {div}}e_z^K \Vert \) gives

$$\begin{aligned} \begin{aligned} \tau \Vert {\text {div}}e_z^K \Vert&\lesssim \Vert e_u^K \Vert _{1} + \Vert e_u^{K-1} \Vert _{1} + c_0 \Vert e_p^K \Vert + c_0 \Vert e_p^{K-1} \Vert + \tau \Vert R^K \Vert \\&\lesssim \Vert e_u^J \Vert _{1} + \tau \Vert R^K \Vert , \end{aligned} \end{aligned}$$

where the last inequality follows from the majorization of the terms \(e_u^K\), \(e_p^K\), \(e_u^{K-1}\), \(e_p^{K-1}\) by the maximum \(e_u^J = \max \limits _{j=1,2,\ldots ,N}e_u^{j}\) and inequality (5.29). Noting that

$$\begin{aligned} \left( \Vert e_u^J \Vert _{1} + \tau \Vert R^K \Vert \right) ^{2} \lesssim \Vert e_u^J \Vert _{1}^2 + \tau ^2 \Vert R^K \Vert ^2 \lesssim \Vert e_u^J \Vert _{1}^2 + \sum _{m=1}^{K} \tau ^2 \Vert R^m \Vert ^2, \end{aligned}$$

and employing (5.28), (5.27) and taking \(I=\max \left\{ J,K\right\} \) then gives

$$\begin{aligned} \tau ^2 \Vert {\text {div}}e_z^K \Vert ^2 \lesssim \Vert e_u^0 \Vert _{1}^2 + \Vert e_p^0 \Vert _{d}^2 + \int _{0}^{t_I} \Vert \rho _z \Vert _{c}^2 \, \mathrm {d} s+ \left( C_{\tau }^I \right) ^2 . \end{aligned}$$
(5.34)

Finally, to establish (5.13), combine the definition of the weighted flux norm (3.9), (5.28), (5.29) (5.33), and (5.34) and use the fact that the integral from 0 to T majorizes all of the time-integral right-hand sides of the summed expressions. \(\square \)

5.4 Convergence estimates

To specialize the general results of Proposition 5.1 we will first suppose the exact solutions to (3.2a)- suitable regularity assumptions. Moreover, we assume the interpolants, discussed in 5.1, satisfy approximation inequalities of a certain order. Towards that end let \(U_h \times W_h \times Q_h\) satisfy the assumptions of Definition 4.1. For a reflexive Banach space X, a time interval \((a, b) \subseteq \mathbb {R}\) and a measurable \(f: (a,b) \rightarrow X\) we define the canonical space-time norm [10]

$$\begin{aligned} \Vert f \Vert _{L^p(a,b;X)} = \left( \int _{a}^{b} \Vert f(s) \Vert _{X} \, \mathrm {d} s\right) ^{1/p}. \end{aligned}$$
(5.35)

As in the case of spatial derivatives, the usual Sobolev notation \(f \in H^r(a, b; X)\) means that \(f \in L^2(a, b; X)\) and that \(\partial _t f\), \(\partial ^2_t f\), \(\ldots \), \(\partial ^r_t f\) are also in \(L^2(a, b; X)\). In the sections that follow we will sometimes use the abbreviations \(\Vert f \Vert _{L^2 X}\) or \(\Vert f \Vert _{H^r X}\) to signify (5.35).

Proposition 5.2

Suppose the assumptions of Proposition 5.1 hold. Let \(k \ge 0\) be the greatest integer such that the orthogonal projection, \(\varPi _{Q_h}: Q \rightarrow Q_h\), satisfies (5.1). Suppose \(r \ge 1\) is the maximal integer such that \(P_r\), the space of continuous Lagrange polynomials of order r, is contained in \(U_h\); suppose an interpolation, from W to \(W_h\), satisfying either (5.2) or (5.3) exists and let \(s > 0\) be the maximal integer satisfying the respective inequality. Suppose that the exact solutions to (3.2a)-(3.2c) satisfy the regularity assumptions

$$\begin{aligned} \begin{array}{lll} u(t) \in L^{\infty }((0,T];H^{r+1}\cap U) \partial _t u\in L^{1}((0,T];H^{r+1}\cap U) &{} \partial _{tt}u\in L^{1}((0,T];H^1))\\ z(t) \in L^{\infty }((0,T];H^{s+1}\cap W)\cap L^{\infty }((0,T];H^{s+1}_{\kappa ^{-1}}\cap W) &{} \partial _t z \in L^2((0,T];H^{s+1}_{\kappa ^{-1}}\cap W)\\ p(t) \in L^{\infty }((0,T];H^{k+1}\cap Q) &{} \partial _{tt}p \in L^1((0,T];L^1), \end{array} \end{aligned}$$

and that the initial iterates, \((u_h^0,z_h^0,p_h^0)\), satisfy the estimates

$$\begin{aligned}&\Vert u(0)-u_h^0 \Vert _{1} + \tau ^{1/2} \Vert z(0)-z_h^0 \Vert _{c} + \Vert p(0)-p_h^0 \Vert _{d} \nonumber \\&\quad \lesssim \,\, h^{r}\Vert u(0) \Vert _{H^{r +1}} + \tau ^{1/2}h^{s+1}\Vert z(0) \Vert _{\kappa ^{-1}H^{s + 1}} + h^{k+1}\Vert p(0) \Vert _{H^{k + 1}}, \end{aligned}$$
(5.36)

consistent with the projections of sect. 5.1. Then for \(c = \min \{k,r,s\}\) we have

(5.37)

where \(M_1\) and \(M_2\) are given by

$$\begin{aligned} M_1&= h^{r-c}\left( \Vert \partial _t u \Vert _{L^1H^{r+1}} + \Vert u \Vert _{L^{\infty }H^{r+1}}\right) + h^{s-c}\left( \Vert z \Vert _{L^1H^{s+1}}\right. \\&\quad + \left. (h+\tau ^{1/2})\Vert z \Vert _{L^{\infty }H^{s+1}_{\kappa ^{-1}}} \right) + h^{k-c}\Vert p \Vert _{L^{\infty }H^{k+1}}\\ M_2&= c_0 \Vert \partial _{tt} p \Vert _{L^1L^1} + \Vert \partial _{tt}u \Vert _{L^1H^1} + h^{s+1}\Vert \partial _t z \Vert _{L^2H^{s+1}_{\kappa ^{-1}}} + h^s\Vert z \Vert _{L^{\infty }H^{s+1}} \end{aligned}$$

Proof

First, note that since \(\varPi _{Q_h}\) satisfies (5.1) and since \(P_r \subset U_h\) then, according to the argument directly preceding (5.5), the inequality (5.5) holds. Using the triangle inequality,

$$\begin{aligned} \Vert e_u^0 \Vert _{1} + \tau ^{1/2} \Vert e_z^0 \Vert _{c} + \Vert e_p^0 \Vert _{d}&\le \Vert u(0)-u_h^0 \Vert _{1} + \tau ^{1/2}\Vert z(0)-z_h^0 \Vert _{c} \\&\quad + \Vert p(0)-p_h^0 \Vert _{d} + \Vert \rho _u^0 \Vert _{1} + \Vert \rho _z^0 \Vert _{c} + \Vert \rho _p^0 \Vert _{d}, \end{aligned}$$

along with (5.36) and the projection estimates of sect. 5.1, applied to the last three terms above, gives

$$\begin{aligned} \Vert e_u^0 \Vert _{1} + \tau ^{1/2} \Vert e_z^0 \Vert _{c} + \Vert e_p^0 \Vert _{d}\lesssim & {} \,\, h^{r}\Vert u(0) \Vert _{H^{r +1}} + \tau ^{1/2}h^{s+1}\Vert z(0) \Vert _{\kappa ^{-1}H^{s + 1}} \nonumber \\&+ h^{k+1}\Vert p(0) \Vert _{H^{k + 1}}. \end{aligned}$$
(5.38)

Then (5.37) follows from the triangle inequality, with respect to the error decompositions (5.6), along with: the discrete error estimates (5.13); discrete initial iterate error estimates (5.38); and interpolation estimates (5.15.4). \(\square \)

Remark 5.1

Further assumptions on the discrete spaces, beyond the minimal Stokes–Biot stability of Defn. 4.1, can lead to slightly different versions of Proposition 5.2. For instance, if \((W_h,Q_h)\) are such that the usual Raviart-Thomas type projection commutation relation

$$\begin{aligned} \left( {\text {div}}z - {\text {div}}\varPi _{W_h}z,q_h = 0\right) ,\quad \text {for all } q_h \in Q_h, \end{aligned}$$

holds for each \(z\in W\) then \(({\text {div}}\rho _z,q_h) = 0\) so that, for instance, the contribution \(\Vert z \Vert _{L^1 H^{s+1}}\) vanishes from \(M_1\); this term arises from \(\Vert {\text {div}}\rho _z \Vert _{}\) in (5.13). This observation is used in [30] where \(W_h = RT_0\) is fixed.

6 Numerical experiments

Turning to numerical evaluation of our theoretical findings, we investigate the stability and numerical convergence properties, for \(0 < \kappa \ll 1\) and \(0 \le c_0 \le 1\), of two mixed finite element pairings:

  1. (i)

    \(U_h \times W_h \times Q_h = P^2_2 \times RT_0 \times DG_0\), and

  2. (ii)

    \(U_h \times W_h \times Q_h = P^2_2 \times P^2_1 \times DG_0\)

The first discretization is a canonical choice from the original view of conforming Stokes–Biot [19, 30] stability whereas the second choice is only minimally Stokes–Biot stable. We define a manufactured, smooth exact solution set over the unit square \(\varOmega = [0,1] \times [0,1]\), with coordinates \(x = (x_1, x_2) \in \varOmega \), given by

$$\begin{aligned}&u(t, x) = \begin{pmatrix} t\sin (\pi x_1) \sin (\pi x_2) \\ 2 t \sin (3 \pi x_1) \sin (4\pi x_2) \end{pmatrix},\\&p(t, x) = (t+1) \left( \left( (x_1 - 1) x_1 (x_2 - 1) x_2 \right) ^2 - \frac{1}{900} \right) . \end{aligned}$$

for \(t \in (0, T)\), \(T = 1.0\). These solutions satisfy the homogeneous boundary conditions \(u_{|\partial \varOmega } = 0\) and \(z_{|\partial \varOmega }\cdot n = 0\) where n is the outward boundary normal to the unit square. By construction \(p(t, \cdot ) \in L^2_0(\varOmega )\) for each t. For each discretization we consider three parameter scenarios: vanishing storage (\(c_0 = 0\)), fixed storage (\(c_0 = 1\)) and diminishing hydraulic conductivity (\(\kappa \rightarrow 0\)), and fixed hydraulic conductivity (\(\kappa = 1.0\)) and vanishing storage (\(c_0 \rightarrow 0)\). For simplicity, we here consider unit Lamé parameters: \(\mu = \lambda = 1.0\). We let the time step size \(\varDelta t = T = 1.0\) as the test case is linear in time. For solving (3.2) numerically, we used the FEniCS finite element software suite [1]. The zero average-value condition on the pressure is enforced via a single real Lagrange multiplier. Linear systems were solved using MUMPS.

In Sects.  6.1 and 6.2 we examine the numerical errors for a fixed, minimally Stokes–Biot stable discretization on a series of uniform meshes, \(\mathcal {T}_h\), with mesh size h. Each of the error tables in Sects. 6.1 and 6.2 follow the same general format. In general, the relative displacement errors \(\Vert \tilde{u}(T) - u_h(T)\Vert _1/\Vert \tilde{u}(T)\Vert _1\) are reported in the first set of table rows; the relative pressure errors \(\Vert \tilde{p}(T) - p_h(T) \Vert /\Vert \tilde{p}(T) \Vert \) appear in the second set of table rows; and the relative flux errors appear in the final set of table rows. The last column of each table (‘Rate’) denotes the order of convergence using for the last two values in each row. For each investigation, either \(c_0\) or \(\kappa \) varies while the other is fixed; the result of the variable parameter is reported for the values \(10^{-r}\) for \(r=0,4,8\) and 12 but intermediate results identical to the previous case are suppressed. For instance, if \(\kappa =10^0\), \(\kappa =10^{-4}\), and \(\kappa = 10^{-8}\) all yield the same errors for a given quantity, then only the errors for \(\kappa =10^0\) and \(\kappa = 10^{-12}\) are reported in the corresponding row. In many instances, the displacement errors correspond directly to those of a previous case; in this event, we refer to the appropriate table.

6.1 Convergence of a Stokes–Biot stable pairing

We first consider the convergence properties for the pairing \(U_h \times W_h \times Q_h = P^2_2(\mathcal {T}_h) \times RT_0(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\). This discretization satisfies the minimal Stokes–Biot stability conditions and also the Darcy stability condition when \(\kappa \) is uniformly bounded below. We note, however, that the Darcy stability condition fails to hold [2] uniformly as \(\kappa \) tends to zero but that Definition 4.1(iii) does indeed hold regardless of \(\kappa \). As discussed in the previous section, we report on the relative approximation errors for the displacement, pressure, and flux for a series of uniform meshes \(\mathcal {T}_h\) with mesh size h. The exact solutions \(\tilde{u}, \tilde{p}, \tilde{z}\) were represented by continuous piecewise cubic interpolants in the error computations.

6.1.1 Vanishing storage \(c_0 = 0\), varying conductivity \(0 < \kappa \le 1\)

(see: Table 2) We observe that the displacement error converges at the expected and optimal rate (2) for \(\kappa \) ranging from 1 down to \(10^{-12}\). Overall, the displacement errors remain essentially unchanged as \(c_0\) and \(\kappa \) vary. (We therefore do not report or discuss these further here.) The behaviour for the flux and pressure errors is less regular. The flux and pressure approximation errors increase as \(\kappa \) decreases, but seem to stabilize i.e. not increase substantially further from \(\kappa = 10^{-4}\) to \(10^{-8}\) and to \(10^{-12}\). Moreover, for each \(\kappa \), the pressure and flux errors decrease with decreasing mesh size. Indeed, for \(h = 1/128\), the pressure errors are of similar magnitude for the range of hydraulic conductivities (\(\kappa \)) tested. For a comparison to a minimally Stokes–Biot stable analogue, see Sect. 6.2.1 and Table 5.

6.1.2 Fixed storage \(c_0 = 1\), varying conductivity \(0 < \kappa \le 1\)

(see: Table 3) For this case, we again observe that the flux and pressure approximation errors increase as \(\kappa \) decrease, but seem to stabilize and not increase substantially further from \(\kappa = 10^{-4}\) to \(10^{-8}\) and \(10^{-12}\). Again, for each \(\kappa \), the pressure and flux errors decrease with decreasing mesh size and for \(h = 1/128\), the pressure errors are nearly identical for the range of hydraulic conductivites (\(\kappa \)) tested. For comparison, see Sect. 6.2.2 and Table 6.

6.1.3 Fixed conductivity \(\kappa = 1\), varying storage \(0 \le c_0 \le 1\)

(see: Table 4) For this case, we observe nearly uniform behaviour as \(c_0\) decreases. The pressure and flux errors are similar for the range of storage coefficients (\(c_0\)) considered, and converge at the optimal and expected rate (1). For comparison, see Sect. 6.2.3 and Table 7

Table 2 Vanishing storage coefficient \(c_0 = 0\), varying conductivity \(0 < \kappa \le 1\) for the (minimally) Stokes–Biot stable pairing \(P^2_2(\mathcal {T}_h) \times RT_0(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\)
Table 3 Fixed storage coefficient \(c_0 = 1\), varying conductivity \(0 < \kappa \le 1\) for the (minimally) Stokes–Biot stable pairing \(P^2_2(\mathcal {T}_h) \times RT_0(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\)
Table 4 Fixed hydraulic conductivity \(\kappa = 1\), varying storage \(0 < c_0 \le 1\) for the (minimally) Stokes–Biot stable pairing \(P^2_2(\mathcal {T}_h) \times RT_0(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\)

6.2 Convergence of a minimally Stokes–Biot stable pairing

We now turn to consider the convergence properties for the pairing \(U_h \times W_h \times Q_h = P^2_2(\mathcal {T}_h) \times P^2_1(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\) and again report on the relative approximation errors for the displacement, pressure and flux. This pairing does not satisfy a Darcy stability condition, for any value of \(\kappa \), as advanced in the original Stokes–Biot stability criteria; it does satisfy the minimally Stokes–Biot criterion of Definition 4.1. Numerical results for this minimally Stokes–Biot stable discretization, for the three paradigms considered in Sect. 6.1, are presented in 6.2.16.2.3 alongside specific comparisons to the standard Stokes–Biot stable case.

The results of this comparison supply computational evidence that Definition 3.1(iii) can be replaced by Definition 4.1(iii) while retaining the convergence properties first observed in [17, 30]. Since the Darcy stability of Definition 3.1(iii) is not satisfied [2] uniformly in \(\kappa \), our observations strongly suggest that the minimal Stokes–Biot stability assumptions, specifically Definition 4.1(iii), are in fact, the key component for discretizations that retain their convergence properties as \(\kappa \) tends to zero.

6.2.1 Vanishing storage \(c_0 = 0\), varying conductivity \(0 < \kappa \le 1\)

(see: Table 5) Comparing Table 5 with Table 2, we observe that the performance of the two element pairings is almost surprisingly similar. Again, the displacement converges at the optimal and expected rate (2), the pressure and flux errors increase with decreasing \(\kappa \), but stabilize, and converge with decreasing mesh size. We further observe that the relative errors for the flux for this element pairing is smaller than for the \(P^2_2 \times RT_0 \times DG_0\) case (bottom rows). For a comparison to a discretization satisfying Darcy stability (though not uniformly in \(\kappa \)) see Sect. 6.1.1 and Table 2.

6.2.2 Fixed storage \(c_0 = 1\), varying conductivity \(0 < \kappa \le 1\)

(see: Table 6) Comparing Table 6 with Table 3, we again observe highly comparable performance. The observations made for the \(P^2_2 \times RT_0 \times DG_0\) case thus also apply for \(P^2_2 \times P^2_1 \times DG_0\). For comparison, see Sect. 6.1.2 and Table 3.

6.2.3 Fixed conductivity \(\kappa = 1\), varying storage \(0 \le c_0 \le 1\)

(see: Table 7) For this case, we observe similar convergence rates as e.g. in Table 4). The pressure error increases very moderately with decreasing \(c_0\) (it doubles as \(c_0\) is reduced by 12 orders of magnitude), but both the pressure and flux converges at the optimal and expected rate (1). For comparison, see Sect. 6.1.3 and Table 4.

Table 5 Vanishing storage coefficient \(c_0 = 0\), varying conductivity \(0 < \kappa \le 1\) for the minimally Stokes–Biot stable pairing \(P^2_2(\mathcal {T}_h) \times P^2_1(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\)
Table 6 Fixed storage coefficient \(c_0 = 1\), varying conductivity \(0 < \kappa \le 1\) for the minimally Stokes–Biot stable pairing \(P^2_1(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\)
Table 7 Fixed hydraulic conductivity \(\kappa = 1\), varying storage \(0 < c_0 \le 1\) for the minimally Stokes–Biot stable pairing \(P^2_2(\mathcal {T}_h) \times P^2_1(\mathcal {T}_h) \times DG_0(\mathcal {T}_h)\)

7 Conclusion

The important concept of Stokes–Biot stability, introduced independently by [19, 22, 27, 30], has proven a practical key to the selection of conforming Euler–Galerkin discretizations of Biot’s Eq. (1.1) that retain their convergence properties as the hydraulic conductivity (\(0<\kappa \)) becomes arbitrarily small. The novel contributions of this manuscript are primarily theoretical in nature; we have shown that the Stokes–Biot stability perspective can be, formally, relaxed and we have introduced the notion of a minimally Stokes–Biot stable Euler–Galerkin discretization. The stability of minimally Stokes–Biot stable schemes is independent of both \(c_0\) and \(\kappa \) (c.f.  Sect. 4.2, [19, Theorem 1] and [17, Theorem 3.2, Case I]), and we have presented a convergence analysis in this context.

In particular, we differ from previous authors [30] by carrying out our convergence analysis without the use of a Galerkin projection based on the Darcy problem. In doing so, we are able to depart from both the Darcy stability assumption, in general, and any questions regarding the appropriate norms for uniform-in-\(\kappa \) Darcy stability. In fact, an analysis based on a uniform-in-\(\kappa \) Darcy stability assumption should take into account a pressure-space norm exhibiting one of the forms discussed in Section 3.3; namely \(L^2 + \kappa ^{1/2} H^1\), \(\kappa ^{1/2}L^2\) or \(\kappa ^{1/2}H^1\). Each of these pressure norms have related difficulties over the usual pressure \(L^2\) norm used here. First, it is not entirely clear to the authors that the \(L^2+\kappa ^{1/2} H^1\) norm can be treated with the otherwise-standard arguments presented here and in related [17, 19, 22, 30] work. Second, the \(\kappa ^{1/2}L^2\) and \(\kappa ^{1/2}H^1\) weightings both degenerate as \(\kappa \) becomes small and must be balanced by an appropriate displacement norm so that Stokes stability (Definition 3.1(i), (ii) and Definition 4.1(i), (ii)) holds uniformly in \(\kappa \) as well. Conversely, our arguments bring together many standard techniques and makes definitevely clear, by abdicating Darcy stability, that one need not consider any \(\kappa \)-weighted norms for the pressure space in order to ensure stability and approximation as \(\kappa \) diminishes. Moreover, the convergence analysis presented here is the first instance, of which we are aware, of an analysis carried out in the context of the full norm used to prove the Banach-Nec̆as-Babus̆ka stability (c.f.  Sect. 4.2) of the Euler–Galerkin discretization (3.5). Thus, as neither the current convergence analysis, nor the previously-established BNB stability result, rely on a Darcy stability assumption, Proposition 5.2 solidifies, and generalizes, previous convergence estimates [30]. The concept of minimal Stokes–Biot stability therefore broadens the original view of Stokes–Biot stability to include alternative spaces that may not be Darcy stable; even for a fixed choice of \(\kappa \).

7.1 Further observations and practical considerations

The primary contribution of the current work is theoretical in nature; we have, in practice, removed the Darcy restriction for Stokes–Biot stability and demonstrated an alternative convergence analysis in this context. Nevertheless, practical questions regarding the suitability of both Stokes–Biot and minimally Stokes–Biot stable approaches, solving (3.2), can be asked. In particular, we now briefly discuss: the drawbacks of Stokes–Biot and minimally Stokes–Biot stable discretizations; what computational advantages, if any, are granted by the minimal Stokes–Biot perspective; and alternatives to the boundary conditions (2.5).

Both Stokes–Biot and minimally Stokes–Biot stables are not without their drawbacks. The requirements of both definition 3.1 and definition 4.1 are general; however, in practice, both approaches typically make use of discontinuous pressures. This theme is present in the literature for both conformal and non-conformal discretizations. In practice, the need of a discontinuous pressure space imposes restrictions on the choice of elements. In this manuscript we have used \(P_2^d \times RT_0 \times DG_0\) and \(P_2^d \times P_1^d \times DG_0\) discretizations to illustrate a simple comparison (c.f. Sect. 6) via numerical experiments in 2D. In two dimensions, as we discussed in Sect. 4, one could also consider pairings of Scott-Vogelius type, i.e. \(P_k^d \times RT_m \times DG_{k-1}\) or \(P_k^d \times P_m^d \times DG_{k-1}\) where \(k \ge 4\). In the context of minimally Stokes–Biot stable triples, we have that the flux space degree can be chosen as \(0<m \le k-1\) in both cases; from the original Stokes–Biot point of view, one would require \(m=k-1\) for the case or Raviart-Thomas elements and polynomial fluxes would not be admissible at all. In 3D, one could also consider extensions of the Scott-Vogelius elements [15], the enriched cubic displacement element and piecewise constant pressures introduced by Guzmán and Neilan [14], the bubble-enriched continuous linear displacement element and piecewise constant pressures as in [30], or the related bubble-enriched continuous quadratic elements with discontinuous linear pressures [8]; these spaces could be considered alongside fluxes of Raviart-Thomas, Brezzi-Douglas-Marini, and Lagrange type. One may also consider quadrilateral meshes by using the Stokes pairing [8] given by \(Q^2_2 \times DG_1\) along with, for instance, \(RT_m\) (\(m=0\) or 1), \(BDM_k\) or \(P_k^d\) (\(k=1\) or 2) fluxes.

It is practical to note that minimally Stokes–Biot stable discretizations do not necessarily confer a computational advantage over those with Darcy stable (for fixed \(\kappa \)) flux-pressuring pairings when equal-order fluxes are selected. That is to say, for instance, that the \(RT_k\) fluxes will have fewer DOFs than the alternative \(P^d_k\) fluxes discussed in this manuscript. However, minimal Stokes–Biot stability makes it clear that one can lower the order of the flux space without adversely impacting the stability and convergence of the method. One could interpret this as a form of ‘computational advantage’ of minimal Stokes–Biot. Overall, however, this is not the important point of minimal Stokes–Biot stability. The important points are that: Darcy stability is not necessary; typical ‘Darcy stable pairings’ satisfy the minimal Stokes–Biot criteria; and that approximation in both contexts yield strikingly similar results. Indeed, numerical experiments (Sect. 6) show similar errors both with (Table 2–Table 4) and without (Table 5–Table 7) a Darcy stability assumption; even as \(\kappa \) becomes very small. Moreover, we would not expect an improvement in results if the norms of the flux and pressure were altered to provide for uniform-in-\(\kappa \) Darcy stability condition (c.f. (A) and (B) of Sect. 3.3)); this is due to the fact that approximation of the pressure in the \(L^2 + \kappa ^{1/2} H^1\) norm is similar to that of the \(L^2\) norm while approximation in the other option, \(\kappa ^{1/2} L^2\), degrades as \(\kappa \) becomes small. Thus, the tenets of minimal Stokes–Biot stability (Definition 4.1) provide an approximation of the pressure in the most sensible norm; that is, the \(L^2\) norm is a fortuitous choice for convergence analysis, assures the proper context for Stokes stability, and does not degrade as \(\kappa \) becomes small. Our conclusion is that one can think, instead, in terms of Definition 4.1 (iii) when designing, or analyzing, approaches for Biot when \(\kappa \rightarrow 0\). This important point could certainly impact the design, or choice, of discretizations that do in fact confer a computational advantage of those where Darcy stability is a requirement.

Finally, we close with a brief revisitation of the boundary conditions discussed in Sect. 2.5. Extending (2.5) to inhomogeneous data is not a concern; essential boundary conditions conditions can be lifted by selecting a particular solution and natural boundary conditions yield right-hand side terms that vanish in the error Eq. (5.10), and do not alter the stability arguments (Sect. 4). However, it is valid to note that the assumption that \(\varGamma _f = \varGamma _c\) and \(\varGamma _t = \varGamma _p\), in (2.4), may not be practical for problems of interest. The conditions (2.5) were considered in the original Stokes–Biot, or motivating, literature [19, 30] which lead to their adoption here. The advantage of the boundary conditions (2.5) is that they provide for an overall discussion that scopes naturally between the case where \(\varGamma _c \ne \partial \varOmega \) and \(\varGamma _c = \partial \varOmega \) provided that \(|\varGamma _c| > 0\) is assumed. In particular, as mentioned in Remark 3.1, if \(\varGamma _c = \partial \varOmega \) then conditions (2.5) imply \(\varGamma _t = \emptyset \), the variational forms (3.2) and (3.5) are unchanged, and all results discussed hold when \(Q = L_0^2(\varOmega )\) is selected, instead, in (3.1).

It is reasonable to ask what other boundary condition configurations can be considered, and under what conditions. First, we note that the requirement that \(\varGamma _c \cap \varGamma _t = \emptyset \) and \(\varGamma _f \cap \varGamma _p = \emptyset \) arise from the early work in well posedness for Biot [31]. Moreover, the requirement that the positive measure of the clamped displacement boundary is non-zero, i.e. that \(|\varGamma _c| > 0\), provides the coercive property \(a(u,u) \ge \gamma _a \Vert u \Vert _{1}^2\) needed by both Definition 3.1 (Stokes–Biot stability) and Definition 4.1 (minimal Stokes–Biot stability); this is therefore a strict requirement of the proposed method. However, if both \(|\varGamma _c| > 0\) and \(|\varGamma _t| > 0\) then, as noted in [18] and used in [22], the conditions (2.5) can be relaxed to those of (2.4). In this case, the requirements of both Definition 3.1 and Definition 4.1 can be satisfied with \(Q = L_2(\varOmega )\) in (3.1). In this case, the variational formulations (3.2) and (3.5) are, again, unaltered and the results of the manuscript follow analagously. It is true, as discussed above, that restrictive boundary conditions, i.e. such as (2.5), are needed when \(\varGamma _c = \partial \varOmega \) in order to ensure the tenets of both Definition 3.1 and Definition 4.1; this is not an additional imposition of minimal Stokes–Biot stability (Definition 4.1) but rather of the Stokes–Biot perspective in general.