1 Introduction

The construction and mathematical analysis of finite element approximations of models of non-Newtonian fluids has been a subject of active research in recent years. Some of the most general results in this direction concern the convergence of mixed finite element approximations of models of incompressible fluids with implicit constitutive laws relating the Cauchy stress tensor to the symmetric velocity gradient (cf. [16, 34] and [19]). Motivated by the groundbreaking contributions of Cohen, Dahmen and DeVore [13, 14] and Binev, Dahmen and DeVore [8] concerning the convergence of adaptive algorithms for linear elliptic problems, progress, albeit much more limited in both scope and extent, has also been made on the analysis of adaptive finite element approximations of implicitly constituted non-Newtonian fluid flow models (cf. [27]).

Upon decomposing the Cauchy stress tensor into its traceless part, called the deviatoric stress tensor or shear-stress tensor, and its diagonal part, called the mean normal stress, models of incompressible fluids typically involve the velocity of the fluid, \({\varvec{u}}\), its pressure, p, and the shear-stress tensor, \(\varvec{{{\mathcal {S}}}}\). For Newtonian fluids the shear-stress tensor is a scalar multiple of the symmetric velocity gradient. The finite element approximation of Newtonian fluids is therefore usually performed in the velocity–pressure formulation. For non-Newtonian fluids on the other hand the situation is more involved, because the shear-stress tensor exhibits nonlinear dependence as a function of the symmetric velocity gradient, and the functional relationship between the shear-stress tensor and the symmetric velocity gradient may even be completely implicit and multi-valued. For power-law fluids, such as the ones considered in this work, the shear-stress tensor exhibits power-law type growth as a function of the symmetric velocity gradient, the simplest instance of which results in an r-Laplace type operator in the balance of linear momentum equation, with a power-law exponent \(r \in (1,\infty )\); for \(r=2\), corresponding to a Newtonian fluid, the operator is linear, the Laplace operator. From a mathematical point of view, in the presence of a convection term in the balance of linear momentum equation in the model, the lower the value of r the more difficult the problem is to analyse. The existence of solutions for small values of r was first proved in [20], where an Acerbi–Fusco type Lipschitz truncation was used in conjunction with Minty’s method from monotone operator theory; thus, weak solutions were shown to exist for \(r>\frac{2d}{d+2}\) in \(d\ge 2\) space dimensions.

Finite element approximations of problems with power-law rheology have been extensively studied, including stabilised (or variational-multiscale) methods (cf. [1, 11], for example) and local discontinuous Galerkin methods (see, [28], for example). The relevant literature is vast and it is beyond the scope of this work to provide an exhaustive survey of the various contributions; the interested reader may wish to consult [30], for example. Concerning implicitly constituted models, in the recent papers [16, 34] the convergence of generic inf-sup stable velocity/pressure-based mixed finite element methods was proved for \(r>\frac{2d}{d+1}\), while convergence for the full range, \(r>\frac{2d}{d+2}\), was shown only in the case of finite element methods where the velocity space consists of pointwise divergence-free functions. The reason for this dichotomy is that in the case of velocity approximations that are discretely divergence-free only, as is the case in generic inf-sup stable mixed finite element methods, the finite element approximation \(\nabla \cdot ({\varvec{u}}_h \otimes {\varvec{u}}_h)\) of the convection term \(\nabla \cdot ({\varvec{u}}\otimes {\varvec{u}})\) does not vanish when tested with \({\varvec{u}}_h\), and it needs to be skew-symmetrized (cf. [35]) for this to happen. While in the case of the Navier–Stokes equations (corresponding to \(r=2\)) membership of the velocity field to the natural function space for weak solutions, \(W^{1,2}_0(\Omega )^d\), ensures that the convection term and its skew-symmetric modification can be bounded by the same expression using Hölder’s inequality, this is not the case for the power-law model under consideration here for entire range \(r>\frac{2d}{d+2}\) for which weak solutions to the problem are known to exist. In fact, in the case of non-Newtonian power-law models the natural function space for the velocity field is \(W^{1,r}_0(\Omega )^d\), and while the original convection term can be bounded in terms of the \(W^{1,r}(\Omega )^d\) norm for all \(r>\frac{2d}{d+2}\), for the skew-symmetric modification of the convection term, whose use is essential so as to be able to derive an energy inequality for discretely divergence-free velocity fields, this can only be achieved for the limited range \(r > \frac{2d}{d+1}\). This was precisely the bottleneck encountered in [16] for discretely divergence-free velocity approximations, resulting in the reduction of the range of r from the maximal range \(r>\frac{2d}{d+2}\) for which weak solutions are known to exist, to \(r>\frac{2d}{d+1}.\)

The advantage of pointwise divergence-free finite element methods over discretely divergence-free finite element methods is therefore that, besides the physical consistency they provide, there is no need to rewrite the convection term in a skew-symmetric form. The topic of divergence-free finite element spaces has been treated extensively in the literature, most commonly presenting pairs of spaces for which the divergence of the velocity space is a subspace of, or equal to, the pressure space. For example, the early Scott–Vogelius element [33] (analysed recently in [25]) uses \(W^{1,2}(\Omega )^d\)-conforming piecewise polynomials of degree k for the velocity, while discontinuous piecewise polynomials of degree \(k-1\) are used for the pressure. The stability of this pair requires either special meshes, or a high-enough degree k (for example, \(k\ge 4\) is needed in [25], and \(k=1\) is only allowed in very special cases such as those described in Remark 5). Another possibility is to relax the continuity requirements and consider a discontinuous Galerkin method, as was done, for example, in [12], or to relax only the tangential continuity of the approximate velocity on faces of elements while still preserving its continuity in the direction of the normal to faces of elements, thus using \(H({\mathop {\mathrm {div}\,}}\!;\Omega )\)-conforming methods, as was the case in [32], for example. In this latter case the viscous term (defined as the divergence of the shear-stress \({{\mathcal {S}}})\) needs to be modified, for stability reasons, by adding terms controlling the jumps and averages of the velocity into the formulation, with, obviously undesirable, extra complications if the viscous term in the balance of linear momentum equation has a more complex structure, as is the case for the power-law model considered herein.

The recent works [2, 3] offer a way of preserving the advantages of a pointwise divergence-free approximation to the velocity field while working with the, computationally simplest, lowest order \(W^{1,2}(\Omega )^d/L^2(\Omega )\)-conforming velocity/pressure pair, namely \({{\mathbb {P}}}_1^d\times {{\mathbb {P}}}_0^{\mathrm{disc}}\). The key idea in those works can be summarised as follows: the discrete continuity equation contains a stabilising term based on the jumps of the discrete pressure. As the jumps of the pressure are constant along element faces, there exists a unique Raviart–Thomas field such that its normal component is equal to the jumps. This field can be built at no extra computational cost, and then the continuity equation can be rewritten as a standard continuity equation, but for a modified velocity field, which is now solenoidal. The finite element method then involves replacing the original discrete velocity field \({\varvec{u}}_h\) with the new, now solenoidal, modified velocity field in the convection term. This facilitates the proofs of stability and convergence of the resulting finite element method without the need to rewrite the convection term in a skew-symmetric form. Our aim here is to apply this idea to a problem in non-Newtonian fluid mechanics. As a first step in this direction, we have chosen an explicit constitutive law with power-law rheology. Even though this is the simplest constitutive law, it has been shown experimentally to faithfully reproduce many situations of physical interest (see the discussion in [23], and the experimental results in, e.g., [26]); we therefore believe that it is a representative model for exemplifying the applicability of the proposed method in a mathematically nontrivial and physically relevant setting. Since the convection term does not need to be rewritten in a skew-symmetric form, the resulting method can now be proved to be stable and convergent to a weak solution for the whole range \(r>\frac{2d}{d+2}\) of the power-law index for which weak solutions to the model are known to exist. In addition, the sequence of numerical approximations is shown to converge strongly, and this strong convergence result is, to the best of our knowledge, a new contribution even in the, very special, Newtonian case (\(r=2\)).

The rest of the manuscript is organised as follows. A section on preliminaries, containing the necessary notational conventions, basic definitions and results, the finite element spaces, the lifting operator, the definition of the stabilising form, and properties of the discrete Lipschitz truncation method that we use, are presented following this Introduction. An important ingredient enabling the use of the discrete Lipschitz truncation technique is a discrete inf-sup condition that is given in Appendix. The finite element method is presented in Sect. 3, where we also show a uniform boundedness result for the sequence of approximations. Based on this and results pertaining to the discrete Lipschitz truncation, in Sect. 4 the convergence of the discrete solution to a weak solution of the model problem is proved using a compactness argument. Finally, some conclusions are drawn and potential future extensions are indicated.

2 Preliminaries

2.1 Notation and the Problem of Interest

We use standard notation for Sobolev spaces. In particular, for \(D\subset {{\mathbb {R}}}^d\), \(d=2,3\) and \(s\in [1,+\infty )\), we denote by \(W^{k,s}_0(D)\) the closure of \(C_0^\infty (D)\) with respect to the \(W^{k,s}(D)\) norm, \(W^{1,\infty }_0(\Omega ):= W^{1,1}_0(\Omega ) \cap W^{1,\infty }(\Omega )\), and by \(L^s_0(D)\) the space of functions in \(L^s(D)\) with zero integral mean-value. The norm in \(L^s(D)\) is denoted by \(\Vert \cdot \Vert _{0,s,D}^{}\); when \(s=2\) we shall use the simpler notation \(\Vert \cdot \Vert _{0,D}^{}\), and the inner product in \(L^2(D)\) will be denoted by \((\cdot ,\cdot )_D^{}\). For \(k\ge 0\), the norm (seminorm) in \(W^{k,s}(D)\) is denoted by \(\Vert \cdot \Vert _{k,s,D}^{}\) (\(|\cdot |_{k,s,D}^{}\)). Moreover, for \(s \in (1,\infty )\), the space \(W^{-1,s'}(D)\) is the dual of \(W^{1,s}_0(D)\) with duality pairing denoted by \(\langle \cdot ,\cdot \rangle _D^{}\). Here, \(s'\) denotes the Hölder conjugate of s, defined by \(\frac{1}{s} + \frac{1}{s'}=1\). We also denote by \(W^s({\mathop {\mathrm {div}\,}}\!;D)\) the space of functions in \(L^s(D)^d\) whose distributional divergence belongs to \(L^s(D)\), and by \(W_0^{s}({\mathop {\mathrm {div}\,}}\!;D)\) the set of elements in \(W^s({\mathop {\mathrm {div}\,}}\!;D)\) whose normal trace on \(\partial D\) is zero. In the above inner products and norms we do not make a distinction between scalar- and vector- or tensor-valued functions.

Let \(\Omega \subset {{\mathbb {R}}}^d, d=2,3,\) be an open, bounded, polyhedral domain with a Lipschitz boundary. In this work we treat the problem with power-law rheology: given \(r\in (1,\infty )\) and a right-hand side \({\varvec{f}}\in W^{-1,r'}(\Omega )^d\), find the velocity \({\varvec{u}}\), the pressure p, and the shear-stress tensor \(\varvec{{{\mathcal {S}}}}\) satisfying

$$\begin{aligned} \left\{ \begin{array}{rcll} -{\mathop {\mathrm {div}\,}}\varvec{{{\mathcal {S}}}} +{\mathop {\mathrm {div}\,}}({\varvec{u}}\otimes {\varvec{u}})+\nabla p &{} = &{} {\varvec{f}}&{} \text {in }\;\Omega ,\\ {\mathop {\mathrm {div}\,}}{\varvec{u}}&{} = &{} 0 &{} \text {in }\;\Omega ,\\ {\varvec{u}}&{} = &{} {\varvec{0}} &{} \text {on }\partial \Omega . \end{array} \right. \end{aligned}$$
(2.1)

There are many possible choices for the constitutive law, linking \(\varvec{{{\mathcal {S}}}}\) and the velocity \({\varvec{u}}\). In this work we have chosen the power-law description where \(\varvec{{{\mathcal {S}}}} = 2\eta \, |\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\), where \(\eta >0\) is a reference viscosity. In order to simplify matters we will suppose that \(\eta =\frac{1}{2}\), but we should keep in mind that, to maintain physical consistency this reference value should be kept. Similarly, in physically realistic models the gradient of the velocity is usually replaced by the symmetric velocity gradient \(\varepsilon ({\varvec{u}}):=\frac{1}{2}(\nabla {\varvec{u}}+\nabla {\varvec{u}}^t)\). The results obtained in this paper can be extended, with minor modifications based on Korn’s inequality, to that case as well, so for the sake of simplicity of the exposition we shall proceed with the constitutive relation \(\varvec{{{\mathcal {S}}}} = 2\eta \, |\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\) (with \(\eta = \frac{1}{2}\)) instead of \(\varvec{{{\mathcal {S}}}} = 2\eta \, |\varepsilon ({\varvec{u}})|^{r-2}\varepsilon ({\varvec{u}})\).

In order to state the weak formulation of (2.1) we need to present a few additional ingredients associated with the exponent in the constitutive law relating \(\varvec{{{\mathcal {S}}}}\) and \({\varvec{u}}\). For \(r\in (1,\infty )\), let us define the associated critical exponent \({\tilde{r}}\) as follows:

$$\begin{aligned} {\tilde{r}}:=\min \left\{ r',\frac{r^\star }{2}\right\} ,\quad \text {where}\quad r^\star :=\left\{ \begin{array}{cl} \infty &{} \text {if}\; r\ge d, \\ \dfrac{dr}{d-r} &{} \text {otherwise}. \end{array}\right. \end{aligned}$$
(2.2)

Remark 1

With the definition (2.2) of \({\tilde{r}}\), the space \(W^{1,r}(\Omega )\) is continuously embedded in \(L^{r^\star }(\Omega )\) if \(r<d\) and in \(L^s(\Omega )\), for every \(s<\infty \), if \(r\ge d\) (see, e.g., [10, Corollary 9.14]). Then, in particular, \(W^{1,r}(\Omega )\) is continuously embedded in \(L^{2{\tilde{r}}}(\Omega )\) and there exists a \(C>0\) such that

$$\begin{aligned} \Vert v\Vert _{0,2{\tilde{r}},\Omega }^{}\le C\,\Vert v\Vert _{1,r,\Omega }^{}\qquad \forall \, v\in W^{1,r}(\Omega ). \end{aligned}$$
(2.3)

Moreover, the value of \({\tilde{r}}\) exhibits two different regimes, as can be seen in Fig. 1, where its range of values is depicted. We will distinguish between \({\tilde{r}}\le 2\) and \({\tilde{r}} > 2\). The latter case occurs for \(r\in \big ( \frac{4d}{d+4},2\big )\) and the maximum value of \({\tilde{r}}\) is attained when \(r'=\frac{r^\star }{2}\), at which point we have the following values:

$$\begin{aligned} r=\frac{3d}{d+2} = \left\{ \begin{array}{ll} \frac{3}{2} &{} \text {if}\; d=2,\\ ~ \\ \frac{9}{5} &{} \text {if}\; d=3, \end{array} \right. \qquad \text {and}\qquad {\tilde{r}}_{\mathrm{max}}^{} = \frac{3d}{2d-2} = \left\{ \begin{array} {ll} 3 &{} \text {if}\; d=2,\\ \frac{9}{4} &{} \text {if}\; d=3. \end{array} \right. \end{aligned}$$
Fig. 1
figure 1

Values of \({\tilde{r}}\) (defined in (2.2)) and \(\alpha (r)\) (defined in (2.23)) for the cases \(d=2\) (left) and \(d=3\) (right)

With this choice of the shear-stress tensor \({{\mathcal {S}}}\), the weak formulation of (2.1) is as follows: find \({\varvec{u}}\in W^{1,r}_0(\Omega )^d\) and \( p\in L^{{\tilde{r}}}_0(\Omega )\) such that

$$\begin{aligned} (|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}},\nabla {\varvec{v}})_\Omega ^{}-({\varvec{u}}\otimes {\varvec{u}},\nabla {\varvec{v}})_\Omega ^{}-(p,{\mathop {\mathrm {div}\,}}{\varvec{v}})_\Omega ^{}&= \langle {\varvec{f}},{\varvec{v}}\rangle _\Omega ^{}\qquad&\forall \,{\varvec{v}}\in W^{1,{\tilde{r}}'}_0(\Omega )^d, \end{aligned}$$
(2.4)
$$\begin{aligned} (q,{\mathop {\mathrm {div}\,}}{\varvec{u}})_\Omega ^{}&= 0\qquad&\forall \, q\in L^{r'}_0(\Omega ). \end{aligned}$$
(2.5)

Remark 2

In order for the variational formulation (2.4), (2.5) to be meaningful it is necessary that \({\varvec{u}}\otimes {\varvec{u}}\in L^{{{\tilde{r}}}}(\Omega )^{d \times d}\) with \({{\tilde{r}}}>1\), which necessitates that \(r> \frac{2d}{d+2}\), and under this condition the existence of a solution to (2.4), (2.5) has been proved (see [17]). Thus, for the rest of this work we will assume that \(r> \frac{2d}{d+2}\).

Another fundamental ingredient in the proof of existence of solutions to (2.4), (2.5) is the following inf-sup condition (for a proof, see [21]): for \(s,s'\in (1,+\infty )\) satisfying \(\frac{1}{s}+\frac{1}{s'}=1\), there exists a constant \(\beta _s^{}>0\) such that

$$\begin{aligned} \sup _{{\varvec{v}}\in W^{1,s'}_0(\Omega )^d\setminus \{{\varvec{0}}\}}\frac{( q,{\mathop {\mathrm {div}\,}}{\varvec{v}})_\Omega ^{}}{|{\varvec{v}}|_{1,s',\Omega }^{}}\ge \beta _s^{}\Vert q\Vert _{0,s,\Omega }^{}\qquad \forall \, q\in L^{s}_0(\Omega ). \end{aligned}$$

2.2 Finite Element Spaces and Preliminary Results

Let \(\{{{\mathscr {T}}}_h^{}\}_{h>0}^{}\) be a shape-regular family of triangulations of \({\overline{\Omega }}\) consisting of closed simplices K of diameter \(h_K^{}\le h:=\max \{ h_K^{}:K\in {{\mathscr {T}}}_h^{}\}\). To avoid technical difficulties we will suppose that the family of triangulations is quasi-uniform. For reasons that will become apparent later, in the proof of convergence of the finite element method we will distinguish between the cases \(r\ge \frac{3d}{d+2}\) and \(r\in \big ( \frac{2d}{d+2}, \frac{3d}{d+2}\big )\). To cover the latter case (and for that purpose only) we need to make the following assumption on the mesh:

Assumption (A1). The triangulation \({{\mathscr {T}}}_h^{}\) is the result of performing one (for \(d=2\)), or two (for \(d=3\)), red refinement(s) of a, coarser, shape-regular triangulation \({{\mathscr {T}}}_{H}^{}\).

Remark 3

We note that when \(d=2\) red refinement of a simplex (triangle) amounts to dividing the triangle into four mutually congruent subtriangles, which are all similar to the original triangle, using the midpoints of the edges of the original triangle. For \(d=3\) red refinement of a simplex (tetrahedron) is performed in two stages. First, using the midpoints of the edges the tetrahedron is divided into four congruent subtetrahedra, which are all similar to the original tetrahedron, and an octahedron. Then in a second step the octahedron is further divided into four subtetrahedra using one of its three diagonals (see [9, Chapter 8] for more details and properties of simplicial refinements).

We will denote the (closed) elements contained in \({{\mathscr {T}}}_H^{}\) (referred to, in some instances, as macro-elements) by M.

Remark 4

  1. (i)

    By letting \(H:=\max \{ {\mathop {\mathrm {diam}\,}}(M):M\in {{\mathscr {T}}}_H^{}\}\), clearly, \(H\le Ch\), where C does not depend on h. In fact, \(C=2\) for \(d=2\) and \(C=4\) for \(d=3\).

  2. (ii)

    Under \({\mathop {\mathrm {Assumption~(A1)}}}\), for every \({\tilde{F}}\), a facet of \(M\in {{\mathscr {T}}}_H^{}\), there exists at least one node of \({{\mathscr {T}}}_h^{}\) that belongs to the interior of \({\tilde{F}}\). In fact, this last remark is the main reason why \({\mathop {\mathrm {Assumption~(A1)}}}\) has been made on the meshes. In particular, \({{\mathscr {T}}}_h^{}\) could also result from first making a barycentric refinement of each facet of \({{\mathscr {T}}}_H^{}\) and then building a conforming triangulation of \({{\overline{\Omega }}}\). For ease of exposition we shall simply adopt \({\mathop {\mathrm {Assumption~(A1)}}}\) in what follows.

In the triangulation \({{\mathscr {T}}}_h^{}\) we shall use the following notation:

  • \({{\mathscr {F}}}_h^{}\) : the set of all facets F (edges in 2D and faces in 3D) of the triangulation \({{\mathscr {T}}}_h^{}\), with diameter \(h_F^{}:={\mathop {\mathrm {diam}\,}}(F)\). The set of internal facets is denoted by \({{\mathscr {F}}}_I^{}\) and those on the boundary of \(\Omega \) are denoted by \({{\mathscr {F}}}_{\partial }^{}\), so \({{\mathscr {F}}}_h^{}={{\mathscr {F}}}_I^{}\cup {{\mathscr {F}}}_\partial ^{}\);

  • For every \(M\in {{\mathscr {T}}}_H^{}\) we denote by \({{\mathscr {F}}}_I^{}(M)\) the set of facets of \({{\mathscr {T}}}_h^{}\) whose interior lies in the interior of M;

  • For \(F\in {{\mathscr {F}}}_h^{}\) and \(K\in {{\mathscr {T}}}_h^{}\) we define the neighbourhoods

    $$\begin{aligned} \omega _F^{}:=\{K\in {{\mathscr {T}}}_h^{}:F\in {{\mathscr {F}}}_K^{}\},\quad \quad \omega _K^{}:=\{K'\in {{\mathscr {T}}}_h^{}:K\cap K'\not =\emptyset \}; \end{aligned}$$
    (2.6)
  • For each facet \(F\in {{\mathscr {F}}}_I^{}\) and every piecewise regular function q, we denote by \(\llbracket q\rrbracket _F^{}\) the jump of q across F;

  • For \(\ell \ge 0\) we denote by \({{\mathbb {P}}}_\ell ^{}(K)\) the space of polynomials defined on K of total degree smaller than, or equal to, \(\ell \), and introduce the following finite element spaces:

    $$\begin{aligned} {\varvec{V}}_h^{}&:=\{ {\varvec{v}}_h^{}\in C^0({\overline{\Omega }})^d \,:\, {\varvec{v}}_h^{}|_K^{}\in {{\mathbb {P}}}_1^{}(K)^d\;,\;\forall \, K\in {{\mathscr {T}}}_h^{}\;,\;{\varvec{v}}_h^{}|_{\partial \Omega }^{}={\varvec{0}}\}, \end{aligned}$$
    (2.7)
    $$\begin{aligned} {{\mathscr {Q}}}_h^{}&:= \{ q_h^{}\in L^1_0(\Omega )\,:\, q_h^{}|_K^{}\in {{\mathbb {P}}}_0^{}(K)\;,\;\forall \, K\in {{\mathscr {T}}}_h^{}\}, \end{aligned}$$
    (2.8)
    $$\begin{aligned} {{\mathscr {Q}}}_H^{}&:= \{ q_H^{}\in L^1_0(\Omega )\,:\, q_H^{}|_M^{}\in {{\mathbb {P}}}_0^{}(M)\;,\;\forall \, M\in {{\mathscr {T}}}_H^{}\}. \end{aligned}$$
    (2.9)

Remark 5

\({\mathop {\mathrm {Assumption~(A1)}}}\) raises the question whether the space \({{\mathbb {P}}}_1^d\times {{\mathbb {P}}}_0^{}\) itself is stable on carefully constructed meshes. Some results are known in this direction. For example, in two space dimensions, this pair is inf-sup stable on Powell–Sabin meshes [38] provided that the pressure space \({{\mathbb {P}}}_0^{}\) is slightly modified: the resulting \({{\mathbb {P}}}_1^{d} \times \tilde{{{\mathbb {P}}}}_0^{}\) velocity–pressure pair, where \(\tilde{{{\mathbb {P}}}}_0^{}\) is a subset of \({{\mathbb {P}}}_0^{}\), is inf-sup stable on Powell–Sabin meshes. In the recent work [24] local inf-sup stability is proved for this element in barycentrically refined meshes (also known as the Alfeld split [29], and a Hsieh–Clough–Tocher triangulation, see the references quoted in [38, p. 461]). This is then used to build enriched elements that are divergence-free (although the velocity space contains quadratic face bubbles with constant divergence). For three-dimensional meshes, for the Alfeld split the lowest order inf-sup stable pair is \({{\mathbb {P}}}_4^3\times {{\mathbb {P}}}_3^{\mathrm{disc}}\) (cf. [37]), while for the Powell–Sabin split the lowest order inf-sup stable pair is \({{\mathbb {P}}}_2^3\times {{\mathbb {P}}}_1^{\mathrm{disc}}\) [39]. However, for the case considered in this paper, that is, taking the \({{\mathbb {P}}}_1^d\times {{\mathbb {P}}}_0^{}\) pair on general shape-regular meshes, stabilisation is a necessity. In addition, it is important to note that the papers cited above concern the Newtonian case only and are mostly focused on the Stokes equations. The analysis of some of those alternatives in the case of non-Newtonian flow models treated in the present work has not been carried out so far, and it will constitute a topic of future research.

Using the finite element spaces defined in (2.7)–(2.9), we denote by \(S_h^{}:W^{1,r}_0(\Omega )^d\rightarrow {\varvec{V}}_h^{}\) the Scott–Zhang interpolation operator and by \(\Pi _h^{}:L^1_0(\Omega )\rightarrow {{\mathscr {Q}}}_h^{}\), \(\Pi _H^{}:L^1_0(\Omega )\rightarrow {{\mathscr {Q}}}_H^{}\) the projections defined by (see, e.g., [18]):

$$\begin{aligned} \Pi _h^{}q|_K^{}&= \frac{(q,1)_K^{}}{|K|}\qquad \forall \, K\in {{\mathscr {T}}}_h^{}, \nonumber \\ \Pi _H^{}q|_M^{}&= \frac{(q,1)_M^{}}{|M|}\qquad \forall \, M\in {{\mathscr {T}}}_H^{}. \end{aligned}$$
(2.10)

\(\Pi _h\) and \(\Pi _H\) are simply linear projectors onto the linear subspaces of \(L^1_0(\Omega )\) consisting of all piecewise constant functions defined on the triangulations \({{\mathscr {T}}}_h\) and \({{\mathscr {T}}}_H\), respectively, with vanishing integral average on \(\Omega \). These operators satisfy ( [18]):

$$\begin{aligned}&\lim _{h\rightarrow 0} S_h^{}{\varvec{v}}={\varvec{v}}\quad \text {strongly in}\; W^{1,s}_0(\Omega )^d\,\quad \text {for all}\; {\varvec{v}}\in W^{1,s}_0(\Omega )^d\quad \text {and all}\; s \in [1,\infty ), \nonumber \\&\lim _{H\rightarrow 0} \Pi _H^{}q=\lim _{h\rightarrow 0} \Pi _h^{}q = q\quad \text {strongly in}\; L^{s}_0(\Omega )^d\,\quad \text {for all}\; q\in L^{s}_0(\Omega )\quad \text {and all}\; s\in [1,\infty ). \end{aligned}$$
(2.11)

The following result, whose proof can be carried out using the techniques presented in [18, Lemma 2.23], will be fundamental in the derivation (and analysis) of the proposed finite element method: for every \(s\in (1,\infty )\) there exists a constant \(C_s^{}>0\), independent of h, such that

$$\begin{aligned} \Vert q_h^{}-\Pi _H^{}(q_h^{})\Vert _{0,s,M}^{}\le C_s^{}\left\{ \sum _{F\in {{\mathscr {F}}}_I^{}(M)}h_F^{}\Vert \llbracket q_h^{}\rrbracket \Vert _{0,s,F}^s \right\} ^{\frac{1}{s}}, \end{aligned}$$
(2.12)

for all \(M\in {{\mathscr {T}}}_H^{}\), all \(q_h^{}\in {{\mathscr {Q}}}_h^{}\), and all \(h>0\).

We now recall three inequalities that will be useful in what follows. Let \(s\in (1,\infty )\), \(F\in {{\mathscr {F}}}_h^{}\) and \(K\in \omega _F^{}\). The following local trace inequality is a corollary of the multiplicative trace inequality proved in [18, Lemma 12.15]:

$$\begin{aligned} \Vert v\Vert _{0,s,F}^{} \le C\, (h_F^{-\frac{1}{s}}\Vert v\Vert _{0,s,K}^{}+h_F^{1-\frac{1}{s}}\Vert \nabla v\Vert _{0,s,K}^{}). \end{aligned}$$
(2.13)

In addition, we recall the following local inverse inequality (see, e.g., [18, Lemma 12.1]): for all \(m,\ell \in {{\mathbb {N}}}, m\le \ell \) and all \(t,s\in [1,+\infty ]\), there exists a constant C, independent of h, such that

$$\begin{aligned} \Vert q\Vert _{\ell ,t,K}^{}\le Ch_K^{m-\ell +d\left( \frac{1}{t}-\frac{1}{s}\right) }\Vert q\Vert _{m, s,K}^{}, \end{aligned}$$
(2.14)

for every polynomial function q defined on K. A global version of this inequality can also be derived using the quasi-uniformity of the mesh family. Finally, for \(1<s\le {\tilde{s}}\le \infty \), a set of indices \({{\mathcal {I}}}\), and any vector \((x_i^{})_{i\in {{\mathcal {I}}}}^{}\in \ell ^{{\tilde{s}}}({{\mathcal {I}}})\), the following inequality holds (see [15, Proposition 3.4(a)] for its proof):

$$\begin{aligned} \left\{ \sum _{i\in {{\mathcal {I}}}}x_i^{{\tilde{s}}}\right\} ^{\frac{1}{{\tilde{s}}}}\le \left\{ \sum _{i\in {{\mathcal {I}}}}x_i^{s}\right\} ^{\frac{1}{s}}. \end{aligned}$$
(2.15)

Finally, we note that under \({\mathop {\mathrm {Assumption~(A1)}}}\) the spaces \({\varvec{V}}_h^{}\) and \({{\mathscr {Q}}}_H^{}\) satisfy the following discrete inf-sup condition: for any \(s\in (1,\infty )\) there exists a constant \(\beta _s^{}>0\), independent of h, such that for all \(q_{H}^{}\in {{\mathscr {Q}}}_{H}^{}\) the following inequality holds:

$$\begin{aligned} \sup _{{\varvec{v}}_h^{}\in {\varvec{V}}_h^{}\setminus \{{\varvec{0}}\}}\frac{(q_{H}^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{}}{|{\varvec{v}}_h^{}|_{1,s',\Omega }^{}}\,\ge \, \beta _s^{}\Vert q_{H}^{}\Vert _{0,s,\Omega }^{}. \end{aligned}$$
(2.16)

The proof of this result, to the best of our knowledge, has not been given previously and thus we report it in Appendix. It is based on the construction of a Fortin operator \({{\mathscr {I}}}:W^{1,s'}_0(\Omega )^d\rightarrow {\varvec{V}}_h^{}\) satisfying

$$\begin{aligned} \big (q_H^{},{\mathop {\mathrm {div}\,}}({\varvec{v}}-{{\mathscr {I}}}({\varvec{v}}))\big )_\Omega ^{}&= 0\qquad&\text {for all}\; q_H^{}\in {{\mathscr {Q}}}_H^{}\quad \text {and all}\;{\varvec{v}}\in W^{1,s'}_0(\Omega )^d ,\, \end{aligned}$$
(2.17)
$$\begin{aligned} {{\mathscr {I}}}{\varvec{v}}&\rightarrow {\varvec{v}}\qquad&\text {strongly in}\; W^{1,s'}_0(\Omega )^d \;\text {as}\; h\rightarrow 0. \end{aligned}$$
(2.18)

In addition, (2.16) guarantees the existence of a non-trivial subspace of discretely divergence-free functions

$$\begin{aligned} {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}:= \{{\varvec{v}}_h^{}\in {\varvec{V}}_h^{}: (q_H^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{} = 0\;\text { for all}\; q_H^{}\in {{\mathscr {Q}}}_H^{}\}. \end{aligned}$$

2.3 Results Related to the Discrete Lipschitz Truncation

In the convergence proof given below we will need the following two results. These are known as discrete Lipschitz truncation and divergence-free discrete Lipschitz truncation, respectively. Their proofs are omitted since they are essentially a rewriting of Corollary 17 and the proof on pages 1006–1007 in [16] (see also [36, Lemmas 2.29 and 2.30]). We begin by recalling that for \(v \in L^1({{\mathbb {R}}}^d)\) the Hardy–Littlewood maximal function is defined by

$$\begin{aligned} M(v)(x):= \sup _{R>0} \frac{1}{|B_R({\varvec{x}})|} \int _{B_R({\varvec{x}})} |v(x)|\,\text {d}{\varvec{x}}, \end{aligned}$$

where \(B_R({\varvec{x}}) \subset {{\mathbb {R}}}^d\) is a ball or radius \(R>0\) centred at \({\varvec{x}}\in {{\mathbb {R}}}^d\). For \({\varvec{v}}\in W^{1,1}({{\mathbb {R}}}^d)^d\), we define \(M({\varvec{v}}):=M(|{\varvec{v}}|)\) and \(M(\nabla {\varvec{v}}):=M(|\nabla {\varvec{v}}|)\). Let \({\varvec{v}}\in W^{1,1}_0(\Omega )^d\) and extend \({\varvec{v}}\) by \({{\mathbf {0}}}\) outside \(\Omega \) to the whole of \({{\mathbb {R}}}^d\), resulting in a function (still denoted by) \({\varvec{v}}\in W^{1,1}({{\mathbb {R}}}^d)^d\). For a fixed \(\lambda >0\), we then define

$$\begin{aligned} {{\mathcal {U}}}_\lambda ({\varvec{v}}):=\{ {\varvec{x}}\in {{\mathbb {R}}}^d\,:\, M(\nabla {\varvec{v}})({\varvec{x}}) > \lambda \}.\end{aligned}$$

Now, recall the definition of \(\omega _K^{}\) given in (2.6). For \({\varvec{v}}_h^{} \in {\varvec{V}}_h^{}\) and \(j \in {{\mathbb {N}}}\) we define

$$\begin{aligned} \Omega ^h_\lambda ({\varvec{v}}):=\text{ int }\left( \bigcup \{\omega ^{}_K\,:\, K \in {{\mathscr {T}}}_h\;\, \text{ with }\;\, K \cap {{\mathcal {U}}}_\lambda ({\varvec{v}}) \ne \emptyset \}\right) . \end{aligned}$$

Lemma 6

Let \(s\in (1,\infty )\). Let us suppose that \({\varvec{v}}_h^{}\in {\varvec{V}}_h^{}\) for all \(h>0\) and \({\varvec{v}}_h^{} \rightharpoonup {\varvec{0}}\) weakly in \(W^{1,s}_0(\Omega )^d\) as \(h\rightarrow 0\). Then, there exist

  • A double sequence \(\{\lambda _{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\subseteq {{\mathbb {R}}}\) such that \(\lambda _{h,j}^{}\in [2^{2^j},2^{2^{j+1}-1}]\) for all \(h>0\), \(j\in {{\mathbb {N}}}\);

  • A double sequence of open sets \({{\mathscr {B}}}_{h,j}^{}:= \Omega ^h_{\lambda _{h,j}}({\varvec{v}}_h)\subseteq \Omega \), for \(h>0\) and \(j\in {{\mathbb {N}}}\);

  • A double sequence \(\{{\varvec{v}}_{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\subseteq W^{1,\infty }_0(\Omega )^d\) with \({\varvec{v}}_{h,j}^{}\in {\varvec{V}}_h^{}\) for all \(j\in {{\mathbb {N}}}\) and all \(h>0\);

satisfying the following properties:

  1. i.

    \({\varvec{v}}_{h,j}^{}={\varvec{v}}_h^{}\) in \(\Omega \setminus {{\mathscr {B}}}_{h,j}^{}\) for all \(j\in {{\mathbb {N}}}\) and all \(h>0\);

  2. ii.

    There exists a \(c(s)>0\) such that

    $$\begin{aligned} \Vert \lambda _{h,j}^{}\mathbb {1}_{{{\mathcal {B}}}_{h,j}^{}}\Vert _{0,s,\Omega }^{}\le c(s)\,2^{-\frac{j}{s}}\qquad \,\forall \, h>0,\;\; j\in {{\mathbb {N}}}; \end{aligned}$$
    (2.19)
  3. iii.

    There exists a \(c(s)>0\) such that

    $$\begin{aligned} \Vert \nabla {\varvec{v}}_{h,j}^{}\Vert _{0,\infty ,\Omega }^{}\le c(s)\lambda _{h,j}^{}\qquad \,\forall \, h>0,\;\; j\in {{\mathbb {N}}}; \end{aligned}$$
    (2.20)
  4. iv.

    For any fixed \(j\in {{\mathbb {N}}}\),

    $$\begin{aligned} {\varvec{v}}_{h,j}^{}\rightarrow {\varvec{0}}\;\text {strongly in}\; L^\infty (\Omega )^d\quad \text {and}\quad \nabla {\varvec{v}}_{h,j}^{} \rightharpoonup {\varvec{0}}\;\text {weakly-* in}\; L^\infty (\Omega )^{d\times d}, \end{aligned}$$

    as \(h\rightarrow 0\).

Lemma 7

Let \(s\in (1,\infty )\) and assume that Assumption (A1) is satisfied. Let \(\{{\varvec{v}}_h^{}\}_{h>0}^{}\) be a sequence such that \({\varvec{v}}_h^{}\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\) for all \(h>0\) and such that \({\varvec{v}}_h^{} \rightharpoonup {\varvec{0}}\) weakly in \(W^{1,s}_0(\Omega )^d\) as \(h\rightarrow 0\). Furthermore, let \(\{{\varvec{v}}_{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\) be the sequence of Lipschitz truncations given by Lemma 6. Then, there exists a double sequence \(\{{\varvec{w}}_{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\) such that

  1. i.

    \({\varvec{w}}_{h,j}^{}\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\) for all \(h>0\) and all \(j\in {{\mathbb {N}}}\);

  2. ii.

    There exists a c(s) such that

    $$\begin{aligned} \Vert {\varvec{v}}_{h,j}^{}-{\varvec{w}}_{h,j}^{}\Vert _{1,s,\Omega }^{} \le c(s)2^{-\frac{j}{s}}\qquad \forall \, h>0, \; \forall \, j\in {{\mathbb {N}}}; \end{aligned}$$
    (2.21)
  3. iii.

    For any fixed \(j\in {{\mathbb {N}}}\) the following convergences hold (up to a subsequence, if necessary):

    $$\begin{aligned} {\varvec{w}}_{h,j}^{}\rightarrow {\varvec{0}}\;\text {strongly in}\; L^t(\Omega )^d\quad \text {and}\quad \nabla {\varvec{w}}_{h,j}^{} \rightharpoonup {\varvec{0}} \;\text {weakly in}\; W^{1,t}_0(\Omega )^{d\times d}, \end{aligned}$$

    as \(h\rightarrow 0\), for all \(t<+\infty \).

2.4 The Stabilising Bilinear form and the Lifting Operator

The finite element method studied in this work is based on the pair \({\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\). Since this pair is not inf-sup stable some form of stabilisation is needed. In this work our proposal is to use the following stabilising bilinear form

$$\begin{aligned} s(q_h^{},t_h^{}) = \sum _{M\in {{\mathscr {T}}}_H^{}}\sum _{F\in {{\mathscr {F}}}_I^{}(M)}\tau _F^{}\,(\llbracket q_h^{}\rrbracket , \llbracket t_h^{}\rrbracket )_F^{}, \end{aligned}$$
(2.22)

where the stabilisation parameter \(\tau _F^{}\) is defined as follows:

$$\begin{aligned} \tau _F^{}=h_F^{\alpha (r)}\qquad \text {where}\qquad \alpha (r) := \left\{ \begin{array}{rll} 1 &{} &{}\text {if}\; r\ge 2, \\ 1-d+\dfrac{2d}{{\tilde{r}}} &{} &{}\text {if}\; r\in \left[ \frac{3d}{d+2}, 2\right) ,\\ ~\\ 1-d+\dfrac{2d}{{\tilde{r}}_{\mathrm{max}}^{}} &{} = \dfrac{d-1}{3}&{}\text {if}\; r\in \left( \frac{2d}{d+2}, \frac{3d}{d+2}\right) . \end{array}\right. \end{aligned}$$
(2.23)

The behaviour of \(\alpha (r)\) is depicted in Fig. 1. It can be observed there that the stabilisation gets stronger as \(r \searrow \frac{2d}{d+2}\). The reason for this behaviour will become clear when we perform the convergence analysis in Sect. 4.

Remark 8

Thanks to \({\mathop {\mathrm {Assumption~(A1)}}}\) and the inf-sup condition (2.16) it can be expected to have stability of \(\Pi _H^{}(p_h^{})\) (where \(p_h^{}\) is the finite element approximation of the pressure). The stabilisation is then built with the aim of controlling \(p_h^{}-\Pi _H^{}(p_h^{})\). More precisely, using (2.12), (2.15) (or the inverse inequality (2.14)), and the definition of the bilinear form \(s(\cdot ,\cdot )\) we see that there exists a constant \(C>0\) such that

$$\begin{aligned} \Vert q_h^{}-\Pi _H^{}(q_h^{})\Vert _{0,\ell ,\Omega }^{}\le C\,h^{\chi }\,s(q_h^{},q_h^{})^{\frac{1}{2}}, \end{aligned}$$
(2.24)

for all \(q_h^{}\in {{\mathscr {Q}}}_h^{}\), where

$$\begin{aligned} \chi =\left\{ \begin{array}{rl} \frac{1-\alpha (r)}{2} &{} \text {if}\; \ell \le 2,\\ \frac{1-d+\frac{2d}{\ell }-\alpha (r)}{2} &{} \text {if}\; \ell > 2. \end{array}\right. \end{aligned}$$

It will be useful in what follows to observe that for \(\ell ={\tilde{r}}\) we have \(\chi \ge 0\).

Another important ingredient in the definition of the method is a lifting of the pressure jumps defined with the help of the lowest order Raviart–Thomas basis functions. To define this, for each \(F\in {{\mathscr {F}}}_h^{}\) we choose a unique normal vector \({\varvec{n}}_F^{}\). Its orientation is of no importance, but it needs to point outwards of \(\Omega \) if \(F\subset \partial \Omega \). Moreover, for each \(K\in {{\mathscr {T}}}_h^{}\) such that \(F\in {{\mathscr {F}}}_K^{}\), we denote the node in K opposite F by \({\varvec{x}}_F^{}\). Using this unique normal vector, we introduce the lowest order Raviart–Thomas basis function \(\varvec{\varphi }_F^{}\) defined as

$$\begin{aligned} \varvec{\varphi }_F^{}({\varvec{x}})|_K^{}:=\pm \frac{|F|}{d|K|}({\varvec{x}}-{\varvec{x}}_F^{}), \end{aligned}$$
(2.25)

and extended by zero outside \(\omega _F^{}\). In this definition, the sign of the function \(\varvec{\varphi }_F^{}\) depends on whether the normal vector \({\varvec{n}}_F^{}\) points in or out of K. Thanks to its definition, \(\varvec{\varphi }_F^{}\) satisfies the following: for every \(F'\in {{\mathscr {F}}}_h^{}\) the normal component of \(\varvec{\varphi }_F^{}\) is given by (with the obvious abuse of notation considering that \({\varvec{n}}_{F'}^{}\) is not defined at the boundary of \(F'\)):

$$\begin{aligned} \varvec{\varphi }_F^{}\cdot {\varvec{n}}_{F'}^{}=\left\{ \begin{array}{ll} 1 &{} \text {if}\; F'=F,\\ 0 &{} \text {otherwise}.\end{array}\right. \end{aligned}$$

With the help of these Raviart–Thomas basis functions, we define the following operator, which will be fundamental in the definition of the finite element method:

$$\begin{aligned} \begin{aligned}&{{\mathscr {L}}}:W^{1,r}(\Omega )^d \times {{\mathscr {Q}}}_h^{}\rightarrow W^r({\mathop {\mathrm {div}\,}}\!; \Omega ),\\&({\varvec{v}},q_h^{})\mapsto {{\mathscr {L}}}({\varvec{v}},q_h^{}) := {\varvec{v}}+ \sum _{M\in {{\mathscr {T}}}_H^{}}\sum _{F\in {{\mathscr {F}}}_I^{}(M)}\tau _F^{}\llbracket q_h^{}\rrbracket \varvec{\varphi }_F^{}. \end{aligned} \end{aligned}$$
(2.26)

Since the velocity \({\varvec{u}}\) is bounded in \(W^{1,r}_0(\Omega )^d\), then it is bounded in \(L^{2{\tilde{r}}}(\Omega )^d\) as well. In the finite element method proposed in Sect. 3, we will consider a modified velocity built with the help of the mapping \({{\mathscr {L}}}\) just defined. The following result states that the stability just mentioned is preserved by the operator \({{\mathscr {L}}}\).

Lemma 9

There exists a constant \(C>0\), independent of h, such that

$$\begin{aligned} \Vert {{\mathscr {L}}}({\varvec{v}},q_h^{})\Vert _{0,2{\tilde{r}},\Omega }^{}\le C\left\{ |{\varvec{v}}|_{1,r,\Omega }^{}+ s(q_h^{},q_h^{})^{\frac{1}{2}} \right\} , \end{aligned}$$

for all \(({\varvec{v}},q_h^{})\in W^{1,r}_0(\Omega )^d\times {{\mathscr {Q}}}_h^{}\).

Proof

Thanks to the embedding (2.3) and denoting

$$\begin{aligned} {\varvec{u}}_{nc}^{}:=\sum _{M\in {{\mathscr {T}}}_H^{}}\sum _{F\in {{\mathscr {F}}}_I^{}(M)}\tau _F^{}\llbracket q_h^{}\rrbracket \varvec{\varphi }_F^{}, \end{aligned}$$

the following bound follows

$$\begin{aligned} \Vert {{\mathscr {L}}}({\varvec{v}},q_h^{})\Vert _{0,2{\tilde{r}},\Omega }^{}\le C\,|{\varvec{v}}|_{1,r,\Omega }^{}+\Vert {\varvec{u}}_{nc}^{}\Vert _{0,2{\tilde{r}},\Omega }^{}. \end{aligned}$$

To bound the second term on the right-hand side of this inequality we start by noticing that the definition of \(\varvec{\varphi }_F^{}\) (cf. (2.25)) gives \(\Vert \varvec{\varphi }_F^{}\Vert _{0,\infty ,K}^{}\le C\) for each K such that \(F\in {{\mathscr {F}}}_K^{}\). So, let \(K\in {{\mathscr {T}}}_h^{}\) and let \(M\in {{\mathscr {T}}}_H^{}\) be the unique macro-element such that \(K\subset M\). Then, using the mesh regularity and the Cauchy–Schwarz inequality we get

$$\begin{aligned} \Vert {\varvec{u}}_{nc}^{}\Vert _{0,K}^{}&=\, \left\| \sum _{F\in {{\mathscr {F}}}_K^{}\cap {{\mathscr {F}}}_I^{}(M)}\tau _F^{}\llbracket p_h^{}\rrbracket \varvec{\varphi }_F^{}\right\| _{0,K}^{} \\&\le \, \sum _{F\in {{\mathscr {F}}}_K^{}\cap {{\mathscr {F}}}_I^{}(M)}\tau _F^{}|\llbracket p_h^{}\rrbracket |\, \Vert \varvec{\varphi }_F^{}\Vert _{0,K}^{} \\&\le \, C\,\sum _{F\in {{\mathscr {F}}}_K^{}\cap {{\mathscr {F}}}_I^{}(M)}\tau _F^{} h_F^{\frac{d}{2}}|\llbracket p_h^{}\rrbracket | \\&\le \, C\,\sum _{F\in {{\mathscr {F}}}_K^{}\cap {{\mathscr {F}}}_I^{}(M)}\tau _F^{} h_F^{1-\frac{d}{2}}(1,|\llbracket p_h^{}\rrbracket |)_F^{} \\&\le \, C\,h_F^{\frac{1}{2}}\sum _{F\in {{\mathscr {F}}}_K^{}\cap {{\mathscr {F}}}_I^{}(M)}\tau _F^{} \Vert \llbracket p_h^{}\rrbracket \Vert _{0,F}^{}. \end{aligned}$$

Hence, squaring, summing over all the elements, and using the mesh regularity gives

$$\begin{aligned} \Vert {\varvec{u}}_{nc}^{}\Vert _{0,\Omega }^{} = \left\{ \sum _{K\in {{\mathscr {T}}}_h^{}}\Vert {\varvec{u}}_{nc}^{}\Vert _{0,K}^2\right\} ^{\frac{1}{2}} \le C\,h^{\frac{1+\alpha (r)}{2}} \left\{ \sum _{M\in {{\mathscr {T}}}_H^{}}\sum _{F\in {{\mathscr {F}}}_I^{}(M)}\tau _F^{}\Vert \llbracket p_h^{}\rrbracket \Vert ^2_{0,F}\right\} ^{\frac{1}{2}}. \end{aligned}$$

Thus, using the inverse inequality (2.14) we arrive at

$$\begin{aligned} \Vert {\varvec{u}}_{nc}^{}\Vert _{0,2{\tilde{r}},\Omega }^{} \le Ch^{\frac{d(1-{\tilde{r}})}{2{\tilde{r}}}}\Vert {\varvec{u}}_{nc}^{}\Vert _{0,\Omega }^{} \le Ch^{\frac{d(1-{\tilde{r}})+{\tilde{r}}+\alpha (r){\tilde{r}}}{2{\tilde{r}}}} \, s(p_h^{},p_h^{})^{\frac{1}{2}}. \end{aligned}$$
(2.27)

To complete the proof we only need to make sure that the exponent of h in (2.27) is not negative. Let \(\xi := d(1-{\tilde{r}})+{\tilde{r}}+\alpha (r){\tilde{r}}\). If \(r\ge 2\) then \(\alpha (r)=1\) and \({\tilde{r}}\le 2\). So, \(\xi =d(1-{\tilde{r}})+2{\tilde{r}}= d+(2-d){\tilde{r}}\ge 4-d\ge 1\). If \(r<2\) then \(\alpha (r)\ge \frac{d-1}{3}\) and so

$$\begin{aligned} \xi \ge d(1-{\tilde{r}})+{\tilde{r}}+\frac{d-1}{3}{\tilde{r}} = d + \frac{2(1-d)}{3}{\tilde{r}} \ge d + \frac{2(1-d)}{3}{\tilde{r}}_\mathrm{max}^{} = 0. \end{aligned}$$

Since in the whole range of values for r we have \(\xi \ge 0\), the proof is complete. \(\square \)

3 The Finite Element Method

The finite element method studied in this work reads as follows: find \(({\varvec{u}}_h^{}, p_h^{})\in {\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\) such that

$$\begin{aligned} (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}-({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}- (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{}&= \langle {\varvec{f}},{\varvec{v}}_h^{}\rangle _\Omega ^{}, \end{aligned}$$
(3.1)
$$\begin{aligned} (q_h^{},{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{} + s(p_h^{},q_h^{})&= 0, \end{aligned}$$
(3.2)

for all \(({\varvec{v}}_h^{},q_h^{})\in {\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\), where \({{\mathscr {L}}}\) is defined by (2.26) and the stabilising bilinear form \(s(\cdot ,\cdot )\) is defined in (2.22).

Remark 10

  1. (i)

    The main differences between (3.1), (3.2) and a standard Galerkin method are twofold: first, the stabilising term involving the jumps of the discrete pressure is added to the formulation to compensate for the fact that the pair \({\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\) does not satisfy the discrete inf-sup condition. Additionally, and perhaps more significantly, the convection velocity \({\varvec{u}}_h^{}\) has been replaced by the modified version \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\). In Lemma 11 this modified velocity will be proved to be solenoidal, which allows us to analyse the finite element method without the need to rewrite the convection term in a skew-symmetric form. This will lead to a convergence result valid in the whole range \(r> \frac{2d}{d+2}\).

  2. (ii)

    As can be expected, the power of h in the stabilisation parameter depends strongly on the value of r. Two important remarks are in order:

    • \(\alpha (r)=1\) for all \(r\ge 2\);

    • For all \(r<2\) we have \(\frac{d-1}{3}\le \alpha (r)<1\).

    Thus, there is always a positive power of h multiplying the jump terms of the pressure involved in the definition of \(s(\cdot ,\cdot )\) and \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\), but the stabilisation becomes stronger as \(r\searrow \frac{2d}{d+2}\).

3.1 Existence of a Solution and a priori Bounds

Before exploring the stability of the scheme, we present the following a priori result concerning qualitative properties of \({\varvec{u}}_h^{}\) and \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\), whenever \(({\varvec{u}}_h^{},p_h^{})\) solves (3.1), (3.2).

Lemma 11

Let \(({\varvec{u}}_h^{},p_h^{})\in {\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\) be any solution of (3.1), (3.2). Then,

  1. (i)

    \({\varvec{u}}_h^{}\) is discretely divergence-free with respect to the coarse space \({{\mathscr {Q}}}_H^{}\), that is,

    $$\begin{aligned} (q_H^{},{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{}=0\qquad \forall \, q_H^{}\in {{\mathscr {Q}}}_H^{}. \end{aligned}$$
  2. (ii)

    \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\cdot {\varvec{n}}=0\) on \(\partial \Omega \), and

    $$\begin{aligned} {\mathop {\mathrm {div}\,}}{{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})=0\quad \text {in}\;\Omega . \end{aligned}$$

Proof

The proof of (i) is a consequence of the fact that the stabilisation \(s(\cdot ,\cdot )\) vanishes on the coarse space \({{\mathscr {Q}}}_H^{}\), that is, \(s(q_h^{}, q_H^{})=0\) for all \(q_h^{}\in {{\mathscr {Q}}}_h^{}\) and all \(q_H^{}\in {{\mathscr {Q}}}_H^{}\). For (ii), we can follow similar arguments as those presented in [6, Lemma 3.8] and [7, Lemma 3] (see also [2, Theorem 3] for a different proof). \(\square \)

The following result states the existence of a solution to the discrete problem (3.1), (3.2). In addition, it provides uniform a priori bounds for the sequence of solutions as \(h\rightarrow 0\).

Theorem 12

There exists a solution \(({\varvec{u}}_h^{},p_h^{})\in {\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\) of (3.1), (3.2). Moreover, every solution satisfies the following a priori bound:

$$\begin{aligned} |{\varvec{u}}_h^{}|_{1,r,\Omega }^{r}+\Vert {{\mathscr {L}}}({\varvec{u}}_h^{},p_h)\Vert _{0,2{\tilde{r}},\Omega }^{}+s(p_h^{},p_h^{}) + \Vert p_h^{}\Vert _{0,{\tilde{r}},\Omega }^{}\le M, \end{aligned}$$
(3.3)

where M does not depend on h.

Proof

The existence of a solution is proved using the argument presented in [22] for the Navier–Stokes equation. First, if \({\varvec{f}}={\varvec{0}}\), then \({\varvec{u}}_h^{}={\varvec{0}}\) and \(p_h^{}=0\) trivially solve (3.1), (3.2). So, we suppose that \({\varvec{f}}\not ={\varvec{0}}\). We define on \({{\mathscr {Q}}}_{h}\) the relation \(\sim \) as follows: given \(p_h, q_h \in {{\mathscr {Q}}}_h\), we shall write \(p_h \sim q_h\) whenever \(p_h - q_h \in {{\mathscr {Q}}}_H\). As \(\sim \) is reflexive, symmetric and transitive it is an equivalence relation. Let \([p_h]\) denote the equivalence class consisting of all \(q_h \in {{\mathscr {Q}}}_h\) such that \(q_h \sim p_h\). Clearly, if \(q_h \in {{\mathscr {Q}}}_H\) then \(q_h \in [0]\). We denote by \([{{\mathscr {Q}}}_{h}^{}/{{\mathscr {Q}}}_H^{}]\) the linear space of equivalence classes induced by the relation \(\sim \) where, for two equivalence classes \([p_h]\), \([q_h]\) in \([{{\mathscr {Q}}}_{h}^{}/{{\mathscr {Q}}}_H^{}]\), and real numbers \(\alpha , \beta \in {{\mathbb {R}}}\), we define \(\alpha [p_h]+\beta [q_h]:= [\alpha p_h + \beta q_h]\). This definition is correct as the value of \([\alpha p_h + \beta q_h]\) is independent of the choice of \(p_h \in [p_h]\) and \(q_h \in [q_h]\). To prove the existence of a solution to the discrete problem (3.1), (3.2) we consider the following reduced problem: find \(({\varvec{u}}_h^{},[p_h^{}])\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\times [{{\mathscr {Q}}}_{h}^{}/{{\mathscr {Q}}}_H^{}]\) such that

$$\begin{aligned} (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}-({{\mathscr {L}}}({\varvec{u}}_h^{},[p_h^{}])\otimes {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}- ([p_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{}&= \langle {\varvec{f}},{\varvec{v}}_h^{}\rangle _\Omega ^{}, \end{aligned}$$
(3.4)
$$\begin{aligned} ([q_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{} + s([p_h^{}],[q_h^{}])&= 0, \end{aligned}$$
(3.5)

for all \(({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\times [{{\mathscr {Q}}}_{h}^{}/{{\mathscr {Q}}}_H^{}]\), where, for \({\varvec{u}}_h, {\varvec{v}}_h \in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\), we define \({{\mathscr {L}}}({\varvec{u}}_h^{},[p_h^{}]) := {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\), \(([p_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{} := (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{}\), \(([q_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{} := (q_h^{},{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{}\), and \(s([p_h^{}],[q_h^{}]):= s(p_h,q_h)\) for any \(p_h \in [p_h]\) and any \(q_h \in [q_h]\). These definitions are correct in the sense that they do not depend on the specific choice of \(p_h \in [p_h]\), \(q_h \in [q_h]\). Now, let us suppose that \(({\varvec{u}}_h^{},[p_h^{}])\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\times [{{\mathscr {Q}}}_{h}^{}/{{\mathscr {Q}}}_H^{}]\) satisfies (3.4), (3.5), and let \({p}_h^{}\) be any representative of the equivalence class \([p_h^{}]\). Then, since \(s(p_H,q_h)= s(p_h,q_H)= s(p_H,q_H)=0\) for any \(p_H, q_H \in {{\mathscr {Q}}}_H^{}\) and any \(p_h, q_h \in {{\mathscr {Q}}}_h\), and since \({\varvec{u}}_h^{}\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\), we deduce that, for any \(q_H^{}\in {{\mathscr {Q}}}_H^{}\), we have

$$\begin{aligned} (q_H^{},{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{} + s(p_h^{},q_H^{}) = (q_H^{},{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{} +0 = 0, \end{aligned}$$

whereby the pair \(({\varvec{u}}_h^{},p_h^{})\), with \(p_h \in [p_h]\), satisfies (3.2). In addition, thanks to the inf-sup condition (2.16) there exists a \(p_H^{}\in {{\mathscr {Q}}}_H^{}\) such that

$$\begin{aligned} (p_H^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{}&= \langle {\varvec{f}},{\varvec{v}}_h^{}\rangle _\Omega ^{}- (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}+({{\mathscr {L}}}({\varvec{u}}_h^{},[p_h^{}])\otimes {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}\\&\quad + ([{p}_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{}\\&= \langle {\varvec{f}},{\varvec{v}}_h^{}\rangle _\Omega ^{}- (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}+({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}\\&\quad + ({p}_h^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{} \end{aligned}$$

for all \({\varvec{v}}_h^{}\in {\varvec{V}}_h^{}\); therefore \(({\varvec{u}}_h^{}, p_h^{}-p_H^{})\in {\varvec{V}}_h^{}\times {{\mathscr {Q}}}_h^{}\) satisfies (3.1). Thus we have shown that the existence of a solution to the problem (3.4), (3.5) implies the existence of a solution to the problem (3.1), (3.2). Hence, it suffices to prove the existence of a solution to problem (3.4), (3.5). We start by noticing that the mapping defined by

$$\begin{aligned} ({\varvec{v}}_h^{},[q_h^{}]) \in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\times [{{\mathscr {Q}}}_h^{}/{{\mathscr {Q}}}_H^{}] \mapsto |{\varvec{v}}_h^{}|_{1,r,\Omega }^{}+[s([p_h^{}],[q_h^{}])]^{\frac{1}{2}} \in {{\mathbb {R}}}_{\ge 0} \end{aligned}$$

is a norm. The subspace of \({\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\times [{{\mathscr {Q}}}_h^{}/{{\mathscr {Q}}}_H^{}]\) where solutions of (3.4), (3.5) are to be sought is

$$\begin{aligned} {\varvec{X}}_h^{}:=\{ ({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\times [{{\mathscr {Q}}}_h^{}/{{\mathscr {Q}}}_H^{}]\,:\, {\mathop {\mathrm {div}\,}}{{\mathscr {L}}}({\varvec{v}}_h^{},[q_h^{}])=0\;\text {in}\;\Omega \}. \end{aligned}$$

Let \(T:{\varvec{X}}_h^{}\rightarrow [{\varvec{X}}_h^{}]'\) be the mapping defined by

$$\begin{aligned}{}[T({\varvec{v}}_h^{},[q_h^{}]),({\varvec{w}}_h^{},[t_h^{}])]&:= (|\nabla {\varvec{v}}_h^{}|^{r-2}\nabla {\varvec{v}}_h^{},\nabla {\varvec{w}}_h^{})_\Omega ^{} -({{\mathscr {L}}}({\varvec{v}}_h^{},[q_h^{}])\otimes {\varvec{v}}_h^{},\nabla {\varvec{w}}_h^{})_\Omega ^{}\\&\quad - ([q_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{w}}_h^{})_\Omega ^{} \\&\quad +\,([t_h^{}],{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{} + s([q_h^{}], [t_h^{}]) - \langle {\varvec{f}},{\varvec{w}}_h^{}\rangle _\Omega ^{}, \end{aligned}$$

that is, the mapping associated with the residual of the problem (3.4), (3.5). For any \(({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{X}}_h^{}\), integration by parts gives \(({{\mathscr {L}}}({\varvec{v}}_h^{},[q_h^{}])\otimes {\varvec{v}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} = 0\), and then, for any \(({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{X}}_h^{}\), Young’s inequality yields

$$\begin{aligned}{}[T({\varvec{v}}_h^{},[q_h^{}]),({\varvec{v}}_h^{},[q_h^{}])]&\ge |{\varvec{v}}_h^{}|^r_{1,r,\Omega }+ s([q_h^{}],[q_h^{}]) -\Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{}|{\varvec{v}}_h^{}|_{1,r,\Omega }^{} \\&\ge \frac{1}{r'}|{\varvec{v}}_h^{}|^r_{1,r,\Omega }+ s([q_h^{}],[q_h^{}]) -\frac{1}{r'}\Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{r'}. \end{aligned}$$

This implies that, for any \(({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{X}}_h^{}\) such that

$$\begin{aligned} \frac{1}{r'}|{\varvec{v}}_h^{}|^r_{1,r,\Omega }+ s([q_h^{}],[q_h^{}]) \ge \Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{r'}, \end{aligned}$$
(3.6)

we have \([T({\varvec{v}}_h^{},[q_h^{}]),({\varvec{v}}_h^{},[q_h^{}])]>0\). By norm-equivalence in the finite-dimensional linear space \({\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\) there exists a positive constant \(C_*=C_*(h,r)\) such that \(|{\varvec{v}}_h^{}|_{1,r,\Omega } \ge C_*|{\varvec{v}}_h^{}|_{1,2,\Omega }\). Hence, for any \(({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{X}}_h^{}\) such that

$$\begin{aligned} \frac{C_*^r}{r'}|{\varvec{v}}_h^{}|^r_{1,2,\Omega }+ s([q_h^{}],[q_h^{}]) \ge \Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{r'}, \end{aligned}$$
(3.7)

we have \([T({\varvec{v}}_h^{},[q_h^{}]),({\varvec{v}}_h^{},[q_h^{}])]>0\). Thus, for any \(({\varvec{v}}_h^{},[q_h^{}])\in {\varvec{X}}_h^{}\) such that

$$\begin{aligned} (|{\varvec{v}}_h^{}|^2_{1,2,\Omega }+ s([q_h^{}],[q_h^{}]))^{\frac{1}{2}} = \mu , \quad \text{ where } \quad \mu :=\max \left( \left( \frac{c}{K}\right) ^{\frac{1}{r}}, c^{\frac{1}{2}}\right) , \end{aligned}$$
(3.8)

with \(c:=\Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{r'}\) and \(K:=\min (1, C_*^r/r')\), we have \([T({\varvec{v}}_h^{},[q_h^{}]),({\varvec{v}}_h^{},[q_h^{}])]>0\); this follows by noting that (3.8) implies (3.7) (which then implies (3.6)). Now, \({\varvec{X}}_h^{}\) is a (finite-dimensional) Hilbert space equipped with the norm \(\Vert ({\varvec{v}}_h^{},[q_h^{}]) \Vert _{{\varvec{X}}_h}:= (|{\varvec{v}}_h^{}|^2_{1,2,\Omega }+ s([q_h^{}],[q_h^{}]))^{\frac{1}{2}}\) and associated inner product \((\cdot ,\cdot )_{{\varvec{X}}_h^{}}\); thus, by the Riesz representation theorem, there exists an element \(R({\varvec{v}}_h^{},[q_h^{}]) \in {\varvec{X}}_h^{}\) such that \([T({\varvec{v}}_h^{},[q_h^{}]),({\varvec{v}}_h^{},[q_h^{}])] = (R({\varvec{v}}_h^{},[q_h^{}]),({\varvec{v}}_h^{},[q_h^{}]))_{X_h}\). Hence, by a consequence of Brouwer’s fixed point theorem (see [22, Ch. IV, Corollary 1.1]) there exists a \(({\varvec{u}}_h^{},[p_h^{}])\in {\varvec{X}}_h^{}\) such that \(R({\varvec{u}}_h^{},[p_h^{}])={\varvec{0}}\), and therefore also \(T({\varvec{u}}_h^{},[p_h^{}])={\varvec{0}}\), i.e., \(({\varvec{u}}_h^{},[p_h^{}])\in {\varvec{X}}_h\) solves (3.4), (3.5), which implies that (3.1), (3.2) has a solution \(({\varvec{u}}_h^{},p_h^{})\in {\varvec{V}}_h^{} \times {{\mathscr {Q}}}_h\).

In order to prove the a priori bound (3.3), we first take \(({\varvec{v}}_h^{},q_h^{})=({\varvec{u}}_h^{},p_h^{})\) in (3.1), (3.2) and use the fact that \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\) is solenoidal to arrive at

$$\begin{aligned} |{\varvec{u}}_h^{}|_{1,r,\Omega }^{r}+s(p_h^{},p_h^{})\le C\,\Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{r'}, \end{aligned}$$
(3.9)

where \(C>0\) depends only on r. Moreover, the bound on \(\Vert {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\Vert _{0,2{\tilde{r}},\Omega }^{}\) follows from Lemma 9 and (3.9).

To bound \(\Vert p_h^{}\Vert _{0,{\tilde{r}},\Omega }^{}\) we consider the projection \(\Pi _H^{}\) defined in Sect. 2.2 and write

$$\begin{aligned} \Vert p_h^{}\Vert _{0,{\tilde{r}},\Omega }^{}\le \Vert p_h^{}-\Pi _H^{}(p_h^{})\Vert _{0,{\tilde{r}},\Omega }^{} + \Vert \Pi _H^{}(p_h^{})\Vert _{0,{\tilde{r}},\Omega }^{}. \end{aligned}$$
(3.10)

First, using the result stated in Remark 8 (that is, (2.24) with \(\ell ={\tilde{r}}\)) we deduce that

$$\begin{aligned} \Vert \Pi _{H}^{}(p_h^{})-p_h^{}\Vert _{0,{\tilde{r}},\Omega }^{} \le C\,h^{\chi }s(p_h^{},p_h^{})^{\frac{1}{2}}\le C, \end{aligned}$$
(3.11)

where \(\chi \ge 0\). Next, since \(\Pi _H^{}(p_h^{})\in {{\mathscr {Q}}}_H^{}\), thanks to (2.16) there exists a \({\tilde{{\varvec{w}}}}_h^{}\in {\varvec{V}}_h^{}\) such that \(|{\tilde{{\varvec{w}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{}=1\) and

$$\begin{aligned}&\beta _r^{}\Vert \Pi _{H}^{}(p_h^{})\Vert _{0,{\tilde{r}},\Omega }^{} \,\le \, (\Pi _{H}^{}(p_h^{}), {\mathop {\mathrm {div}\,}}{\tilde{{\varvec{w}}}}_h^{})_\Omega ^{} \\&\quad = (\Pi _{H}^{}(p_h^{})-p_h^{}, {\mathop {\mathrm {div}\,}}{\tilde{{\varvec{w}}}}_h^{})_\Omega ^{} + (p_h^{}, {\mathop {\mathrm {div}\,}}{\tilde{{\varvec{w}}}}_h^{})_\Omega ^{}\\&\quad = (\Pi _{H}^{}(p_h^{})-p_h^{}, {\mathop {\mathrm {div}\,}}{\tilde{{\varvec{w}}}}_h^{})_\Omega ^{} + (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\tilde{{\varvec{w}}}}_h^{})_\Omega ^{} - ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\tilde{{\varvec{w}}}}_h^{})_\Omega ^{}\\&\qquad -\langle {\varvec{f}},{\tilde{{\varvec{w}}}}_h^{}\rangle _\Omega ^{}\\&\quad = I+ II + III + IV, \end{aligned}$$

where we have also used that \(({\varvec{u}}_h^{},p_h^{})\) solves (3.1), (3.2). The bounds for the above terms proceed using Hölder’s inequality, \({\tilde{r}}\le r'\) (and then \(r\le {\tilde{r}}'\)), \(|{\tilde{{\varvec{w}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{}=1\), (3.11), (2.3), the bound for \(\Vert {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\Vert _{0,2{\tilde{r}},\Omega }^{}\), and (3.9) as follows:

$$\begin{aligned} I&\le \Vert \Pi _{H}^{}(p_h^{})-p_h^{}\Vert _{0,{\tilde{r}},\Omega }^{}|{\tilde{{\varvec{w}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{} \le C, \\ II&\le \, \left( \int _\Omega |\nabla {\varvec{u}}_h^{}|^{(r-1){\tilde{r}}}\,\text {d}{\varvec{x}}\right) ^{\frac{1}{{\tilde{r}}}}|{\tilde{{\varvec{w}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{} \le C\, \left( \int _\Omega |\nabla {\varvec{u}}_h^{}|^{(r-1)r'}\,\text {d}{\varvec{x}}\right) ^{\frac{1}{r'}} = C\, |{\varvec{u}}_h^{}|_{1,r,\Omega }^{r-1}\le C, \\ III&\le \Vert {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\Vert _{0,2{\tilde{r}},\Omega }^{}\Vert {\varvec{u}}_h^{}\Vert _{0,2{\tilde{r}},\Omega }^{} |{\tilde{{\varvec{w}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{} \le C\, \Vert {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\Vert _{0,2{\tilde{r}},\Omega }^{}|{\varvec{u}}_h^{}|_{1,r,\Omega }^{} \le C, \\ IV&\le \Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{}|{\tilde{{\varvec{w}}}}_h^{}|_{1,r,\Omega }^{}\le C\,\Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{}\,|{\tilde{{\varvec{w}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{}=C\,\Vert {\varvec{f}}\Vert _{-1,r',\Omega }^{}. \end{aligned}$$

Thus, the proof follows by inserting the above bounds on \(I,\ldots ,IV\) and (3.11) in (3.10). \(\square \)

4 Convergence to a Weak Solution

In this section we analyse the convergence of the finite element scheme (3.1), (3.2). The convergence proof is divided into two cases in order to distinguish between the situations when a solution \({\varvec{u}}\) of (2.4), (2.5) can, and cannot, be used as a test function in (2.4).

Theorem 13

Let \(r\in (\frac{2d}{d+2},\infty )\). Then, there exists a subsequence, still denoted by \(({\varvec{u}}_h^{},p_h^{})\), such that

$$\begin{aligned}&{\varvec{u}}_h^{} \rightharpoonup {\varvec{u}}\quad \text {weakly in}\; W^{1,r}_0(\Omega )^d; \end{aligned}$$
(4.1)
$$\begin{aligned}&{\varvec{u}}_h^{} \rightarrow {\varvec{u}}\quad \text {strongly in}\; L^s(\Omega )^d\quad \text {for}\; s\in \, [1,2{\tilde{r}}); \end{aligned}$$
(4.2)
$$\begin{aligned}&{{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{}) \rightarrow {\varvec{u}}\quad \text {strongly in}\; L^s(\Omega )^d,\; \text {for}\;s\in \, [1,2{\tilde{r}}); \end{aligned}$$
(4.3)
$$\begin{aligned}&p_h^{} \rightharpoonup p \quad \text {weakly in}\; L^{{\tilde{r}}}(\Omega ); \end{aligned}$$
(4.4)
$$\begin{aligned}&\text {if}\; r\ge \frac{3d}{d+2},\quad \text {then}\; s(p_h^{},p_h^{})\rightarrow 0\qquad \text {as}\; h\rightarrow 0. \end{aligned}$$
(4.5)

In addition, \(p\in L_0^{{\tilde{r}}}(\Omega )\), and \(({\varvec{u}},p)\) solves (2.4), (2.5).

Proof

The proofs of (4.1) and (4.4) follow using (3.3) and the reflexivity of \(W^{1,r}_0(\Omega )\) and \(L^{{\tilde{r}}}(\Omega )\) for \(r\in (1,\infty )\). In addition, p has zero average since

$$\begin{aligned} (p,1)_\Omega ^{}=\lim _{h\rightarrow 0}(p_h^{},1)_\Omega ^{}=0. \end{aligned}$$

The proof of (4.2) is a consequence of the Rellich–Kondrachov Theorem (see, e.g. [10, Theorem 9.16]). Moreover, the bound (2.27) implies that for every \(s< 2{\tilde{r}}\) there exists a number \(\xi >0\) such that

$$\begin{aligned} \Vert {\varvec{u}}_{nc}^{}\Vert _{0,s,\Omega }^{}\le Ch^{\xi } s(p_h^{},p_h^{})^{\frac{1}{2}}, \end{aligned}$$

so (3.3) yields \(\Vert {\varvec{u}}_{nc}^{}\Vert _{0,s,\Omega }^{}\rightarrow 0\) as \( h\rightarrow 0\) for all \(s< 2{\tilde{r}}\). Together with (4.2) this proves (4.3).

We now start the process of identifying the partial differential equation satisfied by the limits \({\varvec{u}}\) and p. Let \({\varvec{v}}\in C_0^{\infty }(\Omega )^d\) be arbitrary, and let \({\varvec{v}}_h^{}\in {\varvec{V}}_h^{}\) be its Scott–Zhang interpolant. Using (3.3) we first get that

$$\begin{aligned} \Vert \,|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}\Vert _{0,r',\Omega }^{r'} = \int _\Omega |\nabla {\varvec{u}}_h^{}|^{\frac{(r-1)r}{(r-1)}} \,\text {d}{\varvec{x}}= \Vert \nabla {\varvec{u}}_h^{}\Vert ^r_{0,r,\Omega }\le C, \end{aligned}$$
(4.6)

and thus there exists a \({\varvec{S}}\in L^{r'}(\Omega )^{d\times d}\) such that (up to a subsequence)

$$\begin{aligned} |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}\rightharpoonup {\varvec{S}}\quad \text {weakly in}\; L^{r'}(\Omega )^{d\times d}. \end{aligned}$$
(4.7)

So, since \({\varvec{v}}_h^{}\) converges to \({\varvec{v}}\) strongly in \(W^{1,r}_0(\Omega )^d\) we have

$$\begin{aligned} (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{}\rightarrow ({\varvec{S}},\nabla {\varvec{v}})_\Omega ^{}\quad \text {as}\; h\rightarrow 0. \end{aligned}$$
(4.8)

Next, thanks to (4.4) and the strong convergence of \({\varvec{v}}_h^{}\) to \({\varvec{v}}\) in \(W^{1,{\tilde{r}}'}_0(\Omega )^d\) (see (2.11)) the following holds:

$$\begin{aligned} (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{} = (p_h^{},{\mathop {\mathrm {div}\,}}({\varvec{v}}_h^{}-{\varvec{v}}))_\Omega ^{} + (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{v}})_\Omega ^{}\rightarrow 0+(p,{\mathop {\mathrm {div}\,}}{\varvec{v}})_\Omega ^{}\quad \text {as}\;h\rightarrow 0.\nonumber \\ \end{aligned}$$
(4.9)

To treat the convection term, (4.2) and (4.3) imply that

$$\begin{aligned} {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{} \rightarrow {\varvec{u}}\otimes {\varvec{u}}\qquad \text {strongly in}\;L^s(\Omega )^d\quad \text {for all}\; s< {\tilde{r}}, \end{aligned}$$
(4.10)

which, together with the fact that \({\varvec{v}}_h^{}\rightarrow {\varvec{v}}\) strongly in \(W^{1,s'}_0(\Omega )^d\), prove that

$$\begin{aligned} ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} \rightarrow ({\varvec{u}}\otimes {\varvec{u}},\nabla {\varvec{v}})_\Omega ^{}\qquad \text {as}\; h\rightarrow 0. \end{aligned}$$
(4.11)

Thus, \(({\varvec{S}},{\varvec{u}},p)\) solves a problem related to (2.4). In fact, since \(({\varvec{u}}_h^{},p_h^{})\) satisfies (3.1), then applying (4.8), (4.9), and (4.11), we arrive at

$$\begin{aligned} \begin{array}{ccccccc} (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} &{} - &{} ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} &{}-&{} (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{v}}_h^{})_\Omega ^{} &{} =&{} \langle {\varvec{f}},{\varvec{v}}_h^{}\rangle _\Omega ^{} \\ \downarrow &{} &{} \downarrow &{} &{} \downarrow &{} &{} \downarrow \\ ({\varvec{S}},\nabla {\varvec{v}})_\Omega ^{} &{} - &{} ({\varvec{u}}\otimes {\varvec{u}},\nabla {\varvec{v}})_\Omega ^{} &{} -&{} (p,{\mathop {\mathrm {div}\,}}{\varvec{v}})_\Omega ^{} &{} =&{} \langle {\varvec{f}},{\varvec{v}}\rangle _\Omega ^{}, \end{array} \end{aligned}$$

as \(h\rightarrow 0\), and using the density of \(C_0^\infty (\Omega )^d\) in \(W^{1,{\tilde{r}}'}_0(\Omega )^d\), \(({\varvec{u}},p,{\varvec{S}})\) satisfies

$$\begin{aligned} ({\varvec{S}},\nabla {\varvec{v}})_\Omega ^{} - ({\varvec{u}}\otimes {\varvec{u}},\nabla {\varvec{v}})_\Omega ^{} - (p,{\mathop {\mathrm {div}\,}}{\varvec{v}})_\Omega ^{} = \langle {\varvec{f}},{\varvec{v}}\rangle _\Omega ^{} \qquad \forall \, {\varvec{v}}\in W^{1,{\tilde{r}}'}_0(\Omega )^d.\nonumber \\ \end{aligned}$$
(4.12)

To show that \({\varvec{u}}\) is solenoidal we consider \(q\in C^\infty _0(\Omega )\), integrate by parts, and use (4.3) and the fact \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\) is solenoidal to obtain

$$\begin{aligned} ({\mathop {\mathrm {div}\,}}{\varvec{u}},q)_\Omega ^{}=-({\varvec{u}},\nabla q)_\Omega ^{} = -\lim _{h\rightarrow 0}({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{}),\nabla q)_\Omega ^{} =\lim _{h\rightarrow 0}({\mathop {\mathrm {div}\,}}{{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{}),q)_\Omega ^{}=0, \end{aligned}$$

and then \({\mathop {\mathrm {div}\,}}{\varvec{u}}=0\) in the distributional sense.

To prove that that \(({\varvec{u}},p)\) solves (2.4), (2.5) it only remains to show that \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\). The proof of this will be split into two cases, labelled (i) and (ii) below.

(i) \(\underline{r\ge \frac{3d}{d+2} :}\) In this case we use a classical result commonly referred to as the Minty trick (see, e.g., [31, Lemma 2.13]). Let \({\varvec{v}}\in W^{1,r}_0(\Omega )^d\), and let \({\varvec{v}}_h^{}\) be its Scott–Zhang interpolant. Since the r-Laplacian operator is monotone (see, e.g., [10]) we have

$$\begin{aligned} 0&\le (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}}- |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla ({\varvec{v}}-{\varvec{u}}_h^{}))_\Omega ^{} \nonumber \\&= (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}}- |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla ({\varvec{v}}-{\varvec{v}}_h^{}))_\Omega ^{} \nonumber \\&\qquad + (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}},\nabla ({\varvec{v}}_h^{}-{\varvec{u}}_h^{}))_\Omega ^{} - ( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla ({\varvec{v}}_h^{}-{\varvec{u}}_h^{}))_\Omega ^{}\nonumber \\&= {{\mathcal {A}}}+{{\mathcal {B}}}+{{\mathcal {C}}}. \end{aligned}$$
(4.13)

Using that \({\varvec{v}}_h^{}\) converges strongly to \({\varvec{v}}\) in \(W^{1,r}_0(\Omega )^d\) and \({\varvec{u}}_h^{}\) converges weakly to \({\varvec{u}}\) in \(W^{1,r}_0(\Omega )^d\), and (3.3) we easily get

$$\begin{aligned} {{\mathcal {A}}}&\le \Vert |\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}}- |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}\Vert _{0,r',\Omega }^{}\,|{\varvec{v}}-{\varvec{v}}_h^{}|_{1,r,\Omega }^{}\rightarrow 0 \qquad \text {as}\; h\rightarrow 0, \\ {{\mathcal {B}}}&= (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}},\nabla ({\varvec{v}}_h^{}-{\varvec{u}}_h^{}))_\Omega ^{} \rightarrow (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}},\nabla ({\varvec{v}}-{\varvec{u}}^{}))_\Omega ^{} \qquad \text {as}\; h\rightarrow 0. \end{aligned}$$

To treat \({{\mathcal {C}}}\) we use that \(({\varvec{u}}_h^{},p_h^{})\) solves the discrete problem (3.1), (3.2), as follows:

$$\begin{aligned} {{\mathcal {C}}}&= - ( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla ({\varvec{v}}_h^{}-{\varvec{u}}_h^{}))_\Omega ^{} \\&= -( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} + ( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{u}}_h^{})_\Omega ^{} \\&= -( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} + \underbrace{({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\varvec{u}}_h^{})_\Omega ^{}}_{=0} + (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{} + \langle {\varvec{f}},{\varvec{u}}_h^{}\rangle _\Omega ^{} \\&= -( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{v}}_h^{})_\Omega ^{} - s(p_h^{},p_h^{}) + \langle {\varvec{f}},{\varvec{u}}_h^{}\rangle _\Omega ^{} \\&= {{\mathcal {D}}}-{{\mathcal {E}}}+{{\mathcal {F}}} . \end{aligned}$$

Thus, from (4.13) we get

$$\begin{aligned} 0\le {{\mathcal {E}}} = {{\mathcal {A}}} +{{\mathcal {B}}} + {{\mathcal {D}}}+ {{\mathcal {F}}}, \end{aligned}$$

and, taking the limit when \(h\rightarrow 0\) on both sides of this inequality, using that \({\varvec{v}}_h^{}\rightarrow {\varvec{v}}\) strongly in \(W^{1,r}_0(\Omega )^d\), \({\varvec{u}}_h^{}\rightharpoonup {\varvec{u}}\) weakly in \(W^{1,r}_0(\Omega )^d\), and \(|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}\rightharpoonup {\varvec{S}}\) weakly in \(L^{r'}(\Omega )^{d\times d}\), we obtain

$$\begin{aligned} 0\le \lim _{h\rightarrow 0} s(p_h^{},p_h^{}) \le (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}}, \nabla ({\varvec{v}}-{\varvec{u}}))_\Omega ^{}-({\varvec{S}},\nabla {\varvec{v}})_\Omega ^{}+\langle {\varvec{f}},{\varvec{u}}\rangle _\Omega ^{}. \end{aligned}$$

It only remains to show that \(\langle {\varvec{f}},{\varvec{u}}\rangle _\Omega ^{}= ({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{}\) to prove that \({\varvec{S}}\) and \({\varvec{u}}\) satisfy

$$\begin{aligned} 0\le \lim _{h\rightarrow 0} s(p_h^{},p_h^{}) \le (|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}}-{\varvec{S}},\nabla ({\varvec{v}}-{\varvec{u}}))_\Omega \end{aligned}$$
(4.14)

for all \(\forall {\varvec{v}}\in W^{1,r}_0(\Omega )^d\), and then the monotonicity of the r-Laplacian and an application of [31, Lemma 2.13] gives \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\). Since \(r\ge \frac{3d}{d+2}\) then \(r={\tilde{r}}'\) and so \({\varvec{u}}\in W^{1,{\tilde{r}}'}_0(\Omega )^d\). Hence, by taking \({\varvec{v}}={\varvec{u}}\) as test function in (4.12) we obtain

$$\begin{aligned} \langle {\varvec{f}},{\varvec{u}}\rangle _\Omega ^{} = ({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{} - \underbrace{({\varvec{u}}\otimes {\varvec{u}},\nabla {\varvec{u}})_\Omega ^{}}_{=0}- \underbrace{(p,{\mathop {\mathrm {div}\,}}{\varvec{u}})_\Omega ^{}}_{=0} = ({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{}, \end{aligned}$$

thus proving that \(({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{}=\langle {\varvec{f}},{\varvec{u}}\rangle _\Omega ^{}\). Hence \(({\varvec{u}},p)\) solves the continuous problem (2.4), (2.5).

Finally, (4.5) follows by taking \({\varvec{v}}={\varvec{u}}\) in (4.14).

(ii) \(\underline{ r \in \left( \frac{2d}{d+2},\frac{3d}{d+2}\right) :}\) For this case we are not able to use the fundamental step of taking \({\varvec{v}}={\varvec{u}}\) as test function in (4.12) to conclude \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\). So, we need to appeal to the results concerning discrete Lipschitz truncation described in Sect. 2.3 and use the Minty trick once again. To conclude that \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\) in \(\Omega \) we need to show that

$$\begin{aligned} \lim _{h\rightarrow 0}\big (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}-|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}},\nabla ({\varvec{u}}_h^{}-{\varvec{u}})\big )_\Omega ^{} = 0. \end{aligned}$$
(4.15)

Let us prove that (4.15) does indeed imply that \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\) in \(\Omega \), as desired. Having done so, we shall show that (4.15) holds. To this end, let \({\varvec{S}}_h^{}:= |\nabla {\varvec{u}}_h^{}|^{r-2} \nabla {\varvec{u}}_h^{}\). As it was done in (4.6) and (4.7), there exists a subsequence such that \({\varvec{S}}_h^{} \rightharpoonup {\varvec{S}}\) in \(L^{r'}(\Omega )^{d\times d}\). Using (4.15) it is simple to prove that

$$\begin{aligned} \lim _{h\rightarrow 0} \big (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{u}}_h^{})_\Omega ^{} = ({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{}. \end{aligned}$$
(4.16)

Indeed, by recalling the definition of \({\varvec{S}}_h\) and expanding (4.15) we have that

$$\begin{aligned} \lim _{h\rightarrow 0} \big (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{u}}_h^{})_\Omega ^{}&= \lim _{h \rightarrow 0} ({\varvec{S}}_h, \nabla {\varvec{u}}_h)_\Omega \\&= \lim _{h \rightarrow 0} ({\varvec{S}}_h, \nabla {\varvec{u}})_\Omega + \lim _{h\rightarrow 0} (|\nabla {\varvec{u}}|^{r-2} \nabla {\varvec{u}}, \nabla {\varvec{u}}_h^{} - \nabla {\varvec{u}})_\Omega \\&= ({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{} + 0 = ({\varvec{S}},\nabla {\varvec{u}})_\Omega ^{}, \end{aligned}$$

thanks to the weak convergence of \({\varvec{S}}_h\) to \({\varvec{S}}\) in \(L^{r'}(\Omega )^{d \times d}\) and the weak convergence of \({\varvec{u}}_h\) to \({\varvec{u}}\) in \(W^{1,r}_0(\Omega )^d\) (cf. (4.1)). Thus we have shown that (4.15) implies (4.16). Now from (4.16) we have that

$$\begin{aligned} 0&\le \lim _{h\rightarrow 0} \big (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}-|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}},\nabla ({\varvec{u}}_h^{}-{\varvec{v}})\big )_\Omega ^{} \\&= \big ({\varvec{S}}-|\nabla {\varvec{v}}|^{r-2}\nabla {\varvec{v}},\nabla ({\varvec{u}}-{\varvec{v}})\big )_\Omega ^{} \end{aligned}$$

for all \({\varvec{v}}\in W^{1,r}_0(\Omega )^d\), and the application of the Minty trick gives that \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\).

To prove (4.15), let \(\text {H}_h^{}:= ( |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}-|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}):\nabla ({\varvec{u}}_h^{}-{\varvec{u}})\). Since the r-Laplacian is monotone, it follows that \(\text {H}_h^{}\ge 0\) almost everywhere in \(\Omega \), leading to

$$\begin{aligned} \liminf _{h\rightarrow 0} \int _\Omega \text {H}_h^{}({\varvec{x}})\,\text {d}{\varvec{x}}\ge 0. \end{aligned}$$
(4.17)

To prove the converse to (4.17), let \({\varvec{v}}_h^{}:= {\varvec{u}}_h^{}-{{\mathscr {I}}}({\varvec{u}})\), where \({{\mathscr {I}}}\) is the Fortin operator satisfying (2.17), (2.18). First, \({\varvec{u}}_h^{},{{\mathscr {I}}}({\varvec{u}})\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\) and \({\varvec{v}}_h^{} \rightharpoonup {\varvec{0}}\) weakly in \(W_0^{1,r}(\Omega )^d\) as \(h\rightarrow 0\). Let now \(\{{\varvec{v}}_{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\) and \(\{{\varvec{w}}_{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\) be the sequences defined in Lemmas 6 and 7, respectively, and let \(\{{{\mathscr {B}}}_{h,j}^{}\}_{h>0,j\in {{\mathbb {N}}}}^{}\) be the sets defined in Lemma 6. First, thanks to (3.3), \(\text {H}_h^{}\) is uniformly bounded in \(L^1(\Omega )\), and then using Hölder’s inequality and (2.19) we get

$$\begin{aligned} \int _\Omega \text {H}_h^{\frac{1}{2}}\,\text {d}{\varvec{x}}&= \int _{{{\mathscr {B}}}_{h,j}^{}}\text {H}_h^{\frac{1}{2}}\, \text {d}{\varvec{x}}+ \int _{\Omega \setminus {{\mathscr {B}}}_{h,j}^{}}\text {H}_h^{\frac{1}{2}}\, \text {d}{\varvec{x}}\\&\le |{{\mathscr {B}}}_{h,j}^{}|^{\frac{1}{2}}\left\{ \int _{{{\mathscr {B}}}_{h,j}^{}}\text {H}_h^{}\,\text {d}{\varvec{x}}\right\} ^{\frac{1}{2}} + |\Omega \setminus {{\mathscr {B}}}_{h,j}^{}|^{\frac{1}{2}}\left\{ \int _{\Omega \setminus {{\mathscr {B}}}_{h,j}^{}}\text {H}_h^{}\,\text {d}{\varvec{x}}\right\} ^{\frac{1}{2}}\\&\le C2^{-\frac{j}{2}} +|\Omega |^{\frac{1}{2}}\,{{\mathfrak {A}}}^{\frac{1}{2}}, \end{aligned}$$

where

$$\begin{aligned} {{\mathfrak {A}}}:= \int _{\Omega }\text {H}_h^{}\,\text {d}{\varvec{x}}.\end{aligned}$$

The goal will be to show that \({{\mathfrak {A}}}\) is bounded by \(C2^{-\frac{j}{r}}\) plus a term that tends to zero with h, ultimately proving that

$$\begin{aligned} \limsup _{h\rightarrow 0}\int _\Omega \text {H}_h^{\frac{1}{2}} \,\text {d}{\varvec{x}}\le C2^{-\frac{j}{2r}}, \end{aligned}$$

for every \(j\in {{\mathbb {N}}}\), which combined with (4.17) will prove (4.15).

To bound \({{\mathfrak {A}}}\) we start by decomposing the error \({\varvec{u}}_h^{}-{\varvec{u}}\) as \({\varvec{u}}_h^{}-{\varvec{u}}= {\varvec{v}}_h^{}+{{\mathscr {I}}}({\varvec{u}})-{\varvec{u}}\), define \({{\mathscr {G}}}_h^{}:=|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}-|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\), and thus write

$$\begin{aligned} {{\mathfrak {A}}}= \int _{\Omega \setminus {{\mathscr {B}}}_{h,j}^{}}{{\mathscr {G}}}_h^{}:\nabla {\varvec{v}}_h^{}\,\text {d}{\varvec{x}}+\int _{\Omega \setminus {{\mathscr {B}}}_{h,j}^{}}{{\mathscr {G}}}_h^{}:\nabla ({{\mathscr {I}}}({\varvec{u}})-{\varvec{u}})\,\text {d}{\varvec{x}}= {{\mathfrak {B}}}+{{\mathfrak {C}}}. \end{aligned}$$

Since \({{\mathscr {I}}}({\varvec{u}})\rightarrow {\varvec{u}}\) strongly in \(W^{1,r}_0(\Omega )^d\) and \({{\mathscr {G}}}_h^{}\) is uniformly bounded in \(L^{r'}(\Omega )^{d\times d}\) (thanks to (3.3)), \({{\mathfrak {C}}}\rightarrow 0\) as \(h\rightarrow 0\). Moreover, since \({\varvec{v}}_{h,j}^{}={\varvec{v}}_h^{}\) in \(\Omega \setminus {{\mathscr {B}}}_{h,j}^{}\), then

$$\begin{aligned} {{\mathfrak {B}}}&= \int _{\Omega \setminus {{\mathscr {B}}}_{h,j}^{}}{{\mathscr {G}}}_h^{}:\nabla {\varvec{v}}_{h,j}^{} \,\text {d}{\varvec{x}}\\&= \int _{\Omega }{{\mathscr {G}}}_h^{}:\nabla {\varvec{v}}_{h,j}^{} \,\text {d}{\varvec{x}}- \int _{\Omega }{{\mathscr {G}}}_h^{}:\nabla {\varvec{v}}_{h,j}^{} \mathbb {1}_{{{\mathcal {B}}}_{h,j}^{}} \,\text {d}{\varvec{x}}\\&= \int _{\Omega }{{\mathscr {G}}}_h^{}:\nabla ({\varvec{v}}_{h,j}^{}-{{\varvec{w}}}_{h,j}^{})\,\text {d}{\varvec{x}}+ \int _{\Omega }{{\mathscr {G}}}_h^{}:\nabla {{\varvec{w}}}_{h,j}^{}\,\text {d}{\varvec{x}}- \int _{\Omega }{{\mathscr {G}}}_h^{}:\nabla {\varvec{v}}_{h,j}^{} \mathbb {1}_{{{\mathcal {B}}}_{h,j}^{}}\,\text {d}{\varvec{x}}\\&= {{\mathfrak {D}}}+{{\mathfrak {E}}}+{{\mathfrak {F}}}. \end{aligned}$$

Hölder’s inequality, (2.21), (2.20), and (2.19) yield the bounds

$$\begin{aligned} |{{\mathfrak {D}}}|&\le \Vert {{\mathscr {G}}}_h^{}\Vert _{0,r',\Omega }^{}\Vert \nabla ({\varvec{v}}_{h,j}^{}- {{\varvec{w}}}_{h,j}^{})\Vert _{0,r,\Omega }^{}\le C2^{-\frac{j}{r}}, \\ |{{\mathfrak {F}}}|&\le \Vert {{\mathscr {G}}}_h^{}\Vert _{0,r',\Omega }^{}\Vert \nabla {\varvec{v}}_{h,j}^{} \mathbb {1}_{{{\mathcal {B}}}_{h,j}^{}}\Vert _{0,r,\Omega }^{} \le C2^{-\frac{j}{r}}, \end{aligned}$$

for all \(h>0\). Moreover, \({{\mathfrak {E}}}\) is decomposed as follows

$$\begin{aligned} {{\mathfrak {E}}}= \int _\Omega |\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{}:\nabla {\varvec{w}}_{h,j}^{}\,\text {d}{\varvec{x}}- \int _\Omega |\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}:\nabla {\varvec{w}}_{h,j}^{}\,\text {d}{\varvec{x}}= {{\mathfrak {G}}}+{{\mathfrak {H}}}. \end{aligned}$$

Since \({\varvec{w}}_{h,j}^{} \rightharpoonup {\varvec{0}}\) weakly in \(W^{1,r}_0(\Omega )^d\) then \({{\mathfrak {H}}}\rightarrow 0\) as \(h\rightarrow 0\). The only remaining term to deal with is \({{\mathfrak {G}}}\). We start by using that \(({\varvec{u}}_h^{},p_h^{})\) solves (3.1), (3.2) to rewrite \({{\mathfrak {G}}}\) as follows

$$\begin{aligned} {{\mathfrak {G}}}= ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{}, \nabla {\varvec{w}}_{h,j}^{})_\Omega ^{}+ (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{w}}_{h,j}^{})_\Omega ^{}-\langle {\varvec{f}}, {\varvec{w}}_{h,j}^{}\rangle _\Omega ^{} . \end{aligned}$$

The convective term above is treated as follows: using that for any fixed \(s<+\infty \), \(\nabla {\varvec{w}}_{h,j}^{}\) is uniformly bounded in \(L^s(\Omega )^{d\times d}\), then (4.10) applied to \({\hat{s}}=\frac{1+{\tilde{r}}}{2}<{\tilde{r}}\) yields the bound

$$\begin{aligned} ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{}-{\varvec{u}}\otimes {\varvec{u}}, \nabla {\varvec{w}}_{h,j}^{})_\Omega ^{} \le \Vert {{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{}-{\varvec{u}}\otimes {\varvec{u}}\Vert _{0,{\hat{s}},\Omega }^{}\Vert \nabla {\varvec{w}}_{h,j}^{}\Vert _{0,{\hat{s}}',\Omega }^{}\rightarrow 0, \end{aligned}$$

as \(h\rightarrow 0\), and then

$$\begin{aligned} ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{}, \nabla {\varvec{w}}_{h,j}^{})_\Omega ^{} = ({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{}-{\varvec{u}}\otimes {\varvec{u}}, \nabla {\varvec{w}}_{h,j}^{})_\Omega ^{}+ ({\varvec{u}}\otimes {\varvec{u}}, \nabla {\varvec{w}}_{h,j}^{})_\Omega ^{}\rightarrow 0, \end{aligned}$$

as \(h\rightarrow 0\). Moreover, \(\langle {\varvec{f}}, {\varvec{w}}_{h,j}^{}\rangle _\Omega ^{} \rightarrow 0\) as \(h\rightarrow 0\). Finally, for the remaining term in \({{\mathfrak {G}}}\) we get, by applying that \({\varvec{w}}_{h,j}^{}\in {\varvec{V}}_{h,{\mathop {\mathrm {div}\,}}}^{}\), the Cauchy–Schwarz inequality and (2.24):

$$\begin{aligned} (p_h^{},{\mathop {\mathrm {div}\,}}{\varvec{w}}_{h,j}^{})_\Omega ^{}&= (p_h^{}-\Pi _H^{}(p_h^{}),{\mathop {\mathrm {div}\,}}{\varvec{w}}_{h,j}^{})_\Omega ^{} \\&\le \Vert p_h^{}-\Pi _H^{}(p_h^{})\Vert _{0,\Omega }^{}\Vert {\mathop {\mathrm {div}\,}}{\varvec{w}}_{h,j}^{}\Vert _{0,\Omega }^{} \\&\le Ch^{\frac{1-\alpha (r)}{2}}s(p_h^{},p_h^{})^{\frac{1}{2}}\Vert {\mathop {\mathrm {div}\,}}{\varvec{w}}_{h,j}^{}\Vert _{0,\Omega }^{} \rightarrow 0, \end{aligned}$$

as \(h\rightarrow 0\), since \(\alpha (r)=\frac{d-1}{3}<1\) for all \(r< \frac{3d}{d+2}\), and \(s(p_h^{},p_h^{})\) and \(\Vert {\mathop {\mathrm {div}\,}}{\varvec{w}}_{h,j}^{}\Vert _{0,\Omega }^{}\) are uniformly bounded in h and j.

Collecting all the above bounds the following can be concluded

$$\begin{aligned} {{\mathfrak {A}}} = {{\mathfrak {B}}}+{{\mathfrak {C}}} = {{\mathfrak {D}}}+{{\mathfrak {E}}}+{{\mathfrak {F}}}+{{\mathfrak {C}}} \le C2^{-\frac{j}{r}}+ {{\mathfrak {G}}}+{{\mathfrak {H}}}+{{\mathfrak {C}}}, \end{aligned}$$

and since \({{\mathfrak {G}}}+{{\mathfrak {H}}}+{{\mathfrak {C}}}\rightarrow 0\) as \(h\rightarrow 0\) for every fixed \(j\in {{\mathbb {N}}}\), then, for every \(j\in {{\mathbb {N}}}\) we get \(\limsup _{h\rightarrow 0} \int _\Omega \text {H}_h^{\frac{1}{2}}({\varvec{x}})\,\text {d}{\varvec{x}}\le C2^{-\frac{j}{2r}}\) for every \(j\in {{\mathbb {N}}}\), and thus

$$\begin{aligned} \limsup _{h\rightarrow 0} \int _\Omega \text {H}_h^{\frac{1}{2}}({\varvec{x}})\,\text {d}{\varvec{x}}\le 0. \end{aligned}$$

So, \(\int _\Omega \text {H}_h^{\frac{1}{2}}\,\text {d}{\varvec{x}}\rightarrow 0\), which means that, up to a subsequence if necessary, \(\text {H}_h^{\frac{1}{2}} \rightarrow 0\) almost everywhere in \(\Omega \), and thus \(\text {H}_h^{} \rightarrow 0\) almost everywhere in \(\Omega \). This, together with (4.17), proves (4.15) and thus \({\varvec{S}}=|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}}\) almost everywhere in \(\Omega \). Hence, \(({\varvec{u}},p)\) solves the continuous problem (2.4), (2.5). \(\square \)

4.1 Strong Convergence

The convergence results proved in the last section can be strengthened. In fact, in this section we prove that the velocity and pressure converge strongly, at least for an appropriate range of values of r in the case of the pressure. We start with the proof of the strong convergence of the velocity.

Theorem 14

For every \(r>\frac{2d}{d+2}\) the discrete velocity \({\varvec{u}}_h^{}\) converges to \({\varvec{u}}\) strongly in \(W^{1,r}_0(\Omega )^d\).

Proof

We start by considering the case when \(r \ge \frac{3d}{d+2}\). Using the discrete problem (3.1), (3.2), (4.5), and (2.4) with \({\varvec{v}}={\varvec{u}}\) we get

$$\begin{aligned} \lim _{h\rightarrow 0}\big (|\nabla {\varvec{u}}_h^{}|^{r-2}\nabla {\varvec{u}}_h^{},\nabla {\varvec{u}}_h^{}\big )_\Omega ^{}&= \lim _{h\rightarrow 0}\big \{ \underbrace{({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{},\nabla {\varvec{u}}_h^{})_\Omega ^{}}_{=0} + (p_h^{}, {\mathop {\mathrm {div}\,}}{\varvec{u}}_h^{})_\Omega ^{}+\langle {\varvec{f}},{\varvec{u}}_h^{}\rangle _\Omega ^{}\big \}\\&= -\lim _{h\rightarrow 0} s(p_h^{},p_h^{}) + \lim _{h\rightarrow 0} \langle {\varvec{f}},{\varvec{u}}_h^{}\rangle _\Omega ^{} \\&= 0+ \langle {\varvec{f}},{\varvec{u}}\rangle _\Omega ^{}\\&=\big (|\nabla {\varvec{u}}|^{r-2}\nabla {\varvec{u}},\nabla {\varvec{u}}\big )_\Omega ^{}, \end{aligned}$$

and the result follows by using that \({\varvec{u}}_h^{} \rightharpoonup {\varvec{u}}\) in \(W^{1,r}_0(\Omega )^d\), the fact that \(W^{1,r}_0(\Omega )^d\) is uniformly convex, and [10, Proposition 3.32]. For \(r< \frac{3d}{d+2}\) we realise that (4.16) in fact states that \(\lim _{h\rightarrow 0}|{\varvec{u}}_h^{}|_{1,r,\Omega }^{}=|{\varvec{u}}|_{1,r,\Omega }^{}\), and the strong convergence of \({\varvec{u}}_h^{}\) to \({\varvec{u}}\) in \(W^{1,r}_0(\Omega )^d\) follows using once again [10, Proposition 3.32]. \(\square \)

The strong convergence of the pressure is proved next. We begin by noticing that, thanks to Theorem 14 and the continuous injection \(W^{1,r}_0(\Omega )^d \hookrightarrow L^{2{\tilde{r}}}(\Omega )^d\) we have that \({\varvec{u}}_h^{}\) converges strongly to \({\varvec{u}}\) in \(L^{2{\tilde{r}}}(\Omega )^d\). Moreover, if \(r\ge \frac{3d}{d+2}\) then thanks to (2.27) and (4.5), \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\) also converges strongly to \({\varvec{u}}\) in \(L^{2{\tilde{r}}}(\Omega )\).

Theorem 15

For \(r\ge \frac{3d}{d+2}\), the discrete pressure \(p_h^{}\) converges to p strongly in \(L^{{\tilde{r}}}_0(\Omega )\).

Proof

Let \(\Pi _{H}^{}\) be the projection defined in (2.10). Using the triangle inequality we get

(4.18)

First, thanks to (2.11)

(4.19)

Moreover, the combined use of (2.24) and (4.5) gives

(4.20)

It only remains to bound . Thanks to the inf-sup condition (2.16) there exist \(\beta _{{\tilde{r}}}^{}>0\) and \({\tilde{{\varvec{v}}}}_h^{}\in {\varvec{V}}_h^{}\) with \(|{\tilde{{\varvec{v}}}}_h^{}|_{1,{\tilde{r}}',\Omega }^{}=1\) such that

Hölder’s inequality gives

thanks to (4.19) and (4.20). It only remains to bound . Using that \(({\varvec{u}}_h^{},p_h^{})\) solves (3.1), (3.2) we get

Using that \({\varvec{u}}_h^{}\) converges to \({\varvec{u}}\) strongly in \(W^{1,r}_0(\Omega )^d\) we get . Finally, \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\otimes {\varvec{u}}_h^{}\rightarrow {\varvec{u}}\otimes {\varvec{u}}\) in \(L^{{\tilde{r}}}(\Omega )^{d\times d}\) and \(|{\tilde{v}}_h^{}|_{1,{\tilde{r}}',\Omega }^{}=1\) giving as \(h\rightarrow 0\). So, , and the result follows from (4.18). \(\square \)

5 Concluding Remarks

In this work we have extended the applicability of a low-order divergence-free stabilised finite element method to incompressible non-Newtonian fluid flow models with power-law rheology. The method is based on using a standard continuous piecewise linear finite element approximation for the velocity and piecewise constant approximation for the pressure. The main results of the paper are twofold: first, the method has been shown to converge to a weak solution of the boundary-value problem in the entire range \(r>\frac{2d}{d+2}\) of the power-law index r within which weak solutions to the model are known to exist. Up to now this was only possible by using finite element methods based on pointwise divergence-free continuous piecewise polynomials constructed by taking the curl of \(C^1\) piecewise polynomials (an approach that is usually avoided because of the complexity of its implementation and the excessive number of unknowns at each node, particularly in three dimensions); by using Scott–Vogelius finite elements, which are inf-sup stable on shape-regular meshes for piecewise quartic velocity fields and higher [25]; or by using Guzmán–Neilan type pointwise divergence-free rational basis functions (see [16] for the convergence proof in this case). With standard mixed finite element methods, with a discretely divergence-free velocity field, the range of r for which convergence was shown to hold is smaller, and is restricted to \(r>\frac{2d}{d+1}\); it is not known whether such standard mixed finite element methods converge for \(\frac{2d}{d+2}<r \le \frac{2d}{d+1}\) (see [16]).

The second main result of this paper is the proof of strong convergence of both the velocity and the pressure. To the best of our knowledge, this is the first work where such a result has been shown for this type of stabilisation; in fact, this strong convergence result is new even for \(r=2\) corresponding to the case of a Newtonian fluid. To date, not many stabilised finite element methods have been proved to be convergent under minimal regularity hypotheses, and for those for which this was achieved the discussion was restricted to the simpler situation of a Newtonian fluid \((r=2)\). In addition, the stabilising jump terms involved the complete Cauchy stress tensor rather than the jump in the pressure alone (see, e.g., [4], where dG methods were analysed), or, in the case of continuous finite element pairs, residual-based stabilisation was used (see, [5]).

As was noted earlier, the present work is seen as a proof-of-concept paper, whose aim is to showcase the applicability of this type of stabilisation to problems that are more complex than the Navier–Stokes model, and to highlight the fact that the use of the ‘covert’ divergence-free velocity field \({{\mathscr {L}}}({\varvec{u}}_h^{},p_h^{})\) in the convection term allows one to prove the convergence in the whole range of values of the power-law index r for which weak solutions to the model are known to exist. As such, several questions remain open, including the following:

  • Assumption (A1) was introduced so as to be able to define the discretely divergence-free Lipschitz truncation in the present setting. Whether this is a necessity or one may avoid the use of Lemma 7 altogether, and thereby dispense with Assumption (A1), is an interesting open question;

  • The discussion contained in Remark 5 hints at the possibility of applying lower-order divergence-free finite elements on appropriately refined meshes without the need of stabilisation. This would require the study of the inf-sup stability of such pairs in the setting of the present paper; we note in this direction the recent paper [19], where a Scott–Vogelius pair is used on barycentrically refined meshes.

  • Most of the results presented in this work can be extended, without major difficulties, to more sophisticated explicit constitutive laws (e.g. to Carreau–Yasuda type models). In particular, power-law models such as the ones discussed in [17, Section 3] can be analysed with the techniques developed in this work;

  • Finally, the extension of the results of our work to steady and unsteady implicitly constituted models, such as the ones considered in [16, 34], where the constitutive relation can be identified with a maximal monotone r-graph, is the subject of ongoing research and will be presented elsewhere.