1 Introduction

In this work we consider a viscous, plastic (Bingham) fluid which behaves like a solid at low stresses and like a viscous fluid at high stresses, see [1,2,3,4] and the more recent review article [5]. An everyday example is toothpaste which extrudes from the tube as a solid plug when stress is applied, remains solid in the middle of the plug and exhibits fluid-like behavior near the tube wall. The velocity stays constant within the solid part, i.e. \(\nabla u = \varvec{0}\), and this condition is enforced using a normalized vectorial Lagrange multiplier \(\varvec{\lambda }\). Note that the physical stress vector is \(g \varvec{\lambda }\), and the shear stress is given by its length \(g \vert \varvec{\lambda }\vert \). Here \(g>0\) is a given fixed threshold value for the shear stress at which the solid becomes liquid when exceeded. According to Section 8 in [6] this gives the strong formulation

$$\begin{aligned} -\mu \Delta u - g\,\mathrm {div}\,\varvec{\lambda }&= f \quad \hbox { in}\ \Omega , \end{aligned}$$
(1a)
$$\begin{aligned} \varvec{\lambda }\cdot \nabla u&= \vert \nabla u \vert \quad \text {in }\Omega , \end{aligned}$$
(1b)
$$\begin{aligned} \vert \varvec{\lambda }\vert&\le 1\quad \text {in }\Omega , \end{aligned}$$
(1c)
$$\begin{aligned} u&=0 \quad \hbox { on}\ \partial \Omega , \end{aligned}$$
(1d)

where \(\mu \) is the viscosity of the considered fluid, and f describes the pressure drop along the pipe. Note that in practice the pressure drop is often constant over the cross section. However, in this work we assume that \(f \in L^2(\Omega )\).

Several contributions in the field of Bingham-type fluid computations were made by Glowinski and collaborators; cf. [7,8,9]. In the latter a linear approximation was introduced and a suboptimal a priori error estimate for the velocity was given. An optimal linear convergence of some low order (mixed) methods was discussed in the works [10, 11]. Glowinski also provided an exact solution for the model problem of a circular domain with a constant load. Since even this simple geometry and loading leads to a solution that is only in \(H^{5/2-\epsilon }(\Omega )\), \(\epsilon > 0\), a higher regularity for general Bingham-type flows is unlikely. Due to this non-smooth nature of the problem, the use of adaptive error control seems highly desirable, see for example [12] and the references therein, and more recently in [13]. In particular, we would like to highlight the work [14] where the authors introduced the same a posteriori estimator that we derive in this work. More precisely they bound the velocity error in terms of the load, the discrete velocity \(u_h\) and the discrete approximation of the Lagrange multiplier \(\varvec{\lambda }_h\),

$$\begin{aligned} \Vert u - u_h \Vert _1 \lesssim \eta (f,u_h,\varvec{\lambda }_h), \end{aligned}$$

where \(\eta \) is some estimator. Although their definition of \(\eta \) is reasonable, proper error control is not guaranteed since their stability analysis does not include a bound for \(\varvec{\lambda }_h\). The main problem can be traced back to the lack of a Babuška–Brezzi condition for the considered linear Lagrangian velocity space and the space of element-wise vector-valued constants for the Lagrange multiplier (which is the same discretization used in [9]). As a result discrete stability for both \(u_h\) and \(\varvec{\lambda }_h\) is not present, i.e. the estimator \(\eta \) could be arbitrarily large.

The main contribution of this work is a novel stability and error analysis of a mixed finite element approximation of (1). For this we build upon the ideas from one of the authors work [15] on obstacle problems, and the corresponding references therein. Our analysis is based on proving a discrete Babuška–Brezzi condition using a mesh dependent norm; cf. [16]. This allows us to consider various finite element pairs suitable for approximating (1). Beside continuous and discrete stability (see Sects. 2 and 3) we derive an a priori error estimate and discuss linear convergence for sufficiently regular solutions in Sect. 4. Our approach then further allows deriving a residual based a posteriori error estimator (see Sect. 5) which is globally reliable and locally efficient up to a consistency term. We want to emphasize that our analysis gives full control for both the error of the velocity and the error of the (divergence of the) Lagrange multiplier. We conclude the work in Sect. 6 where we give insight on how to solve the discrete system and provide several numerical examples to validate our analysis.

2 Continuous stability

The weak formulation of (1) finds \(u \in V\) and \(\varvec{\lambda }\in \varvec{\Lambda }\) such that

$$\begin{aligned} (\mu \nabla u, \nabla v) + (g \nabla v, \varvec{\lambda })&= (f, v) \quad \forall v \in V, \end{aligned}$$
(2a)
$$\begin{aligned} (g\nabla u, \varvec{\mu }- \varvec{\lambda })&\le 0 \quad \forall \varvec{\mu }\in \varvec{\Lambda }, \end{aligned}$$
(2b)

where \(V = H^1_0(\Omega )\) and \(\varvec{\Lambda }= \{ \varvec{\mu }\in L^2(\Omega , {\mathbb {R}}^2) : \vert \varvec{\mu }\vert \le 1~\hbox { a.e.~in}\ \Omega \}\); see also [8]. Combining (2a) and (2b) gives

$$\begin{aligned} (\mu \nabla u, \nabla v) + (g\nabla v, \varvec{\lambda }) + (g\nabla u, \varvec{\mu }- \varvec{\lambda }) \le (f, v) \end{aligned}$$
(3)

for every \((v, \varvec{\mu }) \in V \times \varvec{\Lambda }\). Note that the solution of (3) is unique up to a divergence-free component, i.e. \(\varvec{\lambda }+ \varvec{\xi }\) is also a solution if \(\mathrm {div}\,\varvec{\xi }= 0\). For the stability analysis we choose the standard \(H^1\)-norm for the space V, and the dual norm \(\Vert \mathrm {div}\,(\cdot ) \Vert _{-1}\) for \(\varvec{\Lambda }\). Note that the latter is strictly speaking not a norm, but only a seminorm. Thus, all error estimates for \(\varvec{\lambda }\) from this work will not prove any convergence of the corresponding approximation in a strong sense, but only show convergence of its distributional divergence.

To simplify the notation we will from now on set \(g=\mu =1\). In the following we use the shorthand notation

$$\begin{aligned} {\mathcal {B}}(w, \varvec{\xi }; v, \varvec{\mu }) = (\nabla w, \nabla v) + (\nabla v, \varvec{\xi }) + (\nabla w, \varvec{\mu }). \end{aligned}$$
(4)

Using Cauchy–Schwarz and the continuity of the duality pairing

$$\begin{aligned} (\nabla v, \varvec{\xi }) = \langle \mathrm {div}\,\varvec{\xi }, v\rangle _{-1} \le \Vert v \Vert _1 \Vert \mathrm {div}\,\varvec{\xi }\Vert _{-1} \quad \forall v \in V, \varvec{\xi }\in \varvec{Q}, \end{aligned}$$

with \(\varvec{Q}= L^2(\Omega , {\mathbb {R}}^2)\), one immediately sees that \({\mathcal {B}}\) is continuous, i.e. we have

$$\begin{aligned} {\mathcal {B}}(w, \varvec{\xi }; v, \varvec{\mu }) \lesssim (\Vert w \Vert _1 + \Vert \mathrm {div}\,\varvec{\xi }\Vert _{-1})(\Vert v \Vert _1 + \Vert \mathrm {div}\,\varvec{\mu }\Vert _{-1}). \end{aligned}$$
(5)

Throughout the paper we write \(a \lesssim b\) (or \(a > rsim b\)) if there exists a constant \(C>0\), independent of the finite element mesh, such that \(a \le C b\) (or \(a \ge C b\)). If \(a \lesssim b\) and \(b \lesssim a\) then we write \(a \sim b\).

Theorem 1

For every \((w, \varvec{\xi }) \in V \times \varvec{Q}\) there exists a function \(r \in V\) such that

$$\begin{aligned} {\mathcal {B}}(w, \varvec{\xi }; r, -\varvec{\xi }) > rsim (\Vert w \Vert _1 + \Vert \mathrm {div}\,\varvec{\xi }\Vert _{-1})^2 \end{aligned}$$
(6)

and

$$\begin{aligned} \Vert r\Vert _1 \lesssim \Vert w \Vert _1 + \Vert \mathrm {div}\,\varvec{\xi }\Vert _{-1}. \end{aligned}$$
(7)

Proof

We have

$$\begin{aligned} {\mathcal {B}}(w, \varvec{\xi }; w, -\varvec{\xi }) = \Vert \nabla w \Vert _0^2. \end{aligned}$$
(8)

Moreover, let \(q \in V\). Then

$$\begin{aligned} {\mathcal {B}}(w, \varvec{\xi }; q, 0) = (\nabla w, \nabla q) + (\varvec{\xi }, \nabla q). \end{aligned}$$
(9)

If q is chosen as the solution to

$$\begin{aligned} (\nabla q, \nabla z) = (\varvec{\xi }, \nabla z) \quad \forall z \in V, \end{aligned}$$
(10)

then testing with \(z = q\) gives \((\varvec{\xi }, \nabla q) = \Vert \nabla q \Vert _0^2\). By the definition of \(\Vert \cdot \Vert _{-1}\) and Cauchy–Schwarz we have

$$\begin{aligned} \Vert \mathrm {div}\,\varvec{\xi }\Vert _{-1} = \sup _{v \in V} \frac{(\varvec{\xi }, \nabla v)}{\Vert v\Vert _1} \le \Vert \nabla q \Vert _0. \end{aligned}$$
(11)

Now choose \(r := w + q\). Combining (8), (9) and (11), and applying Cauchy–Schwarz and Young’s inequalities on (9) gives the first result (6).

For (7) the triangle inequality gives \(\Vert r \Vert _1 \le \Vert w\Vert _1 + \Vert q\Vert _1\). Using Friedrichs inequality we get

$$\begin{aligned} \Vert q \Vert ^2_1 \lesssim \Vert \nabla q \Vert ^2_0 = (\nabla q, \varvec{\xi }) = \langle q, \mathrm {div}\,\varvec{\xi }\rangle _{-1} \le \Vert q \Vert _1 \Vert \mathrm {div}\,\varvec{\xi }\Vert _{-1}, \end{aligned}$$

which concludes the proof. \(\square \)

3 Finite element method

Let \(V_h \subset V\) and \(\varvec{Q}_h \subset \varvec{Q}\). We define the discrete subspace \(\varvec{\Lambda }_h \subset \varvec{Q}_h\) as \(\varvec{\Lambda }_h = \{ \varvec{\mu }_h \in \varvec{Q}_h : \vert \varvec{\mu }_h \vert \le 1~\hbox { a.e.~in}\ \Omega \}\). Let \({\mathcal {T}}_h\) be a shape regular triangulation of \(\Omega \) and \(h_T\) denote the diameter of \(T \in {\mathcal {T}}_h\). Further let \({\mathcal {E}}_h\) denote the set of edges with length \(h_E\) for all \(E \in {\mathcal {E}}_h\), for which we have, due to shape regularity, \(h_E \sim h_T\). The discrete norm for \(\varvec{\mu }_h \in \varvec{\Lambda }_h\) is

(12)

where is the usual jump operator. The discrete formulation reads: find \((u_h, \varvec{\lambda }_h) \in V_h \times \varvec{\Lambda }_h\) such that

$$\begin{aligned} {\mathcal {B}}(u_h, \varvec{\lambda }_h; v_h, \varvec{\mu }_h - \varvec{\lambda }_h) \le (f,v_h) \quad \forall (v_h, \varvec{\mu }_h) \in V_h \times \varvec{\Lambda }_h. \end{aligned}$$
(13)

As in the continuous setting, we can prove stability of the mixed method (13) if the following Babuška–Brezzi condition is valid

$$\begin{aligned} \sup _{v_h \in V_h} \frac{(\varvec{\xi }_h, \nabla v_h)}{\Vert v_h\Vert _1} > rsim \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1} \quad \forall \varvec{\xi }_h \in \varvec{Q}_h. \end{aligned}$$
(14)

Theorem 2

Suppose \(V_h\) and \(\varvec{\Lambda }_h\) satisfy (14). Then for every \((w_h, \varvec{\xi }_h) \in V_h \times \varvec{Q}_h\) there exists a function \(r_h \in V_h\) such that

$$\begin{aligned} {\mathcal {B}}(w_h, \varvec{\xi }_h; r_h, -\varvec{\xi }_h) > rsim (\Vert w_h \Vert _1 + \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1})^2, \end{aligned}$$
(15)

and

$$\begin{aligned} \Vert r_h\Vert _1 \lesssim \Vert w_h \Vert _1 + \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1}. \end{aligned}$$
(16)

Proof

This is similar to the proof of Theorem 1 but using (14) in the intermediate step (11). \(\square \)

An explicit proof of condition (14) might be difficult depending on the choice of \(V_h\) and \(\varvec{Q}_h\). To this end, we show that it is sufficient to prove a discrete condition using the mesh dependent norm (12), see Theorem 3. In order to prove Theorem 3, we first consider the following preliminary result. In the following, let \(\Pi _h: L^2(\Omega ) \rightarrow V_h\) be the Clément quasi interpolation operator [17] with the stability and interpolation properties

$$\begin{aligned}&\Vert \Pi _hv \Vert _1 \le C_s \Vert v \Vert _1 \quad \forall v \in V, \end{aligned}$$
(17a)
$$\begin{aligned}&\quad \Big (\sum _{T \in {\mathcal {T}}_h} h_T^{-2} \Vert v - \Pi _hv\Vert _{0,T}^2 + \sum _{E \in {\mathcal {E}}_h} h_E^{-1} \Vert v - \Pi _hv\Vert _{0,E}^2 \Big )^{1/2} \le C_i \Vert v \Vert _1 \quad \forall v \in V. \end{aligned}$$
(17b)

Lemma 1

There exists constants \(C_1, C_2 > 0\) such that

$$\begin{aligned} \sup \limits _{v_h \in V_h} \frac{(\varvec{\xi }_h,\nabla v_h)}{\Vert v_h \Vert _1} \ge C_1 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1} - C_2 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h} \quad \forall \varvec{\xi }_h \in \varvec{Q}_h. \end{aligned}$$
(18)

Proof

Choose an arbitrary \(\varvec{\xi }_h \in \varvec{Q}_h\). By Theorem 1 there exists a function \(r \in V\) such that

$$\begin{aligned} (\varvec{\xi }_h,\nabla r) \ge C' \Vert r\Vert _1 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1}. \end{aligned}$$

Using the Clément operator we have for the difference

Thus, in total we have

$$\begin{aligned} (\varvec{\xi }_h,\nabla \Pi _hr)&= (\varvec{\xi }_h,\nabla (\Pi _hr - r)) + (\varvec{\xi }_h,\nabla r) \\&\ge - 2C_i \Vert r \Vert _1 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h} + C' \Vert r\Vert _1 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1} \\&\ge - C_2 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h} \Vert \Pi _hr \Vert _1 + C_1 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1} \Vert \Pi _hr \Vert _1, \end{aligned}$$

where \(C_2 = 2C_i/C_s\) and \(C_1 = C'/C_s\), which proves (18). \(\square \)

Theorem 3

If the discrete Babuška–Brezzi condition

$$\begin{aligned} \sup _{v_h \in V_h} \frac{(\varvec{\xi }_h, \nabla v_h)}{\Vert v_h\Vert _1} > rsim \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1,h} \quad \forall \varvec{\xi }_h \in \varvec{Q}_h, \end{aligned}$$
(19)

holds true, then the discrete spaces also fulfill (14).

Proof

Suppose that (19) is valid with a constant \(C_3 > 0\), then we have for a convex combination with \(t > 0\) and using Lemma 1 that

$$\begin{aligned} \sup _{v_h \in V_h} \frac{(\varvec{\xi }_h, \nabla v_h)}{\Vert v_h\Vert _1}&= t \sup _{v_h \in V_h} \frac{(\varvec{\xi }_h, \nabla v_h)}{\Vert v_h\Vert _1} + (1 - t) \sup _{v_h \in V_h} \frac{(\varvec{\xi }_h, \nabla v_h)}{\Vert v_h\Vert _1} \\& > rsim t (C_1 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1} - C_2 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h}) + (1-t) C_3 \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h} \\& > rsim {\tilde{C}} \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1}, \end{aligned}$$

where we have chosen \(t := \frac{1}{2} C_3(C_2 + C_3)^{-1}\) and thus \({\tilde{C}} = C_1C_3/(2 (C_2 + C_3))\). \(\square \)

3.1 Some stable discretizations

Theorem 3 shows that it is sufficient to prove condition (19) for some finite element spaces \(V_h\) and \(\varvec{Q}_h\). In the following we discuss some stable choices. Let \({\mathbb {P}}^l(K)\) denote the space of polynomials of order \(l \ge 0\) on \(K \in {\mathcal {T}}_h\), and let \({\mathbb {P}}^l(K, {\mathbb {R}}^2)\) denote its vector-valued version. The same notation is used for polynomials on \(E \in {\mathcal {E}}_h\).

3.1.1 The \(P^{k}P^{k-2}\) family

For \(k\ge 2\), we choose the spaces

$$\begin{aligned} V_h&:= \{v_h \in V: v_h\vert _T \in {\mathbb {P}}^k(T)~\forall T \in {\mathcal {T}}_h\}, \end{aligned}$$
(20a)
$$\begin{aligned} \varvec{Q}_h&:= \{\varvec{\mu }_h \in \varvec{Q}: v_h\vert _T \in {\mathbb {P}}^{k-2}(T, {\mathbb {R}}^2)~\forall T \in {\mathcal {T}}_h\}. \end{aligned}$$
(20b)

Lemma 2

The discrete spaces defined by (20) fulfill the discrete condition (19).

Proof

Let \(\varvec{\xi }_h \in \varvec{Q}_h\) be arbitrary. We choose \(v_h \in V_h\) such that

(21a)
(21b)
(21c)

Element-wise integration by parts and using that for all edges \(E \in {\mathcal {E}}_h\), and \(\mathrm {div}\,\varvec{\xi }_h \in {\mathbb {P}}^{k-3}(T)\) for all elements \(T \in {\mathcal {T}}_h\), we have

$$\begin{aligned} (\nabla v_h, \varvec{\xi }_h) = \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1,h}^2. \end{aligned}$$

By a standard scaling argument we have on each element \(T \in {\mathcal {T}}_h\)

$$\begin{aligned} \begin{aligned} \Vert \nabla v_h \Vert ^2_{0,T}&\sim \sum \limits _{E \subset \partial T} \frac{1}{h_E} \Vert v_h \Vert ^2_{0,E} + \frac{1}{h_T^2}\Vert v_h \Vert ^2_{0,T} \\&\sim \sum \limits _{E \subset \partial T} \frac{1}{h_E} \Vert \Pi ^{k-2}_E v_h \Vert ^2_{0,E} + \frac{1}{h_T^2}\Vert \Pi ^{k-3}_T v_h \Vert ^2_{0,T}, \end{aligned} \end{aligned}$$
(22)

where \(\Pi ^{k-2}_E\) and \(\Pi ^{k-3}_T\) are the edge-wise and element-wise \(L^2\)-projection onto polynomials of order \(k-2\) and \(k-3\), respectively. Note that the second equivalence follows due to \(v_h\) vanishing at all vertices, see (21a). Using this equivalence we have by the moments (21) and the Cauchy–Schwarz inequality

Summing over all elements and using the norm equivalence (22) we conclude the proof.

\(\square \)

3.1.2 The MINI family

For \(k \ge 1\), we choose the spaces

$$\begin{aligned} V_h&:= \{v_h \in V: v_h\vert _T \in {\mathbb {P}}^{k+2}(T)~\forall T \in {\mathcal {T}}_h, v_h\vert _E \in {\mathbb {P}}^k(E)~\forall E \in {\mathcal {E}}_h\},\end{aligned}$$
(23a)
$$\begin{aligned} \varvec{Q}_h&:= \{\varvec{\mu }_h \in \varvec{Q}: v_h\vert _T \in {\mathbb {P}}^k(T, {\mathbb {R}}^2)~\forall T \in {\mathcal {T}}_h\} \cap H^1(\Omega , {\mathbb {R}}^2). \end{aligned}$$
(23b)

Note that since now \(\varvec{Q}_h\) is a subset of \(H^1(\Omega ,{\mathbb {R}}^2)\), the normal jumps in \(\Vert \mathrm {div}\,(\cdot ) \Vert _{-1, h}\) vanish.

Lemma 3

The discrete spaces defined by (23) fulfill the discrete condition (19).

Proof

Let \(\varvec{\xi }_h \in \varvec{Q}_h\) be arbitrary. We choose \(v_h \in V_h\) such that it vanishes at all vertices and all edges, i.e. \(v_h \in H^1_0(T)\) for all elements \(T \in {\mathcal {T}}_h\). In addition \(v_h\) fulfills

$$\begin{aligned} \int _T v_h l \,\mathrm {d}x&=-\int _T h_T^2 \mathrm {div}\,\varvec{\xi }_h l \,\mathrm {d}x&\quad&\forall l \in {\mathbb {P}}^{k-1}(T), \forall T \in {\mathcal {T}}_h. \end{aligned}$$

Using integration by parts and the fact that \(\mathrm {div}\,\varvec{\xi }_h \in {\mathbb {P}}^{k-1}(T)\) for all elements \(T \in {\mathcal {T}}_h\), we have again

$$\begin{aligned} (\nabla v_h, \varvec{\xi }_h) = \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h}^2. \end{aligned}$$

With similar scaling arguments as in the proof of Lemma 2 we also have \(\Vert \nabla v_h \Vert _0 \lesssim \Vert \mathrm {div}\,\varvec{\xi }_h \Vert _{-1, h}\), which concludes the proof. \(\square \)

Remark 1

The Crouzeix–Raviart method. The last method we want to mention is using a nonconforming approximation of the velocity. We define the spaces

$$\begin{aligned} V_h&:= \{v_h \in L^2(\Omega , {\mathbb {R}}): v_h\vert _T \in {\mathbb {P}}^1(T)~\forall T \in {\mathcal {T}}_h, \\&\qquad v_h~\text {is continuous and vanishes at midpoints} \\&\qquad \text {of interior and boundary edges, respectively}\},\\ \varvec{Q}_h&:= \{\varvec{\mu }_h \in L^2(\Omega , {\mathbb {R}}^2) : v_h\vert _T \in {\mathbb {P}}^0(T, {\mathbb {R}}^2)~\forall T \in {\mathcal {T}}_h\}. \end{aligned}$$

Since the degrees-of-freedom of the velocity are again associated to edges, the stability analysis is similar as for the \(P^2P^0\) method. Further note that since \(\nabla v_h \in \varvec{Q}_h\) locally on each element, one can reformulate the mixed method (13) as a primal method (without the Lagrange multiplier \(\varvec{\lambda }_h\)) which is similar to the nonconforming approximations from [10] and (as explained in [10]) the method in [11]. Due to the extensive analysis therein, we do not consider this method in the present work, but want to mention that our techniques can be applied accordingly.

4 A priori error analysis

In this section we present an a priori error estimate and prove a linear convergence for \(H^2\)-regular velocity solutions. This stands in contrast to the suboptimal result \({\mathcal {O}}(h^{1/2})\) (for a linear approximation) from [7, 9] and is in accordance to the linear convergence results from [10, 11]. Although the analysis could be extended to provide a better rate for smooth solutions, a higher regularity can not be expected for Bingham-type flows as discussed in the introduction.

Theorem 4

Let \((u_h, \varvec{\lambda }_h) \in V_h \times \varvec{\Lambda }_h\) be the solution of (13), then for any \((w_h, \varvec{\xi }_h) \in V_h \times \varvec{Q}_h\) it holds

$$\begin{aligned} \Vert u - u_h \Vert _1&+ \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\lambda }_h) \Vert _{-1} \\&\lesssim \Vert u - w_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\xi }_h) \Vert _{-1} + \sqrt{(\nabla u, \varvec{\lambda }- \varvec{\xi }_h)}. \end{aligned}$$

Proof

Let \((w_h, \varvec{\xi }_h) \in V_h \times \varvec{Q}_h\) be arbitrary. By the discrete stability, see Theorem 2, we find \(v_h\) such that

$$\begin{aligned} \Big ( \Vert u_h - w_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }_h - \varvec{\xi }_h) \Vert _{-1} \Big )^2&\lesssim {\mathcal {B}}(u_h - w_h, \varvec{\lambda }_h - \varvec{\xi }_h; v_h, \varvec{\xi }_h - \varvec{\lambda }_h) \\&\le (f, v_h) - {\mathcal {B}}(w_h, \varvec{\xi }_h; v_h, \varvec{\xi }_h - \varvec{\lambda }_h). \end{aligned}$$

Using the continuous problem (3) we get

$$\begin{aligned}&(f, v_h) - {\mathcal {B}}(w_h, \varvec{\xi }_h; v_h, \varvec{\xi }_h - \varvec{\lambda }_h) \\&=(\nabla u, \varvec{\lambda }_h - \varvec{\xi }_h) + {\mathcal {B}}(u - w_h, \varvec{\lambda }- \varvec{\xi }_h; v_h, \varvec{\xi }_h - \varvec{\lambda }_h) \\&\le (\nabla u, \varvec{\lambda }- \varvec{\xi }_h) + {\mathcal {B}}(u - w_h, \varvec{\lambda }- \varvec{\xi }_h; v_h, \varvec{\xi }_h - \varvec{\lambda }_h), \end{aligned}$$

which concludes the proof with the continuity of \({\mathcal {B}}\), see (5), and

$$\begin{aligned}&\Vert u - u_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\lambda }_h) \Vert _{-1} \\&\le \Vert u - w_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\xi }_h) \Vert _{-1} + \Vert u_h - w_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }_h - \varvec{\xi }_h) \Vert _{-1}. \end{aligned}$$

\(\square \)

The following lemma shows that we can expect a linear convergence whenever the solution is at least \(H^2\)-regular. Note that it is essential to bound the error just in terms of \(\mathrm {div}\,\varvec{\lambda }\), since our analysis does not provide any control for the divergence-free part of \(\varvec{\lambda }\). According to [7] we further have for a convex domain and a smooth soulution the stability estimate

$$\begin{aligned} \vert u \vert _2 + \Vert \mathrm {div}\,\varvec{\lambda }\Vert _0 \lesssim \Vert f\Vert _{0}. \end{aligned}$$

Lemma 4

Let \(\Omega \) be simply connected and convex. Choose \(V_h\) and \(\varvec{Q}_h\) as in Sect. 3.1, and let \((u_h, \varvec{\lambda }_h) \in V_h \times \varvec{\Lambda }_h\) be the corresponding discrete solution. Further let \(u \in V \cap H^2(\Omega )\) and \(\varvec{\lambda }\in \varvec{\Lambda }\cap H(\mathrm {div}\,, \Omega )\). Then there holds

$$\begin{aligned} \Vert u - u_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\lambda }_h) \Vert _{-1} \lesssim h ( \vert u \vert _2 + \Vert \mathrm {div}\,\varvec{\lambda }\Vert _0 ) \lesssim h \Vert f \Vert _0, \end{aligned}$$

where \(h = \max \limits _{T \in {\mathcal {T}}_h} {\text {diam}}({T})\).

Proof

We solve the Dirichlet problem: find \(\theta \in H^1_0(\Omega )\) such that

$$\begin{aligned} (\nabla \theta , \nabla v) = (\mathrm {div}\,\varvec{\lambda }, v) \quad \forall v \in H_0^1(\Omega ), \end{aligned}$$

for which we have due to the assumptions on the domain that \(\vert \theta \vert _2 \lesssim \Vert \mathrm {div}\,\varvec{\lambda }\Vert _0\). Further, since \(\varvec{\lambda }- \nabla \theta \) is divergence free (by construction) Theorem 3.1 in [18] shows that there exists a \(\phi \in H^1(\Omega )\) such that \(\varvec{\lambda }= \nabla \theta + \mathbf {curl}\,\phi \).

Let \(w_h := I_{{\mathcal {L}}}u\), where \(I_{{\mathcal {L}}}\) is the Lagrange interpolation operator onto \(V_h\), then by the approximation properties of \(I_{{\mathcal {L}}}\) we have

$$\begin{aligned} \Vert u - w_h \Vert _1 \lesssim h \vert u \vert _2. \end{aligned}$$

Using integration by parts and that \(\mathbf {curl}\,\nabla v = 0\) for all \(v \in V\) we have

$$\begin{aligned} \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\xi }_h) \Vert _{-1}&= \sup _{v \in V} \frac{(\varvec{\lambda }- \varvec{\xi }_h, \nabla v)}{\Vert v \Vert _1} = \sup _{v \in V} \frac{(\varvec{\lambda }- \mathbf {curl}\,\phi - \varvec{\xi }_h, \nabla v)}{\Vert v \Vert _1}. \end{aligned}$$

For the \(P^kP^{k-2}\) family we choose \(\varvec{\xi }_h = \Pi _0\nabla \theta \), then

$$\begin{aligned} (\varvec{\lambda }- \mathbf {curl}\,\phi - \varvec{\xi }_h, \nabla v)&= ( ({\text {id}}- \Pi _0) \nabla \theta , \nabla v) \lesssim h \vert \theta \vert _2 \Vert \nabla v \Vert _0 \lesssim h \Vert \mathrm {div}\,\varvec{\lambda }\Vert _0 \Vert \nabla v \Vert _0. \end{aligned}$$

It remains to bound the last term. For this note that since \({\text {id}}- \Pi _0\) is orthogonal on constants we have with similar steps as above

$$\begin{aligned} (\nabla u, \varvec{\lambda }- \varvec{\xi }_h) \!=\! ( \nabla u, ({\text {id}}- \Pi _0) \nabla \theta ) \!=\! (({\text {id}}- \Pi _0) \nabla u,\!({\text {id}}- \Pi _0) \nabla \theta )\! \lesssim h \vert u \vert _ 2 h \Vert \mathrm {div}\,\varvec{\lambda }\Vert _0, \end{aligned}$$

from which we conclude the proof.

For the MINI family, we use the same steps as above, but choose \(\varvec{\xi }_h = \Pi _h\nabla \theta \). Using Theorem 2.6 from [19] we get again the bound

$$\begin{aligned} (\nabla u, \varvec{\lambda }- \varvec{\xi }_h) \lesssim h \vert u \vert _ 2 h \Vert \mathrm {div}\,\varvec{\lambda }\Vert _0, \end{aligned}$$

and we conclude the proof with Theorem 4. \(\square \)

5 A posteriori error analysis

Since a high regularity of the solution cannot be expected in general, this section is dedicated to a posteriori error control, enabling the use of adaptive mesh refinement. We define the local error estimators – including the dependency on g and \(\mu \) to allow for a direct implementation – as

and the global estimator

$$\begin{aligned} \eta := \sqrt{ \sum _{T \in {\mathcal {T}}_h} \eta _T^2 + \sum _{E \in {\mathcal {E}}_h} \eta _E^2 + \sum _{T \in {\mathcal {T}}_h} \eta _{{\text {con}}, T}^2}. \end{aligned}$$

The element and edge estimators \(\eta _T\) and \(\eta _E\), respectively, are standard residual estimators as known from the literature. The additional term \(\eta _{{\text {con}}, T}\) can be interpreted as a consistency estimator of Eq. (1b). Further we want to emphasize that all estimators only depend on the distributional divergence of \(\varvec{\lambda }_h\) for which we have discrete stability, see Theorem 2. While this is clear for \(\eta _T\) and \(\eta _E\), through integration by parts this is also evident for \(\eta _{{\text {con}}, T}\).

Theorem 5

There holds the a posteriori error estimate

$$\begin{aligned} \Vert u - u_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\lambda }_h) \Vert _{-1} \lesssim \eta . \end{aligned}$$

Proof

Using the continuous stability we find \(v \in V\) such that

$$\begin{aligned}&(\Vert u - u_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\lambda }_h) \Vert _{-1})^2 \\&\lesssim {\mathcal {B}}(u - u_h,\varvec{\lambda }- \varvec{\lambda }_h; v,\varvec{\lambda }_h - \varvec{\lambda }) \\&= (\nabla (u - u_h), \nabla v) + (\nabla v, \varvec{\lambda }- \varvec{\lambda }_h) + (\nabla (u - u_h), \varvec{\lambda }_h - \varvec{\lambda }), \end{aligned}$$

and \(\Vert v \Vert _1 \lesssim \Vert u - u_h \Vert _1 + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\lambda }_h) \Vert _{-1}\). We continue with the first two terms. Using the Clément operator we have

$$\begin{aligned} (\nabla u_h, \nabla \Pi _hv) + (\nabla \Pi _hv, \varvec{\lambda }_h) = (f, \Pi _hv) = (\nabla u, \nabla \Pi _hv) + (\nabla \Pi _hv, \varvec{\lambda }), \end{aligned}$$

and thus

$$\begin{aligned}&(\nabla (u - u_h), \nabla v) + (\nabla v, \varvec{\lambda }- \varvec{\lambda }_h) \\&= (\nabla (u - u_h), \nabla (v - \Pi _hv) ) + (\nabla (v - \Pi _hv), \varvec{\lambda }- \varvec{\lambda }_h) \\&= \sum _{T \in {\mathcal {T}}_h} (\nabla (u - u_h), \nabla (v - \Pi _hv) )_T + (\nabla (v - \Pi _hv), \varvec{\lambda }- \varvec{\lambda }_h)_T. \end{aligned}$$

Since \(\mathrm {div}\,(-\nabla u - \varvec{\lambda }) = f\), see (1a), and \(f \in L^2(\Omega )\), we have that \(\nabla u + \varvec{\lambda }\in H(\mathrm {div}\,, \Omega )\), i.e. it is normal continuous. By that we have with integration by parts on each element

Using the properties of \(\Pi _h\), cf. (17), we finally arrive at

$$\begin{aligned} (\nabla (u - u_h), \nabla v)&+ (\nabla v, \varvec{\lambda }- \varvec{\lambda }_h) \lesssim \Big ( \sum _{T \in {\mathcal {T}}_h} \eta _T^2 + \sum _{E \in {\mathcal {E}}_h} \eta _E^2\Big )^{1/2} \Vert v \Vert _1. \end{aligned}$$

It remains to bound the other term. For this note that (2b) gives \((\nabla u, \varvec{\lambda }_h) \le (\nabla u, \varvec{\lambda })\), and thus as \(\vert \varvec{\lambda }\vert \le 1\),

$$\begin{aligned} (\nabla (u - u_h), \varvec{\lambda }_h - \varvec{\lambda }) \le (\nabla u_h, \varvec{\lambda }- \varvec{\lambda }_h)&\le \sum _{T \in {\mathcal {T}}_h} \int _T (\vert \nabla u_h \vert \vert \varvec{\lambda }\vert - \nabla u_h \cdot \varvec{\lambda }_h) \,\mathrm {d}x\\&\le \sum _{T \in {\mathcal {T}}_h} \int _T (\vert \nabla u_h \vert - \nabla u_h \cdot \varvec{\lambda }_h) \,\mathrm {d}x, \end{aligned}$$

which concludes the proof. \(\square \)

Theorem 6 and Lemma 5 below provide local and global efficiency estimates, respectively, for the residual based estimators \(\eta _T\) and \(\eta _E\). The proofs follow with similar steps as in [15], i.e. we will provide all details of the local efficiency but refer to [15] for the proof of Lemma 5. Further note that similarly as in [15] it is not possible to provide an upper bound for the consistency error \(\eta _{{\text {con}}, T}\).

For the efficiency estimates we need some additional notation. Let \(\omega \subset \Omega \) be arbitrary then we define for all \(\varvec{\mu }\in \varvec{\Lambda }\) the local dual norm by

$$\begin{aligned} \Vert \mathrm {div}\,\varvec{\mu }\Vert _{-1,\omega } := \sup \limits _{v \in H^1_0(\omega )} \frac{\langle v, \mathrm {div}\,\varvec{\mu }\rangle _{-1,\omega }}{\Vert v \Vert _{1,\omega }} = \sup \limits _{v \in H^1_0(\omega )} \frac{(\nabla v, \varvec{\mu })_{\omega }}{\Vert v \Vert _{1,\omega }}. \end{aligned}$$

The subset \(\omega \) will be either an element \(T \in {\mathcal {T}}_h\) or \(\omega _E\), where \(\omega _E\) denotes the edge-patch for a given edge \(E \in {\mathcal {E}}_h\). Finally, let \(f_h := \Pi ^q f\) be the element-wise \(L^2\) projection onto \({\mathbb {P}}^q(K)\) where q is the polynomial order of the space \(\varvec{Q}_h\), and let

$$\begin{aligned} {\text {osc}}_T(f)&:= h_T \Vert f - f_h \Vert _{0,T} \quad \text {and} \quad {\text {osc}}(f) := \Big ( \sum _{T \in {\mathcal {T}}_h} h^2_T \Vert f - f_h \Vert ^2_{0,T} \Big )^{1/2}. \end{aligned}$$

Theorem 6

Let \(v_h \in V_h\) and \(\varvec{\mu }_h \in \varvec{\Lambda }_h\) be arbitrary. There holds the local efficiency

Proof

The proof commences with the usual localizing technique by means of a element-wise qubic bubble function \(b_T\). We define the localized error on T by

$$\begin{aligned} \delta _T\vert _T := h_T^2 b_T (\Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h), \end{aligned}$$

and \(\delta _T = 0\) on \(\Omega \setminus T\). Since \(b_T\) vanishes on the element boundary we have that \(\delta _T \in V\). Using the norm equivalence for polynomial spaces we then have

$$\begin{aligned}&h_T^2 \Vert \Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h \Vert ^2_{0,T} \\&\lesssim h_T^2 \Vert b_T^{1/2} (\Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h) \Vert ^2_{0,T}\\&= ( \Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h, \delta _T)_T \\&= ( \Delta v_h + \mathrm {div}\,\varvec{\mu }_h, \delta _T)_T + (f, \delta _T)_T + (f_h - f, \delta _T)_T \\&= ( \Delta v_h + \mathrm {div}\,\varvec{\mu }_h, \delta _T)_T + (-\Delta u, \delta _T)_T - \langle \mathrm {div}\,\varvec{\lambda }, \delta _T \rangle _{-1,T} + (f_h - f, \delta _T)_T, \end{aligned}$$

and, with integration by parts also

$$\begin{aligned}&h_T^2 \Vert \Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h \Vert ^2_{0,T} \nonumber \\&\lesssim ( \nabla (v_h - u) , \nabla \delta _T)_T + \langle \mathrm {div}\,(\varvec{\mu }_h - \varvec{\lambda }), \delta _T\rangle _{-1, T} + (f_h - f, \delta _T)_T. \end{aligned}$$
(24)

By the inverse inequality for polynomials we have

$$\begin{aligned} \Vert \delta _T \Vert _{1, T} \lesssim h^{-1}_T \Vert \delta _T\Vert _{0, T} \sim h_T \Vert \Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h \Vert _{0,T}, \end{aligned}$$
(25)

and thus, with Cauchy–Schwarz inequality we derive the first estimate with

$$\begin{aligned}&h_T^2 \Vert \Delta v_h + \mathrm {div}\,\varvec{\mu }_h + f_h \Vert ^2_{0,T} \\&\lesssim \Vert u-v_h \Vert _{1,T} \Vert \delta _T\Vert _{1, T} \ + \Vert \mathrm {div}\,(\varvec{\lambda }- \varvec{\mu }_h) \Vert _{-1,T} \Vert \delta _T \Vert _{1, T} + {\text {osc}}_T(f) h_T^{-1} \Vert \delta _T\Vert _{0, T}. \end{aligned}$$

For the other term we proceed similarly. For this let where \({\mathcal {E}}\) is the well known extension operator onto \(H^1_0(\omega _E)\), see [20], and \(b_E\) is the quadratic edge bubble. Scaling arguments and the Poincaré inequality give

With the same steps as for the volume term we derive the estimate

(26)

from which we conclude the proofs using the Cauchy–Schwarz inequality, the estimates of the volume term from before and (25). \(\square \)

Lemma 5

Let \(v_h \in V_h\) and \(\varvec{\mu }_h \in \varvec{\Lambda }_h\) be arbitrary. There holds the global efficiency

Proof

The proof follows with the same steps as in [15] using the intermediate estimates (26) and (24). \(\square \)

6 Numerical examples

We apply an iterative algorithm to approximate the solution of the discrete problem (13). It is based on a reformulation of the inequality constraint (2b) as

$$\begin{aligned} \varvec{\lambda }- \varvec{P}(\varvec{\lambda }+ \rho \nabla u) = \varvec{0}, \quad \rho > 0, \end{aligned}$$
(27)

where \(\varvec{P}(\varvec{\mu }) = \tfrac{\varvec{\mu }}{\max (1, \vert \varvec{\mu }\vert )}\) scales any vectors of \({\mathbb {R}}^2\) to maximum length one, cf. [7, 8] for discussion on similar algorithms and proofs of their convergence. The reformulation is based on the fact that \(\varvec{\lambda }_h\) in

$$\begin{aligned} (\varvec{\xi }_h - \varvec{\lambda }_h, \varvec{\mu }_h - \varvec{\lambda }_h) \le 0 \quad \forall \varvec{\mu }_h \in \varvec{\Lambda }_h, \end{aligned}$$

is the orthogonal projection of \(\varvec{\xi }_h \in \varvec{Q}_h\) onto \(\varvec{\Lambda }_h\), and the orthogonal projection is alternatively characterized by \(\varvec{P}\) [8, Section 3].

Algorithm 1

(Uzawa iteration) Let \((u_h^0, \varvec{\lambda }_h^0) \in V_h \times \varvec{\Lambda }_h\) be an initial guess, TOL a given tolerance and set \(i=1\)

  1. 1.

    Solve \(u_h^i\) from \((\mu \nabla u_h^i, \nabla v_h) = (f, v_h) - g(\varvec{\lambda }_h^{i-1}, \nabla v_h)\) for every \(v_h \in V_h\).

  2. 2.

    Calculate \(\varvec{\lambda }_h^i = \varvec{P}(\varvec{\lambda }_h^{i-1} + \rho \pi _h \nabla u_h^i)\) where \(\pi _h : \varvec{Q}\rightarrow \varvec{Q}_h\) is the \(L^2\) projection onto \(\varvec{Q}_h\).

  3. 3.

    Stop if \(\Vert \nabla (u_h^{i} - u_h^{i-1})\Vert _0 / \Vert \nabla u_h^{i-1} \Vert _0< TOL\). If not, increment i and go to step (1).

Fig. 1
figure 1

The sequence of uniformly refined meshes for the convergence study

We first attempt to approximate an analytical solution on a circle \(\Omega = \{ (x,y) \in {\mathbb {R}}^2 : x^2 + y^2 < R^2 \}\) using uniform mesh refinements; see Fig. 1 for the sequence of meshes. For constant loading f, the coincidence set is a smaller circle with the radius \(R_p = 2g/f\). The analytical solution reads \(u(r) = \frac{R - r}{2}( \frac{f}{2}(R + r) - 2g)\) when \(r > R_p\) and is equal to the constant \(u(R_p)\) when \(r \le R_p\). Substituting the above expression into the strong formulation (1a) we find also an analytical expression for the divergence of \(\varvec{\lambda }\).

The error of the different components of the discrete norm are given in Fig. 2 with \(TOL = 10^{-7}\), \(R=1\), \(g=0.1\), \(f=0.5\) and \(\rho = 10\) in accordance to the suggestion in [8, Remark 3]. We observe that for the MINI and \(P^2P^0\) methods all components converge at least linearly whereas for \(P^3P^1\) method the \(H^1\) seminorm of \(u-u_h\) is approximately \({\mathcal {O}}(h^{1.7})\) and the discrete norm of \(\varvec{\lambda }- \varvec{\lambda }_h\) is approximately \({\mathcal {O}}(h^{1.6})\), i.e. less than the quadratic convergence order that interpolation estimates would imply for a completely smooth solution.

Fig. 2
figure 2

Error in the different components of the discrete norm for the circle problem as a function of the mesh parameter h using the uniform mesh sequence

Fig. 3
figure 3

Some examples from the sequence of adaptively refined meshes for the circle problem

Fig. 4
figure 4

Error in the different components of the discrete norm for the circle problem using \(P^3P^1\) method. The horizontal axis is the square root of the total number of degrees-of-freedom N. A comparison is made between the uniform mesh sequence (circles) and the adaptive mesh sequence (squares)

Fig. 5
figure 5

Some examples from the sequence of adaptively refined meshes for the square problem, and the total error estimator \(\eta \)

Next, our aim is to improve the convergence rate with respect to the total number of degrees-of-freedom N using mesh adaptivity. We use an adaptive mesh sequence based on the a posteriori estimate of Sect. 5. An element-wise error estimator \(E_T\) is given by

$$\begin{aligned} E_T^2 = \eta _T^2 + \sum _{\begin{array}{c} E \in {\mathcal {E}}_h, \\ E \cap \partial T \not = \emptyset \end{array}} (\tfrac{1}{2} \eta _E)^2 + \eta _{{\text {con}}, T}^2, \end{aligned}$$

and we split \(T^\prime \in {\mathcal {T}}_h\) if

$$\begin{aligned} E_{T^\prime } > 0.5 \max _{T \in {\mathcal {T}}_h} E_{T}. \end{aligned}$$

The mesh is refined using the red-green-blue refinement strategy and Laplacian smoothing is applied on the refined mesh to improve its shape regularity. Some examples from the sequence of adaptive meshes are given in Fig. 3. A comparison of the error between the uniform and adaptive mesh sequences is given in Fig. 4. In particular, we observe that while the convergence rate of the error is ultimately dictated by the largest component of the discrete norm (i.e. \(\Vert \varvec{\lambda }- \varvec{\lambda }_h\Vert _{-1,h}\)), there is a visible improvement in all of the components and, as a conclusion, the quadratic rate is recovered with respect to the number of degrees-of-freedom.

Fig. 6
figure 6

The length of the discrete Lagrange multiplier \(\vert \varvec{\lambda }_h\vert \) and the discrete velocity \(u_h\) for the square problem

Finally, we consider an example in a square domain \(\Omega = (0,1)^2\) with \(f=3.6\), \(g=1.25\), \(\rho =1.5\), and no analytical solution; cf. [14, 21] for similar examples. Some meshes from the adaptive sequence and the total error estimators are given in Fig. 5. The final discrete solution is depicted in Fig. 6. As before, we observe the adaptive refinement focusing on the interfaces between the liquid and solid regions. Moreover, the estimators successfully locate and refine the so-called stagnating regions at corners of the square.

Remark 2

We used a quadratic representation of the circle boundary in order to neglect the effect of inexact geometry representation.

Remark 3

We found the following equivalent form of the estimator \(\eta _{{\text {con}}, T}\) to be more robust against numerical tolerances in Algorithm 1:

$$\begin{aligned} g\int _T (\vert \nabla u_h\vert - \varvec{P}(\varvec{\lambda }_h + \rho \pi _h \nabla u_h) \cdot \pi _h \nabla u_h) \,\mathrm {d}x. \end{aligned}$$

Remark 4

We consider methods only up to a linear Lagrange multiplier because for a higher order method, in general, \(\varvec{\lambda }_h \not \in \varvec{\Lambda }_h\) when using Algorithm 1.