$$H(\textrm{div})$$ -conforming HDG methods for the stress-velocity formulation of the Stokes equations and the Navier–Stokes equations

Qiu, Weifeng; Zhao, Lina

doi:10.1007/s00211-024-01419-6

$H(\textrm{div})$-conforming HDG methods for the stress-velocity formulation of the Stokes equations and the Navier–Stokes equations

Open access
Published: 17 June 2024

Volume 156, pages 1639–1678, (2024)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

$H(\textrm{div})$-conforming HDG methods for the stress-velocity formulation of the Stokes equations and the Navier–Stokes equations

Download PDF

Weifeng Qiu¹ &
Lina Zhao¹

495 Accesses
Explore all metrics

Abstract

In this paper we devise and analyze a pressure-robust and superconvergent HDG method in stress-velocity formulation for the Stokes equations and the Navier–Stokes equations with strongly symmetric stress. The stress and velocity are approximated using piecewise polynomial space of order k and $H(\textrm{div};\Omega )$-conforming space of order $k+1$, respectively, where k is the polynomial order. In contrast, the tangential trace of the velocity is approximated using piecewise polynomials of order k. Moreover, the characterization of the proposed schemes shows that the globally coupled unknowns are the normal trace and the tangential trace of velocity, and the piecewise constant approximation for the trace of the stress. The discrete $H^1$-stability is established for the discrete solution. The proposed formulation yields divergence-free velocity, but causes difficulties for the derivation of the pressure-independent error estimate given that the pressure variable is not employed explicitly in the discrete formulation. This difficulty can be overcome by observing that the $L^2$ projection to the stress space has a nice commuting property. Moreover, superconvergence for velocity in discrete $H^1$-norm is obtained, with regard to the degrees of freedom of the globally coupled unknowns. Then the convergence of the discrete solution to the weak solution for the Navier–Stokes equations via the compactness argument is rigorously analyzed under minimal regularity assumption. The strong convergence for velocity and stress is proved. Importantly, the strong convergence for velocity in discrete $H^1$-norm is achieved. Several numerical experiments are carried out to confirm the proposed theories.

Error Estimates of EDG-HDG Methods for the Stokes Equations with Dirac Measures

Article 06 February 2023

Pressure-robust error estimate of optimal order for the Stokes equations: domains with re-entrant edges and anisotropic mesh grading

Article Open access 20 March 2021

New Regularity Criteria Based on Pressure or Gradient of Velocity in Lorentz Spaces for the 3D Navier–Stokes Equations

Article 20 February 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $\Omega \subset \mathbb {R}^d$ be a polygonal domain with boundary $\partial \Omega $, where $\partial \Omega =\partial \Omega _D\cup \partial \Omega _N$, $\partial \Omega _D\cap \partial \Omega _N= \emptyset $ and $\partial \Omega _D\ne \emptyset $. We consider the following stress-velocity-pressure formulation of the Stokes equations

$$\begin{aligned} \underline{\sigma }&=2\mu \varepsilon (\varvec{u})+p\underline{I}{} & {} \quad \text{ in }\;\Omega \end{aligned}$$

(1.1)

$$\begin{aligned} -\textrm{div}\underline{\sigma }&=\varvec{f}_S{} & {} \quad \text{ in }\;\Omega ,\end{aligned}$$

(1.2)

$$\begin{aligned} \nabla \cdot \varvec{u}&=0{} & {} \quad \text{ in }\;\Omega \end{aligned}$$

(1.3)

and the Navier–Stokes equations

$$\begin{aligned} \underline{\sigma }&=2\mu \varepsilon (\varvec{u})+p\underline{I}{} & {} \quad \text {in }\Omega , \end{aligned}$$

(1.4)

$$\begin{aligned} -\textrm{div}\underline{\sigma }+\textrm{div}(\varvec{u}\otimes \varvec{u})&=\varvec{f}_N{} & {} \quad \text {in }\Omega , \end{aligned}$$

(1.5)

$$\begin{aligned} \nabla \cdot \varvec{u}&=0{} & {} \quad \text{ in }\;\Omega , \end{aligned}$$

(1.6)

which are supplemented with the boundary condition $ \varvec{u}=\varvec{0}$ on $\partial \Omega _D$ and $\underline{\sigma }\varvec{n}=\varvec{0}$ on $\partial \Omega _N$. Here $\varepsilon (\varvec{u})=\frac{\nabla \varvec{u}+\nabla \varvec{u}^T}{2}$ and $\varvec{n}$ is the unit outward normal vector to $\partial \Omega $. Moreover, $\varvec{f}_S, \varvec{f}_N\in \varvec{L}^2(\Omega )$ are the external body forces, $\mu $ is the effective viscosity constant, and $\underline{I}$ is the identity matrix. If $\partial \Omega _N\ne \emptyset $, there exists a unique solution $(\varvec{u},p)$ to the above problem. If $\partial \Omega _N=\emptyset $, then we need to enforce the condition $\int _{\Omega }p\;dx=0$ to guarantee the uniqueness of the solution (cf. [17]). For the ease of presentation, we assume that $\partial \Omega _N\ne \emptyset $. The analysis presented below can also be adapted to the case $\partial \Omega _N=\emptyset $ without much difficulty.

The stress(pseudostress)-velocity/stress-velocity-pressure formulation for incompressible flows has drawn great attention over the past decades [3, 4, 7,8,9, 13, 18, 20, 25, 36]. These formulations enjoy the salient features, which can be summarized as follows: they come from the original physical laws and give a direct description of the stresses, which in some applications is the most interesting variable; formally they resemble the stress-displacement formulation of elasticity equations, which hopefully will give a better understanding of the coupled solid-fluid problem; they can give a unified formulation for Newtonian and non-Newtonian flows. The development and analysis for incompressible flows based on the stress-velocity formulation is important and interesting, however, the design of symmetric stress in the numerical approximations faces big challenges. As it is well-known, the construction of symmetric stress space is tricky and generally involves sophisticated procedures. In this paper, we aim to introduce a stress-velocity formulation for the Stokes equations and the Navier–Stokes equations. Specifically, the stress space is approximated using piecewise polynomials of order k with strong symmetry, and the velocity space is approximated using $H(\textrm{div};\Omega )$-conforming space of order $k+1$. In contrast, the tangential trace is approximated using piecewise polynomials of order k only. We remark that the formulation hinges on a carefully designed numerical flux, which guarantees the optimal convergence error estimates of the scheme regardless of the fact that the polynomial degree of the tangential trace being one order lower than that of the velocity.

The discrete formulations for the Stokes equations and the Navier–Stokes equations are designed in a similar manner, except that the non-negative convective term is incorporated in the Navier–Stokes equations. The advantages of the proposed formulations are multifold: it yields divergence-free velocity, which ensures the pressure-robustness; it is robust with respect to the values of viscosity; it is hybridizable and the size of the globally coupled system is greatly reduced, rendering the method computationally efficient. Note that the design of pressure-robust schemes was actively studied over the past few years, one can for instance refer to [14, 23, 26, 30] in the context of pressure-robustness. We also emphasize that the current approach can be straightforwardly applied to time-dependent incompressible flows, and at each time step, we need to solve a global system involving the normal trace and the tangential trace of velocity, and the piecewise constant approximation for the trace of the stress. These salient features make our method highly efficient for the simulations of large scale incompressible flows.

To illustrate the convergence of the proposed scheme with respect to smooth solutions, a rigorous convergence error analysis reflecting pressure-independence is developed for the Stokes equations. The discrete formulation only involves the stress, velocity and the numerical trace of velocity, which makes it less trivial to derive the pressure-independent error estimates. Unlike the velocity gradient-velocity-pressure formulation and the velocity-pressure formulation, the standard approach for the error analysis of the stress-velocity formulation will lead to the error estimates for velocity depending on the stress, which is pressure-dependent (cf. (1.1)). To overcome this issue, we need to refine the standard error analysis. In fact, our pressure-independent error estimates rely on two key observations: the $L^2$-projection operator to the stress space enjoys a nice commuting property, which enables us to decouple the pressure variable from the definition of the stress and only employs the deviatoric part of stress; the normal-tangential component of the stress and the deviatoric part of stress over the interface of the elements are equivalent, namely, $(\mathcal {A} \underline{\sigma }\varvec{n})^t_{|F}= (\underline{\sigma }\varvec{n})^t_{|F}$ for $F\in \mathcal {F}_h$. The optimal convergence error estimates for all the variables measured in a suitable norm are obtained; indeed, the $L^2$-error for stress and velocity, and the energy error for velocity are explored. In particular, from the perspectives on the degrees of freedom of the globally coupled unknowns, superconvergence for velocity in the discrete $H^1$-norm is obtained. Moreover, we are able to show that the velocity error estimates only depend on the deviatoric part of the stress, which ensures the pressure-independence of the error estimates. To the best of our knowledge, the pressure-independent error estimates for the stress (pseudostress)-velocity formulation have not been discussed in the existing literature and the developed approaches will shed new insight onto other pressure-robust discretizations.

In many practical applications, the solutions under consideration are non-smooth. In this situation, it will be interesting to explore the convergence under minimal regularity assumptions. However, the existing works in the direction of HDG methods mainly focus on error estimates under strong regularity assumptions, which excludes the non-smooth solutions that are often present in realistic scenarios. To fill this gap, one of our main goal is to develop convergence to the weak solution under minimal regularity assumptions for the Navier–Stokes equations. To this end, we first show the discrete $H^1$-stability for the proposed formulation by using a local version of Korn inequality. Then the convergence to the weak solution can be established by using the boundedness of the discrete solution and the compactness argument. Moreover, the discrete gradient operator defined using the lifting operator also plays an important role for the convergence proof. Specifically, two discrete gradient operators are employed to favor the proof. We first construct a discrete gradient operator to link the discrete stress and velocity, and the convergence of the discrete gradient operator to $\nabla \varvec{u}$ is proved via the discrete $H^1$ stability of the velocity. Then we construct the second discrete gradient operator to facilitate the analysis for the convective trilinear term. The strong convergence of the discrete velocity and stress to the weak solution is analyzed. In particular, the strong convergence for the velocity in discrete $H^1$-norm is achieved, where a judicious choice of elliptic projection in conjunction with a local Korn inequality is employed. To the best of our knowledge, this is the first work on the proof of convergence to a weak solution under minimal regularity assumptions for a pressure-robust discretization. The methodologies used in this paper can also be extended to analyze other spatial discretizations.

The rest of the paper is organized as follows. In the next section, we introduce the pressure-robust discretizations for the Stokes equations and the Navier–Stokes equations. Moreover, the main results are presented. In Sect. 3 we present the characterization of the proposed HDG scheme. In Sect. 4, the discrete $H^1$-stability and the convergence error estimates for the Stokes equations are proved. Then the convergence to the weak solution for the Navier–Stokes equations under minimal regularity assumptions is rigorously analyzed in Sect. 5. Several numerical experiments are carried out in Sect. 6 to confirm the proposed theories and demonstrate the capabilities of the proposed scheme.

2 The pressure-robust discretization and main results

2.1 The pressure-robust discretization in stress-velocity formulation

In this subsection, we first describe the model problem, then the corresponding discrete formulation will be given. To begin, we introduce some notation that will be used throughout the paper. We will use the most common Sobolev spaces $H^r(\mathcal {O})$ for non-negative integer r, where $\mathcal {O}\subset \Omega ,\Omega \subset \mathbb {R}^d,d=2,3$. The spaces of vector- and matrix-valued functions with all the components in $H^r(\mathcal {O})$ will be respectively denoted as $\varvec{H}^r(\mathcal {O})$ and $\underline{H}^r(\mathcal {O})$. Also, we use S to denote the set of symmetric $d\times d$ matrices and define $\underline{H}^r(S,\Omega ):=\{\underline{w}\in \underline{H}^r(\Omega ); \underline{w}=\underline{w}^T\}$. We use $(\cdot ,\cdot )_D$ to represent the standard $L^2$-inner product over $D\subset \mathbb {R}^d$ and the corresponding norm is denoted as $\Vert \cdot \Vert _{L^2(D)}$, and we use $\langle \cdot ,\cdot \rangle _D$ to represent the $L^2$-inner product on $D\subset \mathbb {R}^{d-1}$. In the sequel, C represents a generic positive constant independent of the meshsize and $\mu $, which may have different values at different occurrences.

In our discretization given below, we aim to eliminate the pressure variable p. To this end, it holds in view of (1.3) that

$$\begin{aligned} \textrm{tr}(\underline{\sigma })= d p, \end{aligned}$$

thus $p=\frac{\textrm{tr}(\underline{\sigma })}{d}$. For any tensor field $\underline{H}$, we define $\mathcal {A}\underline{H}{:=}\underline{H}- \frac{\textrm{tr}(\underline{H})}{d}{\underline{I}}$, where $\mathcal {A}\underline{H}$ is a trace-free tensor and is called the deviatoric part. Then we can infer from (1.1) that $ \mathcal {A}\underline{\sigma } = 2\mu \varepsilon (\varvec{u})$. Thus, the model problem (1.1)–(1.3) can be recast into the following equivalent form:

$$\begin{aligned} (2\mu )^{-1}\mathcal {A}\underline{\sigma }&= \varepsilon (\varvec{u}){} & {} \quad \text{ in }\;\Omega , \end{aligned}$$

(2.1)

$$\begin{aligned} -\textrm{div}\underline{\sigma }&= \varvec{f}_S{} & {} \quad \text{ in }\;\Omega . \end{aligned}$$

(2.2)

We let $\varvec{H}^1_0(\Gamma _D):=\{\varvec{v}\in \varvec{H}^1(\Omega );\varvec{v}=\varvec{0}\;\text{ on }\;\partial \Omega _D\}$ and $X:=\underline{L}^2(S,\Omega )\times \varvec{H}_0^1(\Gamma _D)$. We define the norm $\Vert (\underline{w},\varvec{v})\Vert _X^2{:=}\Vert \mu ^{-\frac{1}{2}}\underline{w}\Vert _{L^2(\Omega )}^2+\Vert \mu ^{\frac{1}{2}}\varvec{v}\Vert _{H^1(\Omega )}^2$ for any $ (\underline{w},\varvec{v})\in X$. Then, the weak formulation for (1.1)–(1.3) reads as follows: Find $(\underline{\sigma },\varvec{u})\in X$ such that

$$\begin{aligned} ((2\mu )^{-1}\mathcal {A}\underline{\sigma } ,\underline{w})&= ( \varepsilon (\varvec{u}) ,\underline{w}), \end{aligned}$$

(2.3)

$$\begin{aligned} ( \underline{\sigma } ,\varepsilon (\varvec{v}))&= (\varvec{f}_S,\varvec{v}) \end{aligned}$$

(2.4)

for all $(\underline{w},\varvec{v})\in X$.

The weak solution to (1.4)–(1.6) is defined by: Find $(\underline{\sigma },\varvec{u})\in \underline{L}^2(S,\Omega )\times \varvec{H}^1_0(\Gamma _D)$ such that

$$\begin{aligned} ((2\mu )^{-1}\mathcal {A}\underline{\sigma }-\varepsilon (\varvec{u}),\underline{H})&=0{} & {} \forall \underline{H}\in \underline{L}^2(S,\Omega ), \end{aligned}$$

(2.5)

$$\begin{aligned} (\underline{\sigma },\varepsilon (\varvec{v}))+(\varvec{u}\cdot \nabla \varvec{u},\varvec{v})&=(\varvec{f}_N,\varvec{v}){} & {} \quad \forall \varvec{v}\in \varvec{H}^1_0(\Gamma _D). \end{aligned}$$

(2.6)

To simplify the notation, we define $A((\underline{\sigma },\varvec{u}),(\underline{w},\varvec{v}))=((2\mu )^{-1}\mathcal {A}\underline{\sigma },\underline{w})-( \varepsilon (\varvec{u}) ,\underline{w})+ ( \underline{\sigma } ,\varepsilon (\varvec{v}))$. For later use, we introduce the Helmholtz projection operator $\mathbb {P}$. We let $H(\textrm{div};\Omega ):=\{\varvec{v}\in \varvec{L}^2(\Omega ), \textrm{div}\varvec{v}\in L^2(\Omega )\}$ and $H_0(\textrm{div};\Omega ):=\{\varvec{v}\in H(\textrm{div};\Omega );\varvec{v}\cdot \varvec{n}=0\;\text{ on }\;\partial \Omega \}$. For every vector field $\varvec{f}\in \varvec{L}^2(\Omega )$, we have (cf. [33])

$$\begin{aligned} \varvec{f}=\nabla \alpha +\varvec{\beta }, \end{aligned}$$

(2.7)

where $\alpha \in H^1(\Omega )$ and $\varvec{\beta }\in H_0(\textrm{div};\Omega )$ is the divergence-free remainder that is called the Helmholtz projector, i.e., $\mathbb {P}(\varvec{f}):=\varvec{\beta }$.

Theorem 2.1

There exists a unique solution to (2.3)–(2.4).

Proof

Let $(\underline{w},\varvec{v})\in X$ and set $\mathbb {M}=\sup _{(\underline{H},\varvec{\theta })\in X\backslash \{0\}}\frac{A((\underline{w},\varvec{v}),(\underline{H},\varvec{\theta }))}{\Vert (\underline{H},\varvec{\theta })\Vert _X}$. We have

$$\begin{aligned} \left\| (2\mu )^{-\frac{1}{2}}\mathcal {A} \underline{w}\right\| _{L^2(\Omega )}^2=A((\underline{w},\varvec{v}),(\underline{w},\varvec{v}))\le \mathbb {M} \Vert (\underline{w},\varvec{v})\Vert _X. \end{aligned}$$

The Poincaré inequality and the Korn inequality respectively yield for $\varvec{v}\in \varvec{H}^1_0(\Gamma _D)$

$$\begin{aligned} \left\| \mu ^{\frac{1}{2}}\varvec{v}\right\| _{H^1(\Omega )}\le C_p \left\| \mu ^{\frac{1}{2}}\nabla \varvec{v}\right\| _{L^2(\Omega )} \end{aligned}$$

and

$$\begin{aligned} \left\| \mu ^{\frac{1}{2}}\nabla \varvec{v}\right\| _{L^2(\Omega )}\le C_{k} \left\| \mu ^{\frac{1}{2}}\varepsilon (\varvec{v})\right\| _{L^2(\Omega )}, \end{aligned}$$

where $C_p$ and $C_k$ are positive constants. Thus, we can infer that

$$\begin{aligned} \begin{aligned} {\frac{1}{C_pC_k}} \left\| \mu ^{\frac{1}{2}}\varvec{v}\right\| _{H^1(\Omega )}&\le \sup _{\underline{H}\in \underline{L}^2(S,\Omega )\backslash \{\underline{0}\}}\frac{-\left( \mu ^{\frac{1}{2}}\varepsilon (\varvec{v}), \mu ^{-\frac{1}{2}}\underline{H}\right) }{\left\| \mu ^{-\frac{1}{2}}\underline{H}\right\| _{L^2(\Omega )}}\\&= \sup _{\underline{H}\in \underline{L}^2(S,\Omega ) \backslash \{\underline{0}\}}\frac{A((\underline{w},\varvec{v}),(\underline{H},\varvec{0}))-((2\mu )^{-1}\mathcal {A}\underline{w},\underline{H})}{\left\| \mu ^{-\frac{1}{2}}\underline{H}\right\| _{L^2(\Omega )}}\\&\le \sup _{\underline{H}\in \underline{L}^2(S,\Omega )\backslash \{\underline{0}\} }\frac{A((\underline{w},\varvec{v}),(\underline{H},\varvec{0}))}{\Vert (\underline{H},\varvec{0})\Vert _X}\\&\quad +\sup _{\underline{H}\in \underline{L}^2(S,\Omega )\backslash \{\underline{0}\}}\frac{((2\mu )^{-1}\mathcal {A}\underline{w},\underline{H})}{\left\| \mu ^{-\frac{1}{2}}\underline{H}\right\| _{L^2(\Omega )}}\\&\le \mathbb {M}+ \left\| (2\mu )^{-\frac{1}{2}}\mathcal {A}\underline{w}\right\| _{L^2(\Omega )}. \end{aligned} \end{aligned}$$

Thereby, it follows

$$\begin{aligned} \Vert (\underline{w},\varvec{v})\Vert _X^2=\left\| \mu ^{-\frac{1}{2}}\underline{w}\right\| _{L^2(\Omega )}^2+\left\| \mu ^{\frac{1}{2}}\varvec{v}\right\| _{H^1(\Omega )}^2\le C\Big (\mathbb {M}^2+\mathbb {M}\Vert (\underline{w},\varvec{v})\Vert _X\Big ). \end{aligned}$$

Thus, Young’s inequality yields $ \Vert (\underline{w},\varvec{v})\Vert _X\le C \mathbb {M}. $ Now let $(\underline{H},\varvec{\theta })\in X$ be such that $A((\underline{w},\varvec{v}),(\underline{H},\varvec{\theta }))=0$ for all $(\underline{w},\varvec{v})\in X$. Taking $(\underline{w},\varvec{v})=(\underline{H},\varvec{\theta })$, we have $ \Vert (2\mu )^{-\frac{1}{2}}\mathcal {A}\underline{H}\Vert _{L^2(\Omega )}=0, $ which yields $\mathcal {A}\underline{H}=\underline{0}$.

We can infer from the Korn inequality and the Poincaré inequality

$$\begin{aligned} \Vert \varvec{\theta }\Vert _{H^1(\Omega )}\le C \sup _{\underline{w}\in \underline{L}^2(S,\Omega )\backslash \{\underline{0}\} }\frac{(\varepsilon (\varvec{\theta }),\underline{w})}{\Vert \underline{w}\Vert _{L^2(\Omega )}}=C\sup _{\underline{w}\in \in \underline{L}^2(S,\Omega )\backslash \{\underline{0}\}}\frac{A((\underline{w},\varvec{0}),(\underline{H},\varvec{\theta }))}{\Vert \underline{w}\Vert _{L^2(\Omega )}}=0. \end{aligned}$$

Thus, $\varvec{\theta }=\varvec{0}$. Since $\textrm{tr}(\underline{H})\in L^2(\Omega )$, there exists $\varvec{z}\in \varvec{H}^1(\Omega )$ such that $\nabla \cdot \varvec{z}=\textrm{tr}(\underline{H})$ and $\Vert \varvec{z}\Vert _{H^1(\Omega )}\le C \Vert \textrm{tr}(\underline{H})\Vert _{L^2(\Omega )} $. Then a simple manipulation shows that

$$\begin{aligned} \Vert \textrm{tr}(\underline{H})\Vert _{L^2(\Omega )}^2=(\textrm{tr}(\underline{H}), \nabla \cdot \varvec{z})=d\Big ((\underline{H},\varepsilon (\varvec{z}))-(\mathcal {A}\underline{H},\varepsilon (\varvec{z}))\Big ). \end{aligned}$$

We let $(\underline{w},\varvec{v})=(\underline{0},\varvec{z})$. Then we have $ (\underline{H},\varepsilon (\varvec{z}))=-A((\underline{0},\varvec{z}),(\underline{H},\varvec{\theta }))=0. $ Thus $\underline{H}=\underline{0}$ in view of $\mathcal {A}\underline{H}=\underline{0}$ and $\textrm{tr}(\underline{H})=0$. Then the well-posedness can be proved by using Banach--Nečas–Babuška theorem (cf. [2]). $\square $

We remark that the unique solvability of (2.5)–(2.6) can be proved similarily with additional treatment for the nonlinear term under the smallness assumption on the Helmholtz projector of $\varvec{f}_N$ (cf. [15, Theorem 6.36] and [28, Equation (3.7)]). We omit the details for simplicity. In the following, we will derive the pressure-robust discretization for the Stokes equations and the Navier–Stokes equations. Let $\mathcal {T}_h$ represent the shape-regular triangulations of the domain $\Omega $. For each element $K\in \mathcal {T}_h$, we let $h_K$ be the diameter of the element K. In addition, we use $\mathcal {F}_h$ to represent the union of all the faces and use $\mathcal {F}_h^0$ to represent the union of all the interior faces. We let $h_K$ represent the diameter of the element K, $K\in \mathcal {T}_h$. For each face F, we use $h_F$ to denote the diameter of F. We use $\varvec{n}_F$ to represent the unit normal vector of F pointing from $K_1$ to $K_2$, where $K_1$ and $K_2$ are the elements sharing the common face F. When there is no confusion, we use $\varvec{n}$ to simplify the notation. For each interior face F, we define the jump and average of a scalar function q over F as

$$\begin{aligned} \llbracket q\rrbracket _{|F}:=q_{|K_1}-q_{|K_2}\quad \text{ and }\quad \{q\}_{|F} :=\frac{q_{|K_1}+q_{|K_2}}{2}, \end{aligned}$$

where $K_1$ and $K_2$ are the two elements belonging to $\mathcal {T}_h$ sharing the common face F. For the boundary faces, we simply define $\llbracket q\rrbracket _{|F}:=q_{|K_1}$ and $\{q\}_{|F}:=q_{|K_1}$. We also use the same notation to indicate the jump and average of the vector and tensor functions. Let $k\ge 1$ represent the polynomial order, we use $P_k(K)$ and $P_k(F)$ to represent the polynomial functions defined on K and F whose order is less than or equal to k. Similarily, $\varvec{P}_k(K)$ represents the vector-valued functions and $\underline{P}_k(S,K)$ represents the symmetric tensor-valued functions on K. For any scalar functions $q,\theta $, we let $(q,\theta )_{\mathcal {T}_h}:=\sum _{K\in \mathcal {T}_h} (q,\theta )_K$. For the vector functions $\varvec{q},\varvec{\theta }$, we let $(\varvec{q},\varvec{\theta })_{\mathcal {T}_h}:=\sum _{i=1}^d(q_i,\theta _i)_{\mathcal {T}_h}$. Similarily, for tensor functions $\underline{q},\underline{\theta }$, we let $(\underline{q},\underline{\theta })_{\mathcal {T}_h}:=\sum _{i,j=1}^d(\underline{q}_{ij},\underline{\theta }_{ij})_{\mathcal {T}_h}$. Moreover, $\langle q,\theta \rangle _{\partial \mathcal {T}_h}:=\sum _{K\in \mathcal {T}_h} \langle q,\theta \rangle _{\partial K}$.

For a vector function, we use $\varvec{v}^n$ and $\varvec{v}^t$ to represent the normal component and the tangential component, respectively, that is,

$$\begin{aligned} \varvec{v}^n:=(\varvec{v}\cdot \varvec{n})\varvec{n},\quad \varvec{v}^t:=\varvec{v}-\varvec{v}^n. \end{aligned}$$

We introduce the following finite-dimensional spaces

$$\begin{aligned} \varvec{U}_h&:=\{\varvec{v}\in \varvec{P}_{k+1}(K),\forall K\in \mathcal {T}_h, \llbracket \varvec{v}\cdot \varvec{n}\rrbracket _{|F}=0,\forall F\in \mathcal {F}_h^0; \varvec{v}\cdot \varvec{n}=0 \;\text{ on }\;\partial \Omega _D\},\\ \underline{\Sigma }_h&:=\{\underline{w}\in \underline{P}_{k}(S,K),\forall K\in \mathcal {T}_h\},\\ \varvec{\widehat{U}}_h&:=\{\varvec{\mu }\in \varvec{P}_k(F), \varvec{\mu }\cdot \varvec{n}_{|F}=0,\forall F\in \mathcal {F}_h; \varvec{\mu }^t=\varvec{0} \;\text{ on }\;\partial \Omega _D \}. \end{aligned}$$

We remark that if $\partial \Omega _N=\emptyset $, then we need to enforce the restriction $\int _\Omega \textrm{tr}(\underline{w})=0$ for $\underline{\Sigma }_h$ in order to ensure the unique solvability.

We define the discrete $H^1$-norm for $(\varvec{v},\widehat{\varvec{v}})\in \varvec{U}_h\times \widehat{\varvec{U}}_h$

$$\begin{aligned} \Vert (\varvec{v},\widehat{\varvec{v}})\Vert _{1,h}^2:=\Vert \nabla \varvec{v}\Vert _{L^2(\mathcal {T}_h)}^2+\left\| h^{-\frac{1}{2}}(\varvec{v}^t-\varvec{\widehat{v}})\right\| _{L^2(\partial \mathcal {T}_h)}^2, \end{aligned}$$

where $h_{|F}:=h_F$.

Note that for $(\varvec{v},\widehat{\varvec{v}})\in \varvec{U}_h\times \widehat{\varvec{U}}_h$, it holds

$$\begin{aligned} \begin{aligned} \Vert \varvec{v}\Vert _{h}^2&:= \Vert \nabla \varvec{v}\Vert _{L^2(\mathcal {T}_h)}^2+\left\| h^{-\frac{1}{2}}\llbracket \varvec{v}^t\rrbracket \right\| _{L^2(\partial \mathcal {T}_h\backslash \partial \Omega _N)}^2\\&=\Vert \nabla \varvec{v}\Vert _{L^2(\mathcal {T}_h)}^2+\left\| h^{-\frac{1}{2}}\llbracket \varvec{v}^t-\widehat{\varvec{v}}\rrbracket \right\| _{L^2(\partial \mathcal {T}_h\backslash \partial \Omega _N)}^2\le C \Vert (\varvec{v},\widehat{\varvec{v}})\Vert _{1,h}^2, \end{aligned} \end{aligned}$$

(2.8)

which coupled with the discrete Poincaré inequality (cf. [6]) yields

$$\begin{aligned} \Vert \varvec{v}\Vert _{L^2(\mathcal {T}_h)}\le C\Vert (\varvec{v},\widehat{\varvec{v}})\Vert _{1,h}\quad \forall (\varvec{v},\widehat{\varvec{v}})\in \varvec{U}_h\times \widehat{\varvec{U}}_h. \end{aligned}$$

(2.9)

The discrete formulation for the Stokes equations reads as follows: Find $(\underline{\sigma }_h,\varvec{u}_h,\varvec{\widehat{u}}_h)\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$ such that

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h,\underline{w})_{\mathcal {T}_h}+(\varvec{u}_h,\textrm{div}\underline{w})_{\mathcal {T}_h}\nonumber \\&\quad -\sum _{F\in \mathcal {F}_h^0\cup \partial \Omega _N}\langle \varvec{u}_h^n,\llbracket (\underline{w}\varvec{n})^n\rrbracket \rangle _{F}-\sum _{F\in \mathcal {F}_h^0\cup \partial \Omega _N}\langle \varvec{\widehat{u}}_h,\llbracket (\underline{w}\varvec{n})^t\rrbracket \rangle _{F}=0, \end{aligned}$$

(2.10)

$$\begin{aligned}&- (\underline{\sigma }_h, \varepsilon (\varvec{v}))_{\mathcal {T}_h}+\langle (\underline{\sigma }_h\varvec{n})^t,\varvec{v}^t\rangle _{\partial \mathcal {T}_h}-\langle \tau (\varvec{P_M} \varvec{u}_h^t -\varvec{\widehat{u}}_h),\varvec{v}^t\rangle _{\partial \mathcal {T}_h}=-(\varvec{f}_S,\varvec{v})_{\mathcal {T}_h},\end{aligned}$$

(2.11)

$$\begin{aligned}&\sum _{F\in \mathcal {F}_h^0\cup \partial \Omega _N}\langle (\underline{\sigma }_h\varvec{n})^t,\varvec{\widehat{v}}\rangle _{F}-\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h), \varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h}=0 \end{aligned}$$

(2.12)

for $(\underline{w},\varvec{v},\varvec{\widehat{v}})\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$. Here we assume $\tau _{|K} = C h_K^{-1} \mu $ over each element. In addition, $\varvec{P_M}$ restricted to each face F represents the standard $L^2$-orthogonal projection from $\varvec{L}^2(F)$ onto $\varvec{P}_k(F), F\subset \partial K$, $K\in \mathcal {T}_h$.

To ease the later presentation, we recast the discrete formulation into compact form. To this end, we let

$$\begin{aligned}&B_h(\varvec{v},\underline{w}):=(\varvec{v},\textrm{div}\underline{w})_{\mathcal {T}_h}-\sum _{F\in \mathcal {F}_h^0\cup \partial \Omega _N}\langle {\varvec{v}^n},\llbracket (\underline{w}\varvec{n})^n\rrbracket \rangle _{F},\quad \forall (\varvec{v},\underline{w})\in \varvec{U}_h\times \underline{\Sigma }_h,\\&S_h((\varvec{w},\widehat{\varvec{w}}),(\varvec{v},\widehat{\varvec{v}})):=\langle \tau (\varvec{P_M} \varvec{w}^t-\varvec{\widehat{w}}),\varvec{v}^t-\widehat{\varvec{v}}\rangle _{\partial \mathcal {T}_h},\quad \\ {}&\quad \forall (\varvec{w},\widehat{\varvec{w}})\in \varvec{U}_h\times \widehat{\varvec{U}}_h, (\varvec{v},\widehat{\varvec{v}})\in \varvec{U}_h\times \widehat{\varvec{U}}_h,\\&T_h(\widehat{\varvec{v}},\underline{w}):=\sum _{F\in \mathcal {F}_h^0\cup \partial \Omega _N}\langle (\underline{w}\varvec{n})^t,\varvec{\widehat{v}}\rangle _{F},\quad \forall (\underline{w},\varvec{\widehat{v}})\in \underline{\Sigma }_h\times \widehat{\varvec{U}}_h. \end{aligned}$$

Integration by parts implies that

$$\begin{aligned} B_h(\varvec{v},\underline{w})= - (\underline{w}, \varepsilon (\varvec{v}))_{\mathcal {T}_h}+\langle (\underline{w}\varvec{n})^t,\varvec{v}^t\rangle _{\partial \mathcal {T}_h} \quad \forall (\varvec{v},\underline{w})\in \varvec{U}_h\times \underline{\Sigma }_h. \end{aligned}$$

We define $ \mathbb {A}_h((\cdot ,\cdot ,\cdot ),(\cdot ,\cdot ,\cdot ))$ by

$$\begin{aligned}&\mathbb {A}_h((\underline{S},\varvec{w},\widehat{\varvec{w}}),(\underline{H},\varvec{v},\widehat{\varvec{v}}))\\&\quad :=((2\mu )^{-1}\mathcal {A}\underline{S},\underline{H})+B_h(\varvec{w},\underline{H})-B_h(\varvec{v},\underline{S})+S_h((\varvec{w},\widehat{\varvec{w}}),(\varvec{v},\widehat{\varvec{v}}))\\&\qquad \quad -T_h(\widehat{\varvec{w}},\underline{H})+T_h(\widehat{\varvec{v}},\underline{S}). \end{aligned}$$

Then (2.10)–(2.12) can be rewritten in compact form as follows: Find $(\underline{\sigma }_h,\varvec{u}_h,\varvec{\widehat{u}}_h)\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$ such that

$$\begin{aligned} \mathbb {A}_h((\underline{\sigma }_h,\varvec{u}_h,\widehat{\varvec{u}}_h),(\underline{H},\varvec{v},\widehat{\varvec{v}}))=(\varvec{f}_S,\varvec{v})\quad \forall (\varvec{v},\underline{H},\widehat{\varvec{v}})\in \varvec{U}_h\times \underline{\Sigma }_h\times \widehat{\varvec{U}}_h. \end{aligned}$$

We define the convective trilinear form for $\varvec{w}\in \varvec{U}_h,(\varvec{\psi },\widehat{\varvec{\psi }})\in \varvec{U}_h\times \widehat{\varvec{U}}_h$ and $(\varvec{v},\widehat{\varvec{v}})\in \varvec{U}_h\times \widehat{\varvec{U}}_h$ as follows

$$\begin{aligned} N_h(\varvec{w};(\varvec{\psi },\widehat{\varvec{\psi }}),(\varvec{v},\widehat{\varvec{v}}))&:=-\sum _{K\in \mathcal {T}_h}(\varvec{\psi }\otimes \varvec{w}, \nabla \varvec{v})_K +\frac{1}{2}\left\langle \varvec{w}\cdot \textbf{n},(\varvec{\psi }^t+\widehat{\varvec{\psi }})\cdot (\varvec{v}^t-\widehat{\varvec{v}})\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\quad \;+\frac{1}{2}\left\langle |\varvec{w}\cdot \textbf{n}|,(\varvec{\psi }^t-\widehat{\varvec{\psi }})\cdot (\varvec{v}^t-\widehat{\varvec{v}})\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}. \end{aligned}$$

The discrete formulation for the Navier–Stokes equations reads as follows: Find $(\underline{\sigma }_h,\varvec{u}_h,\widehat{\varvec{u}}_h)\in \varvec{U}_h\times \underline{\Sigma }_h\times \widehat{\varvec{U}}_h$ such that

$$\begin{aligned}&\mathbb {A}_h((\underline{\sigma }_h,\varvec{u}_h,\widehat{\varvec{u}}_h),(\underline{H},\varvec{v},\widehat{\varvec{v}}))+N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{v},\widehat{\varvec{v}}))\nonumber \\&\quad =(\varvec{f}_N,\varvec{v})\quad \forall (\varvec{v},\underline{H},\widehat{\varvec{v}})\in \varvec{U}_h\nonumber \\&\qquad \times \underline{\Sigma }_h\times \widehat{\varvec{U}}_h. \end{aligned}$$

(2.13)

Remark 2.1

(Divergence-free velocity). Although our discrete formulation does not involve the divergence-free restriction for velocity explicitly, we can show that the numerical velocity is actually divergence-free; indeed, this is encapsulated in (2.10). Let $P_h:=\{q_{|K}\in P_k(K),\forall K\in \mathcal {T}_h\}$. For an arbitrary function $q\in P_h$, we set $\underline{w}=q\underline{I}$ in (2.10), then we have from integration by parts that

$$\begin{aligned} \sum _{K\in \mathcal {T}_h}(\nabla \cdot \varvec{u}_h, q)_K=0. \end{aligned}$$

(2.14)

Remark 2.2

We use discontinuous polynomial space of order k and $H(\textrm{div};\Omega )$-conforming space of order $k+1$ to approximate the stress and velocity, respectively. This choice in conjunction with the use of $\varvec{P_M}$ can guarantee the optimal convergence error estimates for all the variables as well as the robustness of the error estimates with respect to $\mu $. In constrast, the scheme proposed in [21, 22] uses the equal polynomial order k for the approximation of stress and velocity, and the stress converges in $\mathcal {O}(h^k)$ in $L^2$-norm. Compared to the scheme presented in [11, 35], our scheme computes the physical variables of interest directly, which is important in some applications.

Remark 2.3

In contrast to the divergence-conforming schemes in velocity gradient-velocity-pressure formulation [12] and velocity-pressure formulation [27], we use the stress-velocity formulation with strongly symmetric stress and eliminate the pressure via the incompressibility condition, which enables us to calculate the physical interest directly without resorting to postprocessing. The proposed scheme also provides a unified framework for solving the Stokes equations and the elasticity problem. As such, our scheme can be easily extended to solve multiphysical problems such as the fluid-structure interaction problem with a natural incorporation of the interface conditions. Moreover, owing to the use of stress-velocity formulation, the proof for the pressure-independent error estimates is not trivial, the developed methodologies can provide new perspectives for other pressure-robust discretiations.

2.2 Main results

In this subsection we state the main results and the proof is given in Sects. 4 and 5.

Theorem 2.2

(Discrete $H^1$ stability of the Stokes equations) There exists a unique solution to (2.10)–(2.12). In addition, the following estimates hold

$$\begin{aligned} \Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}\le C \Vert \mu ^{-1}\mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-\frac{1}{2}}\Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)} \end{aligned}$$

(2.15)

and

$$\begin{aligned} \mu \Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}&\le C \Vert {\mathbb {P}(\varvec{f}_S)}\Vert _{L^2(\mathcal {T}_h)},\\ \Vert \underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}&\le C \Vert \varvec{f}_S\Vert _{L^2(\mathcal {T}_h)}. \end{aligned}$$

Let $ \underline{e_\sigma } =\underline{\Pi _\Sigma } \underline{\sigma }-\underline{\sigma }_h,\; \varvec{e_u} = \varvec{\Pi _U}\varvec{u}-\varvec{u}_h,\; \varvec{e_{\widehat{u}}}=\varvec{P_M}\varvec{u}^t-\varvec{\widehat{u}}_h $, where $\underline{\Pi _\Sigma }$ and $ \varvec{\Pi _U}$ are projection operators defined in Sect. 4. Then the following convergence error estimates hold.

Theorem 2.3

(Error estimates of the Stokes equations) Let $(\underline{\sigma },\varvec{u})$ be the solution of (2.1)–(2.2) and let $(\underline{\sigma }_h,\varvec{u}_h,\varvec{\widehat{u}}_h)\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$ be the discrete solution of (2.10)–(2.12). In addition, we assume that $(\underline{\sigma },\varvec{u})\in \underline{H}^{t}(\Omega )\times \varvec{H}^s(\Omega ), \frac{1}{2}< t\le k+1, \frac{3}{2}< s\le k+2$, the following error estimate holds

$$\begin{aligned} \begin{aligned}&\left\| (2\mu )^{-\frac{1}{2}}\mathcal {A}\underline{e_\sigma }\right\| _{L^2(\mathcal {T}_h)}+\left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)}\\&\quad \le C \Big (h^t \mu ^{-\frac{1}{2}} |\mathcal {A}\underline{\sigma }|_{H^t(\Omega )}+\mu ^{\frac{1}{2}}h^{s-1}|\varvec{u}|_{H^s(\Omega )}\Big )\\&\quad \le C \mu ^{\frac{1}{2}}h^{s-1}|\varvec{u}|_{H^s(\Omega )}{.} \end{aligned} \end{aligned}$$

(2.16)

In addition, it also holds

$$\begin{aligned} \Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}&\le C h^{s-1}|\varvec{u}|_{H^s(\Omega )},\quad \Vert (\varvec{e_u},\varvec{e_{\widehat{u}}})\Vert _{1,h}\le C h^{s-1}|\varvec{u}|_{H^s(\Omega )}. \end{aligned}$$

(2.17)

We remark that the polynomial degree of the tangential trace of velocity is one order lower than that of the approximated velocity, and the globally coupled system only involves the normal trace and tangential trace of velocity, and the piecewise constant approximation for the trace of the stress as shown in next section. Therefore, with regard to the degrees of freedom of the globally coupled unknowns, (2.17) indicates that superconvergence is obtained for the discrete $H^1$-norm of velocity.

Theorem 2.4

($L^2$-error for stress) Under the assumptions of Theorem 2.3, the following error estimate holds for the Stokes equations

$$\begin{aligned} \Vert \underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)}\le C \Big (h^t|\mathcal {A}\underline{\sigma }|_{H^t(\Omega )}+\mu h^{s-1}|\varvec{u}|_{H^s(\Omega )}\Big ). \end{aligned}$$

Theorem 2.5

($L^2$-error for velocity) Let $(\underline{\sigma },\varvec{u})$ be the solution of (2.1)–(2.2) and let $(\underline{\sigma }_h,\varvec{u}_h,\varvec{\widehat{u}}_h)\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$ be the discrete solution of (2.10)–(2.12). In addition, we assume that $(\underline{\sigma },\varvec{u})\in \underline{H}^{s-1}(\Omega )\times \varvec{H}^s(\Omega ), \frac{3}{2}< s\le k+2$. Then the following error estimate holds

$$\begin{aligned} \Vert \varvec{u}-\varvec{u}_h\Vert _{L^2(\Omega )}\le C h^s|\varvec{u}|_{H^s(\Omega )}. \end{aligned}$$

Remark 2.4

We can observe from Theorems 2.3 and 2.5 that the convergence error estimates for velocity are independent of $\mu $, which illustrates the robustness of the scheme with respect to the viscosity. The robust error estimates for velocity-pressure formulation are generally easier to obtain. However, the standard error estimates for stress-velocity formulation will lead to the dependence of the error estimates on $\underline{\sigma }$, which is linked to the pressure variable (cf. (1.1)). To illustrate the independence of the error estimates on $\mu $ and p, the key tools are to use (4.1) and observe that $(\mathcal {A} \underline{\sigma }\varvec{n})^t_{|F}= (\underline{\sigma }\varvec{n})^t_{|F}$ for $F\in \mathcal {F}_h$ owing to the fact that $(\textrm{tr}(\underline{\sigma })\underline{I}\varvec{n})^t_{|F}=\varvec{0}$.

Theorem 2.6

(convergence to weak solution for the Navier–Stokes equations) Let $\{(\underline{\sigma }_h,\varvec{u}_h)\}_{h>0}$ be the sequence of the approximated solutions generated by solving the discrete formulation (2.13). Then, as $h\rightarrow 0$, it holds

$$\begin{aligned} \varvec{u}_h&\rightarrow \varvec{u} \quad \text {in}\;\varvec{L}^2(\Omega ),\\ \underline{\sigma }_h&\rightarrow \underline{\sigma } \quad \text {in}\;\underline{L}^2(S,\Omega ),\\ \Vert \varvec{u}-\varvec{u}_h\Vert _h&\rightarrow 0, \end{aligned}$$

where $\Vert \cdot \Vert _{h}$ is defined in (2.8) and $(\underline{\sigma },\varvec{u} )\in X$ is the unique solution to the weak formulation (2.5)–(2.6).

3 A characterization of the proposed HDG scheme

In this section, we first describe the local solvers and the corresponding global solvers for the proposed HDG method, where the global solvers involve the normal trace and the tangential trace of velocity, and the piecewise constant approximation for the trace of the stress.

We can follow [12, 19, 29] to relax the $H(\textrm{div})$-conformity of the velocity field via Lagrange multipliers. To begin, we derive the following formulation with relaxed $H(\textrm{div})$-conformity: Find $(\underline{\sigma }_h,\varvec{u}_h,{\delta _h},\varvec{\widehat{u}}_h,\lambda _h)\in \underline{\Sigma }_h\times \varvec{U}_h^*\times U_h^n\times \varvec{\widehat{U}}_h\times M_h^\partial $ such that

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h,\underline{w})_{\mathcal {T}_h}+(\varvec{u}_h,\textrm{div}\underline{w})_{\mathcal {T}_h}-\sum _{F\in \mathcal {F}_h}\langle {\delta _h}\varvec{n},\llbracket (\underline{w}\varvec{n})^n\rrbracket \rangle _{F}\\&\quad -\sum _{F\in \mathcal {F}_h\backslash \partial \Omega _D}\langle \varvec{\widehat{u}}_h,\llbracket (\underline{w}\varvec{n})^t\rrbracket \rangle _{F}=0,\\&\quad - (\underline{\sigma }_h, \varepsilon (\varvec{v}))_{\mathcal {T}_h}+\langle \underline{\sigma }_h\varvec{n}+\lambda _h\varvec{n},\varvec{v}\rangle _{\partial \mathcal {T}_h}-\langle \tau (\varvec{P_M} \varvec{u}_h^t -\varvec{\widehat{u}}_h),\varvec{v}^t\rangle _{\partial \mathcal {T}_h}=-(\varvec{f}_S,\varvec{v})_{\mathcal {T}_h},\\&\langle (\underline{\sigma }_h\varvec{n})^t,\varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}-\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h), \varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h}=0,\\&\langle (\underline{\sigma }_h\varvec{n})^n+\lambda _h\varvec{n}, \chi \varvec{n}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D} =0,\\&\langle \varvec{u}_h^n-{\delta _h}\varvec{n},\mu \varvec{n}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _N} =0 \end{aligned}$$

for $(\underline{w},\varvec{v},\chi ,\varvec{\widehat{v}},\mu )\in \underline{\Sigma }_h\times \varvec{U}_h^*\times U_h^n\times \varvec{\widehat{U}}_h\times M_h^\partial $. Here the spaces are defined as

$$\begin{aligned} \varvec{U}_h^*&:=\{\varvec{v}\in \varvec{P}_{k+1}(K),\forall K\in \mathcal {T}_h\},\\ M_h^{\partial }&:=\{\mu \in L^2(\partial \mathcal {T}_h): \mu _{|F}\in {P_{k+1}(F)}, F\subset \partial K,\forall K\in \mathcal {T}_h;\mu _{|F}=0,\forall F\in \partial \Omega _N\},\\ U_h^n&:= \{\chi \in P_{k+1}(F), \forall F\in \mathcal {F}_h; \chi =0 \;\text{ on }\;\partial \Omega _D \}. \end{aligned}$$

Then we can follow [19] to define the local solvers, and the resulting global system only involves the interface variables.

For $\underline{\sigma }_h\in \underline{\Sigma }_h$, let us define $(\bar{\sigma }_0)_{|K}:=\frac{1}{d{ |K|}}\int _K \textrm{tr}(\underline{\sigma })$. Then we let

$$\begin{aligned} \bar{\underline{\sigma }}_h:=\underline{\sigma }_h-\bar{\sigma }_0\underline{I}. \end{aligned}$$

It follows that $\int _K \textrm{tr}( \bar{\underline{\sigma }}_h)=0$. We define the following local spaces for $K\in \mathcal {T}_h$:

$$\begin{aligned} \underline{\Sigma }(K)&:=\{\underline{w}\in \underline{P}_{k}(S,K); {\int _K \textrm{tr}(\underline{w})=0}\}, \\ \varvec{U}(K)&:=\{\varvec{v}\in \varvec{P}_{k+1}(K)\},\; M_h^{\partial }(K):=\{ \mu _{|\partial K}\in {P_{k+1}}(\partial K)\}. \end{aligned}$$

Given $\delta _h\in U_h^n, \widehat{\varvec{u}}_h\in \widehat{\varvec{U}}_h$ and $\varvec{f}_S\in L^2(\Omega )$, the local solvers are defined as: Find $(\bar{\underline{\sigma }}_h,\varvec{u}_h,\lambda _h)\in \underline{\Sigma }(K)\times \varvec{U}(K)\times M_h^{\partial }(K)$ such that

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}{\bar{\underline{\sigma }}_h},\underline{w})_{K}+(\varvec{u}_h,\textrm{div}\underline{w})_{K}=\langle {\delta _h}\varvec{n},(\underline{w}\varvec{n})^n\rangle _{\partial K}+\langle \varvec{\widehat{u}}_h,(\underline{w}\varvec{n})^t\rangle _{\partial K\backslash \partial \Omega _D}, \end{aligned}$$

(3.1)

$$\begin{aligned}&(\textrm{div}{\bar{\underline{\sigma }}_h}, \varvec{v})_{K}+\langle \lambda _h,\varvec{v}\cdot \varvec{n}\rangle _{\partial K\backslash \partial \Omega _N}-\langle \tau \varvec{P_M} \varvec{u}_h^t,\varvec{v}^t\rangle _{\partial K}=-(\varvec{f}_S,\varvec{v})_{K}-\langle \tau \varvec{\widehat{u}}_h,\varvec{v}^t\rangle _{\partial K},\end{aligned}$$

(3.2)

$$\begin{aligned}&\langle \varvec{u}_h^n-{\delta _h}\varvec{n},\mu \varvec{n}\rangle _{\partial K\backslash \partial \Omega _N} =0 \end{aligned}$$

(3.3)

for $(\underline{w},\varvec{v},\mu )\in \underline{\Sigma }(K)\times \varvec{U}(K)\times M_h^{\partial }(K)$.

When we set $\varvec{f}_S=\varvec{0}$ in (3.1)–(3.3), the solution to the local problem can be denoted by $(\underline{\sigma }_h^{(\delta _h,\varvec{\widehat{u}}_h)}, \varvec{u}_h^{(\delta _h,\varvec{\widehat{u}}_h)},\lambda _h^{(\delta _h,\varvec{\widehat{u}}_h)})$, where the superscript indicates the dependence of the solution on $(\delta _h,\varvec{\widehat{u}}_h)$. Similarily, when we set $\delta _h=0,\varvec{\widehat{u}}_h=\varvec{0}$, the solution to the local problem is denoted as $(\underline{\sigma }_h^{\varvec{f}},\varvec{u}_h^{\varvec{f}},\lambda _h^{\varvec{f}})$, where the superscript indicates the dependence of the solution on $\varvec{f}_S$. Then the solutions to (3.1)–(3.3) can be written as $(\underline{\sigma }_h^{(\delta _h,\varvec{\widehat{u}}_h)}, \varvec{u}_h^{(\delta _h,\varvec{\widehat{u}}_h)},\lambda _h^{(\delta _h,\varvec{\widehat{u}}_h)})+(\underline{\sigma }_h^{\varvec{f}},\varvec{u}_h^{\varvec{f}},\lambda _h^{\varvec{f}})$, we use $(\widetilde{\underline{\sigma }}_h,\widetilde{\varvec{u}}_h,\lambda _h)_K$ to represent the restriction of $(\widetilde{\underline{\sigma }}_h,\widetilde{\varvec{u}}_h,\lambda _h)$ to each element K, $K\in \mathcal {T}_h$, that is,

$$\begin{aligned} (\widetilde{\underline{\sigma }}_h,\widetilde{\varvec{u}}_h,\lambda _h)_K:=(\underline{\sigma }_h^{(\delta _h,\varvec{\widehat{u}}_h)}, \varvec{u}_h^{(\delta _h,\varvec{\widehat{u}}_h)},\lambda _h^{(\delta _h,\varvec{\widehat{u}}_h)})+(\underline{\sigma }_h^{\varvec{f}},\varvec{u}_h^{\varvec{f}},\lambda _h^{\varvec{f}}). \end{aligned}$$

Let us define $\bar{P}_h:=\{q_0\in L^2(\Omega ): (q_0)_{|K}\in P_0(K), \forall K\in \mathcal {T}_h;\int _\Omega q_0=0\}$. The global problem is to find $({\delta _h},\varvec{\widehat{u}}_h,\bar{\sigma }_0)\in U_h^n\times \varvec{\widehat{U}}_h\times \bar{P}_h$ such that

$$\begin{aligned} \langle&((\underline{\sigma }_{h}^{(\delta _h,\varvec{\widehat{u}}_h)}+{\bar{\sigma }_0 \underline{I}})\varvec{n})^n+\lambda _h^{(\delta _h,\varvec{\widehat{u}}_h)}\varvec{n}, \chi \varvec{n}\rangle _{\partial \mathcal {T}_h}= -\langle (\underline{\sigma }_{h}^{\varvec{f}}\varvec{n})^n+\lambda _h^{\varvec{f}}\varvec{n}, \chi \varvec{n}\rangle _{\partial \mathcal {T}_h}, \end{aligned}$$

(3.4)

$$\begin{aligned}&\langle (\underline{\sigma }_{h}^{(u_h^n,\varvec{\widehat{u}}_h)}\varvec{n})^t,\varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}-\langle \tau (\varvec{P_M} (\varvec{u}_{h}^{{(u_h^n,\varvec{\widehat{u}}_h)}})^t-\varvec{\widehat{u}}_h), \varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h}=-\langle (\underline{\sigma }_{h}^{\varvec{f}}\varvec{n})^t,\varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}, \end{aligned}$$

(3.5)

$$\begin{aligned}&{\langle \delta _h\varvec{n}, (\bar{q}_0\underline{I}\varvec{n})^n\rangle _{\partial \mathcal {T}_h}}=0. \end{aligned}$$

(3.6)

for $(\chi ,\varvec{\widehat{v}},{\bar{q}_0})\in U_h^n\times \varvec{\widehat{U}}_h\times {\bar{P}_h}$.

Note that $(\widetilde{\underline{\sigma }}_h{+\bar{\sigma }_0 \underline{I}},\widetilde{\varvec{u}}_h)$ is the solution to the original problem (2.10)–(2.12).

Remark 3.1

We observe that the globally coupled unknows involve the normal trace and tangential trace of velocity, and the piecewise constant approximation $\bar{\underline{\sigma }}_0$. Thus, the matrix associated with the resulting global system has a saddle-point structure. One can combine the techniques introduced in [32] to improve the computational efficiency of the resulting saddle-point system. Meanwhile, the numerical velocity is exactly divergence-free, and thus it will naturally yield error estimates for velocity independent of the pressure variable and of the viscosity, which makes it possible to solve incompressible flows with high Reynolds number (cf. [23]).

4 Error analysis for the Stokes equations

In this section, we prove Theorems 2.2–2.5 on the discrete $H^1$ stability and the convergence error estimates for all the involved variables.

Let $\underline{\Pi _\Sigma }$ represent the $L^2$-orthogonal projection onto $\underline{\Sigma }_h$. The $L^2$ projection property of $\underline{\Pi _\Sigma }$ implies that

$$\begin{aligned} \mathcal {A} (\underline{\Pi _\Sigma }\underline{w}) = \underline{\Pi _\Sigma }(\mathcal {A}\underline{w})\quad \forall \underline{w}\in \underline{L}^2(\Omega ). \end{aligned}$$

(4.1)

In addition, the following standard convergence error estimates hold for any $\underline{w}\in \underline{H}^{k+1}(K),K\in \mathcal {T}_h$

$$\begin{aligned} \Vert \underline{w}-\underline{\Pi _\Sigma }\underline{w}\Vert _{L^2(K)}&\le C h_K^{k+1}|\underline{w}|_{H^{k+1}(K)}, \end{aligned}$$

(4.2)

$$\begin{aligned} \Vert \nabla (\underline{w}-\underline{\Pi _\Sigma }\underline{w})\Vert _{L^2(K)}&\le C h_K^{k}|\underline{w}|_{H^{k+1}(K)}. \end{aligned}$$

(4.3)

Let $\varvec{\Pi _U}: \varvec{H}(\textrm{div};\Omega )\cap \varvec{L}^p(\Omega )\rightarrow \varvec{U}_h, p>2$ represent the Brezzi–Douglas–Marini (BDM) projection defined by (cf. [5])

$$\begin{aligned} \langle (\varvec{v}-\varvec{\Pi _U}\varvec{v})\cdot \varvec{n}, p_{k+1}\rangle _F&=0{} & {} \quad \forall p_{k+1}\in P_{k+1}(F), F\in \mathcal {F}_h,\\ (\varvec{v}-\varvec{\Pi _U} \varvec{v}, \nabla p_{k})_K&=0{} & {} \quad \forall p_{k}\in P_{k}(K), K\in \mathcal {T}_h,\\ {(\varvec{v}-\varvec{\Pi _U} \varvec{v}, \varvec{b})_K}&=0{} & {} \quad \forall \varvec{b}\in \varvec{B}_{k+1}(K), K\in \mathcal {T}_h, \end{aligned}$$

where $\varvec{B}_{k+1}(K)$ is the set of polynomials in $\varvec{P}_{k+1}(K)$ that are divergence-free and whose normal component is zero on $\partial K$.

In addition, the following error estimates hold for $\varvec{v}\in \varvec{H}^{k+2}(K), K\in \mathcal {T}_h$ (cf. [5])

$$\begin{aligned} \Vert \varvec{v}-\varvec{\Pi _{U}}\varvec{v}\Vert _{L^2(K)}&\le C h_K^{k+2}| \varvec{v}|_{H^{k+2}(K)}, \end{aligned}$$

(4.4)

$$\begin{aligned} \Vert \nabla (\varvec{v}-\varvec{\Pi _U}\varvec{v})\Vert _{L^2(K)}&\le C h_K^{k+1}| \varvec{v}|_{H^{k+2}(K)}. \end{aligned}$$

(4.5)

We also recall the following trace inequality (cf. [1])

$$\begin{aligned} \Vert q\Vert _{L^2(F)}\le C \Big (h_K^{-\frac{1}{2}}\Vert q\Vert _{L^2(K)}+h_K^{\frac{1}{2}}\Vert \nabla q\Vert _{L^2(K)}\Big )\quad \forall q\in H^1(K),K\in \mathcal {T}_h, F\subset \partial K \end{aligned}$$

(4.6)

and the following discrete trace inequality

$$\begin{aligned} \Vert \varvec{v}\Vert _{L^p(F)}\le C h_K^{-1/p}\Vert \varvec{v}\Vert _{L^p(K)} \quad 1\le p\le \infty ,\;\forall \varvec{v}\in \varvec{U}_h. \end{aligned}$$

(4.7)

For later analysis, we introduce the space of the rigid body motions

$$\begin{aligned} \varvec{\varTheta }_h:=\{\varvec{\Lambda } \in \varvec{L}^2(\Omega ), \varvec{\Lambda }_{|K}=\underline{B}_{K} \varvec{x}+\varvec{b}_k, \underline{B}_K\in \mathbb {B}, \varvec{b}_K\in \mathbb {R}^d, K\in \mathcal {T}_h\}, \end{aligned}$$

where $\mathbb {B}$ represents the set of all anti-symmetric matrices in $\mathbb {R}^{d\times d}$. More precisely, $\mathbb {B}$ in $\mathbb {R}^2$ and $\mathbb {R}^3$ can be respectively represented as

$$\begin{aligned} \begin{pmatrix} 0 &{} -s\\ s &{} 0 \end{pmatrix},\quad \begin{pmatrix} 0 &{} s_3&{} -s_2\\ -s_3 &{} 0 &{} s_1\\ s_2 &{} -s_1 &{} 0 \end{pmatrix} \end{aligned}$$

with constants $s,s_i\in \mathbb {R},i=1,2,3$.

Following [31], we have the following lemma.

Lemma 4.1

Let $K\in \mathcal {T}_h$ with meshsize $h_K$ and $\varvec{\varTheta }(K):=(\varvec{\varTheta }_h)_{|K}$. Then for any function $\varvec{v}\in \varvec{U}_K$, where $\varvec{U}_K:=(\varvec{U}_h)_{|K}$, we have

$$\begin{aligned} \inf _{\varvec{\Lambda }\in \varvec{\varTheta }(K)}\Vert \nabla (\varvec{v}+\varvec{\Lambda })\Vert _{L^2(K)}\le C \Vert \varepsilon (\varvec{v})\Vert _{L^2(K)}. \end{aligned}$$

Now we can prove Theorem 2.2, which demonstrates the stability and unique solvability of the discrete formulation (2.10)–(2.12).

Proof

We first show the stability, then the uniqueness follows by setting $\varvec{f}_S=\varvec{0}$. As (2.10)–(2.12) is a square linear system, existence follows from uniqueness.

Taking $\underline{w}=\underline{\sigma }_h$, $\varvec{v}=\varvec{u}_h$ and $\varvec{\widehat{v}}=\varvec{\widehat{u}}_h$ in (2.10)–(2.12) and summing up the resulting equations, it follows from (2.9), (2.7) and (2.14)

$$\begin{aligned}&(2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2+\left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\right\| _{L^2(\partial \mathcal {T}_h)}^2\nonumber \\&\quad =(\varvec{f}_S,\varvec{u}_h)=(\mathbb {P}(\varvec{f}_S),\varvec{u}_h)\le C {\Vert \mathbb {P}(\varvec{f}_S)\Vert _{L^2(\mathcal {T}_h)}\Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}}. \end{aligned}$$

(4.8)

In the following, we prove (2.15). We can infer from (2.10) and integration by parts that

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h,\underline{w})_{\mathcal {T}_h}-(\varepsilon (\varvec{u}_h), \underline{w})_{\mathcal {T}_h}+\langle \varvec{u}_h^t,(\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\\ {}&\quad -\langle \varvec{\widehat{u}}_h,(\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}=0\quad \forall \underline{w}\in \underline{\Sigma }_h, \end{aligned}$$

thus,

$$\begin{aligned} (\varepsilon (\varvec{u}_h), \underline{w})_{\mathcal {T}_h}= ((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h,\underline{w})_{\mathcal {T}_h}+\langle \varvec{P_M}\varvec{u}_h^t- \varvec{\widehat{u}}_h,(\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h}. \end{aligned}$$

Then setting $\underline{w}=\varepsilon _h(\varvec{u}_h)$, it follows from (4.7) that

$$\begin{aligned} \Vert \varepsilon (\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}\le C \Big (\Vert \mu ^{-1}\mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big ). \end{aligned}$$

(4.9)

An appeal to (4.7) and Lemma 4.1 leads to

$$\begin{aligned} \Vert \varvec{u}_h-\varvec{P}_M\varvec{u}_h\Vert _{L^2(F)}&= \Vert \varvec{u}_h+(\underline{B}_k\varvec{x}+\varvec{b}_k)-\varvec{P_M}(\varvec{u}_h+(\underline{B}_k\varvec{x}+\varvec{b}_k))\Vert _{L^2(F)}\\&\le C h_K\Vert \nabla (\varvec{u}_h+\underline{B}_k\varvec{x}+\varvec{b}_k)\Vert _{L^2(\partial K)}\\&\le C h_K^{\frac{1}{2}}\Vert \nabla (\varvec{u}_h+\underline{B}_k\varvec{x}+\varvec{b}_k)\Vert _{L^2(K)}\\&\le Ch_K^{\frac{1}{2}} \Vert \varepsilon (\varvec{u}_h)\Vert _{L^2(K)},\quad \forall F\subset \partial K, K\in \mathcal {T}_h, \end{aligned}$$

where we use the fact that for $k\ge 1$, the projection applied to the rigid body motion is the identity.

As a result, we have

$$\begin{aligned} \begin{aligned}&\Vert h^{-1/2}(\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\\&\quad \le \Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h-\varvec{u}_h)\Vert _{L^2(\partial \mathcal {T}_h)}+\Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\\&\quad \le C \Big (\Vert \varepsilon (\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}+\Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big ). \end{aligned} \end{aligned}$$

(4.10)

Let $\varvec{u}_h^*$ represent the $H^1$-conforming counterpart of ${\varvec{u}_h}$ that can be defined using the averaging operator introduced in [24] and $\varvec{u}_h^*=\varvec{0}$ on $\partial \Omega _D$; indeed, $\varvec{u}_h^*$ at the interior Lagrangian nodes can be defined by taking the average of $\varvec{u}_h$ at the nodes [24]. Then it holds

$$\begin{aligned} \Vert \nabla (\varvec{u}_h-\varvec{u}_h^*)\Vert _{L^2(\mathcal {T}_h)}\le C \Vert h^{-1/2}\llbracket \varvec{u}_h\rrbracket \Vert _{L^2(\mathcal {F}_h\backslash \partial \Omega _N)}. \end{aligned}$$

The triangle inequality and the Korn inequality yield

$$\begin{aligned} \begin{aligned}&\Vert (\varvec{u}_h,\varvec{\widehat{u}}_h)\Vert _{1,h}\\&\quad \le C\Big ( \Vert \nabla (\varvec{u}_h-\varvec{u}_h^*)\Vert _{L^2(\mathcal {T}_h)}+\Vert \nabla \varvec{u}_h^*\Vert _{L^2(\mathcal {T}_h)}+\Vert h^{-1/2}(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big )\\&\quad \le C\Big (\Vert h^{-1/2}\llbracket \varvec{u}_h\rrbracket \Vert _{L^2(\mathcal {F}_h\backslash \partial \Omega _N)}+\Vert \varepsilon (\varvec{u}_h^*)\Vert _{L^2(\mathcal {T}_h)}+\Vert h^{-1/2}(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big )\\&\quad \le C \Big (\Vert \varepsilon (\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}+\Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big ), \end{aligned}\nonumber \\ \end{aligned}$$

(4.11)

where we use the fact that $\varvec{u}_h$ is normal continuous, and thus the jump of $\varvec{u}_h$ only involves the tangential component. (4.11) coupled with (4.9) implies (2.15).

Then it follows from (2.15) and (4.8) that

$$\begin{aligned} \mu ^{-\frac{1}{2}}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}+\Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}&\le C\mu ^{-\frac{1}{2}} {\Vert \mathbb {P}(\varvec{f}_S)\Vert _{L^2(\mathcal {T}_h)}},\\ \mu \Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}&\le C {\Vert \mathbb {P}(\varvec{f}_S)\Vert _{L^2(\mathcal {T}_h)}}.\nonumber \end{aligned}$$

(4.12)

Since $\textrm{tr}(\underline{\sigma }_h)\in L^2(\Omega )$, there exists a function $\varvec{\theta }\in \varvec{H}^1(\Omega )$ and $\varvec{\theta }\cdot \varvec{n}=0$ on $\partial \Omega _D$ (cf. [34]) such that

$$\begin{aligned} \begin{aligned} \nabla \cdot \varvec{\theta } = \textrm{tr}(\underline{\sigma }_h),\quad \Vert \varvec{\theta }\Vert _{H^1(\Omega )} \le C \Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}. \end{aligned} \end{aligned}$$

(4.13)

Then we have

$$\begin{aligned} \Vert \textrm{tr}(\underline{\sigma }_h) \Vert _{L^2(\mathcal {T}_h)}^2&=(\textrm{tr}(\underline{\sigma }_h), \nabla \cdot \varvec{\Pi _U}\varvec{\theta }) =(\textrm{tr}(\underline{\sigma }_h)\underline{I}, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}\nonumber \\&=d\Big ((\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}-(\mathcal {A}\underline{\sigma }_h,\varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}\Big ). \end{aligned}$$

(4.14)

It follows from (2.11)

$$\begin{aligned} (\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}&=\langle (\mathcal {A}\underline{\sigma }_h\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t\rangle _{\partial \mathcal {T}_h}\\&\quad -\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\theta })^t\rangle _{\partial \mathcal {T}_h}+(\varvec{f}_S,\varvec{\Pi _U}\varvec{\theta })_{\mathcal {T}_h}, \end{aligned}$$

which combined with (2.12) implies

$$\begin{aligned} (\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}&=\langle (\mathcal {A}\underline{\sigma }_h\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}-\langle \tau (\varvec{P_M} \varvec{u}_h^t\\&\quad -\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\theta })^t{-\varvec{P_M}\varvec{\theta }^t}\rangle _{\partial \mathcal {T}_h}+(\varvec{f}_S,\varvec{\Pi _U}\varvec{\theta })_{\mathcal {T}_h}\\ {}&\le C \Big (\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\mu ^{\frac{1}{2}}\Vert \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\\&\quad +\Vert \varvec{f}_S\Vert _{L^2(\mathcal {T}_h)}\Big )\Vert \varvec{\theta }\Vert _{H^1(\Omega )}. \end{aligned}$$

As such, it holds

$$\begin{aligned} \Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\mathcal {T}_h)}\le C \Big (\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\mu ^{\frac{1}{2}}\left\| \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h)\right\| _{L^2(\partial \mathcal {T}_h)}+\Vert \varvec{f}_S\Vert _{L^2(\mathcal {T}_h)}\Big ). \end{aligned}$$

This and (4.12) yield

$$\begin{aligned} \Vert \underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}\le C \Vert \varvec{f}_S\Vert _{L^2(\mathcal {T}_h)}. \end{aligned}$$

$\square $

In the following, we will carry out the convergence error estimates for the proposed scheme.

Lemma 4.2

Let $(\underline{\sigma },\varvec{u})$ be the solution of (2.1)–(2.2) and let $(\underline{\sigma }_h,\varvec{u}_h,\widehat{u}_h)\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$ be the discrete solution of (2.10)–(2.12). Assume that $(\varvec{u},\underline{\sigma })\in \varvec{H}^{s}(\Omega )\times \underline{H}^{s-1}(\Omega ),s>\frac{3}{2}$, then the following error equations hold

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}(\underline{\sigma }-\underline{\sigma }_h),\underline{w})_{ \mathcal {T}_h}-\langle (\varvec{\Pi _U} \varvec{u}- \varvec{u}_h)^n,(\underline{w}\varvec{n})^n\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\nonumber \\&\quad +(\varvec{u}-\varvec{u}_h,\textrm{div}\underline{w})_{ \mathcal {T}_h}-\langle \varvec{e_{\widehat{u}}},(\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}=0, \end{aligned}$$

(4.15)

$$\begin{aligned}&(\underline{\Pi _\Sigma }\underline{\sigma }-\underline{\sigma }_h, \varepsilon (\varvec{v}))_{ \mathcal {T}_h}-\langle ((\underline{\sigma }-\underline{\sigma }_h)\varvec{n})^t,\varvec{v}^t\rangle _{\partial \mathcal {T}_h}\nonumber \\&\quad -\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),\varvec{v}^t\rangle _{\partial \mathcal {T}_h}=0, \end{aligned}$$

(4.16)

$$\begin{aligned}&\langle ((\underline{\sigma }-\underline{\sigma }_h)\varvec{n})^t,\varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D} +\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h), \varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h}=0 \end{aligned}$$

(4.17)

for $(\underline{w},\varvec{v},\varvec{\widehat{v}})\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$.

Proof

First, we can infer from integration by parts and $\varvec{u}=\varvec{0}$ on $\partial \Omega _D$ that

$$\begin{aligned} ((2\mu )^{-1}\mathcal {A}\underline{\sigma },\underline{w})_{\mathcal {T}_h}&=( \varepsilon (\varvec{u}),\underline{w})_{\mathcal {T}_h}=\langle \varvec{u}^n,(\underline{w}\varvec{n})^n\rangle _{\partial \mathcal {T}_h}+\langle \varvec{u}^t,(\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h}-(\varvec{u},\textrm{div}\underline{w})_{\mathcal {T}_h}\\&=\langle \varvec{\Pi _U}\varvec{u}^n,(\underline{w}\varvec{n})^n\rangle _{\partial \mathcal {T}_h}+\langle \varvec{P_M}\varvec{u}^t,(\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}-(\varvec{u},\textrm{div}\underline{w})_{\mathcal {T}_h}. \end{aligned}$$

Subtracting this equation from (2.10) implies (4.15).

Then using integration by parts and $\llbracket \varvec{v}^n\rrbracket _{|F}=\varvec{0}$ for $F\in \mathcal {F}_h^0$ and $\varvec{v}\cdot \varvec{n}=0$ on $\partial \Omega _D$, we can obtain

$$\begin{aligned} (\textrm{div}\underline{\sigma },\varvec{v})_{\mathcal {T}_h}&=\langle \underline{\sigma }\varvec{n},\varvec{v}\rangle _{\partial \mathcal {T}_h}-(\underline{\sigma },\varepsilon (\varvec{v}))_{\mathcal {T}_h}\\&=\langle (\underline{\sigma }\varvec{n})^t,\varvec{v}^t\rangle _{\partial \mathcal {T}_h}-(\underline{\sigma },\varepsilon (\varvec{v}))_{\mathcal {T}_h}=-(\varvec{f}_S,\varvec{v})_{\mathcal {T}_h}. \end{aligned}$$

where we also use $\underline{\sigma }\varvec{n}=\varvec{0}$ on $\partial \Omega _N$.

Then the $L^2$-projection property of $\underline{\Pi _\Sigma }$ leads to

$$\begin{aligned} \langle (\underline{\sigma }\varvec{n})^t,\varvec{v}^t\rangle _{\partial \mathcal {T}_h}-(\underline{\Pi _\Sigma }\underline{\sigma },\varepsilon (\varvec{v}))_{\mathcal {T}_h}=-(\varvec{f}_S,\varvec{v})_{\mathcal {T}_h}, \end{aligned}$$

which combined with (2.11) leads to (4.16).

Finally, the continuity of $\underline{\sigma }$ and the boundary condition $\underline{\sigma }\varvec{n}=\varvec{0}$ on $\partial \Omega _N$ implies $ \langle (\underline{\sigma }\varvec{n})^t,\varvec{\widehat{v}}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}=0, $ then subtracting this from (2.12) yields (4.17). Thus, the proof is completed. $\square $

Lemma 4.3

Let $(\underline{\sigma },\varvec{u})$ be the solution of (2.1)–(2.2) and let $(\underline{\sigma }_h,\varvec{u}_h,\varvec{\widehat{u}}_h)\in \underline{\Sigma }_h\times \varvec{U}_h\times \varvec{\widehat{U}}_h$ be the discrete solution of (2.10)–(2.12). Assume that $(\varvec{u},\underline{\sigma })\in \varvec{H}^{s}(\Omega )\times \underline{H}^{s-1}(\Omega ),s>\frac{3}{2}$, then the following holds

$$\begin{aligned}&\left\| (2\mu )^{-\frac{1}{2}}\mathcal {A}\underline{e_\sigma }\right\| _{L^2(\mathcal {T}_h)}+\left\| \tau ^{\frac{1}{2}}( \varvec{P_M} \varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)}\\&\quad \le C \Bigg (\left\| (2\mu )^{-\frac{1}{2}}(\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma }(\mathcal {A} \underline{\sigma }))\right\| _{L^2(\mathcal {T}_h)}+\left\| \tau ^{-\frac{1}{2}}((\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma }\mathcal {A}\underline{\sigma })\varvec{n})^t\right\| _{L^2(\partial \mathcal {T}_h)}\\&\qquad +\mu ^{\frac{1}{2}}\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}+\left\| \tau ^{\frac{1}{2}}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)\right\| _{L^2(\partial \mathcal {T}_h)}\Bigg ). \end{aligned}$$

Proof

We have from (4.1)

$$\begin{aligned} ((2\mu )^{-1}\mathcal {A} (\underline{\sigma }-\underline{\Pi _\Sigma }\underline{\sigma }), \underline{w})_{\mathcal {T}_h}=((2\mu )^{-1}(\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma } (\mathcal {A}\underline{\sigma })), \underline{w})_{\mathcal {T}_h}\quad \forall \underline{w}\in \underline{\Sigma }_h. \end{aligned}$$

Taking $\underline{w}=\underline{e_\sigma }$, $\varvec{v}=\varvec{e_u}$ and $\varvec{\widehat{v}}=\varvec{e_{\widehat{u}}}$ in (4.15)–(4.17), then summing up the resulting equations yields

$$\begin{aligned} \begin{aligned}&((2\mu )^{-1}\mathcal {A} (\underline{\sigma }-\underline{\sigma }_h), \underline{e_\sigma })_{\mathcal {T}_h}+(\varvec{u}-\varvec{\Pi _U}\varvec{u}, \textrm{div}(\underline{e_\sigma }))_{ \mathcal {T}_h}\\&\quad -\langle ((\underline{\sigma }-\underline{\Pi _\Sigma }\underline{\sigma })\varvec{n})^t, \varvec{e_u}^t-\varvec{e_{\widehat{u}}}\rangle _{\partial \mathcal {T}_h}\\&\quad -\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),\varvec{e_u}^t-\varvec{e_{\widehat{u}}}\rangle _{\partial \mathcal {T}_h}=0. \end{aligned} \end{aligned}$$

(4.18)

Proceeding analogously to (4.10), we have

$$\begin{aligned} \Vert \tau ^{\frac{1}{2}}(\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\Vert _{L^2(\partial \mathcal {T}_h)}\le C \Big (\mu ^{\frac{1}{2}}\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}+\Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\Vert _{L^2(\partial \mathcal {T}_h)}\Big ). \end{aligned}$$

(4.19)

Thus, it follows from (4.1) and the fact that $(\mathcal {A} \underline{\sigma }\varvec{n})^t_{|F}= (\underline{\sigma }\varvec{n})^t_{|F}$ for $F\in \mathcal {F}_h$

$$\begin{aligned}&\langle ((\underline{\sigma }-\underline{\Pi _\Sigma }\underline{\sigma })\varvec{n})^t, \varvec{e_u}^t-\varvec{e_{\widehat{u}}}\rangle _{\partial \mathcal {T}_h}\\&\quad = \langle ((\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma }\mathcal {A}\underline{\sigma })\varvec{n})^t, \varvec{e_u}^t-\varvec{e_{\widehat{u}}}\rangle _{\partial \mathcal {T}_h}\\&\quad \le C\left\| \tau ^{-\frac{1}{2}}((\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma }\mathcal {A}\underline{\sigma })\varvec{n})^t\right\| _{L^2(\partial \mathcal {T}_h)}\left\| \tau ^{\frac{1}{2}}(\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)}\\&\quad \le C \left\| \tau ^{-\frac{1}{2}}((\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma }\mathcal {A}\underline{\sigma })\varvec{n})^t\right\| _{L^2(\partial \mathcal {T}_h)}\Big (\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}\\&\qquad +\Vert h^{-1/2}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\Vert _{L^2(\partial \mathcal {T}_h)}\Big ). \end{aligned}$$

The last term on the left-hand side of (4.18) can be rewritten as

$$\begin{aligned} -\tau (\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)&=\tau (\varvec{P_M} ((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}_h^t)-(\varvec{P_M}\varvec{u}^t-\varvec{\widehat{u}}_h))-\tau (\varvec{P_M}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t))\\&=\tau (\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})-\tau (\varvec{P_M}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)). \end{aligned}$$

Then the Cauchy–Schwarz inequality yields

$$\begin{aligned}&-\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),\varvec{e_u}^t-\varvec{e_{\widehat{u}}}\rangle _{\partial \mathcal {T}_h}\\&\quad =\langle \tau (\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})-\tau (\varvec{P_M}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)),\varvec{e_u}^t-\varvec{e_{\widehat{u}}} \rangle _{\partial \mathcal {T}_h}\\&\quad =\langle \tau (\varvec{P_M} \varvec{e_u}^t-\varvec{e_{\widehat{u}}}),\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}}\rangle _{\partial \mathcal {T}_h}\\&\qquad \;- \langle \tau (\varvec{P_M}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)),\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}} \rangle _{\partial \mathcal {T}_h}\\&\quad =\left\| \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)}^2- \langle \tau (\varvec{P_M}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)),\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}} \rangle _{\partial \mathcal {T}_h}. \end{aligned}$$

The Cauchy–Schwarz inequality gives

$$\begin{aligned}&- \langle \tau (\varvec{P_M}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)),\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}} \rangle _{\partial \mathcal {T}_h}\\&\quad \le \left\| \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)} \left\| \tau ^{\frac{1}{2}}((\varvec{\Pi _U} \varvec{u})^t-\varvec{u}^t)\right\| _{L^2(\partial \mathcal {T}_h)}. \end{aligned}$$

The proof is completed by combining the above estimates and Young’s inequality. $\square $

Lemma 4.4

Under the assumption of Lemma 4.3, the following error estimates hold

$$\begin{aligned} \Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}&\le C \Big (\Vert \mu ^{-1}\mathcal {A}\underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)}+\Vert (2\mu )^{-1}(\underline{\Pi _\Sigma } (\mathcal {A}\underline{\sigma })-\mathcal {A}\underline{\sigma })\Vert _{L^2(\mathcal {T}_h)}\nonumber \\&\quad +\Vert \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u})\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-\frac{1}{2}}\left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)} \Big )\nonumber \\ \end{aligned}$$

(4.20)

and

$$\begin{aligned} \Vert (\varvec{e_u},\varvec{\widehat{e}_u})\Vert _{1,h}&\le C \Big (\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-\frac{1}{2}}\left\| \tau ^{{\frac{1}{2}}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)}\Big ). \end{aligned}$$

Proof

Taking $\underline{w}=\varepsilon _h(\varvec{e_u})$ in (4.15) and applying integration by parts imply

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A} \underline{e_\sigma }, \varepsilon (\varvec{e_u}))_{ \mathcal {T}_h}-(\varepsilon (\varvec{u}-\varvec{u}_h), \varepsilon (\varvec{e_u}))_{ \mathcal {T}_h}\\&\quad \;+\langle (\varvec{u}-\varvec{u}_h)^t-\varvec{e_{\widehat{u}}}, (\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}=((2\mu )^{-1}\mathcal {A}(\underline{\Pi _\Sigma } \underline{\sigma }-\underline{\sigma }), \varepsilon (\varvec{e_u}))_{\mathcal {T}_h}. \end{aligned}$$

Therefore, it holds

$$\begin{aligned} \begin{aligned} \Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}^2&=((2\mu )^{-1}\mathcal {A} \underline{e_\sigma }, \varepsilon (\varvec{e_u}))_{ \mathcal {T}_h}-(\varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u}), \varepsilon (\varvec{e_u}))_{ \mathcal {T}_h}\\&\quad +\langle (\varvec{u}-\varvec{u}_h)^t-\varvec{e_{\widehat{u}}}, (\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\\&\quad \;-((2\mu )^{-1}(\underline{\Pi _\Sigma } (\mathcal {A}\underline{\sigma })-\mathcal {A}\underline{\sigma }), \varepsilon (\varvec{e_u}))_{\mathcal {T}_h}. \end{aligned} \end{aligned}$$

(4.21)

The Cauchy–Schwarz inequality and the error estimates (4.4)–(4.5) imply

$$\begin{aligned} ((2\mu )^{-1}\mathcal {A} \underline{e_\sigma }, \varepsilon (\varvec{e_u}))_{\mathcal {T}_h}&\le \frac{1}{2} \Vert \mu ^{-1}\mathcal {A} \underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)} \Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)},\\ (\varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u}), \varepsilon (\varvec{e_u}))_{ \mathcal {T}_h}&\le C \Vert \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u})\Vert _{L^2(\mathcal {T}_h)}\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)},\\ ((2\mu )^{-1}(\underline{\Pi _\Sigma } (\mathcal {A}\underline{\sigma })-\mathcal {A}\underline{\sigma }), \varepsilon (\varvec{e_u}))_{\mathcal {T}_h}&\le C \Vert (2\mu )^{-1}(\underline{\Pi _\Sigma } (\mathcal {A}\underline{\sigma })-\mathcal {A}\underline{\sigma })\Vert _{L^2(\mathcal {T}_h)}\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}. \end{aligned}$$

For the third term on the right-hand side of (4.21), we have

$$\begin{aligned}&\langle ( \varvec{u}-\varvec{u}_h)^t-\varvec{e_{\widehat{u}}}, (\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\\&\quad =\langle \varvec{u}^t-(\varvec{\Pi _U}\varvec{u})^t ,(\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}+\langle \varvec{e_u}^t-\varvec{e_{\widehat{u}}}, (\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}. \end{aligned}$$

The first term on the right-hand side can be bounded by the Cauchy–Schwarz inequality and (4.7)

$$\begin{aligned} \langle \varvec{u}^t-(\varvec{\Pi _U}\varvec{u})^t ,(\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\le C\sum _{K\in \mathcal {T}_h}h_K^{-\frac{1}{2}}\Vert \varvec{u}-\varvec{\Pi _U}\varvec{u}\Vert _{L^2(\partial K)}\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(K)}. \end{aligned}$$

The second term on the right-hand side can be estimated by using the Cauchy–Schwarz inequality and the trace inequality (4.7)

$$\begin{aligned} \langle \varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}}, (\varepsilon (\varvec{e_u})\varvec{n})^t\rangle _{\partial \mathcal {T}_h}&\le C\left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)} \left\| \tau ^{-\frac{1}{2}}\varepsilon (\varvec{e_u})\right\| _{L^2(\partial \mathcal {T}_h)}\\&\le C\mu ^{-\frac{1}{2}} \left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)} \Vert \varepsilon (\varvec{e_u})\Vert _{L^2( \mathcal {T}_h)}. \end{aligned}$$

The proof of (4.20) is completed by combining the above estimates and Young’s inequality.

Proceeding similarily to (4.11), it holds

$$\begin{aligned} \Vert (\varvec{e_u},\varvec{e_{\widehat{u}}})\Vert _{1,h}^2\le C \Big (\Vert \varepsilon (\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-\frac{1}{2}}\Vert \tau ^{{\frac{1}{2}}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big ), \end{aligned}$$

which completes the proof. $\square $

Combining Lemmas 4.3–4.4, the trace inequality (4.6) and the error estimates (4.2)–(4.5) implies Theorem 2.3.

We can observe from (2.16) that the following superconvergence property holds

$$\begin{aligned} \Vert h^{\frac{1}{2}} (\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\Vert _{L^2(\partial \mathcal {T}_h)}\le C h^{k+2} |\varvec{u}|_{H^s(\Omega )}. \end{aligned}$$

This superconvergence property is crucial to achieve the optimal convergence rates for $L^2$-errors of stress and velocity, as we are going to see in the following.

Proof of Theorem 2.4 ($L^2$-error for stress)

The proof is similar to that of the stability estimate for $\Vert \underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}$ given in Theorem 2.2. We provide the proof here for completeness. The definition of $\mathcal {A}$ implies $ \underline{e_\sigma }=\mathcal {A} \underline{e_\sigma }+\frac{1}{d} \textrm{tr}(\underline{e_\sigma }) \underline{I}. $ The upper bound for $\mathcal {A}\underline{e_\sigma }$ is given in Theorem 2.3. Thus, it suffices to show the error estimate for the second term. Since $\textrm{tr}(\underline{e_\sigma })\in L^2(\Omega )$, there exists a function $\varvec{\theta }\in \varvec{H}^1(\Omega )$ and $\varvec{\theta }\cdot \varvec{n}=0$ on $\partial \Omega _D$ (cf. [34]) such that

$$\begin{aligned} \begin{aligned} \nabla \cdot \varvec{\theta }&= \textrm{tr}(\underline{e_\sigma }),\quad \Vert \varvec{\theta }\Vert _{H^1(\Omega )} \le C \Vert \textrm{tr}(\underline{e_\sigma })\Vert _{L^2(\Omega )}. \end{aligned} \end{aligned}$$

(4.22)

Therefore, we have

$$\begin{aligned}{} & {} \Vert \textrm{tr}(\underline{e_\sigma })\Vert _{L^2(\mathcal {T}_h)}^2=(\textrm{tr}(\underline{e_\sigma }), \nabla \cdot \varvec{\theta })_{ \mathcal {T}_h}=(\textrm{tr}(\underline{e_\sigma }), \nabla \cdot (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}\\{} & {} \quad =d(\underline{e_\sigma }, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{ \mathcal {T}_h}-d(\mathcal {A} \underline{e_\sigma }, \varepsilon (\varvec{\Pi _U}\theta ))_{ \mathcal {T}_h}, \end{aligned}$$

we can estimate the second term on the right-hand side via the Cauchy–Schwarz inequality and Theorem 2.3. It remains to show the upper bound for the first term on the right-hand side.

We can deduce from (4.16) and (4.17)

$$\begin{aligned} (\underline{e}_\sigma , \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{ \mathcal {T}_h}&=\langle (\underline{\sigma }\varvec{n})^t-(\underline{\sigma }_h\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t\rangle _{\partial \mathcal {T}_h} +\langle \tau (\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\theta })^t \rangle _{\partial \mathcal {T}_h}\\&=\langle (\underline{\sigma }\varvec{n})^t-(\underline{\sigma }_h\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\ {}&\quad +\langle \tau (\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t \rangle _{\partial \mathcal {T}_h}. \end{aligned}$$

An application of (4.1), the Cauchy–Schwarz inequality, the trace inequality (4.6) and (4.22) implies

$$\begin{aligned}&\langle ((\underline{\sigma }-\underline{\sigma }_h)\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\&\quad = \langle (\mathcal {A}(\underline{\sigma }-\underline{\sigma }_h)\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\&\quad =\langle ((\mathcal {A}\underline{\sigma }-\underline{\Pi _\Sigma }\mathcal {A}\underline{\sigma })\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\&\qquad +\langle (\mathcal {A}(\underline{\Pi _\Sigma }\underline{\sigma }-\underline{\sigma }_h)\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\&\quad \le C\Big ( h^t|\mathcal {A}\underline{\sigma }|_{H^t(\Omega )}+\Vert \mathcal {A}(\underline{\Pi _\Sigma }\underline{\sigma }-\underline{\sigma }_h)\Vert _{L^2(\Omega )}\Big )\Vert \varvec{\theta }\Vert _{H^1(\Omega )}\\&\quad \le C \Big ( h^t|\mathcal {A}\underline{\sigma }|_{H^t(\Omega )}+\Vert \mathcal {A}(\underline{\Pi _\Sigma }\underline{\sigma }-\underline{\sigma }_h)\Vert _{L^2(\Omega )}\Big )\Vert \textrm{tr}(\underline{e_\sigma })\Vert _{L^2(\Omega )}. \end{aligned}$$

Similarily, we have

$$\begin{aligned}&\langle \tau (\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\&\quad \le C \mu ^{\frac{1}{2}}\Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Vert \varvec{\theta }\Vert _{H^1(\Omega )}. \end{aligned}$$

The assertion follows by combining the above estimates and Theorem 2.3. $\square $

Now we are ready to prove the $L^2$-error of velocity. For any $\varvec{g}\in \varvec{L}^2(\Omega )$, we assume that the dual problem

$$\begin{aligned} (2\mu )^{-1} \mathcal {A}\underline{\psi }&= \varepsilon (\varvec{\phi })\quad \text{ in }\;\Omega , \end{aligned}$$

(4.23)

$$\begin{aligned} \textrm{div}\underline{\psi }&=\varvec{g} \quad \text{ in }\;\Omega ,\nonumber \\ \varvec{\phi }&=\varvec{0}\quad \text{ on }\;\partial \Omega _D,\nonumber \\ \underline{\psi }\varvec{n}&=\varvec{0}\quad \text{ on }\;\partial \Omega _N \end{aligned}$$

(4.24)

satisfies the elliptic regularity estimate

$$\begin{aligned} \Vert \underline{\psi }\Vert _{H^1(\Omega )}+\Vert \mu \varvec{\phi }\Vert _{H^2(\Omega )}\le C \Vert \varvec{g}\Vert _{L^2(\Omega )}. \end{aligned}$$

This estimate holds, for instance, if the domain $\Omega $ is convex (cf. [33]).

Proof of Theorem 2.5 ($L^2$-error for velocity)

First we set $\varvec{g}=\varvec{u}-\varvec{u}_h$ in (4.23)–(4.24). Then we multiply (4.23) by $\underline{e_\sigma }$ and (4.24) by $\varvec{u}-\varvec{u}_h$, then we can obtain

$$\begin{aligned} \Vert \varvec{u}-\varvec{u}_h\Vert _{L^2(\mathcal {T}_h)}^2&=(\textrm{div}\underline{\psi }, \varvec{u}-\varvec{u}_h)_{\mathcal {T}_h}+((2\mu )^{-1}\mathcal {A}\underline{\psi }, \underline{e}_\sigma )_{\mathcal {T}_h}-(\underline{e_\sigma },\varepsilon (\varvec{\phi }))_{\mathcal {T}_h}\\&=(\textrm{div}(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }), \varvec{u}-\varvec{u}_h )_{\mathcal {T}_h}+(\textrm{div}\underline{\Pi _\Sigma }\underline{\psi }, \varvec{u}-\varvec{u}_h )_{\mathcal {T}_h}\\&\quad +((2\mu )^{-1}\mathcal {A} \Pi _\Sigma \underline{\psi }, \underline{e_\sigma })_{\mathcal {T}_h}\\&\quad \;+((2\mu )^{-1}\mathcal {A}(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }), \underline{e_\sigma })_{\mathcal {T}_h}-(\underline{e_\sigma },\varepsilon (\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}\\&\quad \;-(\underline{e_\sigma },\varepsilon (\varvec{\phi }-\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}. \end{aligned}$$

We can infer from (4.15)–(4.16) by taking $\underline{w}= \underline{\Pi _\Sigma }\underline{\psi }$ and $\varvec{v}=\varvec{\Pi _{U}}\varvec{\phi }$

$$\begin{aligned}&(\varvec{u}-\varvec{u}_h, \textrm{div}\underline{\Pi _\Sigma } \underline{\psi })_{\mathcal {T}_h}+((2\mu )^{-1}\mathcal {A} \underline{\Pi _\Sigma }\underline{\psi }, \underline{e_\sigma })_{\mathcal {T}_h}\\&\quad =\langle \varvec{e_u}^n, (\underline{\Pi _\Sigma }\underline{\psi })^n\rangle _{\partial \mathcal {T}_h}+\langle \varvec{e_{\widehat{u}}},(\underline{\Pi _\Sigma }\underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h},\\&(\underline{e_\sigma },\varepsilon (\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}=\langle (\underline{\sigma }\varvec{n})^t-(\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\phi })^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\qquad +\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\phi })^t \rangle _{\partial \mathcal {T}_h}. \end{aligned}$$

By the elliptic regularity assumption $(\underline{\psi },\varvec{\phi })\in \underline{H}^1(\Omega )\times \varvec{H}^2(\Omega )$, thereby $\varvec{\phi }$ and the normal component of $\underline{\psi }$ are continuous. Then it holds

$$\begin{aligned} \langle (\underline{\sigma }\varvec{n})^t-(\underline{\sigma }_h\varvec{n})^t,\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}+\langle \tau (\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h),\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h}&=0,\quad \langle \varvec{e_{\widehat{u}}},(\underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}=0. \end{aligned}$$

Using integration by parts and the $L^2$-orthogonal property of $\underline{\Pi _\Sigma }$, we can obtain

$$\begin{aligned}&(\textrm{div}(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }), \varvec{u}-\varvec{u}_h )_{\mathcal {T}_h}\\&\quad =\langle (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n}, \varvec{u}-\varvec{u}_h\rangle _{\partial \mathcal {T}_h}-(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }, \varepsilon (\varvec{u}-\varvec{u}_h))_{\mathcal {T}_h}\\&\quad =\langle (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n}, \varvec{u}-\varvec{\Pi _{U}}\varvec{u}\rangle _{\partial \mathcal {T}_h}+\langle (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n}, \varvec{e_u}\rangle _{\partial \mathcal {T}_h}\\&\qquad -(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }, \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u}))_{\mathcal {T}_h}\\&\quad ={\langle (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n}, \varvec{u}-\varvec{\Pi _{U}}\varvec{u}\rangle _{\partial \mathcal {T}_h}-\langle (\underline{\Pi _\Sigma } \underline{\psi }\varvec{n})^n, \varvec{e_u}^n\rangle _{\partial \mathcal {T}_h}}\\&\qquad {+\langle ((\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n})^t, \varvec{e_u}^t\rangle _{\partial \mathcal {T}_h}-(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }, \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u}))_{\mathcal {T}_h}.} \end{aligned}$$

Then we have

$$\begin{aligned}&\Vert \varvec{u}-\varvec{u}_h\Vert _{L^2(\mathcal {T}_h)}^2\nonumber \\&\quad =\langle \varvec{e_u}^t- \varvec{e_{\widehat{u}}},((\underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi })\varvec{n})^t\rangle _{\partial \mathcal {T}_h}-\langle (\underline{\sigma }\varvec{n})^t-(\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\phi })^t-\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\nonumber \\&\qquad +((2\mu )^{-1}\mathcal {A}(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }), \underline{e_\sigma })_{\mathcal {T}_h}-(\mathcal {A}\underline{e}_\sigma ,\varepsilon (\varvec{\phi }-\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}\nonumber \\&\qquad - \langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\phi }-\varvec{\phi })^t\rangle _{\partial \mathcal {T}_h} -(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }, \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u}))_{\mathcal {T}_h}\nonumber \\&\qquad +\langle (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n}, \varvec{u}-\varvec{\Pi _{U}}\varvec{u}\rangle _{\partial \mathcal {T}_h}, \end{aligned}$$

(4.25)

where we use $(\textrm{tr}(\underline{e_\sigma })\underline{I}, \varepsilon (\varvec{\phi }-\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}=(\textrm{tr}(\underline{e_\sigma }), \nabla \cdot (\varvec{\phi }-\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}=0$ for the fourth term on the right-hand side.

Now we estimate each term on the right-hand side of (4.25) separately. First, we have from the Cauchy–Schwarz inequality and (4.19)

$$\begin{aligned}&\langle \varvec{e_u}^t- \varvec{e_{\widehat{u}}},((\underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi })\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\\&\quad \le C\left\| \tau ^{\frac{1}{2}}(\varvec{e_u}^t- \varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)} \left\| \tau ^{-\frac{1}{2}}((\underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi })\varvec{n})^t\right\| _{L^2(\partial \mathcal {T}_h)}\\&\quad \le C\Big (\Vert \varepsilon (\varvec{e_u})\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-\frac{1}{2}} \left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{e_u}^t-\varvec{e_{\widehat{u}}})\right\| _{L^2(\partial \mathcal {T}_h)}\Big ) \Vert \varvec{\psi }\Vert _{H^1(\Omega )}. \end{aligned}$$

We can infer from $(\textrm{tr}(\underline{\sigma }-\underline{\sigma }_h)\underline{I}\varvec{n})^t=\varvec{0}$, the trace inequality (4.6), (4.4)–(4.5) and Theorem 2.3 that

$$\begin{aligned}&\langle (\underline{\sigma }\varvec{n})^t-(\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\phi })^t-\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\quad = \langle (\mathcal {A}\underline{\sigma }\varvec{n})^t-(\mathcal {A}\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\phi })^t-\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\quad = \langle (\mathcal {A}\underline{\sigma }\varvec{n})^t-(\Pi _\Sigma \mathcal {A}\underline{\sigma }\varvec{n})^t,(\varvec{\Pi _U}\varvec{\phi })^t-\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\qquad + \langle (\Pi _\Sigma \mathcal {A}\underline{\sigma }\varvec{n})^t-(\mathcal {A}\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\phi })^t-\varvec{\phi }^t\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\quad \le C \Big (h^{t+1}\mu ^{-1}|\mathcal {A}\underline{\sigma }|_{H^t(\Omega )} + h^s |\varvec{u}|_{H^s(\Omega )}\Big ) \mu \Vert \varvec{\phi }\Vert _{H^2(\Omega )}. \end{aligned}$$

The Cauchy–Schwarz inequality yields

$$\begin{aligned} \langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\phi }-\varvec{\phi })^t\rangle _{\partial \mathcal {T}_h}\le C h \mu ^{\frac{1}{2}}\Vert \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Vert \varvec{\phi }\Vert _{H^2(\Omega )}. \end{aligned}$$

Similarily, we have

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }), \underline{e}_\sigma )_{\mathcal {T}_h}\\&\quad = ((2\mu )^{-1}(\mathcal {A}\underline{\psi }-\underline{\Pi _\Sigma } (\mathcal {A} \underline{\psi })), \underline{e_\sigma })_{\mathcal {T}_h}\\&\quad \le \Vert \mathcal {A}\underline{\psi }-\underline{\Pi _\Sigma } (\mathcal {A} \underline{\psi })\Vert _{L^2(\mathcal {T}_h)}\Vert \mu ^{-1}\mathcal {A}\underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)}\\&\quad \le C h \Vert \underline{\psi }\Vert _{H^1(\Omega )}\Vert \mu ^{-1}\mathcal {A}\underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)} \end{aligned}$$

and

$$\begin{aligned} (\mathcal {A}\underline{e_\sigma },\varepsilon (\varvec{\phi }-\varvec{\Pi _U}\varvec{\phi }))_{\mathcal {T}_h}&\le \Vert \mathcal {A}\underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)}\Vert \varepsilon (\varvec{\phi }-\varvec{\Pi _U}\varvec{\phi })\Vert _{L^2(\mathcal {T}_h)}\\&\le C\mu ^{-1} h\Vert \mathcal {A}\underline{e_\sigma }\Vert _{L^2(\mathcal {T}_h)}\mu \Vert \varvec{\phi }\Vert _{H^2(\Omega )} . \end{aligned}$$

We can infer from the Cauchy–Schwarz inequality and the interpolation error estimate (4.5) that

$$\begin{aligned} (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi }, \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u}))_{\mathcal {T}_h}&\le C h \Vert \underline{\psi }\Vert _{H^1(\Omega )}\Vert \varepsilon (\varvec{u}-\varvec{\Pi _U}\varvec{u})\Vert _{L^2(\mathcal {T}_h)}\\ {}&\le C h^{s} \Vert \underline{\psi }\Vert _{H^1(\Omega )}|\varvec{u}|_{H^{s}(\Omega )}. \end{aligned}$$

The Cauchy–Schwarz inequality, the trace inequality (4.6) and the error estimates (4.4)–(4.5) yield

$$\begin{aligned} \langle (\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi })\varvec{n}, \varvec{u}-\varvec{\Pi _{U}}\varvec{u}\rangle _{\partial \mathcal {T}_h}\le Ch^{s} \Vert \underline{\psi }\Vert _{H^1(\Omega )} |\varvec{u}|_{H^{s}(\Omega )}. \end{aligned}$$

Combining the preceding estimates with Theorem 2.3 completes the proof. $\square $

5 Convergence of the Navier–Stokes equations

In this section, we prove Theorem 2.6 on the convergence to the weak solution under minimal regularity assumption.

We first recall the following inequalities, which will play an important role for later analysis.

Lemma 5.1

There exists C independent of h such that for all $F\subset \partial K, K\in \mathcal {T}_h$ and $1\le p,q\le \infty $

$$\begin{aligned} \Vert \varvec{v}\Vert _{L^p(K)}\le Ch_K^{d(1/p-1/q)}\Vert \varvec{v}\Vert _{L^q(K)}\quad \forall \varvec{v}\in \varvec{U}_h. \end{aligned}$$

(5.1)

Lemma 5.2

There exists C independent of h such that

$$\begin{aligned} \Vert \varvec{v}\Vert _{L^q(\Omega )}\le C\Vert \varvec{v}\Vert _h\quad 1\le q\le 6,\;\forall \varvec{v}\in \varvec{U}_h. \end{aligned}$$

Moreover, the non-negativity and the boundedness are stated in the next two lemmas, one can refer to [10] for the proof.

Lemma 5.3

(Non-negativity) Let $\varvec{w}\in \varvec{H}^1_0(\Gamma _D)+ (\varvec{U}_h\cap \varvec{H}(\textrm{div};\Omega ))$ with $\nabla \cdot \varvec{w}=0$. Then we have

$$\begin{aligned} N_h(\varvec{w};(\varvec{v},\widehat{\varvec{v}}),(\varvec{v},\widehat{\varvec{v}}))\ge 0\quad \forall \varvec{v}\in \varvec{U}_h+\varvec{H}^1_0(\Gamma _D), \widehat{\varvec{v}}\in \widehat{\varvec{U}}_h. \end{aligned}$$

Lemma 5.4

(Boundedness) For any $\varvec{z}_h,\varvec{v}_h,\varvec{w}_h\in \varvec{U}_h$ and $\widehat{\varvec{v}}_h,\widehat{\varvec{w}}_h\in \widehat{\varvec{U}}_h$, it holds

$$\begin{aligned} N_h(\varvec{z}_h;(\varvec{v}_h,\widehat{\varvec{v}}_h),(\varvec{w}_h,\widehat{\varvec{w}}_h))\le C\left\| \varvec{z}_h\right\| _h\left\| (\varvec{v}_h,\widehat{\varvec{v}}_h)\right\| _{1,h}\left\| (\varvec{w}_h,\widehat{\varvec{w}}_h)\right\| _{1,h}. \end{aligned}$$

A solution operator $T_h:\varvec{U}_h\rightarrow \varvec{U}_h$ is defined as follows: For given $\varvec{z}_h\in \varvec{U}_h$, find $\varvec{w}_h=T_h(\varvec{z}_h)\in \varvec{U}_h\times \varvec{\widehat{U}}_h$ such that

$$\begin{aligned}{} & {} \mathbb {A}_h((\underline{S}_h,\varvec{w}_h,\widehat{\varvec{w}}_h),(\varvec{H},\varvec{v},\widehat{\varvec{v}}))+N_h(\varvec{z}_h;(\varvec{w}_h,\widehat{\varvec{w}}_h),(\varvec{v},\widehat{\varvec{v}}))\nonumber \\{} & {} \quad =(\varvec{f}_N,\varvec{v})\quad \forall (\varvec{H},\varvec{v},\widehat{\varvec{v}})\in \underline{\Sigma }_h\times \varvec{U}_h\times \widehat{\varvec{U}}_h \end{aligned}$$

(5.2)

for some $(\underline{S}_h,\widehat{\varvec{w}}_h)\in \underline{\Sigma }_h\times \widehat{\varvec{U}}_h$.

Observe that finding the solution to (2.13) is equivalent to finding a fixed-point $\varvec{u}_h$ of $T_h$ so that

$$\begin{aligned} T_h(\varvec{u}_h)=\varvec{u}_h \end{aligned}$$

with its corresponding $\underline{S}_h$ and $\widehat{\varvec{w}}_h$.

Then, we have the following stability estimate.

Lemma 5.5

For any $\varvec{z}_h\in \varvec{U}_h$, we have

$$\begin{aligned} \left\| T_h(\varvec{z}_h)\right\| _h=\left\| \varvec{w}_h\right\| _h\le C\mu ^{-1}\left\| \varvec{f}_0\right\| _{L^2(\Omega )}, \end{aligned}$$

where $\varvec{f}_0$ is from the Helmholtz-Hodge decomposition $\varvec{f}_N=\varvec{f}_0+\nabla \chi $ with $\varvec{f}_0\in \varvec{H}(\textrm{div};\Omega ),\nabla \cdot \varvec{f}_0=0$ and $\chi \in L^2_0(\Omega )\cap H^1(\Omega )$.

Proof

Let $\varvec{z}_h\in \varvec{U}_h$ and $\varvec{w}_h=T_h(\varvec{z}_h)$. Then there exists $\varsigma =(\underline{G}_h,\varvec{w}_h,\widehat{\varvec{w}}_h)$ such that

$$\begin{aligned} \mathbb {A}_h(\varsigma ,\varphi )+N_h(\varvec{z}_h;(\varvec{w}_h,\widehat{\varvec{w}}_h),(\varvec{v},\widehat{\varvec{v}}))=(\varvec{f}_N,\varvec{v})\quad \forall \varphi =(\underline{H},\varvec{v},\widehat{\varvec{v}})\in \underline{\Sigma }_h\times \varvec{U}_h\times \widehat{\varvec{U}}_h. \end{aligned}$$

Taking $\varphi =\varsigma $, we obtain using Lemma 5.3

$$\begin{aligned}{} & {} \mu ^{-1}\left\| \mathcal {A}\underline{G}_h\right\| _{L^2(\Omega )}^2+\Vert \tau ^{1/2}(\varvec{P_M}\varvec{w}_h^t-\widehat{\varvec{w}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}^2\\{} & {} \quad \le \mathbb {A}_h(\varsigma ,\varsigma )+N_h(\varvec{z}_h;(\varvec{w}_h,\widehat{\varvec{w}}_h),(\varvec{w}_h,\widehat{\varvec{w}}_h))=(\varvec{f}_N,\varvec{w}_h). \end{aligned}$$

Then the Helmholtz-Hodge decomposition $\varvec{f}_N=\varvec{f}_0+\nabla \chi $ and the fact that $\nabla \cdot \varvec{w}_h=0$ yield

$$\begin{aligned}{} & {} \mu ^{-1}\left\| \mathcal {A}\underline{G}_h\right\| _{L^2(\Omega )}^2+\Vert \tau ^{1/2}(\varvec{P_M}\varvec{w}_h^t-\widehat{\varvec{w}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}^2\\{} & {} \quad \le C\Vert \varvec{f}_0\Vert _{L^2(\Omega )}\Vert \varvec{w}_h\Vert _{L^2(\Omega )}. \end{aligned}$$

Proceeding analogously to (2.15), we can infer from the discrete Poincaré inequality and (2.8)

$$\begin{aligned} \Vert \varvec{w}_h\Vert _h\le & {} C \left\| (\varvec{w}_h,\widehat{\varvec{w}}_h)\right\| _{1,h}\le C\Big ( \mu ^{-1}\left\| \mathcal {A}\underline{G}_h\right\| _{L^2(\Omega )}\\{} & {} +\mu ^{-1/2}\Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{w}_h^t-\widehat{\varvec{w}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Big )\le C\mu ^{-1} \Vert \varvec{f}_0\Vert _{L^2(\Omega )}. \end{aligned}$$

Therefore, the proof is completed. $\square $

Lemma 5.6

Let $\{(\underline{\sigma }_h,\varvec{u}_h,\widehat{\varvec{u}}_h){\}}_{h>0}$ be the sequence of solutions obtained from solving (2.13), then it holds

$$\begin{aligned} \Vert \underline{\sigma }_h\Vert _{L^2(\Omega )}&\le C\Big ( \Vert \varvec{f}_N\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-2}\Vert \varvec{f}_0\Vert _{L^2(\mathcal {T}_h)}^2\Big ),\\ \Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}&\le C \Vert \varvec{f}_0\Vert _{L^2(\mathcal {T}_h)},\\ \Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}&\le C\mu ^{-1}\Vert \varvec{f}_0\Vert _{L^2(\mathcal {T}_h)}. \end{aligned}$$

Proof

Proceeding in an analogous way to Lemma 5.5, we can obtain the upper bound for $ \Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}$ and $\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}$. It remains to estimate $\Vert \underline{\sigma }_h\Vert _{L^2(\Omega )}$, which can be proved in a similar way to that of the Stokes equations with additional treatment for the convective trilinear form. In fact, we need to estimate $\Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}$, that is, we need to bound the right-hand side of (4.14).

There exists a function $\varvec{\theta }\in \varvec{H}^1(\Omega )$ and $\varvec{\theta }\cdot \varvec{n}=0$ on $\partial \Omega _D$ (cf. [34]) such that

$$\begin{aligned} \begin{aligned} \nabla \cdot \varvec{\theta } = \textrm{tr}(\underline{\sigma }_h),\quad \Vert \varvec{\theta }\Vert _{H^1(\Omega )} \le C \Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}. \end{aligned} \end{aligned}$$

In view of (4.14), it suffices to show the upper bound for $(\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}$. It follows from (2.13) and Lemma 5.4 that

$$\begin{aligned} (\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\theta }))_{\mathcal {T}_h}&=\langle \mathcal {A}(\underline{\sigma }_h\varvec{n})^t, (\varvec{\Pi _U}\varvec{\theta })^t -\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\ {}&\quad -\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\theta })^t-\varvec{P_M}\varvec{\theta }^t\rangle _{\partial \mathcal {T}_h}\\&\quad -N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{\Pi _U}\varvec{\theta },\varvec{P_M}\varvec{\theta }^t))+(\varvec{f}_N,\varvec{\Pi _U}\varvec{\theta })_{\mathcal {T}_h}\\&\le C \Big (\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\mu ^{\frac{1}{2}}\Vert \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{u}_h^t\\&-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}{+\Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}^2+\Vert \varvec{f}_N\Vert _{L^2(\mathcal {T}_h)}}\Big )\Vert \varvec{\theta }\Vert _{H^1(\Omega )}. \end{aligned}$$

Therefore, it holds owing to (4.14)

$$\begin{aligned}&\Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}\le C\Big ( \Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\mu ^{\frac{1}{2}}\Vert \tau ^{\frac{1}{2}} (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\\&\qquad +\Vert \varvec{f}_N\Vert _{L^2(\mathcal {T}_h)}+\Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}^2\Big )\\&\quad \le C \Big ( \Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\mathcal {T}_h)}+\Vert \varvec{f}_N\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-2}\Vert \varvec{f}_0\Vert _{L^2(\mathcal {T}_h)}^2\Big ). \end{aligned}$$

Thus, the triangle inequality yields

$$\begin{aligned} \Vert \underline{\sigma }_h\Vert _{L^2(\Omega )}\le \Big (\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}+\Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}\Big )\le C \Big (\Vert \varvec{f}_N\Vert _{L^2(\mathcal {T}_h)}+\mu ^{-2}\Vert \varvec{f}_0\Vert _{L^2(\mathcal {T}_h)}^2\Big ). \end{aligned}$$

$\square $

By Lemma 5.5 and the Brouwer fixed point theorem, the existence of the fixed-point $\varvec{u}_h=T_h(\varvec{u}_h)$ is guaranteed. Now we are ready to show the convergence to the weak solution. To facilitate later analysis, we define the lifting operator $\underline{R}$: $\varvec{L}^2(\partial \mathcal {T}_h)\rightarrow \underline{\Sigma }_h$ by

$$\begin{aligned} \int _{\Omega } \underline{R}(\varvec{\psi }) \underline{w}\;dx=\langle \varvec{\psi }, (\underline{w}\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\quad \forall \underline{w}\in \underline{\Sigma }_h. \end{aligned}$$

(5.3)

Let $q\in P_h$, the following holds in view of (5.3) and (2.14)

$$\begin{aligned} (\textrm{tr}(\varepsilon (\varvec{u}_h)-\underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)), q)_{\mathcal {T}_h}&=\sum _{K\in \mathcal {T}_h}(\nabla \cdot \varvec{u}_h,q)_K-(\underline{R}(\varvec{P_M}(\varvec{u}_h^t)-\varvec{\widehat{u}}_h), q\underline{I})_{\mathcal {T}_h}\\&\quad =\sum _{K\in \mathcal {T}_h}(\nabla \cdot \varvec{u}_h,q)_K=0. \end{aligned}$$

Let $(\varepsilon _h(\varvec{u}_h))_{|K}:=\varepsilon ((\varvec{u}_h)_{|K})$. Therefore, $ \textrm{tr}(\varepsilon _h(\varvec{u}_h)-\underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h))=0 $. Notice that $\varepsilon _h(\varvec{u}_h)-\underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\in \underline{\Sigma }_h $ and let $\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h)=\varepsilon _h(\varvec{u}_h)-\underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)$, then we can infer from (2.13) and integration by parts that

$$\begin{aligned} (2\mu )^{-1}\mathcal {A}\underline{\sigma }_h = \underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h) . \end{aligned}$$

(5.4)

The lifting operator satisfies the following estimate.

Lemma 5.7

The following error estimate holds

$$\begin{aligned} \Vert \underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\mathcal {T}_h)}\le C \mu ^{-\frac{1}{2}} \Bigg (\sum _{K\in \mathcal {T}_h} \left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\right\| _{L^2(\partial K)}^2\Bigg )^{\frac{1}{2}}. \end{aligned}$$

Proof

We can deduce from (5.3), the Cauchy–Schwarz inequality and the trace inequality (4.7) that

$$\begin{aligned}&\Vert \underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\mathcal {T}_h)}^2\\&\quad =(\underline{R}(\varvec{\varvec{P_M}}\varvec{u}_h^t-\varvec{\widehat{u}}_h),\underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h))_{\mathcal {T}_h}\\&\quad =\langle \varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h,(\underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\varvec{n})^t\rangle _{\partial \mathcal {T}_h}\\&\quad \le C \sum _{K\in \mathcal {T}_h}h_K^{-1/2} \Vert \varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h\Vert _{L^2(\partial K)}\Vert \underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2( K)}\\&\quad \le C \mu ^{-\frac{1}{2}} \Bigg (\sum _{K\in \mathcal {T}_h} \Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial K)}^2\Bigg )^{\frac{1}{2}}\Vert \underline{R}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\mathcal {T}_h)}, \end{aligned}$$

which implies the desired estimate. $\square $

In addition, we also define the lifting operator $\underline{R}^s:L^2(\partial \mathcal {T}_h)\rightarrow P_k(\mathcal {T}_h)^{d\times d}$ by

$$\begin{aligned} \int _{\Omega } \underline{R}^s(\varvec{\psi }) \underline{w}\;dx=\langle \varvec{\psi }, \underline{w}\varvec{n}\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\quad \forall \underline{w}\in P_k(\mathcal {T}_h)^{d\times d}. \end{aligned}$$

Then a discrete gradient operator is defined as

$$\begin{aligned} \underline{\mathcal {G}}_h^k(\varvec{v},\widehat{\varvec{v}}):=\nabla _h\varvec{v}-\underline{R}^s(\varvec{v}^t-\widehat{\varvec{v}}), \end{aligned}$$

where $\nabla _h$ represents the element-wise gradient operator.

An application of integration by parts implies that

$$\begin{aligned} N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{v}_h,\widehat{\varvec{v}}_h))&=(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{v}_h)_{\mathcal {T}_h}\nonumber \\&\quad +\frac{1}{2}\left\langle |\varvec{u}_h\cdot \textbf{n}|+\varvec{u}_h\cdot \varvec{n},(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot (\varvec{v}_h^t-\widehat{\varvec{v}}_h)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}.\nonumber \\ \end{aligned}$$

(5.5)

Theorem 5.1

(Compactness) Let $1\le p<\infty $. Let $\{(\varvec{v}_h,\widehat{\varvec{v}}_h)\}_{h>0}$ be a sequence in $\varvec{U}_h\times \widehat{\varvec{U}}_h$ bounded in $\Vert (\cdot ,\cdot )\Vert _{1,h}$-norm. Then the sequence $\varvec{v}_h$ is relatively compact in $\varvec{L}^q(\Omega ),1\le q\le 6$.

Proof

We can prove the theorem following Lemma 6.2 and Theorem 6.2 in [16], and the boundedness (2.8). The details are omitted for simplicity. $\square $

Lemma 5.8

Let $\{(\varvec{v}_h,\widehat{\varvec{v}}_h)\}_{h>0}$ be a sequence in $\varvec{U}_h\times \widehat{\varvec{U}}_h$. Assume that the sequence $\{(\varvec{v}_h,\widehat{\varvec{v}}_h)\}_{h>0}$ is bounded in $\Vert \cdot \Vert _{1,h}$-norm. Then, there exists a function $\varvec{v}\in \varvec{H}^1_0(\Gamma _D)$ such that as $h\rightarrow 0$, up to a subsequence, $\varvec{v}_h\rightarrow \varvec{v}$ strongly in $\varvec{L}^2(\Omega )$ and $\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h)\rightharpoonup \varepsilon (\varvec{v})$ weakly in $\underline{L}^2(S,\Omega )$.

Proof

Owing to Theorem 5.1 applied with $q=2$, there exists a function $\varvec{v}\in \varvec{L}^2(\Omega )$ such that up to a subsequence, $\varvec{v}_h\rightarrow \varvec{v}$ strongly in $\varvec{L}^2(\Omega )$. Moreover, $\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h)$ is bounded in $\varvec{L}^2$-norm owing to Lemma 5.7. Thus, up to a new subsequence, there is $\underline{w}\in \underline{L}^2(S,\Omega )$ such that $G_h(\varvec{v}_h,\widehat{\varvec{v}}_h)\rightharpoonup \underline{w}$ weakly in $\underline{L}^2(S,\Omega )$.

Let $\underline{\psi }\in \underline{C}^{\infty }(S,\Omega )\cap \underline{C}(\bar{\Omega })$, it holds in view of (5.3) and the definition of $\varvec{P_M}$ that

$$\begin{aligned} -\langle \varvec{v}_h^t, (\underline{\Pi _\Sigma } \underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h}&=-\langle \varvec{P_M}\varvec{v}_h^t,( \underline{\Pi _\Sigma } \underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h}=-\int _{\Omega } \underline{R}(\varvec{P_M}\varvec{v}_h^t)\underline{\Pi }_\Sigma \underline{\psi }\;dx,\\ \langle \widehat{\varvec{v}}_h, (\underline{\Pi }_\Sigma \underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h}&=\int _{\Omega } \underline{R}(\widehat{\varvec{v}}_h)\underline{\Pi }_\Sigma \underline{\psi }\;dx. \end{aligned}$$

Thanks to the fact that $\widehat{\varvec{v}}_h$ is single valued and $\widehat{\varvec{v}}_h=\varvec{0}$ on $\partial \Omega _D$, we can infer that

$$\begin{aligned} \langle \widehat{\varvec{v}}_h, (\underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h}= \langle \widehat{\varvec{v}}_h, (\underline{\psi }\varvec{n})^t\rangle _{\partial \Omega _N}. \end{aligned}$$

Then using the fact that $\varvec{v}_h^n=0$ on $\partial \Omega _D$, we can deduce that

$$\begin{aligned}&(\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h), \underline{\psi })_{\mathcal {T}_h}\\&\quad =( \varepsilon _h(\varvec{v}_h),\underline{\psi })_{\mathcal {T}_h} -(\underline{R}(\varvec{P_M}\varvec{v}_h^t-\widehat{\varvec{v}}_h),\underline{\psi })_{\mathcal {T}_h}\\&\quad =\langle \varvec{v}_h,\underline{\psi }\varvec{n}\rangle _{\partial \mathcal {T}_h}-(\varvec{v}_h,\textrm{div}\underline{\psi })_{\mathcal {T}_h}-(\underline{R}(\varvec{P_M} \varvec{v}_h^t-\widehat{\varvec{v}}_h),\underline{\psi })_{\mathcal {T}_h}\\&\quad =\langle \varvec{v}_h^t,(\underline{\psi }\varvec{n})^t\rangle _{\partial \mathcal {T}_h}{+\langle \varvec{v}_h^n,(\underline{\psi }\varvec{n})^n\rangle _{\partial \Omega _N}}-(\varvec{v}_h,\textrm{div}\underline{\psi })_{\mathcal {T}_h}-(\underline{R}(\varvec{P_M}\varvec{v}_h^t-\widehat{\varvec{v}}_h),\underline{\psi })_{\mathcal {T}_h}\\&\quad =\langle \varvec{v}_h^t-\widehat{\varvec{v}}_h,((\underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi })\varvec{n})^t\rangle _{\partial \mathcal {T}_h}-(\underline{R}(\varvec{P_M}\varvec{v}_h^t-\widehat{\varvec{v}}_h),\underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi })_{\mathcal {T}_h}\\&\qquad -(\varvec{v}_h,\textrm{div}\underline{\psi })_{\mathcal {T}_h}+{\langle \varvec{v}_h^n,(\underline{\psi }\varvec{n})^n\rangle _{\partial \Omega _N}}+{\langle \widehat{\varvec{v}}_h,(\underline{\psi }\varvec{n})^t\rangle _{\partial \Omega _N}}\\&\quad {=T_1+T_2+T_3+T_4+T_5}. \end{aligned}$$

The Cauchy–Schwarz inequality and the boundedness of $\Vert h^{-1/2}(\varvec{v}_h^t-\widehat{\varvec{v}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}$ yield

$$\begin{aligned} |T_1|&\le C \Vert h^{-1/2}(\varvec{v}_h^t-\widehat{\varvec{v}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}h^{1/2}\Vert \underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi }\Vert _{L^2(\partial \mathcal {T}_h)} \le C h^{1/2}\Vert \underline{\psi }\\&\quad -\underline{\Pi _\Sigma }\underline{\psi }\Vert _{L^2(\partial \mathcal {T}_h)}, \end{aligned}$$

which tends to zero as $h\rightarrow 0$.

Similarily, the Cauchy–Schwarz inequality and Lemma 5.7 imply

$$\begin{aligned} | T_2|\le C\Vert h^{-1/2}(\varvec{P_M}\varvec{v}_h^t-\widehat{\varvec{v}}_h)\Vert _{L^2(\partial \mathcal {T}_h)} \Vert \underline{\psi }-\underline{\Pi _\Sigma }\underline{\psi }\Vert _{L^2(\Omega )} , \end{aligned}$$

which tends to zero as $h\rightarrow 0$.

Moreover, $T_3\rightarrow -\int _{\Omega } \varvec{v}\textrm{div}\underline{\psi }$. The Cauchy–Schwarz inequality and the boundedness of $\Vert h^{-1/2}(\varvec{v}_h^t-\widehat{\varvec{v}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}$ give

$$\begin{aligned} \langle \widehat{\varvec{v}}_h-\varvec{v}_h^t,(\underline{\psi }\varvec{n})^t\rangle _{\partial \Omega _N}&\le \Vert h^{-1/2}(\varvec{v}_h^t-\widehat{\varvec{v}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}h^{1/2}\Vert \underline{\psi }\Vert _{L^2(\partial \mathcal {T}_h)}\\ {}&\le C h^{1/2}\Vert \underline{\psi }\Vert _{L^2(\partial \mathcal {T}_h)}, \end{aligned}$$

which tends to zero as $h\rightarrow 0$. Therefore, we have $ \langle \widehat{\varvec{v}}_h,(\underline{\psi }\varvec{n})^t\rangle _{\partial \Omega _N}\rightarrow \langle \varvec{v}_h^t,(\underline{\psi }\varvec{n})^t\rangle _{\partial \Omega _N}$. Taking $\underline{\psi }\in \underline{C}_c^\infty (\Omega )$ arbitrarily, we can infer from the preceding arguments that

$$\begin{aligned} ( \underline{w},\underline{\psi })=\lim _{h\rightarrow 0} (\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h), \underline{\psi })_{\mathcal {T}_h}=-( \varvec{v},\textrm{div}\underline{\psi }). \end{aligned}$$

Since $\underline{\psi }$ is symmetric, it holds $\underline{w}=\varepsilon (\varvec{v})$. Thereby, $\varvec{v}\in \varvec{H}^1(\Omega )$.

Then we can take $\underline{\psi }\in \underline{C}^\infty (S,\Omega )\cap \underline{C}(S,\bar{\Omega })$ arbitrarily, we have $ ( \underline{w},\underline{\psi })=\lim _{h\rightarrow 0} (\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h), \underline{\psi })_{\mathcal {T}_h}=-( \varvec{v},\textrm{div}\underline{\psi })+ \langle \varvec{v}_h,\underline{\psi }\varvec{n}\rangle _{\partial \Omega _N}$. Then it follows from integration by parts that $ \langle \varvec{v},\underline{\psi }\varvec{n}\rangle _{\partial \Omega _D}=0$ for arbitrary $\underline{\psi }\in \underline{C}^\infty (S,\Omega )\cap \underline{C}(S,\bar{\Omega })$ with the additional restriction that $\underline{\psi }\varvec{n}=\varvec{0}$ on $\partial \Omega _N$, which in turn implies that $\varvec{v}=\varvec{0}$ on $\partial \Omega _D$. Hence, it holds $\varvec{v}\in \varvec{H}^1_0(\Gamma _D)$. $\square $

Proceeding similarily to Lemma 5.8, we can prove the following lemma.

Lemma 5.9

Let $\{(\varvec{v}_h,\widehat{\varvec{v}}_h)\}_{h>0}$ be a sequence in $\varvec{U}_h\times \widehat{\varvec{U}}_h$. Assume that the sequence $\{(\varvec{v}_h,\widehat{\varvec{v}}_h)\}_{h>0}$ is bounded in the $\Vert (\cdot ,\cdot )\Vert _{1,h}$-norm. Then, there exists a function $\varvec{v}\in \varvec{H}^1_0(\Gamma _D)$ such that as $h\rightarrow 0$, up to a subsequence, $\varvec{v}_h\rightarrow \varvec{v}$ strongly in $\varvec{L}^2(\Omega )$ and $\underline{\mathcal {G}}_h^k(\varvec{v}_h,\widehat{\varvec{v}}_h)\rightharpoonup \nabla \varvec{v}$ weakly in $\underline{L}^2(\Omega )$.

We let $\varvec{V}_h:=\{\varvec{v}\in \varvec{H}^1_0(\Gamma _D), \varvec{v}_{|K}\in \varvec{P}_{k+1}(K), \forall K\in \mathcal {T}_h\}$, we seek $\varvec{\Pi _h^c}\varvec{u}\in \varvec{V}_h$ such that

$$\begin{aligned} (\nabla \varvec{\Pi _h^c}\varvec{u},\nabla \varvec{v}_h)=(\nabla \varvec{u},\nabla \varvec{v}_h)\quad \forall \varvec{v}_h\in \varvec{V}_h, \end{aligned}$$

which is well-posed by the Riesz representation theorem. It follows from the density argument that $\Vert \nabla (\varvec{u}-\varvec{\Pi _h^c}\varvec{u})\Vert _{L^2(\mathcal {T}_h)}\rightarrow 0$ as $h\rightarrow 0$.

Proof of Theorem 2.6 (Convergence to weak solution)

In view of Lemmas 5.6 and 5.8, up to a subsequence, there is $(\underline{\sigma },\varvec{u})\in \underline{L}^2(S,\Omega )\times \varvec{H}^1_0(\Gamma _D)$ such that $\underline{\sigma }_h\rightharpoonup \underline{\sigma }$ weakly in $\underline{L}^2(S,\Omega )$, $\varvec{u}_h\rightarrow \varvec{u}$ strongly in $\varvec{L}^2(\Omega )$, $\varvec{u}_h\rightarrow \varvec{u}$ strongly in $\varvec{L}^4(\Omega )$, $\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h)\rightharpoonup \varepsilon (\varvec{u})$ weakly in $\underline{L}^2(S,\Omega )$, $\mathcal {A}\underline{\sigma }_h\rightharpoonup \mathcal {A}\underline{\sigma }$ weakly in $\underline{L}^2(S,\Omega )$.

Let $\underline{\psi }\in \underline{C}_c^\infty (S,\Omega )$. Testing with $\underline{\Pi _\Sigma }\varvec{\psi }$ for (2.13) yields

$$\begin{aligned}&((2\mu )^{-1}\mathcal {A}\underline{\sigma }, \underline{\psi })-(\varepsilon (\varvec{u}), \underline{\psi })\\&\quad = ((2\mu )^{-1}\mathcal {A}(\underline{\sigma }-\underline{\sigma }_h, \underline{\psi })+ ((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h, \underline{\psi }\\&\qquad -\underline{\Pi _\Sigma } \underline{\psi })+((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h,\underline{\Pi _\Sigma }\underline{\psi })\\&\qquad \;-(\underline{\psi },\varepsilon (\varvec{u})-\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h))-(\underline{\psi }-\underline{\Pi _\Sigma } \underline{\psi },\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h))\\&\qquad -(\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h),\underline{\Pi _\Sigma } \underline{\psi })=\sum _{i=1}^6T_i. \end{aligned}$$

As $h\rightarrow 0$, the weak convergence of $\mathcal {A}\underline{\sigma }_h$, the boundedness of $\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}$ and the strong convergence of $\Pi _\Sigma \underline{\psi }$ imply that $T_1\rightarrow 0$ and $T_2\rightarrow 0$. Similarily, we can infer that $T_4\rightarrow 0$ and $T_5\rightarrow 0$ owing to the weak convergence of $\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h)$, the boundedness of $\Vert \underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{L^2(\Omega )}$ and the strong convergence of $\underline{\Pi _\Sigma }\underline{\psi }$.

In view of (5.4), we have

$$\begin{aligned} T_3+T_6=((2\mu )^{-1}\mathcal {A}\underline{\sigma }_h,\underline{\Pi _\Sigma }\underline{\psi })-(\underline{G}_h(\varvec{u}_h,\widehat{\varvec{u}}_h),\underline{\Pi _\Sigma } \underline{\psi })=0. \end{aligned}$$

Therefore, it holds

$$\begin{aligned} ((2\mu )^{-1}\mathcal {A}\underline{\sigma }, \underline{\psi })-(\varepsilon (\varvec{u}), \underline{\psi })=0. \end{aligned}$$

On the other hand, let $\varvec{\chi }\in \varvec{C}^\infty (\Omega )\cap \varvec{C}(\bar{\Omega })$, it follows from (2.13) that

$$\begin{aligned} \begin{aligned}&(\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\chi }))_{\mathcal {T}_h}-\langle (\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\chi })^t-\varvec{P_M}\varvec{\chi }^t\rangle _{\partial \mathcal {T}_h}\\&\quad +\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\chi })^t-\varvec{P_M}\varvec{\chi }^t\rangle _{\partial \mathcal {T}_h}\\&\quad +N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{\Pi _U}\varvec{\chi },\varvec{P_M}\varvec{\chi }^t))=(\varvec{f}_N,\varvec{\Pi _U}\varvec{\chi })_{\mathcal {T}_h}. \end{aligned} \end{aligned}$$

(5.6)

The second term and the third term on the left-hand side can be bounded by

$$\begin{aligned} \begin{aligned}&|-\langle (\underline{\sigma }_h\varvec{n})^t,(\varvec{\Pi _U}\varvec{\chi })^t-\varvec{P_M}\varvec{\chi }^t\rangle _{\partial \mathcal {T}_h}+\langle \tau (\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h),(\varvec{\Pi _U}\varvec{\chi })^t-\varvec{P_M}\varvec{\chi }^t\rangle _{\partial \mathcal {T}_h}|\\&\quad \le Ch\Big ( \Vert \underline{\sigma }_h\Vert _{L^2(\Omega )} \Vert \varvec{\chi }\Vert _{H^2(\Omega )}+\left\| \tau ^{\frac{1}{2}}(\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h)\right\| _{L^2(\partial \mathcal {T}_h)}\Vert \varvec{\chi }\Vert _{H^2(\Omega )}\Big ), \end{aligned} \end{aligned}$$

which tends to zero as $h\rightarrow 0$ owing to the boundedness of $\Vert \underline{\sigma }_h\Vert _{L^2(\Omega )}$ and $\Vert \tau ^{\frac{1}{2}}(\varvec{P_M} \varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}$.

Moreover, we have

$$\begin{aligned} (\underline{\sigma }_h, \varepsilon (\varvec{\Pi _U}\varvec{\chi }))_{\mathcal {T}_h}\rightarrow (\underline{\sigma }, \varepsilon (\varvec{\chi }))_{\mathcal {T}_h} \;\text{ as }\; h\rightarrow 0 \end{aligned}$$

owing to the weak convergence of $\underline{\sigma }_h$ and the strong convergence of $\varepsilon _h(\varvec{\Pi _U}\varvec{\chi })$ (cf. [16]). It also holds $(\varvec{f}_N,\varvec{\Pi _U}\varvec{\chi })_{\mathcal {T}_h}\rightarrow (\varvec{f}_N,\varvec{\chi })_{\mathcal {T}_h}$ in view of the strong convergence of $\varvec{\Pi _U}\varvec{\chi }$. It remains to estimate the last term on the left-hand side of (5.6). It follows from (5.5) that

$$\begin{aligned} \begin{aligned}&N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{\Pi _U}\varvec{\chi },\varvec{P_M}\varvec{\chi }^t)) \\&\quad =(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{\Pi _U}\varvec{\chi })_{\mathcal {T}_h}\\&\qquad +\frac{1}{2}\left\langle |\varvec{u}_h\cdot \textbf{n}|+\varvec{u}_h\cdot \varvec{n},(\varvec{u}_h-\widehat{\varvec{u}}_h)\cdot ((\varvec{\Pi _U\chi })^t -\varvec{P_M}\varvec{\chi }^t)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}. \end{aligned} \end{aligned}$$

(5.7)

Since $\varvec{u}_h\rightarrow \varvec{u}$ in $\varvec{L}^4(\Omega )$ and $\varvec{\Pi _U}\varvec{\chi }\rightarrow \varvec{\chi }$ in $\varvec{L}^4(\Omega )$, it is inferred that $\varvec{u}_h\varvec{\Pi _U}\varvec{\chi }\rightarrow \varvec{u}\varvec{\chi }$ in $\varvec{L}^2(\Omega )$. Moreover, $\underline{\mathcal {G}}_h^{2k}(\varvec{u}_{h},\widehat{\varvec{u}}_h)$ converges to $\nabla \varvec{u}$ weakly in $\varvec{L}^2(\Omega )$. Thus, the first term on the right-hand side of (5.7) converges to $(\varvec{u}\cdot \nabla \varvec{u},\varvec{\chi })$. Now we estimate the second term on the right-hand side of (5.7). Let $K_f$ indicate the element sharing the face F, then Hölder’s inequality and (5.1) yield

$$\begin{aligned}&\frac{1}{2}\left\langle |\varvec{u}_h\cdot \textbf{n}|+\varvec{u}_h\cdot \varvec{n},(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot ((\varvec{\Pi _U\chi })^t-\varvec{P_M}\varvec{\chi }^t)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\nonumber \\&\quad \le C\sum _{F\in \mathcal {F}_h} \Vert \varvec{u}_h\Vert _{L^4(F)}\Vert \varvec{u}_h^t-\widehat{\varvec{u}}_h\Vert _{L^4(F)}\Vert (\varvec{\Pi _U\chi })^t-\varvec{P_M}\varvec{\chi }^t\Vert _{L^2(F)}\nonumber \\&\quad \le C \sum _{F\in \mathcal {F}_h}h_K^{-1/4}\Vert \varvec{u}_h\Vert _{L^4(K_f)}h^{(1-d)/4} \Vert \varvec{u}_h^t-\widehat{\varvec{u}}_h\Vert _{L^2(F)}\Vert (\varvec{\Pi _U\chi })^t-\varvec{P_M}\varvec{\chi }^t\Vert _{L^2(F)}\nonumber \\&\quad \le C h^{(4-d)/4} \Vert \varvec{u}_h\Vert _{L^4(\Omega )}\Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}\Vert \varvec{\chi }\Vert _{H^1(\Omega )}, \end{aligned}$$

(5.8)

which tends to zero.

Therefore, it holds

$$\begin{aligned} (\underline{\sigma },\varepsilon (\varvec{\chi }))+(\varvec{u}\cdot \nabla \varvec{u}, \varvec{\chi })=(\varvec{f}_N,\varvec{\chi })_{\mathcal {T}_h}. \end{aligned}$$

By density of $\underline{C}_c^\infty (S,\Omega )\times (\varvec{C}^\infty (\Omega )\cap \varvec{C}(\bar{\Omega }))$ in $\underline{L}^2(S,\Omega )\times \varvec{H}^1(\Omega )$, this shows that $(\underline{\sigma },\varvec{u})$ solves the Navier–Stokes equations (2.5)–(2.6). Since the solution to this problem is unique, the whole sequence $\{(\underline{\sigma }_h,\varvec{u}_h)\}_{h>0}$ converges.

Now we show the strong convergence of $\mathcal {A}\underline{\sigma }_h$. In view of (2.13), we have

$$\begin{aligned}&(2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2+\left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\right\| _{L^2(\partial \mathcal {T}_h)}^2+N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{u}_h, \widehat{\varvec{u}}_h))=(\varvec{f}_N,\varvec{u}_h).\nonumber \\ \end{aligned}$$

(5.9)

According to the definition of $N_h(\cdot ;(\cdot ,\cdot ),(\cdot ,\cdot ))$, we have

$$\begin{aligned} N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{u}_h, \widehat{\varvec{u}}_h))&=(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{u}_h)_{\mathcal {T}_h}\\&\quad \;+\frac{1}{2}\left\langle |\varvec{u}_h\cdot \textbf{n}|+\varvec{u}_h\cdot \varvec{n},(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot (\varvec{u}_h^t-\widehat{\varvec{u}}_h)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}. \end{aligned}$$

Proceeding similarily to (5.7), we can show that $(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{u}_h)_{\mathcal {T}_h}\rightarrow (\varvec{u}\cdot \nabla \varvec{u},\varvec{u})$ as $h\rightarrow 0$. We also notice that the second term is non-negative, that is,

$$\begin{aligned} \frac{1}{2}\left\langle |\varvec{u}_h\cdot \textbf{n}|+\varvec{u}_h\cdot \varvec{n},(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot (\varvec{u}_h^t-\widehat{\varvec{u}}_h)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\ge 0. \end{aligned}$$

Thus, it holds

$$\begin{aligned} \lim \sup (2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2&\le \lim \sup \Big ((\varvec{f}_N,\varvec{u}_h)-(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{u}_h)_{\mathcal {T}_h}\Big )\\&=(\varvec{f}_N,\varvec{u})-(\varvec{u}\cdot \nabla \varvec{u},\varvec{u})=(2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }\Vert _{L^2(\Omega )}^2. \end{aligned}$$

On the other hand, owing to the weak convergence of $\mathcal {A}\underline{\sigma }_h$, we have

$$\begin{aligned} (2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }\Vert _{L^2(\Omega )}^2\le \lim \inf (2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Thereby, $ (2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2\rightarrow (2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }\Vert _{L^2(\Omega )}^2$, which yields the strong convergence of $\mathcal {A}\underline{\sigma }_h$ in $\underline{L}^2(S,\Omega )$.

In view of (5.9), we have

$$\begin{aligned} \begin{aligned}&\left\| \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\right\| _{L^2(\partial \mathcal {T}_h)}^2\\ {}&\qquad + \frac{1}{2}\left\langle |\varvec{u}_h\cdot \textbf{n}|+\varvec{u}_h\cdot \varvec{n},(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot (\varvec{u}_h^t-\widehat{\varvec{u}}_h)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\quad =(\varvec{f}_N,\varvec{u}_h)-(2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2-(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{u}_h)_{\mathcal {T}_h}, \end{aligned} \end{aligned}$$

(5.10)

where $(\varvec{f}_N,\varvec{u}_h)\rightarrow (\varvec{f}_N,\varvec{u})$, $(2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }_h\Vert _{L^2(\Omega )}^2\rightarrow (2\mu )^{-1}\Vert \mathcal {A}\underline{\sigma }\Vert _{L^2(\Omega )}^2$ and $(\varvec{u}_h\cdot \underline{\mathcal {G}}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h),\varvec{u}_h)_{\mathcal {T}_h}\rightarrow (\varvec{u}\cdot \nabla \varvec{u},\varvec{u})$ as $h\rightarrow 0$. Thus the right hand-side of (5.10) tends to zero. Thereby, $ \Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\varvec{\widehat{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\rightarrow 0$ as $h\rightarrow 0$.

Lemma 5.7 and the strong convergence of $\mathcal {A}\underline{\sigma }_h$ imply

$$\begin{aligned}&\Vert \varepsilon _h(\varvec{u}-\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}\le (2\mu )^{-1}\Vert \mathcal {A}(\underline{\sigma }-\underline{\sigma }_h)\Vert _{L^2(\mathcal {T}_h)}\\&\quad + \Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h^t-\widehat{\varvec{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\rightarrow 0 \quad \text{ as }\;h\rightarrow 0. \end{aligned}$$

For $F\in \mathcal {F}_h\backslash \partial \Omega _N$ and $\partial K_1\cap \partial K_2=F$, we have from (4.7) and Lemma 4.1

$$\begin{aligned}&\int _F |\llbracket \varvec{u}_h-\varvec{P_M}\varvec{u}_h\rrbracket |^2\;ds\\&\quad = \int _F \big |\llbracket \varvec{u}_h-\varvec{\Pi _h^c}\varvec{u}-\varvec{P_M}(\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u})\rrbracket \big |^2\;ds\\&\quad =\int _F \big |\llbracket \varvec{u}_h-\varvec{\Pi _h^c}\varvec{u}+\underline{B}_K\varvec{x}+\varvec{b}_k-\varvec{P_M}(\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u}+\underline{B}_K\varvec{x}+\varvec{b}_k)\rrbracket \big |^2\;ds\\&\quad \le \frac{1}{2}\sum _{i=1}^2\int _{F\cap \partial K_i}|\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u}+\underline{B}_K\varvec{x}+\varvec{b}_k-\varvec{P_M}(\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u}+\underline{B}_K\varvec{x}+\varvec{b}_k)|^2\;ds\\&\quad \le C\sum _{i=1}^2h_{K_i}^{\frac{1}{2}}\Vert \nabla (\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u}+\underline{B}_K\varvec{x}+\varvec{b}_k)\Vert _{L^2( K_i)}\\&\quad \le C\sum _{i=1}^2 h_{K_i}^{\frac{1}{2}}\Vert \varepsilon (\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u})\Vert _{L^2( K_i)}. \end{aligned}$$

Thus, summing over all the faces gives

$$\begin{aligned} \Bigg (\sum _{F\in \mathcal {F}_h\backslash \partial \Omega _N} h_F^{-1}\Vert \llbracket \varvec{u}_h-\varvec{P_M}\varvec{u}_h\rrbracket \Vert _{L^2(F)}^2\Bigg )^{\frac{1}{2}}\le C \Vert \varepsilon (\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u})\Vert _{L^2( \mathcal {T}_h)}. \end{aligned}$$

The triangle inequality yields

$$\begin{aligned} \Vert \varepsilon (\varvec{u}_h-\varvec{\Pi _h^c}\varvec{u})\Vert _{L^2( \mathcal {T}_h)}&\le \Vert \varepsilon _h (\varvec{u}_h-\varvec{u})\Vert _{L^2(\mathcal {T}_h)}\\ {}&\quad +\Vert \varepsilon (\varvec{u}-\varvec{\Pi _h^c}\varvec{u})\Vert _{L^2(\mathcal {T}_h)}\rightarrow 0 \quad \text{ as }\;h\rightarrow 0 \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \Bigg (\sum _{F\in \mathcal {F}_h\backslash \partial \Omega _N} h_F^{-1}\Vert \llbracket \varvec{u}_h\rrbracket \Vert _{L^2(F)}^2\Bigg )^{\frac{1}{2}}&\le \Bigg (\sum _{F\in \mathcal {F}_h\backslash \partial \Omega _N} h_F^{-1}\Vert \llbracket \varvec{u}_h-\varvec{P_M}\varvec{u}_h\rrbracket \Vert _{L^2(F)}^2\Bigg )^{\frac{1}{2}}\\&\quad + \Vert h^{-1/2}(\varvec{P_M}\varvec{u}_h^t-\widehat{\varvec{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}^2\Big )^{\frac{1}{2}}\rightarrow 0 \;\text{ as }\;h\rightarrow 0. \end{aligned} \end{aligned}$$

(5.11)

Then, proceeding similarily to (4.11) yields

$$\begin{aligned} \Vert \nabla _h (\varvec{u}_h-\varvec{u})\Vert _{L^2(\mathcal {T}_h)}&\le C \Big (\Vert \varepsilon _h(\varvec{u}_h-\varvec{u})\Vert _{L^2(\mathcal {T}_h)}\nonumber \\&\quad + \Big (\sum _{F\in \mathcal {F}_h\backslash \partial \Omega _N} h_F^{-1}\Vert \llbracket \varvec{u}_h\rrbracket \Vert _{L^2(F)}^2\Big )^{\frac{1}{2}}\Big )\nonumber \\&\quad \rightarrow 0\quad \text{ as }\;h\rightarrow 0. \end{aligned}$$

(5.12)

Now we can infer from (5.11) and (5.12) that $\Vert \nabla _h(\varvec{u}-\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}\rightarrow 0$ and $\Big (\sum _{F\in \mathcal {F}_h\backslash \partial \Omega _N} h_F^{-1}\Vert \llbracket \varvec{u}_h\rrbracket \Vert _{L^2(F)}^2\Big )^{\frac{1}{2}}\rightarrow 0$ as $h\rightarrow 0$. As such, $\Vert \varvec{u}-\varvec{u}_h\Vert _h\rightarrow 0$ as $h\rightarrow 0$.

Finally, we show the strong convergence of $\underline{\sigma }_h$. Let $\varvec{v}^*\in \varvec{H}^1(\Omega )$ be such that $\nabla \cdot \varvec{v}^*=\textrm{tr}(\underline{\sigma }_h) $ and $\Vert \varvec{v}^*\Vert _{H^1(\Omega )}\le C \Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}$ and set $\varvec{v}_h=\varvec{\Pi _U}\varvec{v}^*$. Then

$$\begin{aligned} \Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}^2=(\textrm{tr}(\underline{\sigma }_h), \nabla \cdot \varvec{v}^*)&=(\textrm{tr}(\underline{\sigma }_h), \nabla \cdot \varvec{v}_h)_{\mathcal {T}_h}\\&=d\Big ((\underline{\sigma }_h, \varepsilon (\varvec{v}_h))_{\mathcal {T}_h}-(\mathcal {A}\underline{\sigma }_h, \varepsilon (\varvec{v}_h))_{\mathcal {T}_h}\Big ). \end{aligned}$$

We let $\widehat{\varvec{v}}_h=\varvec{P_M}(\varvec{v}^*)^t$, $(\varvec{v}_h,\widehat{\varvec{v}}_h)$ is bounded in $\Vert (\cdot ,\cdot )\Vert _{1,h}$-norm. There is $\varvec{v}\in \varvec{H}^1(\Omega )$ such that, up to a subsequence, $\varvec{v}_h\rightarrow \varvec{v}$ strongly in $\varvec{L}^2(\Omega )$ and $\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h)\rightharpoonup \varepsilon (\varvec{v})$ weakly in $\underline{L}^2(S,\Omega )$. Moreover, $\nabla \cdot \varvec{v}=\textrm{tr}(\underline{\sigma })$ in distributional sense. It follows from (2.13) that

$$\begin{aligned} (\underline{\sigma }_h, \varepsilon (\varvec{v}_h))_{L^2(\mathcal {T}_h)}&=\langle (\underline{\sigma }_h\varvec{n})^t, \varvec{v}_h-\widehat{\varvec{v}}_h\rangle _{\partial \mathcal {T}_h}-\langle \tau (\varvec{P_M}\varvec{u}_h^t-\widehat{\varvec{u}}_h), \varvec{v}_h^t-\widehat{\varvec{v}}_h \rangle _{\partial \mathcal {T}_h}\\&\quad \;-N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{v}_h,\widehat{\varvec{v}}_h))+(\varvec{f}_N,\varvec{v}_h). \end{aligned}$$

The definition of $\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h)$ gives

$$\begin{aligned} (\mathcal {A}\underline{\sigma }_h, \varepsilon (\varvec{v}_h))_{\mathcal {T}_h}=(\mathcal {A}\underline{\sigma }_h, \underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h))_{\mathcal {T}_h}+(\mathcal {A}\underline{\sigma }_h, \underline{R}(\varvec{P_M}\varvec{v}_h^t-\widehat{\varvec{v}}_h))_{\mathcal {T}_h}. \end{aligned}$$

Therefore, it holds

$$\begin{aligned} \begin{aligned}&(\underline{\sigma }_h, \varepsilon (\varvec{v}_h))_{\mathcal {T}_h}-(\mathcal {A}\underline{\sigma }_h, \varepsilon (\varvec{v}_h))_{\mathcal {T}_h}\\&\quad =-(\mathcal {A}\underline{\sigma }_h, \underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h))-\langle \tau (\varvec{P_M}\varvec{u}_h^t-\widehat{\varvec{u}}_h), \varvec{v}_h^t-\widehat{\varvec{v}}_h \rangle _{\partial \mathcal {T}_h}\\&\qquad +(\varvec{f}_N,\varvec{v}_h)-N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{v}_h,\widehat{\varvec{v}}_h)), \end{aligned} \end{aligned}$$

(5.13)

where the first term on the right-hand side converges to $ -(\mathcal {A}\underline{\sigma }, \varepsilon (\varvec{v}))_{\mathcal {T}_h}$ owing to the strong convergence of $\mathcal {A}\underline{\sigma }_h$ and weak convergence of $\underline{G}_h(\varvec{v}_h,\widehat{\varvec{v}}_h)$. The third term on the right-hand side converges to $(\varvec{f}_N,\varvec{v})$. Moreover, the Cauchy–Schwarz inequality yields

$$\begin{aligned}&\langle \tau (\varvec{P_M}\varvec{u}_h^t-\widehat{\varvec{u}}_h), \varvec{v}_h^t-\widehat{\varvec{v}}_h \rangle _{\partial \mathcal {T}_h}\\&\quad \le \Vert \tau ^{\frac{1}{2}}(\varvec{P_M}\varvec{u}_h^t-\widehat{\varvec{u}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}\Vert \tau ^{\frac{1}{2}}(\varvec{v}_h^t-\widehat{\varvec{v}}_h)\Vert _{L^2(\partial \mathcal {T}_h)}, \end{aligned}$$

which tends to zero as $h\rightarrow 0$.

Furthermore, (5.5) yields

$$\begin{aligned} N_h(\varvec{u}_h;(\varvec{u}_h,\widehat{\varvec{u}}_h),(\varvec{v}_h,\widehat{\varvec{v}}_h))&=(\varvec{u}_h\cdot \mathcal {G}_h^{2k}(\varvec{u}_h,\widehat{\varvec{u}}_h), \varvec{v}_h)\\&\quad +\frac{1}{2}\left\langle \varvec{u}_h\cdot \varvec{n}+|\varvec{u}_h\cdot \textbf{n}|,(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot (\varvec{v}_h^t-\widehat{\varvec{v}}_h)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}, \end{aligned}$$

where the first term converges to $(\varvec{u}\cdot \nabla \varvec{u} ,\varvec{v})$ that can be proved analogously to (5.7).

Similarily to (5.8), it holds

$$\begin{aligned}&\frac{1}{2}\left\langle \varvec{u}_h\cdot \varvec{n}+|\varvec{u}_h\cdot \textbf{n}|,(\varvec{u}_h^t-\widehat{\varvec{u}}_h)\cdot (\varvec{v}_h^t-\widehat{\varvec{v}}_h)\right\rangle _{\partial \mathcal {T}_h\backslash \partial \Omega _D}\\&\quad \le C h^{(4-d)/4} \Vert \varvec{u}_h\Vert _{L^4(\Omega )}\Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}\Vert (\varvec{v}_h,\widehat{\varvec{v}}_h)\Vert _{1,h}. \end{aligned}$$

An appeal to Lemma 5.2 yields that $\Vert \varvec{u}_h\Vert _{L^4(\Omega )}$ is bounded, which in conjunction with the boundedness of $\Vert (\varvec{u}_h,\widehat{\varvec{u}}_h)\Vert _{1,h}$ and $\Vert (\varvec{v}_h,\widehat{\varvec{v}}_h)\Vert _{1,h}$ implies that the right-hand side tends to zero. Thus, the last term on the right-hand side of (5.13) converges to $(\varvec{u}\cdot \nabla \varvec{u}, \varvec{v})$.

As a result, we have

$$\begin{aligned} \lim \Vert \textrm{tr}(\underline{\sigma }_h)\Vert _{L^2(\Omega )}^2=\Vert \textrm{tr}(\underline{\sigma })\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Then, the triangle inequality gives

$$\begin{aligned} \Vert \underline{\sigma }-\underline{\sigma }_h\Vert _{L^2(\Omega )}\le \Vert \mathcal {A}(\underline{\sigma }-\underline{\sigma }_h)\Vert _{L^2(\Omega )}+\Vert \textrm{tr}(\underline{\sigma }-\underline{\sigma }_h)\Vert _{L^2(\Omega )}\rightarrow 0 \quad \text{ as }\;h\rightarrow 0, \end{aligned}$$

which yields the strong convergence of $\underline{\sigma }_h$. $\square $

Remark 5.1

Proceeding in a similar fashion to that of the Stokes equations and mimicking the convergence error estimates presented in [10] for the Navier–Stokes equations, we are also able to show the convergence error estimates by assuming that the weak solutions are smooth; indeed, the optimal convergence error estimates can be achieved for all the variables. The details are omitted here for simplicity.

6 Numerical experiments

In this section, several two-dimensional numerical experiments will be carried out to test the capabilities of the proposed method as well as the proposed error estimator for the Stokes equations. Two examples with smooth solutions will be employed to test the convergence rates of the proposed method. In particular, the robustness of the method with respect to the values of the viscosity and the pressure robustness of the method will be confirmed.

6.1 Smooth solution example

In this example, we consider $\Omega =(0,1)^2$ and the exact solution is defined by

$$\begin{aligned}\varvec{u}&= {\left\{ \begin{array}{ll} x^2\pi \sin (2y\pi )(x - 1)^2+1,\\ -2x\sin (y\pi )^2(2x - 1)(x - 1) +1 \end{array}\right. }\quad \\ {}&\quad \text{ and }\quad p=(\cos (1)-1)\sin (1) + \cos (y)\sin (x). \end{aligned}$$

The convergence history against the number of degrees of freedom with $\mu =1,10^{-4}$ and $\mu =10^{-8}$ for the polynomial order $k=1,2,3,4$ is displayed in Figure 1, 2 and 3, respectively. We can observe that the optimal convergence rates matching the theoretical results can be obtained. In addition, the convergence rates for all the variables remain optimal regardless of the values of $\mu $, which verifies the robustness of the scheme with respect to $\mu $. Moreover, we also observe that the accuracy for the velocity error is slightly influenced by the value of $\mu $.

6.2 No-flow example

In this example, we use the unit square domain, i.e., $\Omega =(0,1)$ and the exact solution is defined by

$$\begin{aligned} \varvec{u}= {\left\{ \begin{array}{ll} 0\\ 0 \end{array}\right. },\quad p=-\frac{\text {Ra}}{2}y^2+\text {Ra} y-\frac{\text {Ra}}{3}, \end{aligned}$$

where $\text {Ra}=10^3$. The convergence history against the number of degree of freedom with $k=1$ is reported on the left panel of Fig. 4. We can observe that the $L^2$-errors of stress and velocity approach zero, in addition, $\Vert \varepsilon (\varvec{u}-\varvec{u}_h)\Vert _{L^2(\mathcal {T}_h)}$ also approaches zero. The convergence rates for $\Vert \textrm{tr}(\underline{\sigma }-\underline{\sigma })\Vert _{L^2(\Omega )}$ is optimal as reflected by the theories. In addition, the solution profile of $\textrm{tr}(\underline{\sigma }_h)$ is also correct. This example once again highlights that the proposed scheme is pressure-robust.

References

Arnold, D.N.: An interior penalty finite element method with discontinuous elements. SIAM J. Numer. Anal. 19, 742–760 (1982)
Article MathSciNet Google Scholar
Babuška, I., Aziz, A.K.: Survey lectures on the mathematical foundation of the finite element method. In: The Mathematical Foundations of the Finite Element Method with Applications to Partial Differential Equations, pp. 1–359. Academic Press (1972)
Behr, M.A., Franca, L.P., Tezduyar, T.E.: Stabilized finite element methods for the velocity-pressure-stress formulation of incompressible flows. Comput. Methods Appl. Mech. Eng. 104, 31–48 (1993)
Article MathSciNet Google Scholar
Bochev, P.B., Gunzburger, M.D.: Least-squares methods for the velocity-pressure-stress formulation of the Stokes equations. Comput. Methods Appl. Mech. Eng. 126, 267–287 (1995)
Article MathSciNet Google Scholar
Boffi, D., Brezzi, F., Fortin, M.: Mixed Finite Element Methods and Applications. Springer, Berlin (2013)
Book Google Scholar
Brenner, S.C.: Poincaré-Friedrichs inequalities for piecewise ${H}^1$ functions. SIAM J. Numer. Anal. 41, 306–324 (2003)
Article MathSciNet Google Scholar
Cai, Z., Lee, B., Wang, P.: Least-squares methods for incompressible Newtonian fluid flow: linear stationary problems. SIAM J. Numer. Anal. 42, 843–859 (2004)
Article MathSciNet Google Scholar
Cai, Z., Wang, C., Zhang, S.: Mixed finite element methods for incompressible flow: stationary Navier–Stokes equations. SIAM J. Numer. Anal. 48, 79–94 (2010)
Article MathSciNet Google Scholar
Cai, Z., Zhang, S.: Mixed methods for stationary Navier–Stokes equations based on pseudostress-pressure-velocity formulation. Math. Comput. 81, 1903–1927 (2012)
Article MathSciNet Google Scholar
Cesmelioglu, A., Cockburn, B., Qiu, W.: Analysis of a hybridizable discontinuous Galerkin method for the steady-state incompressible Navier–Stokes equations. Math. Comput. 86, 1643–1670 (2017)
Article MathSciNet Google Scholar
Cockburn, B., Kanschat, G., Schötzau, D.: A note on discontinuous Galerkin divergence-free solutions of the Navier–Stokes equations. J. Sci. Comput. 31, 61–73 (2007)
Article MathSciNet Google Scholar
Cockburn, B., Sayas, F.-J.: Divergence-conforming HDG methods for Stokes flows. Math. Comput. 83, 1571–1598 (2014)
Article MathSciNet Google Scholar
Cáceres, E., Gatica, G.N.: A mixed virtual element method for the pseudostress-velocity formulation of the Stokes problem. IMA J. Numer. Anal. 37, 296–331 (2016)
Article MathSciNet Google Scholar
da Veiga, L.B., Dassi, F., Di Pietro, D.A., Droniou, J.: Arbitrary-order pressure-robust DDR and VEM methods for the Stokes problem on polyhedral meshes. Comput. Methods Appl. Mech. Eng. 397, 115061 (2022)
Article MathSciNet Google Scholar
Di Pietro, D.A., Ern, A.: Mathematical Aspects of Discontinuous Galerkin Methods. Springer, Berlin (2011)
Google Scholar
Di Pietro, D.A., Ern, A.: Discrete functional analysis tools for discontinuous Galerkin methods with application to the incompressible Navier–Stokes equations. Math. Comput. 79, 1303–1330 (2010)
Article MathSciNet Google Scholar
Fabricius, J.: Stokes flow with kinematic and dynamic boundary conditions. Quart. Appl. Math. 77, 525–544 (2019)
Article MathSciNet Google Scholar
Figueroa, L.E., Gatica, G.N., Márquez, A.: Augmented mixed finite element methods for the stationary Stokes equations. SIAM J. Sci. Comput. 31, 1082–1119 (2009)
Article MathSciNet Google Scholar
Fu, G., Jin, Y., Qiu, W.: Parameter-free superconvergent $H$(div)-conforming HDG methods for the Brinkman equations. IMA J. Numer. Anal. 39, 957–982 (2018)
Article MathSciNet Google Scholar
Gatica, G.N., Márquez, A., Sánchez, M.A.: Analysis of a velocity-pressure-pseudostress formulation for the stationary Stokes equations. Comput. Methods Appl. Mech. Eng. 199, 1064–1079 (2010)
Article MathSciNet Google Scholar
Gopalakrishnan, J., Lederer, P.L., Schöberl, J.: A mass conserving mixed stress formulation for Stokes flow with weakly imposed stress symmetry. SIAM J. Numer. Anal. 58, 706–732 (2020)
Article MathSciNet Google Scholar
Gopalakrishnan, J., Lederer, P.L., Schöberl, J.: A mass conserving mixed stress formulation for the Stokes equations. IMA J. Numer. Anal. 40, 1838–1874 (2019)
Article MathSciNet Google Scholar
John, V., Linke, A., Merdon, C., Neilan, M., Rebholz, L.G.: On the divergence constraint in mixed finite element methods for incompressible flows. SIAM Rev. 59, 492–544 (2017)
Article MathSciNet Google Scholar
Karakashian, O.A., Pascal, F.: A posteriori error estimates for a discontinuous Galerkin approximation of second-order elliptic problems. SIAM J. Numer. Anal. 41(6), 2374–2399 (2003)
Article MathSciNet Google Scholar
Kim, D., Zhao, L., Park, E.-J.: Staggered DG methods for the pseudostress-velocity formulation of the Stokes equations on general meshes. SIAM J. Sci. Comuput. 42, A2537–A2560 (2020)
Article MathSciNet Google Scholar
Lederer, P.L., Linke, A., Merdon, C., Schöberl, J.: Divergence-free reconstruction operators for pressure-robust Stokes discretizations with continuous pressure finite elements. SIAM J. Numer. Anal. 55, 1291–1314 (2017)
Article MathSciNet Google Scholar
Lehrenfeld, C., Schöberl, J.: High order exactly divergence-free hybrid discontinuous Galerkin methods for unsteady incompressible flows. Comput. Methods Appl. Mech. Eng. 307, 339–361 (2016)
Article MathSciNet Google Scholar
Linke, A., Merdon, C.: Pressure-robustness and discrete Helmholtz projectors in mixed finite element methods for the incompressible Navier–Stokes equations. Comput. Methods Appl. Mech. Eng. 311, 304–326 (2016)
Article MathSciNet Google Scholar
Nguyen, N.C., Peraire, J., Cockburn, B.: A hybridizable discontinuous Galerkin method for Stokes flow. Comput. Methods Appl. Mech. Eng. 199, 582–597 (2010)
Article MathSciNet Google Scholar
Neilan, M., Linke, A., Merdon, C., Neumann, F.: Quasi-optimality of a pressure-robust nonconforming finite element method for the Stokes-problem. Math. Comput. 87, 1543–1566 (2018)
Article MathSciNet Google Scholar
Qiu, W., Shen, J., Shi, K.: An HDG method for linear elasticity with strong symmetric stresses. Math. Comput. 87, 69–93 (2018)
Article MathSciNet Google Scholar
Rozložník, M.: Saddle-Point Problems and Their Iterative Solution. Birkhäuser, Cham (2018)
Book Google Scholar
Raviart, P.-A., Girault, V.: Finite Element Methods for Navier–Stokes Equations. Springer, Berlin (1986)
Google Scholar
Verfürth, R.: A posteriori error estimators for the Stokes equations. Numer. Math. 55, 309–325 (1989)
Article MathSciNet Google Scholar
Wang, J., Ye, X.: New finite element methods in computational fluid dynamics by H(div) elements. SIAM J. Numer. Anal. 45, 1269–1286 (2007)
Article MathSciNet Google Scholar
Zhao, L.: Analysis of a mixed DG method for stress-velocity formulation of the Stokes equations. J. Sci. Comput. 92, 44 (2022)
Article MathSciNet Google Scholar

Download references

Funding

Open access publishing enabled by City University of Hong Kong Library’s agreement with Springer Nature

Author information

Authors and Affiliations

Department of Mathematics, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China
Weifeng Qiu & Lina Zhao

Authors

Weifeng Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Lina Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lina Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

W. Qiu’s research is partially supported by the Research Grants Council of the Hong Kong Special Administrative Region, China. [Project No. CityU 11302219, CityU 11300621].

L. Zhao’s research is partially supported by the Research Grants Council of the Hong Kong Special Administrative Region, China. [Project No. CityU 21309522].

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Qiu, W., Zhao, L. $H(\textrm{div})$-conforming HDG methods for the stress-velocity formulation of the Stokes equations and the Navier–Stokes equations. Numer. Math. 156, 1639–1678 (2024). https://doi.org/10.1007/s00211-024-01419-6

Download citation

Received: 02 May 2023
Revised: 24 February 2024
Accepted: 24 May 2024
Published: 17 June 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s00211-024-01419-6

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

\(H(\textrm{div})\)-conforming HDG methods for the stress-velocity formulation of the Stokes equations and the Navier–Stokes equations

Abstract

Similar content being viewed by others

Error Estimates of EDG-HDG Methods for the Stokes Equations with Dirac Measures

Pressure-robust error estimate of optimal order for the Stokes equations: domains with re-entrant edges and anisotropic mesh grading

New Regularity Criteria Based on Pressure or Gradient of Velocity in Lorentz Spaces for the 3D Navier–Stokes Equations

1 Introduction

2 The pressure-robust discretization and main results

2.1 The pressure-robust discretization in stress-velocity formulation

Theorem 2.1

Proof

Remark 2.1

Remark 2.2

Remark 2.3

2.2 Main results

Theorem 2.2

Theorem 2.3

Theorem 2.4

Theorem 2.5

Remark 2.4

Theorem 2.6

3 A characterization of the proposed HDG scheme

Remark 3.1

4 Error analysis for the Stokes equations

Lemma 4.1

Proof

Lemma 4.2

Proof

Lemma 4.3

Proof

Lemma 4.4

Proof

Proof of Theorem 2.4 (\(L^2\)-error for stress)

Proof of Theorem 2.5 (\(L^2\)-error for velocity)

5 Convergence of the Navier–Stokes equations

Lemma 5.1

Lemma 5.2

Lemma 5.3

Lemma 5.4

Lemma 5.5

Proof

Lemma 5.6

Proof

Lemma 5.7

Proof

Theorem 5.1

Proof

Lemma 5.8

Proof

Lemma 5.9

Proof of Theorem 2.6 (Convergence to weak solution)

Remark 5.1

6 Numerical experiments

6.1 Smooth solution example

6.2 No-flow example

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation