1 Introduction

The development of pressure-robust numerical schemes for incompressible flow problems has emerged as an active research topic in recent years. In the continuous setting, when the source term of the Stokes and Navier–Stokes equations is changed by a gradient field, only the pressure solution is affected, while the velocity solution remains unchanged. The key objective of pressure-robust numerical schemes is to maintain this invariance, so that the obtained numerical velocity solution is exactly divergence-free and its error estimation is independent of the regularity of the pressure solution of the incompressible flow problems. We refer to [27] for a comprehensive review of the pressure-robust finite element methods.

Among the developed pressure-robust numerical schemes, divergence-conforming hybridizable discontinuous Galerkin (H(div)-HDG) methods are an efficient variant of the divergence-conforming discontinuous Galerkin (H(div)-DG) methods. By leveraging the hybridization technique, H(div)-HDG schemes reduce coupling between degrees of freedom (DOFs) when compared to H(div)-DG. Additionally, the H(div)-HDG schemes can be statically condensed into a global system where only DOFs on the mesh skeleton remain, resulting in a much smaller matrix to be solved. At the same time, the HDG schemes retain attractive features of H(div)-DG schemes, such as hp-adaptivity, stable upwind discretization of the convection term, and the ability to handle unstructured meshes with hanging nodes. The grad-velocity-pressure formulation for the H(div)-HDG was first introduced by Cockburn and Sayas in [16] for the Stokes flow, and later generalized to the Brinkman equation in [21]. Lehrenfeld and Schöberl [31, 32] proposed and analyzed a symmetric interior penalty formulation for the H(div)-HDG scheme for the Stokes and Navier–Stokes equations. Different from the aforementioned methods, Rhebergen and Wells [40] constructed H(div)-HDG scheme with facet unknowns for the pressure as Lagrange multipliers to ensure the numerical velocity solution is exactly divergence-free, and this scheme is then extended to the Navier–Stokes equations in [41].

However, constructing efficient solvers and preconditioners for large-scale simulations for the statically condensed systems of high-order HDG schemes remains a challenge and has attracted interest in recent years. For the H(div)-HDG scheme for the incompressible flow problems, most solvers are focused on block preconditioner for the saddle-point structure of the condensed H(div)-HDG for the Stokes flow [23, 42, 43], seeking a robust approximation of the Schur complement. In this study, we propose a robust hp-geometric multigrid preconditioner for the the H(div)-HDG scheme for both the generalized Stokes and Navier–Stokes equations. Constructing geometric multigrid algorithms for the condensed HDG schemes poses a challenge in designing a stable intergrid transfer operator between different mesh levels. This difficulty arises because the global DOFs of the condensed systems exist only on the mesh skeleton, and the finite element spaces of these global unknowns on different mesh levels are non-nested. In [49], some intuitive choices of the intergrid transfer operators were tested for Poisson’s equation but proved to be unstable and not optimal by numerical expriments.

To tackle this problem, Cockburn et al. [14] first introduced a “two-level" V-cycle multigrid for the HDG scheme for Poisson’s equation in the spirit of the auxiliary space preconditioning (ASP) technique, where they employed a continuous piece-wise linear finite element space on the same mesh as the auxiliary space. The residual of the condensed HDG system is projected onto the auxiliary space and there is no coarse-grid facet unknown space. However, adopting such an approach for the H(div)-HDG scheme problems for incompressible flow problems could be challenging. A stable projection operator on to an inf-sup stable pair of auxiliary spaces is needed, and moreover there is no natural correspondent for the upwind HDG discretization of the convection term in the continuous finite element spaces. Lu et al. [35] have recently proposed an approach that keeps the HDG discretization on the coarse grid and constructs a novel prolongation operator, where hierarchical continuous finite element spaces are used to link facet variable spaces on different mesh levels. Lu et al. proved that the standard V-cycle algorithm exhibits h-robust convergence for the local HDG scheme (LDG-H) for Poisson’s equation, and extended their findings to the single-face hybridizable (SFH), hybrid Raviart-Thomas (RT-H), and hybrid Brezzi-Douglas-Marini (BDM-H) schemes for the Stokes equation in a more recent study [36]. To solve the saddle-point structure of the HDG scheme, they employed an augmented Lagrangian with parameter \(\Delta t\) and iteratively solved it (outer iteration), with the standard V-cycle used as the solver of the condensed SPD system (inner iteration). However, while the fast convergence of the outer iteration requires a large \(\Delta t\), the V-cycle of the inner iteration is not robust with \(\Delta t\) and may result in a large total iteration count.

In this study, inspired by the previous studies on hp-multigrid methods to efficiently solve high-order DG schemes [4, 25, 50], we propose an hp-multigrid preconditioner for the condensed global systems of the grad-velocity-pressure formulation for the H(div)-HDG scheme for the generalized Stokes and the Navier–Stokes equations on conforming simplicial meshes. The augmented Lagrangian Uzawa iteration method is used to solve the condensed H(div)-HDG schemes, and we aim to accelerate Krylov space solvers by the hp-multigrid preconditioner for the primal operator on global velocity spaces. Our hp-multigrid is essentially a multiplicative ASP, with lowest-order global velocity spaces as the auxiliary space and geometric h-multigrid method as the auxiliary space solver. For the generalized Stokes equation, the key to the geometric multigrid algorithm is the establishment of the equivalence between the condensed lowest order H(div)-HDG scheme and the nonconforming Crouzeix–Raviart (CR) discretization with a pressure-robust treatment, with both methods introduced in [27] as approaches to satisfy the exact divergence-free constraint in incompressible flow problems. We note that the equivalence between the CR discretization and the lowest-order Raviart-Thomas (RT) mixed method is well-known for Poisson’s equation [1, 37]. Motivated by this, we proved in our previous work [22] the equivalence between the condensed lowest-order HDG scheme (HDG-P0) and a (scaled) CR discretization for the generalized Stokes equation. In this work, we extend and prove the equivalence between the condensed lowest-order H(div)-HDG scheme and a pressure-robust treated CR discretization. Figure 1 demonstrates the DOFs of the condensed HDG-P0, condensed lowest-order H(div)-HDG and CR discretization. Such equivalence allows us to directly employ the rich geometric multigrid theory for CR discretization. Then we use the geometric multigrid as the building block of the hp-multigrid for the condensed higher-order H(div)-HDG schemes. Numerical experiments support the robustness of the preconditioner with respect to mesh size and the augmented Lagrangian parameter, with iteration counts insensitivity to polynomial order increase. Inspired by the works by Benzi and Olshanskii [6], and Farrell et al. [19], we further test the developed hp-preconditioenr on the condensed H(div)-HDG scheme for the linearized Naiver-Stokes equation by Picard and Newton’s method. Our numerical experiments demonstrate that the proposed preconditioner still works well, with a mild increase of the iteration counts of the preconditioned GMRes solver for Reynolds number as large as \(10^3\).

Fig. 1
figure 1

Comparison of the degrees of freedom of condensed HDG-P0, condensed lowest order H(div)-HDG and Crouzeix–Raviart schemes for incompressible flow problems

The rest of the paper is organized as follows. Basic notations and the finite element spaces to be used in the H(div)-HDG schemes are introduced in Sect. 2. In Sect. 3, we present the grad-velocity-pressure formulation of the H(div)-HDG scheme for the generalized Stokes equation and propose the augmented Lagrangian Uzawa iteration for the condensed system. We first focus on the lowest order case, prove the equivalence to the CR discretization and present the geometric multigrid algorithm, then we use it as the building block for the hp-multigrid preconditioner for higher-order cases. In Sect. 4, we present the H(div)-HDG scheme for the Navier–Stokes equations. We use Picard linearization to search for a close enough initial guess and then use Newton’s method to accelerate convergence. Two- and three-dimensional numerical experiments are performed in Sect. 5 and we conclude in Sect. 6.

2 Preliminaries and Notations

We assume a bounded polygonal/polyhedral domain \(\Omega \in {\mathbb {R}}^d\), \(d\in \{2, 3\}\), with boundary \(\partial \Omega \). We denote \({\mathcal {T}}_h\) as a conforming, shape-regular and quasi-uniform simplicial triangulation of the domain \(\Omega \), and \({\mathcal {E}}_h\) as the set of facets of \({\mathcal {T}}_h\). \({\mathcal {E}}_h\) is also referred to as mesh skeleton. We split \({\mathcal {E}}_h\) into the boundary part \({\mathcal {E}}_h^\partial := \{F\in {\mathcal {E}}_h| \;F\subset \partial \Omega \}\) and the interior part \({\mathcal {E}}_h^o:={\mathcal {E}}_h\backslash {\mathcal {E}}_h^\partial \). For any facet \(F\in {\mathcal {E}}_h\), we define \({\underline{n}}\) as the unit normal vector which points outward for \(F\in {\mathcal {E}}_h^\partial \) and is uniquely oriented for \(F\in {\mathcal {E}}_h^o\). We denote \(\textsf{nrm}({\underline{w}}):= ({\underline{w}}\cdot {\underline{n}}){\underline{n}}\) as the normal component and \(\textsf{tng}({\underline{w}}):= {\underline{w}}-\textsf{nrm}({\underline{w}})\) as the tangential component of a vector field \({\underline{w}}\). For each element \(K\in {\mathcal {T}}_h\) with element boundary \(\partial K\), we denote |K| as the measure of K, \((\cdot , \cdot )_K\) as the \(L^2\)-inner product over K, \(\langle \cdot , \cdot \rangle _{\partial K}\) as the \(L^2\)-inner product over \(\partial K\). We further define \((\cdot , \cdot )_{{\mathcal {T}}_h}:= \sum _{K\in {\mathcal {T}}_h}(\cdot , \cdot )_K\) as the discrete \(L^2\) inner product over the domain, and \(\langle \cdot , \cdot \rangle _{\partial {\mathcal {T}}_h}:= \sum _{K\in {\mathcal {T}}_h}\langle \cdot , \cdot \rangle _{\partial K}\) as the discrete \(L^2\) inner product over all element boundaries. For each element \(K\in {\mathcal {T}}_h\), we denote \({\underline{n}}_K\) as the unit normal vector on the element boundaries \(\partial K\) pointing outward.

As usual, we denote \(\Vert \cdot \Vert _{p,S}\) and \(|\cdot |_{p,S}\) as the \(H^p\)-norm and -seminorm of the Hilbert spaces \(H^{p}(S)\) for the domain \(S\subset {\mathbb {R}}^d\), with S omitted when \(S=\Omega \) is the whole domain.

For the finite element spaces, we denote \({\mathcal {P}}^k(S)\) and \({\widetilde{{\mathcal {P}}}}^k(S)\) as the scalar polynomial and homogeneous polynomial of order k over a simplex S. In this work, we use underline to denote the vector-valued spaces, and use double underline to denote the matrix version. The following spaces are used to construct the H(div)-conforming HDG scheme and the multigrid method:

where \({\underline{V}}_h^k\) is the k-th order RT space [39] and is the vector-valued first-order nonconforming CR space [17].

Next, to facilitate our analysis of static condensation of the H(div)-conforming HDG scheme, we perform a hierarchical basis splitting following [55]. We first split \(W_h^k\) into element-wise constant space and its complement, i.e.

$$\begin{aligned} W_h^k = {W}_h^{\partial } \oplus W_h^{k,o}, \end{aligned}$$

where

$$\begin{aligned} {W}_h^{\partial }&:= W_h^0, \\ W_h^{k,o}&:= \left\{ q\in W_h^k:\;\;(q,\;1)_K=0,\;\forall K\in {\mathcal {T}}_h\right\} . \end{aligned}$$

We also divide the RT space \({\underline{V}}_h^k\) into global subspace \({\underline{V}}_h^{k,\partial }\) and local subspace \({\underline{V}}_h^{k,o}\), i.e.

$$\begin{aligned} {\underline{V}}_h^k= {\underline{V}}_h^{k,\partial }\oplus {\underline{V}}_h^{k,o}. \end{aligned}$$

For any function \({\underline{v}}_h^\partial \) in the global subspace \({\underline{V}}_h^{k,\partial }\), it holds

$$\begin{aligned} \nabla \cdot {\underline{v}}_h^{\partial }|_K \in {\mathcal {P}}^0(K), \quad \textsf{nrm}({\underline{v}}_h^{\partial })|_F \in {\underline{{\mathcal {P}}}}^k(F), \quad \forall F\in \partial K \text { and } K\in {\mathcal {T}}_h. \end{aligned}$$

For any function \({\underline{v}}_h^o\) in the local subspace \({\underline{V}}_h^{k,o}\), it holds

$$\begin{aligned} \nabla \cdot {\underline{v}}_h^{o}|_K \in {\mathcal {P}}^k(K) \quad (\nabla \cdot {\underline{v}}_h^{o},1)_K=0, \quad {\textsf{nrm}({\underline{v}}_h^{o}))|_F = {\underline{0}}}, \quad \forall F\in \partial K \text { and } K\in {\mathcal {T}}_h. \end{aligned}$$

Thus, by the definition of the subspaces we have

$$\begin{aligned} \nabla \cdot {\underline{V}}_h^{k,\partial }&= {W}_h^{\partial },\\ \nabla \cdot {\underline{V}}_h^{k,o}&= W_h^{k,o}. \end{aligned}$$

When \(k=0\), the local subspace \({\underline{V}}_h^{k,o}\) is empty, and when \(k\ge 1\), the local components in \({\underline{V}}_h^{k,o}\) are locally eliminated in the static condensation of H(div)-HDG schemes. We refer interested readers to [55, Section 5.2.7] [31, Section 2.2] and the references therein for the basis functions of high-order RT spaces in triangle and tetrahedral.

For two positive constants a and b, we denote \(a\lesssim b\) if there exists a positive constant C independent of mesh size and model parameters such that \(a\le Cb\). We denote \(a \simeq b\) when \(a\lesssim b\) and \(b\lesssim a\).

3 Generalized Stokes Equation

3.1 Model Problem

Let \({\underline{f}}\in L^2(\Omega )\) be the source term and assume homogeneous Dirichlet boundary condition for simplicity. The model problem is to find \(({\underline{u}}, p)\) satisfying

$$\begin{aligned} - \nabla \cdot (\nu \nabla {\underline{u}}) + \beta {\underline{u}} + \nabla p =&\;{\underline{f}}, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(1a)
$$\begin{aligned} \nabla \cdot {\underline{u}} =&\; 0, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(1b)
$$\begin{aligned} {\underline{u}}=&\;{\underline{0}}, \quad{} & {} \text {on }\partial \Omega , \end{aligned}$$
(1c)

where \({\underline{u}}\) is the velocity, p is the pressure, \(\nu > 0\) is a constant representing the fluid viscosity, and the lower-order term coefficient \(0\le \beta _0\le \beta \le \beta _1\).

To present the H(div)-HDG scheme for the generalized Stokes equation, we introduce the tensor \(\underline{\underline{L}}:= - \nu \nabla {\underline{u}}\) as a new variable and rewrite (1) into a first-order system:

$$\begin{aligned} \nu ^{-1}\underline{\underline{L}} + \nabla {\underline{u}} =&\; 0, \quad{} & {} \text {in }\Omega , \end{aligned}$$
(2a)
$$\begin{aligned} \nabla \cdot \underline{\underline{L}} + \beta {\underline{u}} + \nabla p =&\; {\underline{f}}, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(2b)
$$\begin{aligned} \nabla \cdot {\underline{u}} =&\; 0, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(2c)
$$\begin{aligned} {\underline{u}} =&\; {\underline{0}}, \quad{} & {} \text {on }{\partial \Omega .} \end{aligned}$$
(2d)

Both the superconvergence property of the H(div)-HDG scheme used in this study and the geometric multigrid analysis for the lowest-order case require the following full elliptic regularity result for the solution \((\underline{\underline{L}}, {\underline{u}}, p) \in \underline{\underline{H}}^1(\Omega )\times ({\underline{H}}^2(\Omega )\cap {\underline{H}}^1_0(\Omega )\times (H^1(\Omega ){/} {\mathbb {R}})\) to the model problem (2):

$$\begin{aligned} \Vert \underline{\underline{L}}\Vert _1 + \Vert {\underline{u}}\Vert _2 + \Vert p\Vert _1 \lesssim \Vert f\Vert _0, \end{aligned}$$
(3)

which holds in convex domain \(\Omega \) [24].

3.2 The H(div)-HDG Scheme

The H(div)-HDG scheme used in this study for the system (2) is to find \((\underline{\underline{L}}_h, {\underline{u}}_h, \widehat{{\underline{u}}}_h, p_h) \in \underline{\underline{W}}_h^k \times {\underline{V}}_{h,0}^k \times \widehat{{\underline{V}}}_{h,0}^{k}\times W_{h,0}^{k}\), \(k \ge 0\), such that

$$\begin{aligned} (\nu ^{-1}\underline{\underline{L}}_h,\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} + (\nabla {\underline{u}}_h,\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} - \langle \textsf{tng}({\underline{u}}_h - \widehat{{\underline{u}}}_h),\; \underline{\underline{G}}_h{\underline{n}}_K\rangle _{\partial {\mathcal {T}}_h}&= 0, \end{aligned}$$
(4a)
$$\begin{aligned} -( \underline{\underline{L}}_h,\; \nabla {\underline{v}}_h )_{{\mathcal {T}}_h} + \langle \underline{\underline{L}}_h{\underline{n}}_K,\; \textsf{tng}({\underline{v}}_h - \widehat{{\underline{v}}}_h) \rangle _{\partial {\mathcal {T}}_h} + (\beta {\underline{u}}_h,\; {\underline{v}}_h)_{{\mathcal {T}}_h} - (p,\; \nabla \cdot {\underline{v}}_h)_{{\mathcal {T}}_h}&= ({\underline{f}},\; {\underline{v}}_h)_{{\mathcal {T}}_h}, \end{aligned}$$
(4b)
$$\begin{aligned} (\nabla \cdot {\underline{u}}_h,\; q_h)_{{\mathcal {T}}_h}&= 0, \end{aligned}$$
(4c)

for all \((\underline{\underline{G}}_h, {\underline{v}}_h, \widehat{{\underline{v}}}_h, q_h)\in \underline{\underline{W}}_h^k \times {\underline{V}}_{h,0}^k \times \widehat{{\underline{V}}}_{h,0}^{k}\times W_{h,0}^{k}\). The above H(div)-HDG scheme has been studied in [21] for the Brinkman equations. Besides the superconvergence property for post-processed velocity when \(k\ge 1\), the numerical solution \((\underline{\underline{L}}_h, {\underline{u}}_h, \widehat{{\underline{u}}}_h, p_h)\) to the H(div)-HDG scheme (4) has optimal a priori error analysis results when \(k\ge 0\) that are parameter-robust with respect to the ratio \(\nu / \beta \). Since \(\nabla \cdot {\underline{V}}_{h}^{k}=W_h^k\), the velocity error estimates are pressure-robust. We refer readers to [21, Section 2] for more details. Note that no extra facet-based HDG stabilization is introduced in the scheme (4), hence, it is technically a hybrid-mixed method. Here we follow the convention in [16, 21], and still call it an HDG method.

Based on the subspace splitting in Sect. 2, we split the numerical solution \((\underline{\underline{L}}_h, {\underline{u}}_h, \widehat{{\underline{u}}}_h, p_h)\) to (4) into local variables \((\underline{\underline{L}}_h, {\underline{u}}_h^o, p_h^o)\in \underline{\underline{W}}_h^k \times {\underline{V}}_{h}^{k,o}\times W_{h}^{k,o}\) and global variables \(({\underline{u}}_h^{\partial }, \widehat{{\underline{u}}}_h, p_h^{\partial })\in {\underline{V}}_{h,0}^{k,\partial }\times \widehat{{\underline{V}}}_{h,0}^{k}\times W_{h,0}^{\partial }\), where

$$\begin{aligned} {\underline{u}}_h = {\underline{u}}_h^o + {\underline{u}}_h^{\partial }, \quad p_h = p_h^o + p_h^{\partial }. \end{aligned}$$

When implementing the H(div)-HDG scheme (4), we first eliminate the local variables to arrive at the condensed global system composed of the global variables. After solving \(({\underline{u}}_h^{\partial }, \widehat{{\underline{u}}}_h, p_h^{\partial })\) from the condensed global system, which is the most computationally costly part, we recover the local variables in an element-by-element manner.

To simplify notation, we denote the compound spaces \(\underline{{\mathbb {V}}}_h^k:={\underline{V}}_h^k\times \widehat{{\underline{V}}}_h^k\) and \(\underline{{\mathbb {V}}}_h^{k,\partial }:={\underline{V}}_h^{k,\partial }\times \widehat{{\underline{V}}}_h^k\), and their corresponding element functions \(\underline{\mathbb {v}}_h:=({\underline{v}}_h, \widehat{{\underline{v}}}_h)\), \(\underline{\mathbb {v}}_h^{\partial }:=({\underline{v}}_h^{\partial }, \widehat{{\underline{v}}}_h)\) from now on. We introduce an \(L^2\)-like inner-product on \(\underline{{\mathbb {V}}}_{h}^{k,\partial }\):

$$\begin{aligned} (\underline{\mathbb {u}}_h^\partial ,\; \underline{\mathbb {v}}_h^\partial )_{0, h} := \sum _{K\in {\mathcal {T}}_h}\frac{|K|}{d+1} \left( \langle \textsf{nrm}({{\underline{u}}}_h),\; \textsf{nrm}({{\underline{v}}}_h) \rangle _{\partial K} + \langle \textsf{tng}(\widehat{{\underline{u}}}_h),\; \textsf{tng}(\widehat{{\underline{v}}}_h) \rangle _{\partial K}\right) \end{aligned}$$
(5)

for all \(\underline{\mathbb {u}}_h^\partial ,\underline{\mathbb {v}}_h^\partial \in \underline{{\mathbb {V}}}_h^{k, \partial }\), with the induced norm \(\Vert \cdot \Vert _{0, h}\). The constant global pressure space \(W_{h}^{\partial }\) is equipped with standard \(L^2\) norm denoted as

$$\begin{aligned}{}[p_h^\partial ,\; q_h^\partial ]_{0,h}:= (p_h^\partial ,\; q_h^\partial )_{{\mathcal {T}}_h}, \quad \forall p_h^\partial , q_h^\partial \in W_{h}^\partial . \end{aligned}$$

The characterization of the condensed system of (4) has been studied in [21] and the operator form is to find \((\underline{\mathbb {u}}_h^\partial , p_h^{\partial })\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\times W_{h,0}^{\partial }\) satisfying

$$\begin{aligned} {\underline{A}}_{k,h}\underline{\mathbb {u}}_h^\partial + {\underline{B}}_{k,h}^*p_h^\partial =\;&{\underline{F}}_{k,h} \end{aligned}$$
(6a)
$$\begin{aligned} {\underline{B}}_{k,h} \underline{\mathbb {u}}_h^\partial =\;&0, \end{aligned}$$
(6b)

for all \(\underline{\mathbb {u}}_h^\partial , \underline{\mathbb {v}}_h^\partial \in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\) and \(q_h^\partial \in W_{h,0}^\partial \), where

$$\begin{aligned} ({\underline{A}}_{k,h}\underline{\mathbb {u}}_h^\partial ,\; \underline{\mathbb {v}}_h^\partial )_{0,h}:=\;&(\nu \underline{\underline{{\mathcal {L}}}}^W(\underline{\mathbb {u}}_h^\partial ),\; \underline{\underline{{\mathcal {L}}}}^W(\underline{\mathbb {v}}_h^\partial ))_{{\mathcal {T}}_h} + (\beta \underline{{\mathcal {L}}}^V(\underline{\mathbb {u}}_h^\partial ),\; \underline{{\mathcal {L}}}^V(\underline{\mathbb {v}}_h^\partial ))_{{\mathcal {T}}_h}, \\ [{\underline{B}}_{k,h}\underline{\mathbb {u}}_h^\partial ,\; q_h^\partial ]_{0,h}:=\;&-(\nabla \cdot \underline{{u}}_h^\partial ,\; q_h^\partial )_{{\mathcal {T}}_h}, \end{aligned}$$

and \(\underline{\underline{{\mathcal {L}}}}^W:\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\rightarrow \underline{\underline{W}}_{h}^{k}\) and \(\underline{{\mathcal {L}}}:\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\rightarrow {\underline{V}}_{h,0}^{k}\) are mappings defined by the well-posed local solvers of the H(div)-HDG scheme, and \({\underline{B}}_{k,h}^*:W_{h,0}^{\partial }\rightarrow \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\) is the transpose of \({\underline{B}}_{k,h}:\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\rightarrow W_{h,0}^{\partial }\) with respect to \(L^2\) inner product:

$$\begin{aligned} ({\underline{B}}_{k,h}^*p_h^\partial ,\; \underline{\mathbb {v}}_h^\partial )_{0,h} = [p_h^\partial ,\; {\underline{B}}_{k,h}\underline{\mathbb {v}}_h^\partial ]_{0,h}, \quad \forall p_h^\partial \in W_{h,0}^\partial ,\; \underline{\mathbb {v}}_h^\partial \in \underline{{\mathbb {V}}}_{h,0}^{k,\partial } \end{aligned}$$

It is clear to see that \({\underline{A}}_{k,h}:\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\rightarrow \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\) is a symmetric positive definite (SPD) operator, and we denote \(\Vert \cdot \Vert _{{A}_{k,h}}\) as the induced norm on \(\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\), i.e. \(\Vert \cdot \Vert _{{A}_{k,h}}:= \sqrt{({\underline{A}}_{k,h}\cdot ,\;\cdot )_{0,h}}\)

3.3 Augmented Lagrangian Uzawa Iteration for the Condensed H(div)-HDG

To avoid solving saddle-point system, we apply the augmented Lagrangian Uzawa iteration method [20, 26, 30, 52]. The saddle-point system is first transformed into an equivalent augmented Lagrangian formulation: find \((\underline{\mathbb {u}}_h^\partial , p_h^{\partial })\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\times W_{h,0}^{\partial }\) satisfying

$$\begin{aligned} \underbrace{({\underline{A}}_{k,h}+ \epsilon ^{-1}{\underline{B}}_{k,h}^*{\underline{B}}_{k,h})}_{{\underline{A}}_{k,h}^\epsilon }\underline{\mathbb {u}}_h^\partial + {\underline{B}}_{k,h}^*p_h^\partial =&{\underline{F}}_{k, h}, \end{aligned}$$
(7a)
$$\begin{aligned} {\underline{B}}_{k,h} \underline{\mathbb {u}}_h^\partial =&0, \end{aligned}$$
(7b)

where \(\epsilon \) is a small augmented Lagrangian parameter, which is also referred to as the penalty parameter. The operator form of \(\epsilon ^{-1}{\underline{B}}_{k,h}^*{\underline{B}}_{k,h}\) is expressed as

$$\begin{aligned} (\epsilon ^{-1}{\underline{B}}_{k,h}^*{\underline{B}}_{k,h}\underline{\mathbb {u}}_h^\partial ,\; \underline{\mathbb {v}}_h^\partial )_{0,h} = \epsilon ^{-1}(\nabla \cdot \underline{{u}}_h^\partial ,\;\nabla \cdot \underline{{v}}_h^\partial )_{{\mathcal {T}}_h}, \quad \forall \underline{\mathbb {u}}_h^\partial , \underline{\mathbb {v}}_h^\partial \in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }. \end{aligned}$$

The Uzawa iteration method with \(\epsilon ^{-1} \gg 1\) is to start with \(p_h^{\partial \,(0)}=0\), and iteratively find \((\underline{\mathbb {u}}_h^{\partial \,(n)},p_h^{\partial \,(n)})\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\times W_{h,0}^{\partial }\) such that

$$\begin{aligned} {\underline{A}}_{k,h}^\epsilon \underline{\mathbb {u}}_h^{\partial \,(n)} =&\;{\underline{F}}_{k,h} - {\underline{B}}_{k,h}^*p_h^{\partial \,(n-1)}, \end{aligned}$$
(8a)
$$\begin{aligned} p_h^{\partial \, (n)} =&\; p_h^{\partial \, (n-1)}- \epsilon ^{-1} {\underline{B}}_{k,h} \underline{\mathbb {u}}_h^{\partial \, (n)}, \end{aligned}$$
(8b)

The convergence property of the Augmented Lagrangian Uzawa iteration method (8) has been studied in [30], and we quote it here for completeness.

Lemma 3.1

[Lemma 2.1 of [30]] Let \((\underline{\mathbb {u}}_h^{\partial }, p_h^\partial )\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\times W_{h,0}^\partial \) be the solution of (6) and \((\underline{\mathbb {u}}_h^{\partial \, (n)}, p_h^{\partial \, (n)})\) be the n-th Uzawa iteration solution to (8). Then the following estimate holds:

$$\begin{aligned} \Vert \underline{\mathbb {u}}_h^{\partial \,(n)} - \underline{\mathbb {u}}_h^{\partial }\Vert _{{A}_{k,h}} \lesssim&\sqrt{\epsilon }\Vert p_h^{\partial \,(n)} - p_h^\partial \Vert _0\; \lesssim \; \sqrt{\epsilon }\bigl (\frac{\epsilon }{\epsilon +\mu _0}\bigr )^{n} \Vert p_h^\partial \Vert _{0}, \end{aligned}$$

where \(\mu _0\) is the minimal eigenvalue of the Schur complement \({\underline{S}}_{k,h}={\underline{B}}_{k,h}{\underline{A}}_{k,h}^{-1}{\underline{B}}^*_{k,h}\).

The discontinuous piecewise constant nature of the global pressure space \(W_{h,0}^\partial \) results in trivial computation in equation (8b). As a consequence, the major computational cost of a Uzawa iteration is associated with solving the global velocity equation (8a). Here we use the preconditioned conjugate gradient (PCG) method to iteratively solve it. Specifically, we refer to the augmented Uzawa iteration method as the outer iteration, and the CG method for solving equation (8a) as the inner iteration. The subsequent subsections are devoted to designing an hp-multigrid algorithm to precondition the operator \({\underline{A}}_{k,h}^\epsilon \) for the PCG method.

Remark 3.1

[On the augmented Lagrangian Uzawa iteration for (6)] We are aware of the extensive body of literature on solving saddle-point problems, including the problem presented in equation (6). For a comprehensive review, we refer the reader to [5]. In this work, we adopt the augmented Lagrangian Uzawa iteration method for two primary reasons. Firstly, by locally eliminating \(p_h^\partial \), the final matrix size is further reduced. Secondly, the resulting global velocity operator \({\underline{A}}_{k,h}^\epsilon \) is SPD on the space \(\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\), allowing us to use the CG method, for which the eigenvalues of the preconditioned linear operator are sufficient to characterize the convergence rate of the method. A good preconditioner for \({\underline{A}}_{k,h}^\epsilon \) can also serve in block preconditioners as an approximate inverse of the primal (1, 1)-block of the augmented Lagrangian transformed saddle-point structure as in (7).

However, there are two potential drawbacks to this approach. As observed from Lemma 3.1, \(\epsilon \ll 1\) leads to fast convergence of the Uzawa iteration. Firstly, when \(\epsilon \ll 1\) the term \(\epsilon ^{-1}{\underline{B}}_{k,h}^*{\underline{B}}_{k,h}\) in \({\underline{A}}_{k,h}^\epsilon \) poses challenges for preconditioning. In the following sections, we construct hp-multigrid methods that are robust with respect to parameter \(\epsilon \) and mesh size, such that PCG/inner iterations remain bounded while we can take \(\epsilon \) arbitrarily small and only few Uzawa/outer iterations are needed, avoiding large total computational cost. Secondly, setting \(\epsilon \) to an extremely small value leads to round-off issues. This is because the non-zero entries in the matrix \({\underline{A}}_{k,h}\) are of order \({\mathcal {O}}(1)\), whereas the entries in \(\epsilon ^{-1}{\underline{B}}_{k,h}^*{\underline{B}}_{k,h}\) are of order \({\mathcal {O}}(\epsilon ^{-1})\). Hence, \(\log (\epsilon ^{-1})\) digits loss is expected in a practical implementation.

In our numerical experiments, we use double digit calculation with machine precision of \(10^{-16}\). We set \(\epsilon =10^{-6}\) and only two Uzawa iteration is needed to achieve the required accuracy. Here the round-off error is of order \({\mathcal {O}}(10^{-16}/\epsilon )={\mathcal {O}}(10^{-10})\).

3.4 Geometric h-multigrid for the Lowest-Order Scheme

We first focus on the lowest order case when \(k=0\) and prove the equivalence between the condensed H(div)-HDG scheme (4) and a CR discretization after a pressure-robust treatment. Then we propose for the lowest-order scheme a geometric h-multigrid method robust with the mesh size and the penalty parameter \(\epsilon \).

3.4.1 Equivalence to a CR Discretization

To explain the pressure-robust treatment of the CR discretization, we introduce an interpolation operator from vector CR space to the RT0 space satisfying

(9)

To establish the link between the lowest-order H(div)-HDG and the CR discretization, we define an interpolation operator from the lowest order RT0 and tangential facet finite element space to the vector CR space :

(10a)
(10b)

for all \(F \in \partial K\), \(K\in {\mathcal {T}}_h\), where \(m_F\) is the barycenter of facet F. By counting the DOFs of the spaces, it is clear that is an isomorphic mapping between \(\underline{{\mathbb {V}}}_h^0\) and . Then we have the following property for the above interpolation operators:

Lemma 3.2

For all \(\underline{\mathbb {v}}_h\in \underline{{\mathbb {V}}}_h^0\), we have:

$$\begin{aligned} {\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h&= {\underline{v}}_h, \end{aligned}$$
(11)
$$\begin{aligned} \nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h&= \nabla \cdot {\underline{v}}_h. \end{aligned}$$
(12)

Proof

With the definition of the interpolation operators in (9) and (10),

we have

$$\begin{aligned} \int _F {\textsf{nrm}({\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)} \textrm{ds} = \int _F {\textsf{nrm}({\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)} \textrm{ds} =&\int _F {\textsf{nrm}({\underline{v}}_h)}\textrm{ds}, \end{aligned}$$

for all \(\underline{\mathbb {v}}_h\in \underline{{\mathbb {V}}}_h^0\), \(F\in \partial K\) and \(K\in {\mathcal {T}}_h\). The result (11) then immediately follows.

Similarly, by the divergence theorem, we have for all \(\underline{\mathbb {v}}_h\in \underline{{\mathbb {V}}}_h^0\),

$$\begin{aligned} \int _{K}\nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h \textrm{dx} = \int _{\partial K}({\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)\cdot {\underline{n}}_K\textrm{ds} = \int _{\partial K}\underline{{v}}_h\cdot {\underline{n}}_K \textrm{ds} = \int _{K} \nabla \cdot {\underline{v}}_h\textrm{dx}, \end{aligned}$$

and the divergence invariance (12) follows from the fact that

$$\begin{aligned} (\nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)|_K, \quad (\nabla \cdot {\underline{v}}_h)|_K \in {\mathcal {P}}^0(K), \quad \forall K\in {\mathcal {T}}_h. \end{aligned}$$

\(\square \)

Now we are ready to present our main result below.

Theorem 3.1

[Equivalence to CR discretization at lowest order] Let \(({\underline{u}}_{h}^{CR}, p_h^{CR}) \in {\underline{V}}_{h,0}^{CR}\times W_{h,0}^0\) be the solution to the following nonconforming scheme:

$$\begin{aligned} \nu (\nabla {\underline{u}}_h^{CR},\; \nabla {\underline{v}}_h^{CR})_{{\mathcal {T}}_h} +\beta ({\underline{\Pi }}^{RT}{\underline{u}}_h^{CR},\; {\underline{\Pi }}^{RT}{\underline{v}}_h^{CR})_{{\mathcal {T}}_h} - (p_h^{CR},\; \nabla \cdot {\underline{v}}_h^{CR})_{{\mathcal {T}}_h}&= ({\underline{f}},\; {\underline{\Pi }}^{RT}{\underline{v}}_h^{CR})_{{\mathcal {T}}_h}, \end{aligned}$$
(13a)
$$\begin{aligned} (\nabla \cdot {\underline{u}}_h^{CR},\; q_h^{CR})_{{\mathcal {T}}_h}&= 0, \end{aligned}$$
(13b)

for all \(({\underline{v}}_{h}^{CR}, q_h^{CR}) \in {\underline{V}}_{h,0}^{CR}\times W_{h,0}^0\). Then the solution \((\underline{\underline{L}}_h, \mathbb {{\underline{u}}}_h, p_h) \in \underline{\underline{W}}_h^0\times \underline{{\mathbb {V}}}_{h,0}^0 \times W_{h,0}^0\) to the lowest-order H(div)-HDG scheme for the generalized Stokes equation (4) satisfies

$$\begin{aligned} \underline{\underline{L}}_h&= -\nu \nabla {\underline{u}}_h^{CR}, \end{aligned}$$
(14a)
$$\begin{aligned} {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h&= {\underline{u}}_h^{CR}, \end{aligned}$$
(14b)
$$\begin{aligned} p_h&= p_h^{CR}. \end{aligned}$$
(14c)

Proof

By integration by parts and the definition of \({\underline{\Pi }}^{CR}\), we have

$$\begin{aligned} (\nabla {\underline{u}}_h,\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} - \langle \textsf{tng}({\underline{u}}_h - \widehat{{\underline{u}}}_h),\; \underline{\underline{G}}_h{\underline{n}}_K\rangle _{\partial {\mathcal {T}}_h} =&\langle (\textsf{nrm}({\underline{u}}_h) + \textsf{tng}(\widehat{{\underline{u}}}_h),\; \underline{\underline{G}}_h{\underline{n}}_K\rangle _{\partial {\mathcal {T}}_h}\nonumber \\&\; - \underbrace{({\underline{u}}_h,\; \nabla \cdot \underline{\underline{G}}_h)_{{\mathcal {T}}_h}}_{\equiv 0} \nonumber \\ =&\langle {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \underline{\underline{G}}_h{\underline{n}}_K\rangle _{\partial {\mathcal {T}}_h} \nonumber \\ =&(\nabla {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} + \underbrace{( {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \nabla \cdot \underline{\underline{G}}_h)_{{\mathcal {T}}_h}}_{\equiv 0}, \end{aligned}$$
(15)

for all \(\underline{\mathbb {u}}_h\in \underline{{\mathbb {V}}}_{h,0}^0\) and \(\underline{\underline{G}}_h\in \underline{\underline{W}}_{h}^0\), where we used the fact that \(\underline{\underline{G}}_h|_K\in \underline{\underline{{\mathcal {P}}}}^0(K)\). By plugging (15) into (4a), we immediately get

$$\begin{aligned} \underline{\underline{L}}_h = -\nu \nabla {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h. \end{aligned}$$
(16)

Next, with the same arguments as in proving (15), for all \(\underline{\mathbb {v}}_h\in \underline{{\mathbb {V}}}_{h,0}^{0}\) and \(\underline{\underline{L}}_h\in \underline{\underline{W}}_{h}^0\) we have

$$\begin{aligned} -( \underline{\underline{L}}_h,\; \nabla {\underline{v}}_h )_{{\mathcal {T}}_h} + \langle \underline{\underline{L}}_h{\underline{n}}_K, \textsf{tng}({\underline{v}}_h - \widehat{{\underline{v}}}_h) \rangle _{\partial {\mathcal {T}}_h} =&\underbrace{(\nabla \cdot \underline{\underline{L}}_h,\; {\underline{v}}_h)_{{\mathcal {T}}_h}}_{\equiv 0} - \langle \underline{\underline{L}}_h{\underline{n}}_K,\; \textsf{nrm}({\underline{u}}_h) + \textsf{tng}(\widehat{{\underline{v}}}_h) \rangle _{\partial {\mathcal {T}}_h} \nonumber \\ =&-\langle \underline{\underline{L}}_h{\underline{n}}_K,\; {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h\rangle _{\partial {\mathcal {T}}_h} \nonumber \\ =&\underbrace{-(\nabla \cdot \underline{\underline{L}}_h,\; \Pi ^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h}}_{\equiv 0} - (\underline{\underline{L}}_h,\; \nabla {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} \nonumber \\ =&\nu (\nabla {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \nabla {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h}, \end{aligned}$$
(17)

where we plug in the equivalence (16) at the final step. Finally, by plugging (17) into the lowest order H(div)-HDG scheme (4b) and results in Lemma 3.2, the lowest order H(div)-HDG scheme (4) becomes finding \((\underline{\mathbb {u}}_h,\; p_h)\in \underline{{\mathbb {V}}}_{h,0}^0\times W_{h,0}^0\) satisfying

$$\begin{aligned} \nu (\nabla {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \nabla {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} +\beta ({\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; {\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h}&\nonumber \\ -(p_h,\; \nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h}&= ({\underline{f}},\; {\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} \end{aligned}$$
(18a)
$$\begin{aligned} (\nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; q_h)_{{\mathcal {T}}_h}&= 0, \end{aligned}$$
(18b)

for all \((\underline{\mathbb {v}}_h,\; q_h)\in \underline{{\mathbb {V}}}_{h,0}^0\times W_{h,0}^0\), and the equivalence results (14) follows from the well-posedness of the CR discretization (13). \(\square \)

Remark 3.2

(On the modified CR discretization) The modified CR discretization (13) was firstly introduced in [33] for the incompressible Stokes equation, i.e. \(\beta =0\) in (1). The original mixed CR discretization for the incompressible Stokes equation, though inf-sup stable, has poor mass conservation property and not pressure-robust velocity error estimates. To address this issue, the test function on the right hand side was reconstructed using \({\underline{\Pi }}^{RT}\), which maps discretely divergence-free test functions onto exactly divergence-free test functions and reconstructs the \(L^2\)-orthogonality between discretely divergence-free and irrotational vector fields in the mixed method, while the left hand side remains unchanged. Despite introducing extra inconsistency error into the scheme, optimal and pressure-robust discrete \(H^1\) norm error estimates of velocity were obtained, and optimal and pressure-robust \(L^2\) norm convergence rates were supported by numerical experiments. Later this pressure-robust treatment of CR discretization was extended to the incompressible generalized Stokes equation [34], i.e. \(\beta \ne 0\) in (1), where both the trial function and the test function in the mass term were further reconstructed onto RT0 space. It is worth mentioning that a skew-symmetric pressure-robust treatment for the convection term was also introduced in [34] for the CR discretization of the incompressible Navier–Stokes equations.

In practical implementation of the lowest-order H(div)-HDG scheme (4), we first locally eliminate \(\underline{\underline{L}}_h\) and arrive at the condensed global system (18) composed of \((\underline{\mathbb {u}}_h,\;p_h)\) where there is no higher-order local velocity and pressure components. Thus theorem 3.1 implies the equivalence between the condensed lowest-order H(div)-HDG scheme (4) and the modified CR discretization (13). In other words, when \(k=0\), the condensed H(div)-HDG scheme is equivalent to find \((\underline{\mathbb {u}}_h,p_h)\in \underline{{\mathbb {V}}}_{h,0}^0\times W_{h,0}^0\) satisfying

$$\begin{aligned} {\underline{A}}_{0,h}\underline{\mathbb {u}}_h + {\underline{B}}_{0,h}^*p_h =&{\underline{F}}_{0,h}, \end{aligned}$$
(19a)
$$\begin{aligned} {\underline{B}}_{0,h} \underline{\mathbb {u}}_h =&0, \end{aligned}$$
(19b)

where for all \(\underline{\mathbb {u}}_h, \underline{\mathbb {v}}_h \in \underline{{\mathbb {V}}}_{h,0}^{0}\) and \(q_h\in W_{h,0}^0\),

$$\begin{aligned} ({\underline{A}}_{0,h}\underline{\mathbb {u}}_h,\;\underline{\mathbb {v}}_h)_{0,h} \equiv&\;\; \nu (\nabla {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \nabla {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} +\beta ({\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; {\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h}, \\ [{\underline{B}}_{0,h} \underline{\mathbb {u}}_h, q_h]_{0,h} \equiv&\;\; -(\nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; q_h)_{{\mathcal {T}}_h}, \end{aligned}$$

and the augmented Lagrangian Uzawa iteration (8) is equivalent to find \((\underline{\mathbb {u}}_h^{(n)},p_h^{(n)})\in {\underline{V}}_{h,0}^0\times W_{h,0}^0\) satisfying

$$\begin{aligned} {\underline{A}}_{0,h}^\epsilon \underline{\mathbb {u}}_h^{(n)} =&\;{\underline{F}}_{0,h} - {\underline{B}}_{0,h}^*p_h^{(n-1)}, \end{aligned}$$
(20a)
$$\begin{aligned} p_h^{(n)} =&\; p_h^{(n-1)}- \epsilon ^{-1} {\underline{B}}_{0,h} \underline{\mathbb {u}}_h^{(n)}. \end{aligned}$$
(20b)

where for all \(\underline{\mathbb {u}}_h, \underline{\mathbb {v}}_h \in \underline{{\mathbb {V}}}_{h,0}^{0}\),

$$\begin{aligned} ({\underline{A}}_{0,h}^\epsilon \underline{\mathbb {u}}_h,\; \underline{\mathbb {v}}_h)_{0,h} \equiv&\;\; \nu (\nabla {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \nabla {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} +\beta ({\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; {\underline{\Pi }}^{RT}{\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} \\&+ (\nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {u}}_h,\; \nabla \cdot {\underline{\Pi }}^{CR}\underline{\mathbb {v}}_h)_{{\mathcal {T}}_h} \end{aligned}$$

This result is inspired by the equivalence of the lowest order Raviart-Thomas mixed method and the nonconforming CR method for Poisson’s equation [1, 37]. Such equivalence allows for the direct application of the rich multigrid theory on the CR discretization to solve the condensed lowest-order H(div)-HDG scheme. In this study, we adopt Schöberl’s geometric multigrid theory [45, 46] to the SPD primal operator \({\underline{A}}_{0,h}^{\epsilon }\) to get geometric multigrid methods robust concerning both mesh size and the penalty parameter \(\epsilon \).

Remark 3.3

(On geometric h-multigrid for the CR discretization) There are other two approaches in the literature to construct geometric multigrid algorithms for the CR scheme (13) which is equivalent to the condensed system (18). The first approach [10, 48, 51] exploits the cell-wise divergence-free subspace of the CR element and implements multigrid algorithms for the positive definite system on the divergence-free kernel space. However, this method is limited to two dimensions and the extension to three dimensions is exceedingly challenging due to the need for constructing complex intergrid transfer operators between the divergence-free subspaces. The second approach, proposed by Brenner [11, 12], directly deals with the saddle point system (with a penalty term) to avoid the divergence kernel, and proved convergence results for nested W-cycle multigrid method with large enough smoothing steps. Schöberl’s approach works with the resulting positive definite primal operator from the saddle point system with a penalty term, the same as the \({\underline{A}}_{0,h}^{\epsilon }\) in our study. This approach is more feasible in three dimensions as the intergrid transfer operator is considerably easier to implement in practice than the first approach. Originally introduced for the \(P^2\)-\(P^0\) discretization on triangles, Schöberl’s approach has been successfully applied by other researchers to other finite element schemes in two- and three-dimensions, as documented in [18, 19, 26, 28, 29].

3.4.2 Geometric Multigrid Algorithm

In this subsection, we present detailed geometric multigrid algorithm to solve (20a) based on the parameter-robust multigrid theory of Schöberl [45, 46].

We consider a hierarchical mesh sequence for the geometric h-multigrid algorithm, beginning with the coarsest simplicial triangulation \({\mathcal {T}}_1\). The finest mesh is denoted as \({\mathcal {T}}_J={\mathcal {T}}_h\) and is obtained through a sequence of mesh refinements for \(l=2,\dots , J\). On each mesh level, the mesh skeleton is denoted as \({\mathcal {E}}_l\) and the maximum mesh size of \({\mathcal {T}}_l\) is denoted by \(h_l\). We assume that on each level the triangulation \({\mathcal {T}}_l\) is conforming, shape-regular, and quasi-uniform over the domain \(\Omega \), and that the difference in mesh size between two adjacent mesh levels is bounded by \(h_{l} \lesssim h_{l+1}\). The corresponding finite element spaces on \({\mathcal {T}}_l\) are denoted as \(W_l^0\), \({\underline{V}}_l^0\), \(\widehat{{\underline{V}}}_h^0\), and \(V_l^{CR}\). The corresponding \(L^2\) inner product on global spaces \(\underline{{\mathbb {V}}}_{l}^{0}\) and \({W}_l^0\) are denoted as \((\cdot ,\;\cdot )_{0,l}\) and \([\cdot ,\;\cdot ]_{0,l}\), and the corresponding linear operators in (20) are denoted as \({\underline{A}}_{0,l}^\epsilon \), \({\underline{B}}_{0,l}^*\) and \({\underline{B}}_{0,l}\).

The main components in [45, 46] are (i) a robust intergrid transfer operator that transfer coarse-grid divergence-free functions to fine-grid (nearly) divergence-free functions, and (ii) a robust block-smoother capable of capturing the divergence-free basis functions.

For the intergrid transfer operator, we first define the following averaging operator \({\underline{I}}_{l-1}^l:\underline{{\mathbb {V}}}_{l-1,0}^{0}\rightarrow \underline{{\mathbb {V}}}_{l,0}^{0}\) in light of the one used in multigrid methods for the CR discretization for Poisson’s equation [7, 9]: Let \(({\underline{v}}_{l}', \widehat{{\underline{v}}}_{l}') = {\underline{I}}_{l-1}^l\underline{\mathbb {v}}_{l-1}\), then we have

(21a)
(21b)

where and are the values of on two adjacent elements \(K^{+},\; K^{-}\in {{\mathcal {T}}_{l-1}}\) that share the facet F. On the finer l-th mesh level, we denote \({\underline{V}}_{l,0}^{0,T}\) and \(\widehat{{\underline{V}}}_{l,0}^{0,T}\) as the local subspaces of \({\underline{V}}_{l,0}^{0}\) and \(\widehat{{\underline{V}}}_{l,0}^{0}\) respectively, with DOFs of these subspaces vanishing on the mesh skeleton \({\mathcal {E}}_{l-1}\) of the coarser \((l-1)\)-th mesh level. Due to the components of \(\underline{\mathbb {v}}_l'\) in \(\underline{{\mathbb {V}}}_{l,0}^{0,T}\), the energy norm \(\Vert \underline{\mathbb {v}}_{l-1}\Vert _{{A}_{0,l-1}^\epsilon }\) can not be bounded by \(\Vert \underline{\mathbb {v}}_{l}'\Vert _{{A}_{0,l}^\epsilon }\) independent of \(\epsilon ^{-1}\) after performing only averaging operator, thus we stabilize this averaging operator with a local correction using discrete harmonic extensions [45]. The integer grid transfer operator \(\underline{{\mathcal {I}}}_{l-1}^l: \underline{{\mathbb {V}}}_{l-1,0}^0\rightarrow \underline{{\mathbb {V}}}_{l,0}^0\) is defined as follows:

$$\begin{aligned} \underline{{\mathcal {I}}}_{l-1}^{l}:=(id - {\underline{P}}^{T}_{{A}_{0,l}^\epsilon }){\underline{I}}_{l-1}^l, \end{aligned}$$
(22)

where id is the identity operator and the local projection \({\underline{P}}_{{A}_{0,l}^\epsilon }^{T}: {{\underline{V}}}_{l,0}^0\rightarrow {\underline{{\mathbb {V}}}}_{l,0}^{0,T}\) satisfies

$$\begin{aligned} ({\underline{A}}_{0,l}^\epsilon {\underline{P}}_{{A}_{0,l}^{\epsilon }}^{T}{\underline{\mathbb {u}}}_l,\; {\underline{\mathbb {v}}}_l^T)_{0,l} = ({\underline{A}}_{0,l}^\epsilon {\underline{\mathbb {u}}}_l,\; {\underline{\mathbb {v}}}_l^T)_{0,l}, \quad \forall {\underline{\mathbb {u}}}_l\in \underline{{\mathbb {V}}}_{l,0}^0,\; {\underline{\mathbb {v}}}_l^T\in \underline{{\mathbb {V}}}_{l,0}^{0,T}, \end{aligned}$$
(23)

which is locally solved on coarse mesh elements of \({\mathcal {T}}_{l-1}\). We further define the restriction operator \(\underline{{\mathcal {I}}}_l^{l-1}: \underline{{\mathbb {V}}}_{l,0}^0\rightarrow \underline{{\mathbb {V}}}_{l-1,0}^0\) as the transpose of \({\underline{I}}_{l-1}^l\) with respect to \((\cdot ,\;\cdot )_{0,l}\) satisfying

$$\begin{aligned} (\underline{{\mathcal {I}}}_{l}^{l-1}\underline{\mathbb {u}}_{l},\; \underline{\mathbb {v}}_{l-1})_{0,l-1} = (\underline{\mathbb {u}}_{l},\; \underline{{\mathcal {I}}}_{l-1}^l\underline{\mathbb {v}}_l)_{0,l}, \quad \forall \underline{\mathbb {u}}_l\in \underline{{\mathbb {V}}}_{l,0}^0,\; \underline{\mathbb {v}}_{l-1}\in \underline{{\mathbb {V}}}_{l-1,0}^{0}. \end{aligned}$$

For the smoother for (20a), we employ the classical block smoother for H(div)-elliptic problems proposed by Arnold, Falk, and Winther [2] to address the discretely divergence-free kernel space of the operator \({\underline{A}}_{0,l}^\epsilon \). Specifically, we utilize the vertex-patched damped block Jacobi or block Gauss-Seidel smoother. It is worth noting that in three dimensions, edge-patch block smoothers can also be utilized to reduce memory usage [2]. For completeness, we present the formulation of the vertex-patch damped block Jacobi smoother here. We define \(\mathcal {S}_l\) as the set of vertices in the triangulation \({\mathcal {T}}_l\), and we further denote \({\mathcal {T}}_{l}^s\) as the subset of mesh elements and \({\mathcal {E}}_{l}^s\) as the subset of mesh skeletons meeting at the vertex \(s\in \mathcal {S}_l\) respectively, i.e.

$$\begin{aligned} {\mathcal {T}}_{l}^s:=&\bigcup \limits _{\begin{array}{c} K\in {\mathcal {T}}_l, \; s\in K \end{array}} K, \\ {\mathcal {E}}_{l}^s:=&\bigcup \limits _{\begin{array}{c} F\in {\mathcal {E}}_l, \; s\in F \end{array}} F. \end{aligned}$$

The lowest-order compound finite element space \(\underline{{\mathbb {V}}}_{l,0}^0\) is then decomposed into overlapping subspaces with support on \({\mathcal {T}}_{l}^s\) and \({\mathcal {E}}_{l}^s\) as follows:

$$\begin{aligned} \underline{{\mathbb {V}}}_{l,0}^0 = \sum _{s\in \mathcal {S}_l} \underline{{\mathbb {V}}}_{l, 0}^{0, s} := \sum _{s\in \mathcal {S}_l} \left\{ ({\underline{v}}_l^s, \widehat{{\underline{v}}}_l^s)\in \underline{{\mathbb {V}}}_{l,0}^0:\; \textrm{supp}\;{{\underline{v}}}_l^s\subset \textrm{interior}({\mathcal {T}}_{l}^s),\; \textrm{supp}\;\widehat{{\underline{v}}}_l^s\subset {\mathcal {E}}_{l}^s \right\} . \end{aligned}$$

Furthermore, we define \({\underline{P}}_{{A}_{0,l}^\epsilon }^s:\; \underline{{\mathbb {V}}}_{l,0}^0\rightarrow \underline{{\mathbb {V}}}_{l,0}^{0,s}\) as the local projection onto the subspace \(\underline{{\mathbb {V}}}_{l,0}^{0,s}\) with respect to the operator \({\underline{A}}_{0,l}^\epsilon \) that satisfies:

$$\begin{aligned} ({\underline{A}}_{0,l}^\epsilon {\underline{P}}_{{A}_{0,l}^\epsilon }^s\underline{\mathbb {u}}_l,\;\underline{\mathbb {v}}_{l}^s)_{0,l} = ({\underline{A}}_{0,l}^\epsilon \underline{\mathbb {u}}_l,\;\underline{\mathbb {v}}_{l}^s)_{0,l}, \quad \forall \underline{\mathbb {u}}_l \in \underline{{\mathbb {V}}}_{l,0}^0,\; \underline{\mathbb {v}}_{l}^s\in \underline{{\mathbb {V}}}_{l,0}^{0,s},\; \forall s\in \mathcal {S}_l. \end{aligned}$$

The damped block Jacobi smoother is then expressed as:

$$\begin{aligned} {\underline{R}}_l:= \varsigma \sum _{s\in \mathcal {S}_l}{\underline{P}}_{{A}_{0,l}^\epsilon }^s ({\underline{A}}_{0,l}^{\epsilon })^{-1}, \end{aligned}$$
(24)

where the damping parameter \(\varsigma > 0\) is small enough to ensure that the operator \((id - {\underline{R}}_l{\underline{A}}_{0,l}^\epsilon )\) is a positive definite contraction, and only depends on the bounded number of overlapping blocks [2]. We further define \({\underline{R}}_l^T\) as the transpose of \({\underline{R}}_l\) with respect to the inner product \((\cdot ,\;\cdot )_{0,l}\).

Now we are ready to present the W-cycle and variable V-cycle multigrid algorithms for the linear system \({\underline{A}}_{0,l}^\epsilon \underline{\mathbb {u}}_l = \underline{\mathbb {g}}_l\in \underline{{\mathbb {V}}}_{l,0}^0\) as in Algorithm 1. Schöberl’s multigrid theory [45, 46], originally introduced for the \(\textrm{P}^2\)-\(\textrm{P}^0\) discretization on triangles, can be applied to the CR discretization in two- and three-dimensions and requires full elliptic regularity results in (3). The proof procedures for the optimality of W-cycle and variable V-cycle multigrid for the CR discretization with pressure-robust treatment (13) are essentially the same as in our previous work [22], where we used Schöberl’s theory for the CR discretization without pressure-robust treatment, and we refer to [22, Remark 4.4] for more details. We note that the only difference of the left-hand-side operator before and after the pressure-robust treatment lies in the mass term. The \(L^2\) norm of \({\underline{\Pi }}^{RT}{\underline{v}}_h^{CR}\) is bounded by \(\Vert {\underline{v}}_h^{CR}\Vert _0\) for any \({\underline{v}}_h^{CR}\in {\underline{V}}_{h}^{CR}\). Here we quote the optimality result from [46, Theorem 3.7], from which we have, for the lowest-order case, W-cycle multigrid is a convergent iteration method when the smoothing steps are large enough, and V-cycle multigrid is a preconditioner, with both methods robust concerning the mesh size and the penalty parameter \(\epsilon \).

Algorithm 1
figure a

The h-multigrid algorithm at lowest order.

Theorem 3.2

[Theorem 3.7 of [46]] The geometric h-multigrid procedure defined in Algorithm 1 has the following properties:

  • The W-cycle multigrid algorithm is a robust convergent method. Specifically, there exist positive constants \(m_*\) and C, independent of the mesh size \(h_l\) and the penalty parameter \(\epsilon \), such that with \(q = 2\) and \(m(1)= \cdots =m(l) = m\) in Algorithm 1, we have

    $$\begin{aligned} \Vert \underline{{\mathbb {E}}}_{l,m}\underline{\mathbb {v}}_l\Vert _{{\underline{A}}_{0,l}^\epsilon } \le C m^{-1/4} \Vert \underline{\mathbb {v}}_l\Vert _{{\underline{A}}_{0,l}^\epsilon }, \quad \forall \underline{\mathbb {v}}_l\in \underline{{\mathbb {V}}}_{l,0}^0,\; l\ge 1,\; m\ge m_*, \end{aligned}$$

    where \({\underline{{\mathbb {E}}}}_{l,m}: \underline{{\mathbb {V}}}_{l,0}^0\rightarrow \underline{{\mathbb {V}}}_{l,0}^0\) is the operator relating the initial error and the final error of the W-cycle multigrid algorithm, i.e.,

    $$\begin{aligned} {\underline{{\mathbb {E}}}}_{l,m}(\underline{\mathbb {u}}_l-\underline{\mathbb {u}}_l^{(0)}):= \underline{\mathbb {u}}_l - MG_h(l, \underline{\mathbb {g}}_l, \underline{\mathbb {u}}_l^{(0)}, m, 2). \end{aligned}$$
  • The variable V-cycle algorithm is a robust preconditioner. Specifically, with \(q=1\) and \(\gamma _0 m(l) \le m(l-1) \le \gamma _1 m(l)\) in Algorithm 1 (\(1<\gamma _0<\gamma _1\)), there exists a positive constant C independent of the mesh size \(h_l\) and the penalty parameter \(\epsilon \) such that

    $$\begin{aligned} \kappa (\underline{{\mathbb {B}}}_{l,m(l)}{\underline{A}}_{0,l}^\epsilon ) \le 1 + C m(l)^{-1/4}, \end{aligned}$$

    where \(\kappa \) is the condition number, and \({\underline{{\mathbb {B}}}}_{l,m(l)}: \underline{{\mathbb {V}}}_{l,0}^0\rightarrow \underline{{\mathbb {V}}}_{l,0}^0\) is the preconditioning operator relating the residual to the correction of the variable V-cycle multigrid algorithm with a zero initial guess, i.e.,

    $$\begin{aligned} \underline{{\mathbb {B}}}_{l,m(l)}\underline{\mathbb {v}}_l:= MG_h(l, \underline{\mathbb {v}}_l, 0, m(l),1), \quad \forall \underline{\mathbb {v}}_l\in \underline{{\mathbb {V}}}_{l,0}^0. \end{aligned}$$

3.5 hp-multigrid Algorithm for the Higher-Order Scheme

We now consider the augmented Lagrangian Uzawa iteration (8) for the condensed higher-order H(div)-HDG scheme with \(k\ge 1\). We present an hp-multigrid algorithm to precondition the primal operator \({\underline{A}}_{k,h}^\epsilon \) on the finest mesh \({\mathcal {T}}_h={\mathcal {T}}_J\). Specifically, we define a "two-level" nested ASP for the system \({\underline{A}}_{k,h}^\epsilon \underline{\mathbb {u}}_h^\partial =\underline{\mathbb {g}}_h^\partial \in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\), where we use the lowest-order velocity space \(\underline{{\mathbb {V}}}_{h,0}^{0}\) as the auxiliary space and \(MG_h\) in Algorithm 1 as the inexact auxiliary space solver.

Two components are needed to finish constructing the multiplicative ASP: (i) a prolongation operator to transfer functions from the lowest-order velocity space \(\underline{{\mathbb {V}}}_{h,0}^0\) to the higher-order global velocity space \(\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\), and (ii) a relaxation method to reduce the errors in the high-order spaces. For the prolongation operator, since we have the natural inclusion relationship \(\underline{{\mathbb {V}}}_{h,0}^0\subset \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\), we directly use the identity operator denoted by \({\underline{\Pi }}_{0}^k\). The transpose of this operator with respect to \((\cdot ,\;\cdot )_{0,h}\) is the \(L^2\)-projection operator \({\underline{\Pi }}_{k}^0\). For the relaxation method, we again need to take care of the divergence-free kernels in \(\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\) and we use vertex-patched block Jacobi/Gauss-Seidel method. On the finest mesh level \({\mathcal {T}}_h\) with the set of vertices \(\mathcal {S}_h\), for any vertex \(s\in \mathcal {S}_h\), we denote \({\mathcal {T}}_{h}^s\) as the subset of mesh elements and \({\mathcal {E}}_{h}^s\) as the subset of mesh skeletons meeting at the vertex s respectively, i.e.

$$\begin{aligned} {\mathcal {T}}_{h}^s:=&\bigcup \limits _{\begin{array}{c} K\in {\mathcal {T}}_h, \; s\in K \end{array}} K, \\ {\mathcal {E}}_{h}^s:=&\bigcup \limits _{\begin{array}{c} F\in {\mathcal {E}}_h, \; s\in F \end{array}} F. \end{aligned}$$

The high-order compound global finite element space \(\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\) is then decomposed into overlapping subspaces with support on \({\mathcal {T}}_{h}^s\) and \({\mathcal {E}}_{h}^s\) as follows:

$$\begin{aligned} \underline{{\mathbb {V}}}_{h,0}^{k,\partial } = \sum _{s\in \mathcal {S}_h} \underline{{\mathbb {V}}}_{h,0}^{k,\partial ,s} := \sum _{s\in \mathcal {S}_h} \left\{ ({\underline{v}}_h^s, \widehat{{\underline{v}}}_h^s)\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }:\; \textrm{supp}\;{{\underline{v}}}_h^s\subset \textrm{interior}({\mathcal {T}}_{h}^s),\; \textrm{supp}\;\widehat{{\underline{v}}}_h^s\subset {\mathcal {E}}_{h}^s \right\} . \end{aligned}$$

Furthermore, we denote \({\underline{P}}^s_{A_{k,h}^\epsilon }:\underline{{\mathbb {V}}}_{h,0}^{k,\partial }\rightarrow \underline{{\mathbb {V}}}_{h,0}^{k,\partial ,s}\) as the \({\underline{A}}_{k,h}^\epsilon \)-orthogonal projection onto vertex-patched subspace \({{\mathbb {V}}}_{h,0}^{k,\partial ,s}\) that satisfies:

$$\begin{aligned} ({\underline{A}}_{k,h}^\epsilon {\underline{P}}_{{A}_{k,h}^\epsilon }^s\underline{\mathbb {u}}_h,\;\underline{\mathbb {v}}_{h}^s)_{0,h} = ({\underline{A}}_{k,h}^\epsilon \underline{\mathbb {u}}_h,\;\underline{\mathbb {v}}_{h}^s)_{0,h}, \quad \forall \underline{\mathbb {u}}_h \in \underline{{\mathbb {V}}}_{h,0}^{k,\partial },\; \underline{\mathbb {v}}_{h}^s\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial ,s},\; \forall s\in \mathcal {S}_h. \end{aligned}$$

Then we get \({\underline{R}}_k^\partial \) as the corresponding damped block Jacobi smoother

$$\begin{aligned} {\underline{R}}_k^\partial := \varsigma '\sum _{s\in {\mathcal {S}}_h}{\underline{P}}_{A_{k,h}^\epsilon }^s ({\underline{A}}_{k,h}^\epsilon )^{-1}, \end{aligned}$$

where \(\varsigma '>0\) is the damping parameter, and \({\underline{R}}_k^{\partial , T}\) as the transpose of \({\underline{R}}_k^\partial \) with respect to the inner product \((\cdot ,\;\cdot )_{0,h}\). Now we present the hp-multigrid method for \({\underline{A}}_{k,h}^\epsilon \underline{\mathbb {u}}_h^\partial =\underline{\mathbb {g}}_h^\partial \) in Algorithm 2.

Algorithm 2
figure b

The hp-multigrid algorithm for \({\underline{A}}_{k,h}^\epsilon \underline{\mathbb {u}}_h^\partial =\underline{\mathbb {g}}_h^\partial \).

Remark 3.4

[On the hp-multigrid method] Our goal is to achieve a mild increase or robustness in iteration counts of the Krylov space solver preconditioned by the hp-multigrid method as the polynomial order increases. Our hp-multigrid algorithm can be seen as a Schwarz-type method [53], utilizing a space decomposition given by

$$\begin{aligned} \underline{{\mathbb {V}}}_{h,0}^{k,\partial } = \underline{{\mathbb {V}}}_{h,0}^{0} + \sum _{s\in {\mathcal {S}}_h}\underline{{\mathbb {V}}}_{h,0}^{k,\partial ,s}. \end{aligned}$$
(25)

Based on the same subspace decomposition (lowest-order space \(+\) vertex-patched subspaces), Pavarino [38] introduced and proved a p-robust additive Schwarz preconditioner for the continuous finite element space for Poisson’s equation, and Brubeck and Farrell [13] further scale this p-robust preconditioner to very high polynomial degrees. For a nearly-singular system such as in (8) where \(\epsilon \) approaches 0, Lee et al. [30] presented a general framework and proved that the method of subsequent subspace correction (MSSC) is convergent provided that the exact/inexact solver on each subspace is a robust contraction and that the kernel of the divergence operator can be decomposed into sum of elements of the subspaces, and the convergence rate is robust with respect to mesh size and parameter \(\epsilon \). Our numerical experiments show that the hp-multigrid in Algorithm 2 is robust with respect to mesh size and parameter \(\epsilon \), and the preconditioned Krylov space solver has only a very mild increase in iteration counts as the polynomial order increases.

We acknowledge that our current hp-multigrid method is multiplicative and subsequent between the lowest order spaced \(\underline{{\mathbb {V}}}_{h,0}^{0}\) and the vertex-patched subspaces \(\sum _{s\in {\mathcal {S}}_h}\underline{{\mathbb {V}}}_{h,0}^{k,\partial ,s}\), whereas ASP introduced in [54] is additive and parallel. However, we observed drastic increases in iteration counts when \(\epsilon \rightarrow 0\) and the polynomial order increases when an additive version of the hp-multigrid is used as the preconditioner in our numerical experiments. Therefore, a parallel and robust hp-multigrid for the operator \({\underline{A}}_{k,h}^\epsilon \) along with its theoretical proof is worth pursuing and will be the focus of our future research.

4 Navier–Stokes Equation

4.1 Model Problem and the H(div)-HDG Scheme

The model problem is to find \(({\underline{u}}, p)\) satisfying

$$\begin{aligned} \beta {\underline{u}} - \nabla \cdot (\nu \nabla {\underline{u}}) + {\underline{u}}\cdot \nabla {\underline{u}} + \nabla p =&\;{\underline{f}}, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(26a)
$$\begin{aligned} \nabla \cdot {\underline{u}} =&\; 0, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(26b)
$$\begin{aligned} {\underline{u}}=&\;{\underline{0}}, \quad{} & {} \text {on }\partial \Omega , \end{aligned}$$
(26c)

with notations the same as in (1). We introduce the tensor \(\underline{\underline{L}}:= - \nu \nabla {\underline{u}}\) as a new variable and rewrite (26) into a first-order system:

$$\begin{aligned} \nu ^{-1}\underline{\underline{L}} + \nabla {\underline{u}} =&\; 0, \quad{} & {} \text {in }\Omega , \end{aligned}$$
(27a)
$$\begin{aligned} \beta {\underline{u}} + \nabla \cdot \underline{\underline{L}} + {\underline{u}}\cdot \nabla {\underline{u}} + \nabla p =&\; {\underline{f}}, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(27b)
$$\begin{aligned} \nabla \cdot {\underline{u}} =&\; 0, \quad{} & {} \text {in }{\Omega ,} \end{aligned}$$
(27c)
$$\begin{aligned} {\underline{u}} =&\; {\underline{0}}, \quad{} & {} \text {on }{\partial \Omega .} \end{aligned}$$
(27d)

For the nonlinear convection term \({\underline{u}}\cdot \nabla {\underline{u}}\) in the Navier–Stokes equations, we use the natural upwind discretization which needs no additional stabilization and leads to minimal numerical dissipation [31, Chapter 2]. The trilinear form for the H(div)-HDG discretization for the convection term is defined as

$$\begin{aligned} \underline{{\mathcal {C}}}_h({\underline{w}}_h;\;\underline{\mathbb {u}}_h,\; \underline{\mathbb {v}}_h):=&\sum _{K\in {\mathcal {T}}_h}- ({\underline{w}}_h \otimes {\underline{u}}_h,\; \nabla {\underline{v}}_h)_{K} +\langle ({\underline{w}}_h\cdot {\underline{n}}_K){\underline{u}}_h^{up},\; \textsf{tng}({\underline{v}}_h - \widehat{{\underline{v}}}_h)\rangle _{\partial K}, \end{aligned}$$

for all \({\underline{w}}_h\in {\underline{V}}_{h,0}^{k}\) and \(\underline{\mathbb {u}}_h,\underline{\mathbb {v}}_h\in \underline{{\mathbb {V}}}_{h,0}^{k}\), where

$$\begin{aligned} {\underline{u}}_h^{up}:=&\textsf{nrm}({\underline{u}}_h) + \left\{ \begin{array}{ll} \textsf{tng}({\underline{u}}_h),&{} \text { if }\ {\underline{u}}_h\cdot {\underline{n}}_K > 0, \\ \textsf{tng}(\widehat{{\underline{u}}}_h),&{} \text { if }\ {\underline{u}}_h\cdot {\underline{n}}_K < 0. \end{array} \right. \end{aligned}$$

4.2 Linearization and Iterative Solving Procedure

We linearize the convection term by Picard or Newton’s method and then iteratively solve the resulting linearized H(div)-HDG scheme. Given \(\underline{\mathbb {u}}_{h}^{(n-1)}\in \underline{{\mathbb {V}}}_{h,0}^k\) at the previous step, when Picard iteration is used, the linearized convection term at the n-th step is

$$\begin{aligned} \underline{{\mathcal {C}}}_h^l({\underline{u}}_h^{(n-1)};\;\underline{\mathbb {u}}_h^{(n)},\; \underline{\mathbb {v}}_h):= \underline{{\mathcal {C}}}_h({\underline{u}}_h^{(n-1)};\;\underline{\mathbb {u}}_h^{(n)},\; \underline{\mathbb {v}}_h), \end{aligned}$$

where \(\underline{\mathbb {u}}_h^{(n)}\in \underline{{\mathbb {V}}}_{h,0}^k\) is the velocity solution to be found at the n-th step, and when Newton iteration is used, the linearized convection term at n-th step becomes

$$\begin{aligned} \underline{{\mathcal {C}}}_h^l({\underline{u}}_h^{(n-1)};\;\delta \underline{\mathbb {u}}_h^{(n)},\; \underline{\mathbb {v}}_h):= \underline{{\mathcal {C}}}_h({\underline{u}}_h^{(n-1)};\;\delta \underline{\mathbb {u}}_h^{(n)},\; \underline{\mathbb {v}}_h) + \underline{{\mathcal {C}}}_h(\delta {\underline{u}}_h^{(n)};\;\underline{\mathbb {u}}_h^{(n-1)},\; \underline{\mathbb {v}}_h), \end{aligned}$$

where \(\delta \underline{\mathbb {u}}_h^{(n)}\in \underline{{\mathbb {V}}}_{h,0}^k\) is the change to the velocity at the n-th step.

Then given solution \((\underline{\underline{L}}_h^{(n-1)}, \underline{\mathbb {u}}_h^{(n-1)}, p_h^{(n-1)})\) at the previous step, the linearized H(div)-HDG scheme is to find \((\underline{\underline{L}}_h, \underline{\mathbb {u}}_h, p_h) \in \underline{\underline{W}}_h^k \times \underline{{\mathbb {V}}}_{h,0}^k \times W_{h,0}^{k}\), \(k \ge 0\), such that

$$\begin{aligned} (\nu ^{-1}\underline{\underline{L}}_h,\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} + (\nabla {\underline{u}}_h,\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} - \langle \textsf{tng}({\underline{u}}_h - \widehat{{\underline{u}}}_h),\; \underline{\underline{G}}_h{\underline{n}}_K\rangle _{\partial {\mathcal {T}}_h}&= 0, \end{aligned}$$
(28a)
$$\begin{aligned} (\beta {\underline{u}}_h,\; {\underline{v}}_h)_{{\mathcal {T}}_h} -( \underline{\underline{L}}_h,\; \nabla {\underline{v}}_h )_{{\mathcal {T}}_h} + \langle \underline{\underline{L}}_h{\underline{n}}_K, \textsf{tng}({\underline{v}}_h - \widehat{{\underline{v}}}_h) \rangle _{\partial {\mathcal {T}}_h}&\nonumber \\ + \underline{{\mathcal {C}}}_h^l({\underline{u}}_h^{(n-1)};\;\underline{\mathbb {u}}_h,\; \underline{\mathbb {v}}_h) - (p_h, \nabla \cdot {\underline{v}}_h)_{{\mathcal {T}}_h}&= ({\underline{f}}',\; {\underline{v}}_h)_{{\mathcal {T}}_h} \end{aligned}$$
(28b)
$$\begin{aligned} (\nabla \cdot {\underline{u}}_h,\; q_h)_{{\mathcal {T}}_h}&= 0. \end{aligned}$$
(28c)

for all \((\underline{\underline{G}}_h, \underline{\mathbb {v}}_h, q_h) \in \underline{\underline{W}}_h^k \times \underline{{\mathbb {V}}}_{h,0}^k \times W_{h,0}^{k}\), where for the Picard method,

$$\begin{aligned} ({\underline{f}}',\; {\underline{v}}_h)_{{\mathcal {T}}_h}:= ({\underline{f}},\; {\underline{v}}_h)_{{\mathcal {T}}_h}, \end{aligned}$$

and for the Newton’s method,

$$\begin{aligned} ({\underline{f}}',\; {\underline{v}}_h)_{{\mathcal {T}}_h} :=&({\underline{f}},\; {\underline{v}}_h)_{{\mathcal {T}}_h} -(\nu ^{-1}\underline{\underline{L}}_h^{(n-1)},\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h} -(\nabla {\underline{u}}_h^{(n-1)},\; \underline{\underline{G}}_h)_{{\mathcal {T}}_h}\\&+ \langle \textsf{tng}({\underline{u}}_h^{(n-1)} - \widehat{{\underline{u}}}_h^{(n-1)}),\; \underline{\underline{G}}_h{\underline{n}}_K\rangle _{\partial {\mathcal {T}}_h}\\&-(\beta {\underline{u}}_h^{(n-1)},\; {\underline{v}}_h)_{{\mathcal {T}}_h} +( \underline{\underline{L}}_h^{(n-1)},\; \nabla {\underline{v}}_h )_{{\mathcal {T}}_h} - \langle \underline{\underline{L}}_h^{(n-1)}{\underline{n}}_K, \textsf{tng}({\underline{v}}_h - \widehat{{\underline{v}}}_h) \rangle _{\partial {\mathcal {T}}_h} \\ {}&- \underline{{\mathcal {C}}}_h^l({\underline{u}}_h^{(n-1)};\;\underline{\mathbb {u}}_h^{(n-1)},\; \underline{\mathbb {v}}_h) + (p_h^{(n-1)}, \nabla \cdot {\underline{v}}_h)_{{\mathcal {T}}_h}. \end{aligned}$$

We denote the operator form of the condensed global system of the linearized H(div)-HDG scheme (28) as to find \((\underline{\mathbb {u}}_h^\partial , p_h^{\partial })\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\times W_{h,0}^{\partial }\) satisfying:

$$\begin{aligned} {\underline{A}}_{k,h}^c\underline{\mathbb {u}}_h^\partial + {\underline{B}}_{k,h}^*p_h^\partial =\;&{\underline{F}}_{k,h}^c, \end{aligned}$$
(29a)
$$\begin{aligned} {\underline{B}}_{k,h} \underline{\mathbb {u}}_h^\partial =\;&0, \end{aligned}$$
(29b)

where the operators \({\underline{B}}_{k,h}\) and \({\underline{B}}_{k,h}^*\) are identical to those in (6). The operator \({\underline{A}}_{k,h}^c\) is obtained by augmenting \({\underline{A}}_{k,h}\) with a condensed operator from the linearized convection term \(\underline{{\mathcal {C}}}_h^l\). To solve the resulting saddle-point system, we employ an augmented Lagrangian Uzawa iteration, similar to the approach used for the generalized Stokes equation: With \(p_h^{\partial (0)} = 0\), iteratively find solution \((\underline{\mathbb {u}}_h^{\partial (n)},p_h^{\partial (n)})\in \underline{{\mathbb {V}}}_{h,0}^{k,\partial }\times W_{h,0}^{\partial }\) that satisfies

$$\begin{aligned} {\underline{A}}_{k,h}^{c, \epsilon } \underline{\mathbb {u}}_h^{\partial \,(n)} =&\;{\underline{F}}_{k,h}^c - {\underline{B}}_{k,h}^*p_h^{\partial \,(n-1)}, \end{aligned}$$
(30a)
$$\begin{aligned} p_h^{\partial \, (n)} =&\; p_h^{\partial \, (n-1)}- \epsilon ^{-1} {\underline{B}}_{k,h} \underline{\mathbb {u}}_h^{\partial \, (n)}, \end{aligned}$$
(30b)

where

$$\begin{aligned} {\underline{A}}_{k,h}^{c, \epsilon }:= {\underline{A}}_{k,h}^c + \epsilon ^{-1}{\underline{B}}_{k,h}^*{\underline{B}}_{k,h}, \end{aligned}$$

with the penalty parameter \(\epsilon ^{-1}\gg 1\).

When the viscosity \(\nu \) approaches zero, the convection term in the Navier–Stokes equations dominates over the diffusion term, and this dominance causes the linear system to become more ill-conditioned, making it challenging to find a robust preconditioner for Krylov subspace solvers such as GMRes [44]. Moreover, the non-symmetry of the H(div)-HDG scheme further complicates the analysis of the preconditioner’s robustness. However, in a study by Benzi and Olshanskii [6], Schöberl’s geometric multigrid method [45, 46] was applied to precondition the primal operator of the augmented Lagrangian formulation of the two-dimensional Oseen problem discretized by \(\textrm{iso}\textrm{P}^2\)-\(\textrm{P}^0\) and \(\textrm{iso}\textrm{P}^2\)-\(\textrm{P}^1\) elements with streamline-upwind Petrov-Galerkin (SUPG) stabilization. Numerical experiments demonstrate that this approach is essentially robust with respect to the Reynolds number. Subsequently, Farrell et al. [19] extended this approach to the three-dimensional Newton linearized Navier–Stokes equations, which is discretized by a \(\textrm{P}^1\)-\(\textrm{P}^0\) pair with velocity space enriched by facet bubbles and with SUPG stabilization. Numerical experiments support the preconditioner’s robustness with respect to the Reynolds number. Motivated by these results, we adapt the hp-multigrid algorithm developed for the generalized Stokes equation in Algorithm 2 to precondition the operator \({\underline{A}}_{k,h}^{c, \epsilon }\) of the linearized Navier–Stokes equations in (30a). We employ the same block relaxation method, block smoothing method, and intergrid transfer operators with discrete harmonic extensions.

It is important to note that the Newton iteration solver has quadratic convergence, but it requires a sufficiently accurate initial guess. To address this and test the performance of the proposed hp-multigrid for both the Picard and the Newton linearization, the Picard iteration method is firstly utilized in this study to solve the Navier–Stokes equations until the \(L^2\) norm of the difference between the velocities of two consecutive steps becomes smaller than a predefined tolerance \(\epsilon _{picard}\). Subsequently, the Newton iteration method is employed, using the solution of the last Picard iteration as the initial guess for the Newton iteration. The algorithm is described in Algorithm 3. In both the Picard and Newton iteration, the result of the previous iteration is used as the initial guess of the current iteration. Figure 2 provides a graphical illustration of the overall solving procedure.

Algorithm 3
figure c

The iterative solving procedure for Navier-Stokes equations.

Fig. 2
figure 2

Graphic illustration of the Algorithm 3 process

5 Numerical Experiments

This section presents the numerical experiments carried out to validate the optimal convergence rates of the \(H(\text {div})\)-HDG scheme and the robustness of the proposed hp-multigrid method. We conduct experiments for both the generalized Stokes equation and the Navier–Stokes equations. For the latter, we solve the steady cases with \(\beta = 0\). To solve the condensed \(H(\text {div})\)-HDG scheme, we use two-step augmented Lagrangian Uzawa iteration method with \((\nu \epsilon )^{-1}=10^6\) for both set of equations. In updating global velocity in each step of Uzawa iteration, the preconditioned conjugate gradient solver (PCG) is adopted for the generalized Stokes equation, while the preconditioned GMRes solver (PGM) is adopted for the linearized Navier–Stokes equations. The proposed hp-multigrid method is employed as the preconditioner for both solvers, and the stopping criterion is a relative tolerance of \(10^{-8}\) and an absolute tolerance of \(10^{-10}\). In Algorithm 3, we set \(\varepsilon _{\text {picard}}\) to \(10^{-4}\), and the Newton iteration stops with a relative tolerance of \(10^{-8}\) and an absolute tolerance of \(10^{-10}\). In all cases, we use block Gauss-Seidel method for both the relaxation and smoothing to avoid the damping parameter in Jacobi method. We further set the relaxation steps to be the same as the smoothing steps, i.e. \(m_h = m_p = m\) in the hp-multigrid in Algorithm 2. All results are obtained by using the NGSolve [47] and ParaView [3]. Source code for the numerical experiments is available at https://github.com/WZKuang/MG4HdivHDG.

5.1 Convergence Rate Check

We first verify the optimal convergence rates of the H(div)-HDG scheme solved with the proposed hp-multigrid method for the generalized Stokes equation and the Navier–Stokes equations with manufactured solutions. We set the exact solution

$$\begin{aligned}&\left. \begin{array}{l l} u_x &{} = x^2 (x - 1)^2 2y(1 - y)(2y - 1) \\ u_y &{} = y^2 (y - 1)^2 2 x(x - 1)(2x - 1)\\ p &{} = x(1 - x)(1 - y) - 1/12 \end{array} \right\} \;\text {when }{d = 2,} \end{aligned}$$

and

$$\begin{aligned}&\left. \begin{array}{l l} u_x &{} = x^2(x - 1)^2 (2 y - 6y^2 + 4 y^3)(2 z - 6z^2 + 4 z^3) \\ u_y &{} = y^2 (y - 1)^2 (2x - 6x^2 + 4x^3) (2z - 6z^2 + 4z^3)\\ u_z &{} = -2 z^2 (z - 1)^2 (2x - 6x^2 + 4x^3) (2y - 6y^2 + 4y^3)\\ p &{} = x(1 - x)(1 - y)(1 - z) - 1/24 \end{array} \right\} \;\text {when }{d = 3.} \end{aligned}$$

The source term \({\underline{f}}\) is obtained by plugging the exact solution into the model problem.

5.1.1 Generalized Stokes Equation

We set the domain as a unit square/cube \(\Omega =[0, 1]^d\) with homogeneous Dirichlet boundary conditions on all sides. The coarsest mesh is a triangulation of \(\Omega \) with the maximum element diameter \(1/h=2\) in 2D or \(1/h=3\) in 3D, followed by uniform refinement by connecting the midpoints of the element boundaries. After solving \((\underline{\underline{L}}_h,{\underline{u}}_h, p_h)\) from the H(div)-HDG scheme (4), a post-processed velocity approximation \({\underline{u}}_h^*\in {\underline{W}}_h^{k+1}\) is obtained element-wise by solving

$$\begin{aligned} (\nabla {\underline{u}}_h^*,\; \nabla {\underline{v}}_h)_K =\;&(\underline{\underline{L}}_h,\; \nabla {\underline{v}}_h)_K, \quad{} & {} \forall {\underline{v}}_h \in {\underline{{\mathcal {P}}}}^{k+1}(K), \end{aligned}$$
(31a)
$$\begin{aligned} ({\underline{u}}_h^*,\; {\underline{w}}_h)_K =\;&({\underline{u}}_h,\; {\underline{w}}_h)_K, \quad{} & {} \forall w_h\in {\underline{{\mathcal {P}}}}^0(K), \end{aligned}$$
(31b)

and the superconvergence property of such post-processed \({\underline{u}}_h^*\) when \(k\ge 1\) has been proved in [15].

We set the viscosity \(\nu = 1\), and Table 1 reports the estimated order of convergence (EOC) of the \(L_2\) norms of \({\underline{e}}_u:={\underline{u}}-{\underline{u}}_h\), \(\underline{\underline{e}}_L:=\underline{\underline{L}}-\underline{\underline{L}}_h\), and \({\underline{e}}_u^*:={\underline{u}}-{\underline{u}}_h^*\) in both two-dimensional and three-dimensional cases, together with the \(L_2\) norm of the divergence error \(\nabla \cdot {\underline{u}}_h\), for different \(\beta \) and polynomial orders k of the finite element spaces. As observed, the optimal \((k+1)\)-th convergence rate is obtained for both the velocity \({\underline{u}}_h\) and the flux \(\underline{\underline{L}}_h\) when \(k\ge 0\), with the globally divergence-free constraint satisfied. When \(k\ge 1\), the \((k+2)\)-th convergence rate is obtained for \({\underline{u}}_h^*\). We note that the deterioration of the convergence rates when \(\Vert {\underline{e}}_u\Vert _0\) and \(\Vert {\underline{e}}_u^*\Vert _0\) approach \({\mathcal {O}}(10^{-10})\) is due to the round-off error caused by \(\epsilon ^{-1}\), as mentioned in Remark 3.1.

Table 1 Estimated convergence rates of the H(div)-HDG scheme for the generalized Stokes equation in Example 5.1.1

5.1.2 Stationary Navier–Stokes Equations

We consider the same exact solution and settings as in Example 5.1.1. Upon solving \((\underline{\underline{L}}_h, {\underline{u}}_h, p_h)\) using the H(div)-HDG scheme (28), we locally post-process \({\underline{u}}_h\) as in (31) to achieve superconvergence. Table 2 presents the estimated order of convergence (EOC) of the \(L_2\) norms of \({\underline{e}}_u\), \(\underline{\underline{e}}_L\), and \({\underline{e}}_u^*\) in two-dimensional and three-dimensional cases, along with the \(L_2\) norm of the divergence error \(\nabla \cdot {\underline{u}}_h\) for different values of viscosity \(\nu \) and polynomial order k of the finite element spaces. The observed optimal convergence rates of \(\Vert {\underline{e}}_u\Vert _0\), \(\Vert {\underline{e}}_u^*\Vert _0\), \(\Vert \underline{\underline{e}}_L\Vert _0\), and the exact divergence-free results are similar to those obtained in the generalized Stokes equation. It is noteworthy that the convergence rates deteriorate as \(\Vert {\underline{e}}_u\Vert _0\) and \(\Vert {\underline{e}}_u^*\Vert _0\) approach \({\mathcal {O}}(10^{-8})\), which is due to the fact that \(\epsilon _{newton}\) in 3 is of order \({\mathcal {O}}(10^{-8})\) in this example.

Table 2 Estimated convergence rates of the H(div)-HDG scheme for the Navier-Stokes equations in Example 5.1.2

5.2 Lid-Driven Cavity Problem

In this example, we investigate the robustness of the proposed hp-multigrid preconditioners for the lid-driven cavity problem. The computational domain is a unit square/cube \(\Omega =[0,1]^d\). We assume an inhomogeneous Dirichlet boundary condition \({\underline{u}}=[4x(1-x),\; 0]^\textrm{T}\) when \(d=2\) or \({\underline{u}}=[16x(1-x)y(1-y),\; 0,\; 0]^\textrm{T}\) when \(d=3\) on the top side, and no-slip boundary conditions on the remaining domain boundaries. The source term \({\underline{f}}={\underline{0}}\). The coarsest mesh is a triangulation of \(\Omega \) with the maximum element diameter \(1/h=2\) in 2D or \(1/h=3\) in 3D. We refine the mesh uniformly by connecting the midpoints of the element boundaries in two dimensions, whereas in three dimensions, we apply one-step bisection refinement and split each coarse-grid tetrahedron into two due to computational capacity limitations. Vertex-patched block smoothing and relaxation method are used when \(d=2\), while edge-patched block smoothing and relaxation method are used when \(d=3\) to save memory usage.

5.2.1 Generalized Stokes Equation

We conducted tests on the PCG method preconditioned by both the variable V-cycle multigrid and W-cycle multigrid to solve the primal variable operator of the augmented Lagrangian Uzawa iteration for the condensed H(div) HDG scheme for the generalized Stokes equation. Table 3 reports the obtained PCG iteration counts with varying mesh levels, lower-order term coefficient \(\beta \), finite element space polynomial order k, and smoothing steps \(m_p = m_h = m\). The iteration counts remain almost unchanged as \(\beta \) increases from 0 to \(10^3\). We noticed a slight increase in iteration counts as the polynomial order k increases, particularly in three-dimensional cases. However, increasing the smoothing steps effectively decreases the PCG iteration counts. Other results verify the robustness of the hp-multigrid preconditioner with respect to mesh size.

Table 3 PCG iteration counts for the generalized Stokes equation in the lid-driven cavity problem in Example 5.2.1

5.2.2 Stationary Navier–Stokes Equations

We use PGM method preconditioned by the variable V-cycle method to solve the primal variable operator of the augmented Lagrangian Uzawa iteration method for the condensed H(div)-HDG method for the linearized Navier–Stokes equations as in Algorithm 3. Tables 4 and 5 report the obtained average PGM iteration counts during the Picard iteration and Newton iteration procedures in Algorithm 2, with different mesh levels, viscosity \(\nu \), finite element space polynomial order k, and smoothing steps \(m_p = m_h = m\). Figure 3 demonstrates the magnitude and the streamlines of the obtained numerical velocity solution with \(\nu =10^{-3}\) (left panel) and \(\nu =10^{-4}\) (right panel).

Fig. 3
figure 3

The obtained numerical velocity solution of the lid-driven cavity problem for the Navier–Stokes equations in Example 5.2.2

In Table 4, for the two-dimensional cases where \(\nu \ge 10^{-3}\), the proposed hp-multigrid method demonstrates satisfactory performance under both Picard and Newton linearization methods, despite some mild increases in the iteration count of the PGM solver with increasing polynomial order k and Reynolds number Re. However, when \({Re}=10^4\) and \(k \ge 1\), the performance of the hp-multigrid method deteriorates significantly, although it remains satisfactory when \(k=0\). This highlights the need for an efficient p-robust relaxation method to handle the convection-dominated case in the H(div)-HDG scheme, which requires further investigation. In Table 5, the hp-multigrid method exhibits similarly good performance when \(\nu \ge 10^{-2}\) for the three-dimensional cases. However, when \(\nu \le 10^{-3}\), the solution of the Picard linearization oscillates and cannot provide a sufficiently close initial guess for the Newton iteration. To address this issue, we employed the pseudo-time integration method to obtain the steady state, using a simple implicit backward-Euler discretization with the pseudo-time step size \(\delta t^*\). The results in Table 5 show that \(1/\delta t^*=0.1\) is good enough when \(\nu =10^{-3}\), and the extra mass term in the primal operator \({\underline{A}}_{k,h}^{c,\epsilon }\) in (30a) makes it easier to be solved by PGM. In the extreme case where \({Re}=10^4\), the solution to the Picard iteration oscillates even more severely, and we omit this case to keep our discussion simple. All other results verify the robustness of our hp-multigrid with respect to the mesh size.

In addition to Algorithm 3, we are aware of other techniques to solve the stationary Navier–Stokes equations, such as continuation on Reynolds number and pseudo-time integration with implicit-explicit (IMEX) methods. Our numerical experiments demonstrate that the proposed hp-multigrid method is a promising preconditioner for these techniques.

Table 4 Two-dimensional average PGM iteration counts for the linearized Navier-Stokes equations during the Picard/Newton iteration procedure in Algorithm 3 in the lid-driven cavity problem in Example 5.2.2
Table 5 Three-dimensional average PGM iteration counts for the linearized Navier-Stokes equations during the Picard/Newton iteration procedure in Algorithm 3 in the lid-driven cavity problem in Example 5.2.2

5.3 Backward-Facing Step Flow Problem

Finally, we test the proposed hp-multigrid preconditioners for the backward-facing step flow problem, with the domain \(\Omega =([0.5,4]\times [0,0.5])\cup ([0,4]\times [0.5,1])\) when \(d=2\), or \(\Omega =(([0.5,4]\times [0,0.5])\cup ([0,4]\times [0.5,1]))\times [0,1]\) when \(d=3\). We assume an inhomogeneous Dirichlet boundary condition \({\underline{u}}=[16(1-y)(y-0.5),\; 0]^\textrm{T}\) when \(d=2\), or \({\underline{u}}=[64(1-y)(y-0.5)z(1-z),\; 0,\; 0]^\textrm{T}\) when \(d=3\) for the inlet flow on \(\{x=0\}\), with do-nothing boundary condition on \(\{x=4\}\) and no-slip boundary conditions on the remaining sides. We note that the domain is L-shaped and no longer convex, thus the full elliptic regularity assumption in (3) no longer holds. But the results in the following subsections indicate that our proposed hp-multigrid method still works well. We also note that when the edge-patched block relaxation method is used in our hp-multigrid, an evident increase in the iteration count of preconditioned Krylov subspace solvers with the increase of the polynomial degree is observed when the length of the domain is increased. Thus in the numerical experiments for the backward-facing step flow problems, we use vertex-patched relaxation and edge-patched smoothing in the hp-multigrid. Other settings are the same as in the lid-driven cavity problem.

5.3.1 Generalized Stokes Equation

Table 6 reports the obtained PCG iteration counts. Similar optimal results are observed as in the lid-driven cavity problem. We note that when \(d=2\), \(\beta =10^3\) and \(m=1\), or when \(d=3\) and \(m=2\), the PCG solver preconditioned by the W-cycle multigrid fails, which is due to the fact that the smoothing steps on each mesh level of the W-cycle multigrid are not large enough to make the preconditioned operator definite and robust with respect to mesh size [8, Section 4]. Meanwhile, the variable V-cycle with \(m=1\) remains a robust and positive definite preconditioner.

Table 6 PCG iteration counts for the generalized Stokes equation in the backward-facing step flow problem in Example 5.3.1, where "NA" means the conjugate gradient solver fails

5.3.2 Stationary Navier–Stokes Equations

We apply the PGM method preconditioned by the variable V-cycle method to solve the augmented Lagrangian Uzawa iteration (30a) for the condensed H(div)-HDG method for the linearized Navier–Stokes equations. Here we limit ourselves to cases with two-dimensional domain and with Reynolds number up to \(10^3\). All other settings are identical to those in Example 5.3.1 for the generalized Stokes equation. The resulting average PGM iteration counts during the Picard and Newton iteration procedures in Algorithm are reported in Table 7 for different mesh levels, viscosity \(\nu \), finite element space polynomial order k, and smoothing steps \(m_p = m_h = m\), and similar results as for the lid-driven cavity problem in Example 5.2.2 are observed.

Table 7 Two-dimensional average PGM iteration counts for the linearized Navier-Stokes equations during the Picard/Newton iteration procedure in Algorithm 3 in the backward-facing step flow problem in Example 5.3.2

6 Conclusion

In this study, we developed an hp-multigrid preconditioner for the H(div)-HDG scheme for both the generalized Stokes and the Navier–Stokes equations. The condensed H(div)-HDG system is solved by the augmented Lagrangian Uzawa iteration, and the hp-multigrid is used to precondition the nearly-singular primal operator on the global velocity spaces. The proposed hp-multigrid is essentially a multiplicative ASP, with the lowest global velocity space as the auxiliary space and a robust geometric multigrid algorithm as the auxiliary space solver. For the generalized Stokes equation, we prove that the condensed lowest-order H(div)-HDG discretization is equivalent to the CR discretization with a pressure-robust treatment, which allows for the application of the rich geometric multigrid theory for the CR discretization. Numerical experiments demonstrate that the proposed hp-multigrid is robust to mesh size and the augmented Lagrangian parameter, while a very mild increase in iteration counts of the preconditioned Krylov space solvers with respect to the increase of polynomial order is observed. We further test the proposed hp-multigrid preconditioner on the H(div)-HDG scheme for the linearized Naiver-Stokes equation by Picard or Newton’s method, and the iteration counts grow mildly with respect to the increase of the Reynolds number as large as \(10^3\). An efficient parallel implementation of an additive hp-multigrid algorithm and the proof of its robustness for the generalized Stokes equation will be our forthcoming research.