1 Introduction

Convection, diffusion, and reaction are basic physical mechanisms which play an important role in many mathematical models used in science and technology. A frequently used model problem for studying numerical techniques for the mentioned class of models is the scalar steady-state convection–diffusion–reaction problem

$$ -\varepsilon {\Delta} u+{\boldsymbol{b}}\cdot\nabla u+c u=g\quad\text{in} {\Omega} , \qquad\qquad u=u_{b}\quad\text{on} \partial{\Omega} , $$
(1)

where \({\Omega }\subset \mathbb {R}^{d}\), d ≥ 1, is a bounded domain, ε > 0 is a constant diffusion coefficient, b is the convection field, c is the reaction field, and the right-hand side g is a source of the unknown quantity u. Note that the model problem (1) itself has also a clear physical meaning since it may describe, e.g., the distribution of temperature or concentration. For our mathematical considerations, we will assume that the boundary Ω of Ω is polyhedral and Lipschitz-continuous (if d ≥ 2) and that \({\boldsymbol {b}}\in W^{1,\infty }({\Omega })^{d}\), \(c\in L^{\infty }({\Omega })\), gL2(Ω), and \(u_{b}\in H^{\frac {1}{2}}(\partial {\Omega })\cap C(\partial {\Omega })\). Moreover, it will be assumed that the data satisfy the conditions

$$ \nabla\cdot\boldsymbol{b}=0 ,\qquad c\ge\sigma_{0}\ge0\qquad\quad\text{in} {\Omega} , $$
(2)

where σ0 is a constant.

In most applications, the convective transport strongly dominates the diffusion which causes that the solution u comprises so-called layers, which are narrow regions where u changes abruptly. The presence of layers makes the numerical solution of (1) very challenging since standard approaches provide solutions polluted by spurious oscillations unless the layers are resolved by the mesh. A well-known remedy is a stabilization of the standard discretization, e.g., by adding additional stabilization terms (see, e.g., [36]). To obtain accurate approximations, the stabilization has to be adopted to the character of the approximated solution which inevitably leads to nonlinear methods. However, many of such stabilization techniques still do not remove the spurious oscillations completely since the stabilization effect is influenced by many factors, like the used mesh or the considered data (cf. [18,19,20]). Although the remaining spurious oscillations are often quite small, they may be not acceptable in some applications, e.g., if the oscillating solution should serve as input data for other equations. A possible remedy is to apply methods satisfying the discrete maximum principle (DMP) (see, e.g., the recent review paper [8]). The DMP excludes many types of oscillating solutions that otherwise frequently appear when solving convection-dominated problems. A further reason for requiring the validity of the DMP is that a maximum principle holds for the continuous problem (1) if c ≥ 0 (cf. [12, 14]) and it is important that this physical property is preserved by the discrete problem.

An interesting class of methods satisfying the DMP (often under some assumptions on the mesh) are algebraically stabilized finite element schemes, e.g., algebraic flux correction (AFC) schemes. These methods have been developed intensively in recent years (see, e.g., [2, 7, 15, 28,29,30,31,32,33,34,35]). The origins of this approach can be tracked back to [10, 37]. In these schemes, the stabilization is performed on the basis of the algebraic system of equations corresponding to the Galerkin finite element method. It involves so-called limiters, which restrict the stabilized discretization mainly to a vicinity of layers to ensure the satisfaction of the DMP without compromising the accuracy. There are several limiters proposed in the literature, like the so-called Kuzmin [29], BJK [7], or BBK [4] limiters. Both, the Kuzmin and the BBK limiters were utilized in [3] for defining a scheme that blends a standard linear stabilized scheme in smooth regions and a nonlinear stabilized method in a vicinity of layers.

An important feature of algebraically stabilized schemes is that they not only satisfy the DMP but also usually provide sharp approximations of layers (cf. the numerical results in, e.g., [1, 16, 23, 31]). In this paper, we concentrate on schemes based on the idea of algebraic flux correction. Many properties of the AFC schemes are already well understood since these schemes were investigated in a number of papers (see, e.g., [5,6,7, 9, 24,25,26]) where one can find results on the existence of solutions, local and global DMPs, error estimates, and further properties. However, it was observed already in [6] that convergence rates of these schemes may be suboptimal on some meshes, even if problems without layers are considered. The aim of the present paper is to explain this behavior in some model cases and, using the results of this analysis, to propose modifications of the considered methods leading to optimal convergence rates. This will lead to a new algebraic stabilization called Symmetrized Monotone Upwind-type Algebraically Stabilized (SMUAS) method for which the solvability, linearity preservation and DMP will be proved on arbitrary simplicial meshes. Moreover, various numerical results will be reported that show that, in many cases, the SMUAS method leads to more accurate results than other algebraic stabilizations. In addition, the numerical results indicate that the SMUAS method converges with optimal rates on general simplicial meshes. Let us mention that the analysis of AFC schemes also demonstrates the interesting fact that certain types of spurious oscillations may be still present in the approximate solutions despite the validity of the DMP. This contradicts the frequently made claim that the DMP guarantees that no spurious oscillations appear.

The plan of the paper is as follows. In the next section, we define a Galerkin finite element discretization of (1) and the corresponding linear algebraic problem. Then, in Section 3, we introduce a general algebraic stabilization and summarize its main properties. Section 4 provides three examples of algebraic stabilizations. The first one is the AFC scheme with the Kuzmin limiter, the deficiencies of which are then analyzed in Section 5. The other two examples in Section 4 are the AFC scheme with the BJK limiter, for which also some results are reported in Section 5, and the MUAS method. The MUAS method is used as the basis for defining the new algebraic stabilization in Section 6. After analyzing the new method, various numerical results will be presented.

2 Galerkin finite element discretization

A finite element discretization of the convection–diffusion–reaction problem (1) is based on its weak formulation, which reads:

Find uH1(Ω) such that u = ub on Ω and

$$ a(u,v)=(g,v)\qquad\forall v\in {H^{1}_{0}}({\Omega}) , $$

where

$$ a(u,v)=\varepsilon (\nabla u,\nabla v)+({\boldsymbol{b}}\cdot\nabla u,v)+(c u,v) . $$

As usual, (⋅,⋅) denotes the inner product in L2(Ω) or L2(Ω)d. It is well known that this weak formulation has a unique solution (cf. [12]).

To define a finite element discretization of problem (1), we consider a simplicial triangulation \({\mathscr{T}}_{h}\) of \(\overline {\Omega }\) which is assumed to belong to a regular family of triangulations in the sense of [11]. Furthermore, we introduce finite element spaces

$$ \begin{array}{@{}rcl@{}} W_h=\{v_h\in C(\overline{\Omega}) ; v_h\vert _T^{}\in P_1(T) \forall T\in \mathscr{T}_h\} ,\qquad V_h=W_h\cap H^1_0({\Omega}) , \end{array} $$

consisting of continuous piecewise linear functions. The vertices of the triangulation \({\mathscr{T}}_{h}\) will be denoted by \(x_{1},\dots ,x_{N}\) and we assume that \(x_{1},\dots ,x_{M}\in {\Omega }\) and \(x_{M+1},\dots ,x_{N}\in \partial {\Omega }\). Then, the usual basis functions \(\varphi _{1},\dots ,\varphi _{N}\) of Wh are defined by the conditions φi(xj) = δij, \(i,j=1,\dots ,N\), where δij is the Kronecker symbol. Obviously, the functions \(\varphi _{1},\dots ,\varphi _{M}\) form a basis of Vh. Any function uhWh can be written in a unique way in the form

$$ u_{h}=\sum\limits_{i=1}^{N}u_{i} \varphi_{i} $$
(3)

and hence it can be identified with the coefficient vector \(\text {U}=(u_{1},\dots ,u_{N})\).

Now, an approximate solution of problem (1) can be introduced as the solution of the following finite-dimensional problem:

Find uhWh such that uh(xi) = ub(xi), \(i=M+1,\dots ,N\), and

$$ a(u_{h},v_{h})=(g,v_{h})\qquad\forall v_{h}\in V_{h} . $$
(4)

It is easy to show that the discrete problem (4) has a unique solution.

We denote

$$ \begin{array}{@{}rcl@{}} a_{ij}&=&a(\varphi_{j},\varphi_{i}) ,\qquad i,j=1,\dots,N , \end{array} $$
(5)
$$ \begin{array}{@{}rcl@{}} g_{i}&=&(g,\varphi_{i}) ,\quad\qquad i=1,\dots,M , \end{array} $$
(6)
$$ \begin{array}{@{}rcl@{}} {u^{b}_{i}}&=&u_{b}(x_{i}) ,\quad\qquad i=M+1,\dots,N . \end{array} $$
(7)

Then, uh is a solution of the finite-dimensional problem (4) if and only if the coefficient vector \((u_{1},\dots ,u_{N})\) corresponding to uh satisfies the algebraic problem

$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{j=1}^{N} a_{ij} u_{j}=g_{i} ,\qquad i=1,\dots,M , \end{array} $$
(8)
$$ \begin{array}{@{}rcl@{}} &&u_{i}={u^{b}_{i}} ,\qquad i=M+1,\dots,N . \end{array} $$
(9)

As discussed in the introduction, the above discretization is not appropriate in the convection-dominated regime and a stabilization has to be applied. The most common way is to introduce additional stabilization terms in the discrete problem (4) (see, e.g., [36]). However, another attractive possibility is to modify the algebraic problem (8), (9), which will be pursued in this paper.

3 A general algebraic stabilization

The stabilizing effect of various approaches used to suppress the spurious oscillations present in the solutions of the Galerkin discretization is due to the fact that these methods add a certain amount of artificial diffusion to the Galerkin FEM. However, if this amount is too large, the approximate solution becomes inaccurate due to an excessive smearing of the layers. It turns out that accurate solutions can be obtained only if the amount of the artificial diffusion respects the local behavior of the solution (see, e.g., [8]). This motivates us to stabilize the algebraic problem (8), (9) by adding an artificial diffusion matrix \(\mathbb {B}(\text {U})=(b_{ij}(\text {U}))_{i,j=1}^{N}\) which depends on the unknown approximate solution \(\text {U}=(u_{1},\dots ,u_{N})\). Here, we shall describe this approach only briefly and refer to the recent paper [21] for a more detailed presentation.

Based on the above discussion, we will consider the nonlinear algebraic problem

$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{j=1}^{N} (a_{ij}+b_{ij}(\text{U})) u_{j}=g_{i} ,\qquad i=1,\dots,M , \end{array} $$
(10)
$$ \begin{array}{@{}rcl@{}} &&u_{i}={u^{b}_{i}} ,\qquad i=M+1,\dots,N . \end{array} $$
(11)

We assume that, for any \(\text {U}\in \mathbb {R}^{N}\), the matrix \(\mathbb {B}(\text {U})\) satisfies

$$ \begin{array}{@{}rcl@{}} &&b_{ij}(\text{U})=b_{ji}(\text{U}) ,\qquad\qquad i,j=1,\dots,N , \end{array} $$
(12)
$$ \begin{array}{@{}rcl@{}} &&b_{ij}(\text{U})\le0,\qquad\qquad\qquad i,j=1,\dots,N , i\neq j , \end{array} $$
(13)
$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{j=1}^{N} b_{ij}(\text{U})=0 ,\qquad\qquad i=1,\dots,N . \end{array} $$
(14)

Moreover, we assume that \(\mathbb {B}(\text {U})\) has the typical sparsity pattern of finite element matrices, i.e.,

$$ b_{ij}(\text{U})=0\qquad \forall j\not\in S_{i}\cup\{i\}, i=1,\dots,M , $$
(15)

where

$$ S_{i}=\{j\in\{1,\dots,N\}\setminus\{i\} ; x_{i} \text{and} x_{j} \text{are end points of the same edge}\} . $$

These assumptions are motivated by the fact that the properties (12)–(15) are satisfied for the diffusion matrix \((\varepsilon (\nabla \varphi _{j},\nabla \varphi _{i}))_{i,j=1}^{N}\) if the triangulation \({\mathscr{T}}_{h}\) is weakly acute, i.e., if the angles between facets of \({\mathscr{T}}_{h}\) do not exceed π/2. It is also important that the properties (12)–(14) assure that the matrix \(\mathbb {B}(\text {U})\) is positive semidefinite for any \(\text {U}\in \mathbb {R}^{N}\) (see [21]).

To prove the solvability of the system (10), (11), we make the following assumption, which is motivated by the definitions of the matrix \(\mathbb {B}(\text {U})\) considered in this paper.

Assumption (A1) 1

For any \(i\in \{1,\dots ,M\}\) and any \(j\in \{1,\dots ,N\}\), the function bij(U)(ujui) is a continuous function of \(\text {U}=(u_{1},\dots ,u_{N})\in \mathbb {R}^{N}\) and, for any \(i\in \{1,\dots ,M\}\) and any \(j\in \{M+1,\dots ,N\}\), the function bij(U) is a bounded function of \(\text {U}\in \mathbb {R}^{N}\).

Theorem 1

Let (12)–(14) hold and let Assumption (A1) be satisfied. Then, there exists a solution of the nonlinear problem (10), (11).

Proof

See [21]. □

The construction of the matrix \(\mathbb {B}(\text {U})\) is usually based on the requirement that the problem (10), (11) satisfies the DMP. One can formulate various conditions that guarantee that a nonlinear discrete problem satisfies the DMP or at least preserves the positivity (cf. [8]). For our purposes, the following assumption is useful.

Assumption (A2) 1

Consider any \(\text {U}=(u_{1},\dots ,u_{N})\in \mathbb {R}^{N}\) and any \(i\in \{1,\dots ,M\}\). If ui is a strict local extremum of U with respect to Si, i.e.,

$$ \begin{array}{@{}rcl@{}} u_i>u_j\quad\forall j\in S_i\qquad\text{or}\qquad u_i<u_j\quad\forall j\in S_i , \end{array} $$

then

$$ \begin{array}{@{}rcl@{}} a_{ij}+b_{ij}(\text{U})\le0\qquad\forall j\in S_i . \end{array} $$

Under the above assumptions, it is possible to prove that the approximate solution obtained using the nonlinear problem (10), (11) satisfies a direct analogue of the maximum principles which hold for the problem (1) (see, e.g., [12] for the classical solutions and [14] for the weak solutions).

Theorem 2

Let the assumptions stated in Section 1 be satisfied and let the matrix \(\mathbb {B}(\text {U})\) satisfy (12)–(15) and Assumptions (A1) and (A2). Consider any nonempty set \({\mathscr{G}}_{h}\subset {\mathscr{T}}_{h}\) and define

$$ \begin{array}{@{}rcl@{}} G_h=\bigcup_{T\in \mathscr{G}_h} T . \end{array} $$

Let \(\mathrm {U}\in \mathbb {R}^{N}\) be a solution of (10) and let uhWh be the corresponding finite element function given by (3). Then, one has the DMP

$$ \begin{array}{@{}rcl@{}} g\le0\quad \text{in} G_{h}\quad &\Rightarrow&\quad \underset{G_{h}}{\max} u_{h}\le\underset{\partial G_{h}}{\max} u_{h}^{+} ,\\ g\ge0\quad \text{in} G_{h}\quad&\Rightarrow&\quad \underset{G_{h}}{\min} u_{h}\ge\underset{\partial G_{h}}{\min} u_{h}^{-} , \end{array} $$

where \(u_{h}^{+}=\max \limits \{u_{h},0\}\) and \(u_{h}^{-}=\min \limits \{u_{h},0\}\). If, in addition, c = 0 in Gh, then

$$ \begin{array}{@{}rcl@{}} g\le0\quad\text{in} G_{h}\qquad&\Rightarrow&\qquad \underset{G_{h}}{\max} u_{h}=\underset{\partial G_{h}}{\max} u_{h} ,\\ g\ge0\quad \text{in} G_{h}\qquad&\Rightarrow&\qquad \underset{G_{h}}{\min} u_{h}=\underset{\partial G_{h}}{\min} u_{h} . \end{array} $$

Proof

See [21].□

We will close this section with a brief discussion of a priori error estimates available for the nonlinear problem (10), (11). To derive an error estimate, it is convenient to write (10), (11) as a variational problem where the algebraic stabilization term is represented using the form

$$ \begin{array}{@{}rcl@{}} b_h(w;z,v) =\sum\limits_{i,j=1}^N b_{ij}(w) z(x_j) v(x_i) , \qquad w,z,v\in C(\overline{\Omega}) , \end{array} $$

with \(b_{ij}(w):=b_{ij}(\{w(x_{i})\}_{i=1}^{N})\) (see [21] for details). This variational problem is stable with respect to the solution-dependent norm on Wh defined by

$$ \begin{array}{@{}rcl@{}} \| v\| _h^{}:=\Big(\varepsilon \vert v\vert_{1,{\Omega}}^2 +\sigma_0 \| v\|_{0,{\Omega}}^2 +b_h(u_h;v,v)\Big)^{1/2} ,\qquad v\in H^1({\Omega})\cap C(\overline{\Omega}) , \end{array} $$

assuming that σ0 > 0 in (2). This shows that the problem (10), (11) really provides a stronger stability than the original problem (8), (9).

The algebraic stabilization term leads to a consistency error whose behavior with respect to h depends on how the artificial diffusion matrix is constructed. Often, one has

$$ \vert b_{ij}(u_{h})\vert \le\max\{\vert a_{ij}\vert ,\vert a_{ji}\vert \}\qquad\forall i\neq j , $$

which will be also the case in this paper. Under this assumption and assuming further that the weak solution of (1) satisfies uH2(Ω) and that σ0 > 0, one can prove (cf. [21]) that the finite element function uhWh, corresponding via (3) to the solution \(\text {U}\in \mathbb {R}^{N}\) of the nonlinear algebraic problem (10), (11), satisfies the estimate

$$ \begin{array}{@{}rcl@{}} \| u-u_{h}\|_{h}^{}&\le& C (\varepsilon+\sigma_{0}^{-1} \{\| \boldsymbol{b}\|_{0,\infty,{\Omega}}^{2} +\| c\|_{0,\infty,{\Omega}}^{2}\}+\sigma_{0}h^{2})^{1/2} h \| u\|_{2,{\Omega}}\\ &&+ C (\varepsilon+\| \boldsymbol{b}\|_{0,\infty,{\Omega}} h +\| c\|_{0,\infty,{\Omega}}^{} h^{2})^{1/2} \vert i_{h}u\vert_{1,{\Omega}} , \end{array} $$
(16)

where the constant C is independent of h and the data of problem (1). If σ0 = 0, then the estimate is deteriorated by a negative power of ε (see [6] for details). We also refer to [6] and [9] for slightly improved error estimates under various additional assumptions.

The estimate (16) does not imply any convergence in the diffusion-dominated case (when \(\varepsilon >\| \boldsymbol {b}\|_{0,\infty ,{\Omega }} h\)) and it guarantees only the convergence order 1/2 in the convection-dominated case. Numerical results presented in [6] show that this result is sharp under the general assumptions made up to now. It is of course desirable to design the artificial diffusion matrix \(\mathbb {B}(\text {U})\) in such a way that optimal convergence rates with respect to various norms are obtained. For some algebraic stabilizations, optimal convergence rates were indeed observed; however, more detailed convergence studies revealed that the convergence rates often depend on the considered meshes and data (cf. [6, 7]). The aim of this paper is to analyze some of these observations and to propose an algebraic stabilization for which optimal convergence rates can be observed in a wide range of situations, in particular, for various types of simplicial meshes.

4 Examples of algebraic stabilizations

In this section, we present three examples of algebraic stabilizations based on the papers [29], [7], and [21], respectively. All these stabilizations fit into the framework of the previous section.

4.1 Algebraic flux correction with the Kuzmin limiter

To derive an algebraic flux correction (AFC) scheme for the problem (8), (9), one first introduces the artificial diffusion matrix \(\mathbb {D}=(d_{ij})_{i,j=1}^{N}\) by

$$ d_{ij}=d_{ji}=-\max\{a_{ij},0,a_{ji}\}\qquad\forall i\neq j ,\qquad\qquad d_{ii}=-\sum\limits_{j\neq i} d_{ij} . $$

Note that this matrix possesses the properties (12)–(15). If \((\mathbb {D} \text {U})_{i}\) is added to the left-hand side of (8), one obtains a problem satisfying the DMP. However, this stabilized problem is too diffusive. Therefore, one first adds the term \((\mathbb {D} \text {U})_{i}\) to both sides of (8), uses the identity

$$ \begin{array}{@{}rcl@{}} (\mathbb{D} \text{U})_i=\sum\limits_{j=1}^N f_{ij}\qquad\text{with}\qquad f_{ij}=d_{ij} (u_j-u_i) \end{array} $$

and then, on the right-hand side, one limits those anti-diffusive fluxes fij that would otherwise cause spurious oscillations. The limiting is achieved by multiplying the fluxes by solution dependent limiters αij ∈ [0,1] satisfying

$$ \alpha_{ij}=\alpha_{ji} ,\qquad i,j=1,\dots,N . $$
(17)

This leads to the algebraic problem (10), (11) with

$$ b_{ij}(\text{U})=(1-\alpha_{ij}(\text{U})) d_{ij}\qquad\forall i\neq j , \qquad\quad b_{ii}(\text{U})=-\sum\limits_{j\neq i} b_{ij}(\text{U}) . $$
(18)

This matrix \((b_{ij}(\text {U}))_{i,j=1}^{N}\) satisfies the assumptions (12)–(15). A theoretical analysis of this AFC scheme concerning the solvability, local DMP, and error estimation can be found in [6] where also a detailed derivation of the scheme is presented.

The properties of the above-described AFC scheme significantly depend on the choice of the limiters αij. Here, we present the Kuzmin limiter proposed in [29] which was thoroughly investigated in [6] and can be considered as a standard limiter for algebraic stabilizations of steady-state convection–diffusion–reaction equations.

To define the limiter of [29], one first computes, for \(i=1,\dots ,M\),

$$ P_{i}^{+}={\underset{a_{ji}\le a_{ij}}{\underset{j\in S_{i}}{\sum}}} f_{ij}^{+} ,\quad P_{i}^{-}=\underset{a_{ji}\le a_{ij}}{\underset{j\in S_{i}}{\sum}} f_{ij}^{-} ,\!\quad Q_{i}^{+}=-\sum\limits_{j\in S_{i}} f_{ij}^{-} ,\quad Q_{i}^{-}=-\sum\limits_{j\in S_{i}} f_{ij}^{+} , $$
(19)

where fij = dij(ujui), \(f_{ij}^{+}=\max \limits \{0,f_{ij}\}\), and \(f_{ij}^{-}=\min \limits \{0,f_{ij}\}\). Then, one defines

$$ R_{i}^{+}=\min\left\{1,\frac{Q_{i}^{+}}{P_{i}^{+}}\right\},\quad R_{i}^{-}=\min\left\{1,\frac{Q_{i}^{-}}{P_{i}^{-}}\right\},\qquad i=1,\dots,M . $$
(20)

If \(P_{i}^{+}\) or \(P_{i}^{-}\) vanishes, one sets \(R_{i}^{+}=1\) or \(R_{i}^{-}=1\), respectively. At Dirichlet nodes, these quantities are also set to be 1, i.e.,

$$ R_{i}^{+}=1 ,\quad R_{i}^{-}=1 ,\qquad i=M+1,\dots,N . $$
(21)

Furthermore, one sets

$$ \widetilde{\alpha}_{ij}=\left\{ \begin{array}{cl} R_{i}^{+}\quad&\text{if} f_{ij}>0 ,\\ 1\quad&\text{if} f_{ij}=0 ,\\ R_{i}^{-}\quad&\text{if} f_{ij}<0 , \end{array}\right.\qquad\qquad i,j=1,\dots,N . $$
(22)

Finally, one defines

$$ \alpha_{ij}=\alpha_{ji}=\widetilde{\alpha}_{ij}\qquad\text{if}\quad a_{ji}\le a_{ij} ,\qquad i,j=1,\dots,N . $$
(23)

It was proved in [6] that αij(U)(ujui) are continuous functions of \(\text {U}\in \mathbb {R}^{N}\) so that the assumption (A1) is satisfied for bij(U) defined by (18) with the Kuzmin limiter. The validity of (A2) was proved in [25] under the assumption

$$ \min\{a_{ij},a_{ji}\}\le0\qquad \forall i=1,\dots,M , j=1,\dots,N , i\neq j . $$
(24)

On the other hand, it was shown in [25] that the DMP generally does not hold if the condition (24) is not satisfied. Since the convection matrix is skew-symmetric, the condition (24) can be violated if the diffusion matrix has large positive entries (which may occur if the angles between facets of \({\mathscr{T}}_{h}\) exceed π/2) or if the reaction coefficient c is large. As a remedy for the latter case, a lumping of the reaction term was considered in [6]. This, however, may increase the smearing of layers as demonstrated in [21]. Let us mention that, in the two-dimensional case and for c = 0 or a lumped reaction term, the validity of (24) is guaranteed for Delaunay meshes (i.e., meshes where the sum of any pair of angles opposite a common edge is smaller than, or equal to, π).

Since it is desirable that the DMP holds on arbitrary simplicial meshes and without a lumping of the reaction term, it is necessary to apply other limiters or different algebraic stabilizations. This will be the subject of the following two sections.

4.2 Algebraic flux correction with the BJK limiter

In this section, we again consider an AFC scheme, i.e., the matrix \(\mathbb {B}(\text {U})\) in (10), (11) is defined by (18). A small difference to the previous section is that the matrix \((a_{ij})_{i,j=1}^{N}\) is modified by

$$ a_{ji}:=0\quad\text{if}\quad a_{ij}<0 ,\qquad i=1,\dots,M , j=M+1,\dots,N . $$

This modification affects only the definition of the matrix \(\mathbb {D}\) and reduces the amount of artificial diffusion introduced by the algebraic stabilization. We shall describe the so-called BJK limiter proposed in [7] using some ideas of [31]. The definition of this limiter is inspired by the Zalesak algorithm [37] for the time-dependent case.

The definition of the limiter again relies on local quantities \(P_{i}^{+}\), \(P_{i}^{-}\), \(Q_{i}^{+}\), \(Q_{i}^{-}\) which are now computed for \(i=1,\dots ,M\) by

$$ \begin{array}{@{}rcl@{}} P_{i}^{+}&=&\sum\limits_{j\in S_{i}} f_{ij}^{+} ,\qquad P_{i}^{-}=\sum\limits_{j\in S_{i}} f_{ij}^{-} , \end{array} $$
(25)
$$ \begin{array}{@{}rcl@{}} Q_{i}^{+}&=&q_{i} (u_{i}-u_{i}^{\max}) ,\qquad Q_{i}^{-}=q_{i} (u_{i}-u_{i}^{\min}) , \end{array} $$
(26)

where again fij = dij(ujui) and

$$ u_{i}^{\max}= \underset{j\in S_{i}\cup\{i\}}{\max} u_{j} ,\qquad u_{i}^{\min}= \underset{j\in S_{i}\cup\{i\}}{\min} u_{j} ,\qquad q_{i}=\sum\limits_{j\in S_{i}} d_{ij} . $$

Then, one defines

$$ R_{i}^{+}=\min\left\{1,\frac{\mu_{i} Q_{i}^{+}}{P_{i}^{+}}\right\},\quad R_{i}^{-}=\min\left\{1,\frac{\mu_{i} Q_{i}^{-}}{P_{i}^{-}}\right\},\qquad i=1,\dots,M , $$
(27)

with fixed constants μi > 0. If \(P_{i}^{+}\) or \(P_{i}^{-}\) vanishes, one again sets \(R_{i}^{+}=1\) or \(R_{i}^{-}=1\), respectively. The definition (21) of \(R_{i}^{\pm }\) at Dirichlet nodes is applied, too, and one again defines the factors \(\widetilde {\alpha }_{ij}\) by (22). Finally, the limiter functions are defined by

$$ \alpha_{ij}=\min\{\widetilde{\alpha}_{ij},\widetilde{\alpha}_{ji}\} , \qquad i,j=1,\dots,N . $$
(28)

The validity of the Assumptions (A1) and (A2) was proved in [7] without any additional assumptions on the matrix \((a_{ij})_{i,j=1}^{N}\). Thus, in particular, the DMP holds for arbitrary simplicial meshes and any nonnegative reaction coefficient c. Moreover, it was shown in [7] that the constants μi can be defined in such a way that the AFC scheme with the BJK limiter is linearity preserving, i.e., \(\mathbb {B}(u)=0\) for \(u\in P_{1}(\mathbb {R}^{d})\). This property may lead to improved convergence results (see, e.g., [4, 9]).

To formulate a sufficient condition for the linearity preservation, we introduce the patches

$$ {\Delta}_{i} = \cup\{ T\in\mathscr{T}_{h} : x_{i}\in T\} ,\qquad i=1,\dots,M , $$
(29)

consisting of simplices from \({\mathscr{T}}_{h}\) sharing the vertex xi. Then, the AFC scheme with the BJK limiter is linearity preserving if

$$ \mu_{i}\ge\frac{\underset{x_{j}\in\partial{\Delta}_{i}}{\max} \vert x_{i}-x_{j}\vert} {\text{dist}(x_{i},\partial{\Delta}_{i}^{\text{conv}})} , \qquad\quad i=1,\dots,M , $$
(30)

where \({\Delta }_{i}^{\text {conv}}\) is the convex hull of Δi. It was also proved in [7] that it suffices to set μi = 1 if the patch Δi is symmetric with respect to the vertex xi. Note that large values of the constants μi cause that more limiters αij are equal to 1 and hence less artificial diffusion is added, which makes it possible to obtain sharp approximations of layers. On the other hand, however, large values of μi’s also cause that the numerical solution of the nonlinear algebraic problem becomes more involved.

4.3 Monotone upwind-type algebraically stabilized method

Although the BJK limiter presented in the previous section has nice theoretical properties, numerical experiments revealed that it has also some drawbacks in comparison with the Kuzmin limiter. In particular, the nonlinear algebraic problems are much more difficult to solve and the approximate solutions are sometimes less accurate away from layers (see [17, 22]). Therefore, another approach based on the Kuzmin limiter was developed in [21, 27] that will be presented in this section.

As we mentioned in Section 4.1, the DMP generally does not hold for the AFC scheme with the Kuzmin limiter if the condition (24) is not satisfied. The need of (24) for proving the Assumption (A2) is a consequence of the condition ajiaij used in (23) to symmetrize the factors \(\widetilde {\alpha }_{ij}\). A possible remedy is to replace (23) by (28) and to define \(P_{i}^{\pm }\) by (25). Then, the DMP is satisfied without any additional condition on the matrix \((a_{ij})_{i,j=1}^{N}\) but the method is more diffusive then the scheme from Section 4.1 (see [25]).

The inequality aji < aij often means that the vertex xi lies in the upwind direction with respect to the vertex xj (see [25] for a discussion on this topic). Consequently, the use of the inequality ajiaij in (23) causes that αij = αji is defined using quantities computed at the upwind vertex of the edge with end points xi, xj. It turns out that this feature has a positive influence on the quality of the approximate solutions and on the convergence of the iterative process for solving the nonlinear problem (10), (11).

In order to obtain a method possessing the mentioned upwind feature and satisfying the DMP on arbitrary simplicial meshes, the definition of the matrix \(\mathbb {B}(\text {U})\) was changed in [21] to

$$ \begin{array}{@{}rcl@{}} b_{ij}(\text{U}) &=&-\max\{\beta_{ij}(\text{U}) a_{ij},0,\beta_{ji}(\text{U}) a_{ji}\} ,\qquad i,j=1,\dots,N , i\neq j , \end{array} $$
(31)
$$ \begin{array}{@{}rcl@{}} b_{ii}(\text{U}) &=&-\sum\limits_{j\neq i} b_{ij}(\text{U}) ,\qquad i=1,\dots,N , \end{array} $$
(32)

with some solution-dependent factors βij(U) ∈ [0,1]. This matrix again satisfies the assumptions (12)–(15) but, in contrast to (18), the formula (31) leads to a symmetric matrix \(\mathbb {B}(\text {U})\) also if the factors βij are not symmetric. This makes it possible to get rid of the symmetry condition (17).

If the condition (24) is satisfied, then

$$ b_{ij}(\text{U})=\left\{ \begin{array}{cl} \beta_{ij}(\text{U}) d_{ij}\quad&\text{if} a_{ji}\le a_{ij} ,\\ \beta_{ji}(\text{U}) d_{ij}\quad&\text{otherwise} , \end{array}\right. $$

for \(i=1,\dots ,M\) and \(j=1,\dots ,N\) with ij. Thus, in this case, the definition (31) implicitly comprises the favorable upwind feature discussed above and the method (10), (11) can be again written in the form of an AFC scheme. Moreover, if the functions βij form a symmetric matrix and αij = 1 − βij, then the definitions (18) and (31), (32) are equivalent.

Thus, let us consider the algebraic problem (10), (11) with the artificial diffusion matrix given by (31) and (32) and with any functions \(\beta _{ij}:\mathbb {R}^{N}\to [0,1]\) satisfying, for any \(i,j\in \{1,\dots ,N\}\),

$$ \text{if } a_{ij}>0, \text{ then } \beta_{ij}(\text{U})(u_{j}-u_{i}) \text{ is a continuous function of } \text{U}\in\mathbb{R}^{N} . $$
(33)

Then, one has the following existence result.

Theorem 3

Let the matrix \((b_{ij}(\mathrm {U}))_{i,j=1}^{N}\) be defined by (31) and (32) with functions \(\beta _{ij}:\mathbb {R}^{N}\to [0,1]\) satisfying (33) for any \(i,j\in \{1,\dots ,N\}\). Then, Assumption (A1) is satisfied and the nonlinear algebraic problem (10), (11) has a solution.

Proof

See [21].□

Rewriting the definition of the Kuzmin limiter under the condition (24), the following definition of βij was introduced in [21]. First, for any \(i\in \{1,\dots ,M\}\), one computes

$$ \begin{array}{@{}rcl@{}} P_{i}^{+}&=&\underset{a_{ij}>0}{\underset{j\in S_{i}}{\sum}} a_{ij} (u_{i}-u_{j})^{+} ,\qquad\quad P_{i}^{-}=\underset{a_{ij}>0}{\underset{j\in S_{i}}{\sum}} a_{ij} (u_{i}-u_{j})^{-} , \end{array} $$
(34)
$$ \begin{array}{@{}rcl@{}} Q_{i}^{+}&=&\sum\limits_{j\in S_{i}} s_{ij} (u_{j}-u_{i})^{+} ,\qquad\quad Q_{i}^{-}=\sum\limits_{j\in S_{i}} s_{ij} (u_{j}-u_{i})^{-} , \end{array} $$
(35)

with

$$ s_{ij}=\max\{\vert a_{ij}\vert,a_{ji}\} . $$

Then, one defines \(R_{i}^{\pm }\) by (20) and (21), and sets

$$ \beta_{ij}=\left\{ \begin{array}{ll} 1-R_{i}^{+}\quad&\text{if} u_{i}>u_{j} ,\\ 0\quad&\text{if} u_{i}=u_{j} ,\\ 1-R_{i}^{-}\quad&\text{if} u_{i}<u_{j} , \end{array}\right.\qquad\qquad i,j=1,\dots,N . $$
(36)

It was proved in [21] that the resulting method satisfies the Assumptions (A1) and (A2) without any additional assumptions on the matrix \((a_{ij})_{i,j=1}^{N}\). Thus, the DMP holds on arbitrary simplicial meshes and for any nonnegative reaction coefficient c. Due to the above-discussed upwind feature, the name Monotone Upwind-type Algebraically Stabilized (MUAS) method was introduced in [21].

If the condition (24) holds, then the only difference between the MUAS method and the AFC scheme with the Kuzmin limiter is the definition of \(Q_{i}^{\pm }\) since the relations (19) give (35) with sij = |dij|. In the convection-dominated regime, the difference is negligible and both methods lead to almost the same results. Therefore, the MUAS method preserves the advantages of the AFC scheme from Section 4.1 which are available under the condition (24). Note that, without the assumption (24), the application of the AFC scheme with the Kuzmin limiter does not make much sense since the main goal of the AFC, i.e., the validity of the DMP, is not achieved in general. In the diffusion-dominated case, the use of sij instead of |dij| may improve the accuracy and convergence behavior when non-Delaunay meshes are used (see [21]).

5 Numerical and analytical studies of AFC schemes

The convergence properties of the AFC scheme with the Kuzmin limiter from Section 4.1 were thoroughly tested in [6] for various grids and the following example.

Example 1

Problem (1) is considered with Ω = (0,1)2, with different values of ε, and with b = (3,2)T, c = 1, ub = 0, and the right-hand side g chosen so that

$$ u(x,y) = 100 x^{2} (1-x)^{2} y (1-y) (1-2y) $$

is the solution of (1).

The coarsest levels of the grids considered in [6] are shown in Fig. 1. Grids 1, 2, and 3 were refined uniformly whereas Grid 4 was always obtained from Grid 1 by changing the directions of the diagonals in even rows of squares (from below). Grid 5 was obtained from Grid 4 by shifting interior nodes to the right by the tenth of the horizontal mesh width on each even horizontal mesh line. Note that Grids 3 and 5 are not of Delaunay type.

Fig. 1
figure 1

Grids 1–5 (left to right)

Errors of the approximate solutions of Example 1 with ε = 10− 8 computed using the AFC scheme with the Kuzmin limiter for Grid 1 can be seen in Table 1. The results slightly differ from those in [6] since, in contrast to the present paper, a lumping of the reaction term was applied in [6]. The value of ne represents the number of edges along one horizontal mesh line (thus, ne = 4 for Grid 1 in Fig. 1). One observes the usual optimal orders of convergence with respect to the L2 norm and the H1 seminorm. Moreover, the convergence order with respect to the norm \(\|\cdot \|_{h}^{}\) is much higher than predicted by (16). However, if the computation is repeated on Grid 4, one observes in Table 2 that the convergence orders with respect to all three norms deteriorate by 1 and, in particular, one has no error reduction with respect to the H1 seminorm. A similar behavior can be observed for Grids 3 and 5. For Grid 2, the deterioration of the convergence is less pronounced but the convergence orders are also far from being optimal (see [6] for the case with a lumped reaction term leading to similar results). Let us mention that, in all these computations, the matrix \((a_{ij})_{i,j=1}^{N}\) satisfies the condition (24), which guarantees the validity of the DMP.

Table 1 Example 1: ε = 10− 8, numerical results for Grid 1 computed using the AFC scheme with the Kuzmin limiter
Table 2 Example 1: ε = 10− 8, numerical results for Grid 4 computed using the AFC scheme with the Kuzmin limiter

On the other hand, there are various grids for which optimal convergence orders can be observed. Examples of such grids are given in Fig. 2. The finer variants of Grid 6 are obtained by uniform refinement like for Grid 1 whereas Grid 7 is obtained from Grid 6 by changing the directions of some of the diagonals. Grid 8 is obtained from Grid 6 by adding the second diagonal in each small square. Finer variants of Grid 9 are also not constructed by refining the coarse level but each level is constructed separately (cf. the rightmost grid in Fig. 2). Obviously, the basic difference between Grids 2–5 and Grids 1 and 6–9 is that, in the latter case, (most of) the patches Δi defined by (29) are symmetric, i.e., for any vertex xjΔi there exists a vertex xkΔi such that (xj + xk)/2 = xi. Thus, it seems that the local symmetry of the grids is important for optimal convergence rates.

Fig. 2
figure 2

Grids 6–9 (left to right) and a finer variant of Grid 9 (rightmost)

To understand why the approximate solutions on Grid 4 do not converge in the H1 seminorm, let us have a look at the graphs of some of these solutions. Figure 3 (left) shows the solution computed for ne = 32 and it can be seen that the solution is polluted by an oscillating component (for the sake of clarity, the solution is drawn only along grid lines of Grid 4 which are parallel to the coordinate axes). This is also clearly seen from Fig. 3 (right) which shows the wildly oscillating error uhihu, where ih is the usual Lagrange interpolation operator. The observed structure of the solution remains preserved also on finer meshes. Figure 4 shows the errors uhihu along the line x = 0.25 on three successive meshes and indicates that the H1 seminorm of the error will not change significantly when switching to finer meshes (notice the different scales on the vertical axes). It should be mentioned that this type of oscillations does not represent a violation of the DMP. The oscillatory behavior of the approximate solutions suggests that the accuracy might be improved by a local averaging. This is indeed possible but the convergence rates generally still remain suboptimal.

Fig. 3
figure 3

Example 1: ε = 10− 8, approximate solution computed using the AFC scheme with the Kuzmin limiter on Grid 4 with ne = 32 (left) and the corresponding error uhihu (right)

Fig. 4
figure 4

Example 1: ε = 10− 8, errors uhihu along the line x = 0.25 for approximate solutions computed using the AFC scheme with the Kuzmin limiter on Grid 4 with ne = 32, ne = 64 and ne = 128 (left to right)

Figures 3 and 4 explain why the H1 seminorm of the error of the approximate solution does not tend to zero on Grid 4 and now the main question is why the observed oscillations are not suppressed by the algebraic stabilization. To answer this question, we shall consider simpler examples than Example 1. We start with the following almost trivial case.

Example 2

Problem (1) is considered with Ω = (0,1)2, ε = 10− 8, b = (1,0)T, c = 0, g = 1, and ub(x,y) = x.

Of course, the exact solution of this example is u(x,y) = x and hence the Galerkin FEM gives the exact solution on any mesh. However, if one applies the AFC scheme with the Kuzmin limiter on Grid 4, one obtains the oscillating solution shown in Fig. 5 (left). Again, the structure of the solution is preserved also on finer meshes. Moreover, numerical tests show that the size of the oscillations is proportional to h so that one can again expect that the H1 seminorm of the error will not tend to zero. This is confirmed by the results shown in Table 3.

Fig. 5
figure 5

Approximate solutions computed using the AFC scheme with the Kuzmin limiter on Grid 4 with ne = 20: Example 2 (left) and Example 3 computed with the modification (47), (48) (right)

Table 3 Example 2: numerical results for Grid 4 computed using the AFC scheme with the Kuzmin limiter

Before we start our analytical investigations of this surprising observation, let us have a closer look at Grid 4 and the matrix entries corresponding to Example 2. First, note that all the patches Δi defined in (29) have the same geometry for Grid 4 but they possess two types of orientation with respect to the constant convection vector b. This can be seen in Fig. 6 where a part of Grid 4 is shown. One type of orientation of the patches is represented by the patch around the node A and the other one by the patch around the node B. Note that A lies on an even horizontal grid line and B on an odd horizontal grid line.

Fig. 6
figure 6

A part of Grid 4

In the previous sections, we referred to the nodes xi of the triangulation through their indices i. In the following, it will be more convenient to use directly the notation for nodes. Thus, for example, the matrix entry aij will be denoted by aAB if xi = A and xj = B. Then, the linear system (8), (9), can be written in the form

$$ \sum\limits_{Q\in\mathscr{I}(P)} a_{PQ} u_{Q}=g_{P}\quad\forall P{\in\mathscr{N}_{h}^{i}} ,\qquad u_{P}={u^{b}_{P}}\quad\forall P{\in\mathscr{N}_{h}^{b}} , $$
(37)

where \({{\mathscr{N}}_{h}^{i}}\) is the set of interior nodes of \({\mathscr{T}}_{h}^{}\), \({{\mathscr{N}}_{h}^{b}}\) is the set of boundary nodes of \({\mathscr{T}}_{h}^{}\), and \({\mathscr{I}}(P){\subset {\mathscr{N}}_{h}^{i}}{\cup {\mathscr{N}}_{h}^{b}}\) consists of the node P and all nodes connected to P by edges of \({\mathscr{T}}_{h}^{}\). Note that \({\mathscr{I}}(A)=\{A,B,C,D,E,F,G\}\) and \({\mathscr{I}}(B)=\{B,A,G,H,I,J,C\}\) in the case depicted in Fig. 6.

Considering the notation introduced in Fig. 6 and the data of Example 2, the entries of the Galerkin matrix \((a_{ij})_{i,j=1}^{N}\) defined in (5) are given by

$$ \begin{array}{@{}rcl@{}} a_{AB}&=&a_{AD}=a_{HB}=-\varepsilon+\frac{h}6 ,\qquad a_{AC}=a_{FA}=a_{BJ}=a_{GB}=-\varepsilon+\frac{h}{3} ,\\ a_{BA}&=&a_{DA}=a_{BH}=-\varepsilon-\frac{h}6 ,\qquad a_{CA}=a_{AF}=a_{JB}=a_{BG}=-\varepsilon-\frac{h}{3} ,\\ a_{EA}&=&a_{GA}=a_{BC}=a_{BI}=\frac{h}6 ,\qquad a_{AA}=4 \varepsilon ,\\ a_{AE}&=&a_{AG}=a_{CB}=a_{IB}=-\frac{h}6 ,\quad a_{BB}=4 \varepsilon , \end{array} $$

where h is the mesh width in the directions of the coordinate axes. In our analytical considerations, it will be always assumed that

$$ \varepsilon<\frac{h}9 . $$
(38)

Since the data of Example 2 are constant and the triangulation is uniform, the matrix entries do not depend on the actual position of the nodes A and B. The above values of the entries of the Galerkin matrix imply that the relations for \(P_{i}^{\pm }\) from (19) can be written in the form

$$ P_{A}^{\pm}=f_{AB}^{\pm}+f_{AC}^{\pm}+f_{AD}^{\pm} ,\qquad P_{B}^{\pm}=f_{BI}^{\pm}+f_{BJ}^{\pm}+f_{BC}^{\pm} . $$
(39)

Finally, note that, under the assumption (38), one has

$$ \begin{array}{@{}rcl@{}} d_{AB}&=&d_{AD}=\varepsilon-\frac{h}6 ,\qquad d_{AC}=d_{AF}=\varepsilon-\frac{h}{3} ,\qquad d_{AE}=d_{AG}=-\frac{h}6 ,\\ d_{BA}&=&d_{BH}=\varepsilon-\frac{h}6 ,\qquad d_{BG}=d_{BJ}=\varepsilon-\frac{h}{3} ,\qquad d_{BI}=d_{BC}=-\frac{h}6 . \end{array} $$

A closer look at Fig. 5 (left) reveals that, in a large part of the computational domain, the discrete solution uh is approximately given by

$$ \begin{array}{@{}rcl@{}} u_{h}(x,y)&=&x+\alpha\qquad \text{along odd horizontal grid lines} , \end{array} $$
(40)
$$ \begin{array}{@{}rcl@{}} u_{h}(x,y)&=&x-\beta\qquad \text{along even horizontal grid lines} , \end{array} $$
(41)

where α and β are positive constants. A direct computation shows that the nodal values of this function satisfy

$$ \sum\limits_{Q\in\mathscr{I}(A)} a_{AQ} u_{Q}=h^{2}-2 \delta \varepsilon ,\qquad \sum\limits_{Q\in\mathscr{I}(B)} a_{BQ} u_{Q}=h^{2}+2 \delta \varepsilon , $$
(42)

where δ = α + β. For the data of Example 2, one has gP = h2 in (37) and hence one observes that the function uh given by (40) and (41) satisfies the Galerkin discretization up to the perturbation 2δε. This leads us to the surprising conclusion that, for the oscillating solution shown in Fig. 5 (left), the AFC stabilization term should be nearly zero.

Thus, let us investigate the AFC stabilization term when it is applied to a function satisfying (40) and (41). If we consider δh, then

$$ \begin{array}{@{}rcl@{}} f_{AB}&\le&0 ,\quad f_{AC} \le0 ,\quad f_{AD} \le0 ,\quad f_{AE}\ge0 ,\quad f_{AF} \ge0 ,\quad f_{AG} \ge0 , \end{array} $$
(43)
$$ \begin{array}{@{}rcl@{}} f_{BI}&\le&0 ,\quad f_{BJ}\le0 ,\quad f_{BC} \le0 ,\quad f_{BA}\ge0 ,\quad f_{BG}\ge0 ,\quad f_{BH} \ge0 , \end{array} $$
(44)

which together with (39) implies that \(P_{A}^{+}=P_{B}^{+}=0\) and hence \(R_{A}^{+}=R_{B}^{+}=1\). Furthermore, it follows from (19), (43), and (44) that

$$ Q_{A}^{-}=-f_{AE}^{+}-f_{AF}^{+}-f_{AG}^{+} ,\qquad Q_{B}^{-}=-f_{BA}^{+}-f_{BG}^{+}-f_{BH}^{+} . $$
(45)

Then, a direct computation gives

$$ \begin{array}{@{}rcl@{}} P_{A}^{-}&=&Q_{B}^{-}=\varepsilon (h+2 \delta)-\frac{h}{3} (h+\delta) ,\\ Q_{A}^{-}&=&P_{B}^{-}=\varepsilon h-\frac{h}{3} (2 h-\delta) . \end{array} $$

Moreover, setting

$$ \delta=\frac{h}2 \frac{h}{h-3 \varepsilon} , $$
(46)

one obtains \(P_{A}^{-}=Q_{A}^{-}\) and \(P_{B}^{-}=Q_{B}^{-}\), which implies that \(R_{A}^{-}=R_{B}^{-}=1\). Since the nodes A and B were chosen arbitrarily, one observes that, for any function uh given by (40) and (41) with α + β equal to δ from (46), the AFC stabilization term vanishes. This shows that our conjecture was correct.

The fact that the AFC scheme with the Kuzmin limiter does not reproduce the exact solution u(x,y) = x implies that the method is not linearity preserving or not uniquely solvable. The following lemma shows that the former possibility holds true.

Lemma 1

Let \(u\in P_{1}(\mathbb {R}^{2})\) be an arbitrary first degree polynomial and let us consider the arrangement from Fig. 6 and the above matrix entries corresponding to Example 2. Then, the quantities from (19) computed using the nodal values of u satisfy

$$ P_{A}^{+}\le Q_{A}^{+} ,\qquad P_{A}^{-}\ge Q_{A}^{-} $$

and

$$ P_{B}^{+}\le\frac{2 h-3 \varepsilon}{h-3 \varepsilon} Q_{B}^{+} ,\qquad P_{B}^{-}\ge\frac{2 h-3 \varepsilon}{h-3 \varepsilon} Q_{B}^{-} , $$

where the latter inequalities are sharp. Consequently, the AFC scheme with the Kuzmin limiter is not linearity preserving on Grid 4 when applied to Example 2.

Proof

First consider the inequalities at the node A. Since the values of \(P_{A}^{\pm }\) and \(Q_{A}^{\pm }\) do not change if a constant function is added to u, one can consider uA = 0. Moreover, the ratios \(Q_{A}^{+}/P_{A}^{+}\) and \(Q_{A}^{-}/P_{A}^{-}\) do not change if u is multiplied by a positive constant. Thus, it suffices to consider three types of functions u: with uC = −uF = 1, uC = −uF = − 1, and uC = uF = 0. These functions are then determined by the value uB and it is sufficient to consider uB ≥ 0 in view of the axisymmetry of the patch ΔA. Then, it is straightforward to verify that the inequalities at the node A hold.

The inequalities at the node B can be verified analogously. Equalities hold for u given by uA = uB = 0, uG = 1 and uA = uB = 0, uG = − 1, respectively. For these functions, one gets \((\mathbb {B}(\text {U})\text {U})_{B}=h^{2}/(6h-9\varepsilon )\) and \((\mathbb {B}(\text {U})\text {U})_{B}=-h^{2}/(6h-9\varepsilon )\), respectively, which means that the considered method is not linearity preserving. □

In view of the previous lemma, it is not surprising that the exact solution of Example 2 is not recovered by the AFC scheme with the Kuzmin limiter on Grid 4. Nevertheless, it is rather disappointing that, for this very simple example, the H1 seminorm of the error cannot be reduced by considering finer meshes.

On the other hand, we also see from Lemma 1 that it is easy to modify the AFC scheme with the Kuzmin limiter in such a way that the method becomes linearity preserving for the considered case. In fact, similarly as in (27), it suffices to replace \(R_{B}^{\pm }\) by

$$ R_{B}^{+}=\min\left\{1,\frac{\mu Q_{B}^{+}}{P_{B}^{+}}\right\},\quad R_{B}^{-}=\min\left\{1,\frac{\mu Q_{B}^{-}}{P_{B}^{-}}\right\}, $$
(47)

with an appropriate positive constant μ. It can be easily verified that this does not change other properties of the method formulated so far. To simplify our analytical considerations, we shall use

$$ \mu=\frac{2 h-3 \varepsilon}{h-4 \varepsilon} , $$
(48)

which is a slightly larger value than suggested by Lemma 1. Nevertheless, for values of h and ε considered in our numerical computations, this modification is negligible.

If one now repeats the computation leading to the result in Fig. 5 (left) with \(R_{i}^{\pm }\) given by (47), (48) at nodes lying on odd horizontal grid lines, one obtains the exact solution uh(x,y) = x. This is not surprising since this exact solution solves the Galerkin discretization and the AFC stabilization term now vanishes for first degree polynomials. Thus, let us consider the following slightly more difficult example.

Example 3

Problem (1) is considered with Ω = (0,1)2, ε = 10− 8, b = (1,0)T, c = 0, g = 1, and

$$ u_{b}(x,y)=x-\frac{{\mathrm e}^{\frac{x}\varepsilon}-1} {{\mathrm e}^{\frac{1}\varepsilon}-1} . $$
(49)

The formula in (49) not only defines the boundary condition but it also represents the solution u = u(x,y) of Example 3. In most of Ω, u(x,y) is very close to x, only in the vicinity of the outflow boundary x = 1 it abruptly falls to 0 and exhibits an exponential boundary layer. The approximate solution obtained on Grid 4 using the AFC scheme with the Kuzmin limiter modified by (47), (48) is depicted in Fig. 5 (right) and one can observe that it is again rather poor. The character of the solution remains the same also on finer meshes where one can observe that, in a large part of the computational domain, the discrete solution uh is approximately given by three parameters. For example, for ne = 80, one can deduce the following form of the discrete solution:

$$ \begin{array}{@{}rcl@{}} u_{h}(x,y)&=&\left\{\begin{array}{ll} x+\alpha\quad&\text{if} x=(3 k-1) h ,\\ x+\beta&\text{otherwise} , \end{array}\right.\quad \text{along odd horizontal grid lines} , \end{array} $$
(50)
$$ \begin{array}{@{}rcl@{}}[1.5ex] u_{h}(x,y)&=&\left\{ \begin{array}{ll} x+\beta\quad&\text{if} x=(3 k+1) h ,\\ x-\gamma&\text{otherwise} , \end{array}\right.\quad\text{along even horizontal grid lines} , \end{array} $$
(51)

where k is an arbitrary integer and α, β, γ are positive constants.

Let us now again investigate when a function uh given by (50) and (51) satisfies the Galerkin discretization or the AFC scheme. Due to the definition of uh, one has to distinguish six cases: whether the node under consideration lies on an odd or an even horizontal grid line and whether the respective vertical grid line is expressed by x = (3k − 1)h, x = 3kh, or x = (3k + 1)h. Like in (42), one derives in all six cases that, under the condition

$$ \alpha=2 \beta+\gamma , $$
(52)

uh satisfies the Galerkin discretization up to a perturbation κ(β + γ)ε with κ ∈{− 5,− 3,− 1,1,2,6}. Since the computed discrete solutions approximately satisfy (52), we again expect that the AFC stabilization term is nearly zero for them.

To compute the AFC stabilization term for uh given by (50) and (51), we shall assume that, apart from (38) and (52), one also has

$$ \beta+\gamma\le\frac{h}4 . $$
(53)

Then, it is easy to verify that, in all cases, the fluxes again have the signs given in (43) and (44). Thus, one again immediately obtains that \(R_{A}^{+}=R_{B}^{+}=1\) and the relations (45) hold. Using (39), (45), (47), and (48), a lengthy but straightforward computation reveals that, in all three cases, \(R_{A}^{-}=R_{B}^{-}=1\), which means that the AFC stabilization term again vanishes. Note that the condition (53) allows more flexibility in defining the function uh than (46) which determines the respective uh uniquely up to an additive constant.

There are two important conclusions of the above discussion. The first one, a more general, is that approximate solutions may be polluted by spurious oscillations despite the validity of the DMP. This may happen also if the right-hand side g vanishes (in contrast to the above examples) (see Example 5 below). The second conclusion is that there are oscillating functions (which may solve, e.g., a Galerkin discretization) for which the algebraic stabilization term vanishes. This is a surprising observation that does not correspond to the usual experience that, in case of an oscillating solution, a stabilization introduces an artificial diffusion in the discrete problem to suppress the oscillations. However, it is worth noting that if a discretization of Example 2 on some of the Grids 1, 4–7, and 9 leads to an oscillating approximate solution of the type (40) and (41), then also residual-based stabilizations (see, e.g., [36]) are not able to suppress the oscillations since the residual vanishes on any element of the triangulation.

Let us mention that if Example 3 is solved on Grid 1 using the AFC scheme with the Kuzmin limiter, one obtains the nodally exact solution, except for the rightmost vertical interior grid line (see Fig. 7 (left)). However, if \(R_{i}^{\pm }\) are defined by (27) with μi = 2, then oscillations again appear (see Fig. 7 (second from left)). Since the AFC scheme with the Kuzmin limiter is linearity preserving on Grid 1 for constant data, this shows that the symmetry of the patches and the linearity preservation are not sufficient for obtaining an accurate approximate solution. The error at the rightmost vertical interior grid line appears independently of the choice of the limiter as it was proved in [24] so that a refinement of the mesh along the outflow boundary is needed for enhancing the accuracy.

Fig. 7
figure 7

Approximate solutions computed using the AFC scheme with the Kuzmin limiter on Grid 1 with ne = 10: Example 3 (left), Example 3, \(R_{i}^{\pm }\) defined by (27) with μi = 2 (second from left), Example 4 (second from right), Example 4, \(P_{i}^{\pm }\) defined by (25) (right)

Let us now change the boundary condition of Example 3 to the homogeneous one, i.e., consider

Example 4

Problem (1) is considered with Ω = (0,1)2, ε = 10− 8, b = (1,0)T, c = 0, g = 1, and ub = 0.

The solution of this example possesses not only an exponential boundary layer at the outflow boundary but also two parabolic boundary layers. The AFC scheme with the Kuzmin limiter on Grid 1 provides the approximate solution shown in Fig. 7 (second from right). One can observe that, in the region of the numerical parabolic boundary layers, the approximate solution is not monotone in the crosswind direction. This can be improved by defining \(P_{i}^{\pm }\) by (25) instead of (19) (see Fig. 7 (right)). In general, this modification decreases \(R_{i}^{\pm }\) so that more artificial diffusion is introduced, which may lead to a more pronounced smearing of layers. Then, the accuracy can be again enhanced by using a finer mesh in the boundary layer region.

Let us mention that, for the finite element functions given by (40), (41) or (50), (51) and for the matrix entries corresponding to Grid 4 and the data of Example 2, the values of the Kuzmin limiter are determined only by the quantities \(R_{i}^{-}\). Since it follows from (43), (44) that the quantities \(P_{i}^{-}\) attain the same values for both definitions (19) and (25), the above analytical results remain the same also if \(P_{i}^{\pm }\) are defined by (25). Also the result in Fig. 7 (left) is not affected by computing \(P_{i}^{\pm }\) using (25).

Figure 8 shows results for Example 4 computed using the AFC scheme with the BJK limiter on Grid 1. As we know from Section 4.2, one can consider μi = 1 in (27) for Grid 1 to guarantee the linearity preservation, which leads to the oscillatory solution in Fig. 8 (left). If one uses μi = 2 as suggested by the formula (30), the oscillations become even larger (see Fig. 8 (right)). This again demonstrates that the symmetry of the patches and the linearity preservation are not sufficient for obtaining an accurate approximate solution. Moreover, the results presented in Figs. 57, and 8 show that using the modification (27) (with μi > 1) of (20) (e.g., to enforce the linearity preservation or to reduce the amount of artificial diffusion) is not a good idea since it allows more oscillatory solutions. In fact, this is not surprising since, for any finite element function, for which the quantities \(R_{i}^{\pm }\) do not vanish, one can find μi such that the AFC stabilization term vanishes.

Fig. 8
figure 8

Example 4: approximate solutions computed using the AFC scheme with the BJK limiter on Grid 1 with ne = 10 for μi = 1 (left) and μi = 2 (right)

In particular, one should avoid such constructions of limiters for which \(R_{i}^{+}=R_{i}^{-}=1\) may occur for oscillating functions. If \(P_{i}^{\pm }\) are defined by (25) and \(Q_{i}^{\pm }\) by (19), as suggested in the discussion to Fig. 7, then \(P_{i}^{+}+Q_{i}^{-}=P_{i}^{-}+Q_{i}^{+}=0\) and it is easy to verify that \(R_{i}^{+}=R_{i}^{-}=1\) is equivalent to \(P_{i}^{+}=Q_{i}^{+}\) and \(P_{i}^{-}=Q_{i}^{-}\), i.e., to

$$ 0=\sum\limits_{j\in S_{i}} (f_{ij}^{+}+f_{ij}^{-})=\sum\limits_{j\in S_{i}} f_{ij} =\sum\limits_{j\in S_{i}} d_{ij} (u_{j}-u_{i}) . $$

Thus, \(R_{i}^{+}=R_{i}^{-}=1\) holds if and only if \(u_{i}=\bar {u}_{i}\) where \(\bar {u}_{i}\) is a local average defined by

$$ \bar{u}_{i}=\frac{{\sum}_{j\in S_{i}}\vert d_{ij}\vert u_{j}} {{\sum}_{j\in S_{i}}\vert d_{ij}\vert} . $$
(54)

To avoid oscillating approximate solutions, the local averages \(\bar {u}_{i}\) should be good approximations of the values ui for smoothly varying functions, which is the case for locally symmetric meshes like Grid 1 but not meshes with unsymmetric patches like Grid 4. This probably contributes to the better performance of the AFC scheme with the Kuzmin limiter on locally symmetric meshes.

The above examples have all non-vanishing right-hand sides g so that the DMP provides only one-sided local bounds on approximate solutions. To demonstrate that the above-discussed phenomena are not restricted to this case, let us consider the following example with a vanishing right-hand side.

Example 5

Problem (1) is considered with Ω = (0,1)2, ε = 10− 8, b = (− 2,− 3)T, g = 0, and

$$ c(x,y)=\frac{3 x+2 y+7}{(x+1)(y+2)} ,\qquad u_{b}(x,y)=(x+1)(y+2) . $$

Note that the solution of Example 5 is u(x,y) = (x + 1)(y + 2). Whereas, on Grid 1, the AFC scheme with the Kuzmin limiter leads to an accurate approximation, the results on Grid 4 are again polluted by spurious oscillations (see Fig. 9). Moreover, on Grid 1, one can observe the same optimal convergence rates as in Table 1 whereas an analogous reduction of the convergence rates as in Table 2 is observed on Grid 4.

Fig. 9
figure 9

Example 5: approximate solutions computed using the AFC scheme with the Kuzmin limiter on Grid 1 (left) and on Grid 4 (right), in both cases with ne = 20

6 Symmetrized monotone upwind-type algebraically stabilized method

The aim of this section is to design an algebraic stabilization which will not suffer from the deficiencies discussed and analyzed in the previous section. The starting point will be the MUAS method of Section 4.3 since this method has the favorable property being of upwind type and satisfies the DMP on arbitrary simplicial meshes.

It was argued below Example 4 in the previous section that \(P_{i}^{\pm }\) should be defined by (25) instead of (19). Consequently, the relations (34) in the MUAS method should be changed to

$$ P_{i}^{+}=\sum\limits_{j\in S_{i}} \vert d_{ij}\vert (u_{i}-u_{j})^{+} ,\qquad\quad P_{i}^{-}=\sum\limits_{j\in S_{i}} \vert d_{ij}\vert (u_{i}-u_{j})^{-} . $$
(55)

Moreover, it was observed in the previous section that two properties seem to be important for obtaining accurate results using algebraic stabilizations: local symmetries of triangulations and the linearity preservation. As it was demonstrated that the linearity preservation should not be enforced using (27), our goal will be to get this property by symmetrizing the definitions of \(P_{i}^{\pm }\) and \(Q_{i}^{\pm }\) in a suitable way.

To introduce the mentioned symmetry, we will extend the definitions of \(P_{i}^{\pm }\) and \(Q_{i}^{\pm }\) by considering values at symmetrically placed points. The construction is illustrated by Fig. 10 where a patch ΔA around a node A is shown. Each node P connected to A by an edge is mapped to a point \(\tilde {P}\) in a symmetric way with respect to A and then the idea is to compute the value uAP at \(\tilde {P}\) of the finite element function uh corresponding to U via (3). This is easy in case of the node B from Fig. 10 since \(\tilde {B}\in {\Delta }_{A}\). However, in case of the node C, the symmetrically placed point \(\tilde {C}\) lies outside ΔA. In this case, we extend the linear function uh from the triangle AEF to the convex set surrounded by the half lines AE and AF and define uAC as the value of this extended function at \(\tilde {C}\). This makes more sense than considering the actual value \(u_{h}(\tilde {C})\). The value uAC can be easily computed using the gradient of uh on AEF since

$$ u_{AC}=u_{A}+\nabla u_{h}\vert_{AEF}^{}\cdot(\tilde{C}-A) =u_{A}+\nabla u_{h}\vert_{AEF}^{}\cdot(A-C) . $$

Of course, an analogous relation holds for uAB, too.

Fig. 10
figure 10

Construction of symmetrically placed points

Using the above-defined values uAP, we can now symmetrize the definitions of the quantities \(P_{i}^{\pm }\) and \(Q_{i}^{\pm }\) in (55) and (35), respectively, for any \(i\in \{1,\dots ,M\}\) by setting

$$ \begin{array}{@{}rcl@{}} P_{i}^{+}&=&\sum\limits_{j\in S_{i}} \vert d_{ij}\vert \{(u_{i}-u_{j})^{+}+(u_{i}-u_{ij})^{+}\} , \end{array} $$
(56)
$$ \begin{array}{@{}rcl@{}} P_{i}^{-}&=&\sum\limits_{j\in S_{i}} \vert d_{ij}\vert \{(u_{i}-u_{j})^{-}+(u_{i}-u_{ij})^{-}\} , \end{array} $$
(57)
$$ \begin{array}{@{}rcl@{}} Q_{i}^{+}&=&\sum\limits_{j\in S_{i}} s_{ij} \{(u_{j}-u_{i})^{+}+(u_{ij}-u_{i})^{+}\} , \end{array} $$
(58)
$$ \begin{array}{@{}rcl@{}} Q_{i}^{-}&=&\sum\limits_{j\in S_{i}} s_{ij} \{(u_{j}-u_{i})^{-}+(u_{ij}-u_{i})^{-}\} , \end{array} $$
(59)

where

$$ u_{ij}=u_{i}+\nabla u_{h}\vert_{T_{ij}}^{}\cdot(x_{i}-x_{j})\qquad\forall j\in S_{i} , $$
(60)

and Tij ⊂Δi is a simplex intersected by the half line {xi + α(xixj);α > 0} (like the triangle AEF in Fig. 10 for xi = A and xj = C). As we will see below, this modification of the MUAS method leads to optimal convergence rates in cases where the algebraic stabilizations of Section 4 provide suboptimal convergence results.

The above definitions of \(P_{i}^{\pm }\) and \(Q_{i}^{\pm }\) can be generalized to

$$ \begin{array}{@{}rcl@{}} P_{i}^{+}&=&\sum\limits_{j\in S_{i}, a_{ij}>0 \vee a_{ji}>0} p_{ij} \{(u_{i}-u_{j})^{+}+(u_{i}-u_{ij})^{+}\} , \end{array} $$
(61)
$$ \begin{array}{@{}rcl@{}} P_{i}^{-}&=&\sum\limits_{j\in S_{i}, a_{ij}>0 \vee a_{ji}>0} p_{ij} \{(u_{i}-u_{j})^{-}+(u_{i}-u_{ij})^{-}\} , \end{array} $$
(62)
$$ \begin{array}{@{}rcl@{}} Q_{i}^{+}&=&\sum\limits_{j\in S_{i}} q_{ij} \{(u_{j}-u_{i})^{+}+(u_{ij}-u_{i})^{+}\} , \end{array} $$
(63)
$$ \begin{array}{@{}rcl@{}} Q_{i}^{-}&=&\sum\limits_{j\in S_{i}} q_{ij} \{(u_{j}-u_{i})^{-}+(u_{ij}-u_{i})^{-}\} , \end{array} $$
(64)

with some weighting factors satisfying, for any jSi, \(i=1,\dots ,M\),

$$ \begin{array}{@{}rcl@{}} &0\le p_{ij}\le q_{ij} , \end{array} $$
(65)
$$ \begin{array}{@{}rcl@{}} &p_{ij}>0\quad\text{if} a_{ij}>0 . \end{array} $$
(66)

We name the resulting scheme Symmetrized Monotone Upwind-type Algebraically Stabilized (SMUAS) method. Let us recall that the stabilization matrix of the SMUAS method is given by (31) and (32) with βij determined by (36), (20), (21), and (61)–(64) where uij are defined by (60) and pij, qij satisfy (65) and (66).

Remark 1

If \(P_{i}^{+}=0\), then \(R_{i}^{+}\) can be defined arbitrarily (and the same holds for \(P_{i}^{-}\) and \(R_{i}^{-}\)). Indeed, \(P_{i}^{+}\) is used only for defining βij with j such that ui > uj. Then, if \(P_{i}^{+}=0\), one has aij ≤ 0 due to (66) and hence the matrix \(\mathbb {B}(\text {U})\) defined by (31), (32) does not depend on these βij.

Remark 2

The condition (65) assures that the SMUAS method is linearity preserving. Indeed, if \(u_{h}\in P_{1}(\mathbb {R}^{d})\), then uiuij = ujui for any \(i\in \{1,\dots ,M\}\) and jSi and hence one gets

$$ P_{i}^{+}=-P_{i}^{-}\le\sum\limits_{j\in S_{i}} p_{ij} \vert u_{i}-u_{j}\vert \le \sum\limits_{j\in S_{i}} q_{ij} \vert u_{i}-u_{j}\vert=Q_{i}^{+}=-Q_{i}^{-} , $$

so that \(R_{i}^{+}=R_{i}^{-}=1\) for \(i=1,\dots ,N\) and the stabilization term vanishes.

Remark 3

The condition (aij > 0 ∨ aji > 0) in (61) and (62) restricts the summation to those indices jSi for which dij≠ 0 (cf. (56) and (57)). This is important to obtain optimal convergence rates in the diffusion-dominated regime.

Of course, the properties of the SMUAS method depend on the choice of the weighting factors pij, qij. The relations (56)–(59) correspond to

$$ p_{ij}=\max\{a_{ij},0,a_{ji}\} ,\quad q_{ij}=\max\{\vert a_{ij}\vert,a_{ji}\} ,\quad i=1,\dots,M , j\in S_{i} . $$
(67)

Another possibility is to simply set

$$ p_{ij}=q_{ij}=1 ,\quad i=1,\dots,M , j\in S_{i} . $$
(68)

More generally, let us consider weighting factors satisfying pij = qij for \(i=1,\dots ,M\) and jSi but not necessarily equal to 1. In the convection-dominated regime, the condition (aij > 0 ∨ aji > 0) usually holds for any jSi due to the skew-symmetry of the convection matrix and hence one obtains that \(P_{i}^{+}+Q_{i}^{-}=P_{i}^{-}+Q_{i}^{+}=0\). Thus, if \(R_{i}^{+}=R_{i}^{-}=1\) for some \(i\in \{1,\dots ,M\}\) (so that βij = 0 for all jSi), one finds out that \(u_{i}=\bar {u}_{i}\) where \(\bar {u}_{i}\) is a local average defined by

$$ \bar{u}_{i}=\frac{{\sum}_{j\in S_{i}}p_{ij} (u_{j}+u_{ij})} {2 {\sum}_{j\in S_{i}}p_{ij}} , $$

see the derivation leading to (54). Therefore, the choice of the weights pij may be also guided by the requirement that the local averages \(\bar {u}_{i}\) are good approximations of the values ui for smoothly varying functions. Then, the weights pij should depend on the distribution of the nodes xj, jSi, and on their distances to xi.

Remark 4

It is not always necessary to use all the additional terms in (61)–(64). For example, let us consider the patch around the node A in Fig. 6. Then, the nodes B, D and C, F are symmetric with respect to A. Therefore, using (68) and assuming that the condition (aij > 0 ∨ aji > 0) holds for any jSi and xi = A, it is sufficient to introduce only symmetrically placed points to the nodes E and G. However, for simplicity of the presentation (and also of implementation), we do not consider such variants of the above formulas in this paper.

Now, let us prove that the SMUAS method satisfies Assumptions (A1) and (A2) from Section 3.

Theorem 4

The stabilization matrix of the SMUAS method satisfies Assumption (A1).

Proof

In view of Theorem 3, it suffices to prove (33). Since βij ≡ 0 for any \(j\in \{1,\dots ,N\}\) if \(i\in \{M+1,\dots ,N\}\), consider any \(i\in \{1,\dots ,M\}\). Let jSi be such that aij > 0. We want to show that Φ(U) := βij(U)(ujui) is continuous at a fixed but arbitrary point \(\bar {\text {U}}=(\bar {u}_{1},\dots ,\bar {u}_{N})\in \mathbb {R}^{N}\). If \(\bar {u}_{i}=\bar {u}_{j}\), then \({\Phi }(\bar {\text {U}})=0\) and the continuity at \(\bar {\text {U}}\) follows from the estimates

$$ \vert {\Phi}(\text{U})-{\Phi}(\bar{\text{U}})\vert =\vert {\Phi}(\text{U})\vert \le\vert u_{i}-u_{j}\vert\le\sqrt2 \| \text{U}-\bar{\text{U}}\| , $$

where ∥⋅∥ is the Euclidean norm on \(\mathbb {R}^{N}\). Thus, let \(\bar {u}_{i}>\bar {u}_{j}\) and denote

$$ B=\left\{\text{U}\in \mathbb{R}^{N}; \| \text{U}-\bar{\text{U}}\| \le\frac12\vert \bar{u}_{i}-\bar{u}_{j}\vert\right\} . $$

Then, ui > uj for U ∈ B and hence

$$ {\Phi}(\text{U})=(1-R^{+}_{i}(\text{U})) (u_{j}-u_{i})\qquad\forall \text{U}\in B . $$

Since both \(P_{i}^{+}\) and \(Q_{i}^{+}\) are continuous and \(P_{i}^{+}\) is positive in B due to (66), the function Φ is continuous in B and hence also at \(\bar {\text {U}}\). If \(\bar {u}_{i}<\bar {u}_{j}\), one proceeds analogously. □

Theorem 5

The stabilization matrix of the SMUAS method satisfies Assumption (A2).

Proof

Consider any \(\text {U}=(u_{1},\dots ,u_{N})\in \mathbb {R}^{N}\), \(i\in \{1,\dots ,M\}\), and jSi. Let ui be a strict local extremum of U with respect to Si. We want to prove that

$$ a_{ij}+b_{ij}(\text{U})\le0 . $$
(69)

If aij ≤ 0, then (69) holds since bij(U) ≤ 0. Thus, let aij > 0. First, assume that ui > uk for any kSi. Then, for any simplex T ⊂Δi and any vector \(\boldsymbol {a}\in \mathbb {R}^{d}\) pointing from xi into T, one has \(\boldsymbol {a}\cdot \nabla u_{h}\vert _{T}^{}<0\) with uh defined by (3). Thus, ui > uik for any kSi according to (60), which implies that \(Q_{i}^{+}=0\). Moreover, \(P_{i}^{+}\ge p_{ij} (u_{i}-u_{j})^{+}>0\) in view of (66) and hence \(\beta _{ij}=1-R_{i}^{+}=1\). Similarly, if ui < uk for any kSi, then also ui < uik for any kSi and hence \(Q_{i}^{-}=0\). Since \(P_{i}^{-}\le p_{ij} (u_{i}-u_{j})^{-}<0\), one obtains \(\beta _{ij}=1-R_{i}^{-}=1\). Therefore, in both cases, bij(U) ≤−aij, which proves (69). □

The above theorems imply that the SMUAS method is solvable (cf. Theorem 1) and satisfies the DMP formulated in Theorem 2. Moreover, as shown in Remark 2, the SMUAS method is linearity preserving. It is important that all these properties hold for arbitrary simplicial meshes. For regular families of triangulations, one also has the error estimate (16).

However, as we have seen in Section 5, such theoretical properties do not guarantee that an algebraically stabilized method will provide an accurate approximate solution and that the approximate solutions will converge to the exact solution in usual norms. Thus, let us investigate the properties of the SMUAS method numerically. We start with Example 1 for ε = 10− 8 and Grid 4, for which suboptimal convergence results are presented in Table 2 for the AFC scheme with the Kuzmin limiter (that is equivalent to the MUAS method from Section 4.3 in this case). The results for the SMUAS method with pij, qij defined by (67) are shown in Table 4. One observes a higher accuracy of the results than in Table 2 and the experimental convergence rates tend to the optimal values. Using the SMUAS method with pij, qij defined by (68) leads to similar results (see Table 5). Also for other test examples, the results obtained using (67) and (68) were similar and hence we will not present any other comparisons of results for these two choices of the weighting factors here. Similar convergence rates as in Tables 4 and 5 can be observed for all other grids from Figs. 1 and 2 (see Tables 6 and 7 for illustration). In Tables 6 and 7, ne represents the number of edges along the part of Ω lying on the line y = 1, i.e., in Fig. 1, ne = 2 for Grid 2 and ne = 4 for Grid 3 so that the numbers of triangles in grids with the same value of ne are similar. The higher accuracy of the SMUAS method can be also seen from Fig. 11 if one compares it with Fig. 3.

Table 4 Example 1: ε = 10− 8, numerical results for Grid 4 computed using the SMUAS method with pij, qij defined by (67)
Table 5 Example 1: ε = 10− 8, numerical results for Grid 4 computed using the SMUAS method with pij, qij defined by (68)
Table 6 Example 1: ε = 10− 8, numerical results for Grid 2 computed using the SMUAS method with pij, qij defined by (67)
Table 7 Example 1: ε = 10− 8, numerical results for Grid 3 computed using the SMUAS method with pij, qij defined by (68)
Fig. 11
figure 11

Example 1: ε = 10− 8, approximate solutions computed using the SMUAS method with pij, qij defined by (68) on Grid 4 with ne = 32 (left) and ne = 64 (right)

It was reported in [6] for Example 1 that, in the diffusion-dominated case ε = 10, the solutions of the AFC scheme with the Kuzmin limiter do not converge for the non-Delaunay Grid 5 in any of the three norms considered in the above tables. Recall that Grid 5 was obtained from Grid 4 by shifting some of the nodes by h/10 to the right, where h is the horizontal mesh width in Grid 4. If the shift is h/2, then the experimental convergence rates tend to zero already on relatively coarse meshes (cf. [21]). The MUAS method from Section 4.3 shows an improved behavior. In particular, for the shift h/2, it leads to a convergence in all three norms and the convergence rates in the L2 norm and the H1 seminorm are near to the optimal values. However, if the shift is 0.8h, then the accuracy deteriorates and the convergence rates tend to zero also for the MUAS method (cf. [21]). It is conjectured in [21] that this behavior is connected with the fact that the MUAS method is linearity preserving for the shift h/2 but not for 0.8h. This conjecture is supported by the results obtained for the SMUAS method which is always linearity preserving and, indeed, leads to optimal convergence rates even on the distorted mesh corresponding to the shift 0.8h (see Table 8 and Fig. 12 (left)). Moreover, also a further deformation of the mesh leads to similar results (cf. Table 9 and Fig. 12 (middle) corresponding to the shift 1.5h (rightmost interior nodes on even horizontal mesh lines are shifted by 0.75h)).

Table 8 Example 1: ε = 10, numerical results computed using the SMUAS method with pij, qij defined by (67) on triangulations of the type of Grid 5 obtained by shifting the respective interior nodes by eight tenths of the horizontal mesh width
Table 9 Example 1: ε = 10, numerical results computed using the SMUAS method with pij, qij defined by (67) on triangulations of the type of Grid 5 obtained by shifting the respective interior nodes by one and half of the horizontal mesh width
Fig. 12
figure 12

Grids used for computing the results in Table 8 (left, ne = 16), Table 9 (middle, ne = 16), and Table 10 (right, ne = 17)

The SMUAS method seems to work well also on pathological meshes with very enlongated triangles. Table 10 shows results obtained on a triangular version of the mesh4_1 family from the FVCA5 benchmark [13]. This family consists of six meshes and the coarsest one is depicted in Fig. 12 (right). In contrast to the meshes of the type of Grid 5, the orientation of the edges in the 17 vertical strips of the coarsest mesh of the mesh4_1 family is preserved also in finer meshes. The value ne again represents the number of edges along the part of Ω lying on the line y = 1 but now ne increases linearly (ne = 17i in the i th row of Table 10) whereas in all the tables considered before the increase was exponential (ne = 8 ⋅ 2i in the i th row). Consequently, the convergence orders in Table 10 tend to the optimal values slower.

Table 10 Example 1: ε = 10, numerical results computed using the SMUAS method with pij, qij defined by (67) on a triangular version of the mesh4_1 family from the FVCA5 benchmark

It is not surprising that the SMUAS method provides the exact solution on any mesh if it is applied to Example 2. For Example 3 and Grid 4, the solution of the SMUAS method is nodally exact except for the rightmost vertical interior grid line, similarly as for the AFC scheme with the Kuzmin limiter and Grid 1 in Fig. 7 (left). Also for Example 4, the SMUAS method on Grid 4 provides an approximate solution which is nodally exact in most of the computational domain (see Fig. 13 (left)). The approximation of the boundary layers should be improved by local mesh refinement. Finally, also in case of Example 5, an application of the SMUAS method on Grid 4 leads to a much more accurate approximate solution than the AFC scheme with the Kuzmin limiter (see Figs. 13 (right) and 9 (right)). Moreover, the SMUAS method again shows optimal convergence rates.

Fig. 13
figure 13

Approximate solutions computed using the SMUAS method with pij, qij defined by (67) on Grid 4 with ne = 20 for Example 4 (left) and Example 5 (right)

Summarizing our numerical results, one can state that the SMUAS method led to optimal convergence rates in all our numerical tests involving various types of simplicial meshes whereas, in many cases, the algebraic stabilizations from Section 4 lead to suboptimal convergence rates or do not converge at all. A theoretical explanation of the observed optimal convergence behavior of the SMUAS method is left to future work.

The properties of the SMUAS method proved in this paper together with the observed optimal experimental convergence rates make the SMUAS method superior to the remaining three algebraically stabilized methods discussed in this paper. This is also illustrated by Table 11 where the basic properties of all four methods are compared.

Table 11 Comparison of the properties of the AFC scheme with the Kuzmin limiter (Section 4.1), the AFC scheme with the BJK limiter (Section 4.2), the MUAS method (Section 4.3), and the SMUAS method (Section 6)

Remark 5

To compute solutions of the algebraic stabilizations considered in this paper, systems of nonlinear algebraic equations have to be solved. For the AFC schemes with the Kuzmin and BJK limiters, various nonlinear solvers have been studied in [17] and it turned out that a simple fixed point iteration called fixed point rhs was the most efficient method. This approach was also used to compute the results presented in this paper. We observed that the convergence of this nonlinear solver for the SMUAS method was very similar as for the AFC scheme with the Kuzmin limiter and we refer to [17, 22] where detailed convergence studies for fixed point rhs applied to the AFC scheme with the Kuzmin limiter can be found for various test problems. These convergence studies investigate the influence of the mesh width on the numbers of iterations and computing times and include test problems with interior layers and a non-constant convection field.