1 Introduction

The two dimensional shallow-water equations (SWE) are used for a wide range of applications in environmental and hydraulic engineering, oceanography, and many other areas. They are discretized on computational domains that can be very large and often feature complex geometries; therefore, the numerical schemes must be computationally efficient and robust. The nonlinearity and hyperbolic character of the SWE system constitute additional challenges for designing discretizations and solution algorithms, while other application-specific aspects such as local conservation of unknown quantities and well-balancedness represent further desirable properties [see Hinkelmann et al. (2015)] for a brief overview of key requirements for SWE models).

The aforementioned issues led to a large number of studies dedicated to the development, analysis, and practical evaluation of various numerical techniques for solving the SWE. The earliest models used finite differences on structured grids, but, with the emergence of unstructured-mesh models [e.g. TELEMAC (Galland et al. 1991) or ADCIRC (Luettich et al. 1992)], finite elements and finite volumes became the de-facto standard. A big advantage of the finite element approach is its potential to naturally accommodate higher-order discretizations on unstructured meshes; in this vein, various methods based on the continuous Galerkin (CG) and discontinuous Galerkin (DG) approximation spaces (or mixtures of both) have been described and compared in the literature (Hanert et al. 2003; Comblen et al. 2010). The results of these comparisons can be summarized (in a somewhat oversimplified fashion) as follows:

  • Using CG for elevation and velocity is computationally very efficient (at least the low-order schemes) but tends to have stability issues usually represented by spurious elevation or velocity modes that arise from the LBB condition or from too large elevation spaces;

  • Using DG spaces for both unknowns is robust and needs no additional stabilization but may turn out to be computationally expensive [up to a factor of four to five longer serial execution times compared to CG for piecewise linears on the same mesh (Dawson et al. 2006)]. However, DG discretizations deliver higher accuracy than their CG counterparts (Kubatko et al. 2009), are locally conservative, have better support for adaptive and non-conforming meshes, and can partially offset their computational costs by more efficient parallel scaling (Kubatko et al. 2009);

  • Some combinations of continuous and discontinuous spaces such as the lowest-order Raviart–Thomas spaces are robust and computationally efficient (Hanert et al. 2003) but difficult to generalize to higher orders on triangular meshes.

The idea of the enriched Galerkin (EG) method is to enhance the CG approximation space using element-local discontinuous functions and, by relying on a solution procedure nearly identical to that of the DG method (i.e. edge fluxes, Riemann solvers, etc.), produce a robust, locally conservative discretization with fewer unknowns than a DG discretization of the same order.

In its original form, the EG method adds a piecewise constant DG component to a continuous piecewise linear or multilinear CG space and uses the DG bilinear form. This method was introduced for the linear advection–diffusion–reaction equation in Becker et al. (2003) and was proved there to be stable and to converge at the same rate as the piecewise linear DG discretization. The piecewise constant enrichment of the linear CG approximation makes the scheme intrinsically stable and imparts the local conservation property (Sun and Liu 2009). Since then, the EG method has been generalized to CG spaces of the polynomial order k augmented by discontinuous, element-wise constant functions (Sun and Liu 2009; Lee et al. 2016) and even to arbitrary enrichments with polynomials of degree m with \(-1 \le m \le k\) (\(m=-1\) indicates no enrichment at all) as discussed in Rupp and Lee (2020). Arbitrary enrichments allow to consider EG as a generalization of both CG and DG since EG uses the same bilinear and linear forms as DG.

While inheriting many advantages of DG, the EG method in multiple dimensions needs fewer degrees of freedom (DOF) than the DG method if \(m < k\). The standard EG method (with \(m = 0\)) has been developed to solve general elliptic and parabolic problems with dynamic mesh adaptivity (Choi and Lee 2019; Lee et al. 2018; Lee and Wheeler 2017, 2018) and was extended to address multiphase fluid flow problems (Lee and Wheeler 2018). Recently, the EG method has been applied to solve the nonlinear poroelastic problem (Choo and Lee 2018; Kadeethum et al. 2019), and its performance has been compared to other two- and three-field methods (Kadeethum et al. 2019). Another generalization of the EG approach considers enrichments by discontinuous polynomials defined on a subcell mesh (Rupp et al. 2020).

The EG methodology should not be confused with numerous other approximation space enrichment approaches such as bubble functions or eXtended Finite Element Method (XFEM) schemes. Main differences are the purpose (robustness and local conservation for EG vs. higher accuracy for bubbles and XFEM) and the underlying framework (DG vs. CG). Similarly to DG, the EG method also appears to be well-suited for computational enhancements such as hybridization or static condensation—this topic deserves a separate study.

In the context of convection-dominated problems, additional stabilization techniques are often required. Classical stabilizations include the streamline upwind Petrov–Galerkin (SUPG) method, weighted essentially non oscillatory (WENO) schemes, as well as provably bound-preserving alternatives [e.g. Brooks and Hughes (1982), Shu (2009), Cockburn and Shu (1998), Zhang and Shu (2011), Dumbser et al. (2014)]. Popular methods designed particularly for DG discretizations include the edge-based Barth–Jesperson limiter (Barth and Jespersen 1989), and its vertex-based counterparts (Kuzmin 2010; Aizinger 2011). Limiters for an EG discretization of the linear advection equation have recently been proposed in Kuzmin et al. (2020). The approach therein deviates from classical DG slope limiters but rather fits in the framework of algebraic flux correction (Kuzmin 2012), which only recently has been extended to the DG setting (Anderson et al. 2017; Hajduk et al. 2020). Numerical solutions based on the methods in Kuzmin et al. (2020) can be proven to satisfy discrete maximum principles under CFL-like time step restrictions, which makes the approach superior to geometrical slope limiting.

The main focus of the present work is to formulate and evaluate an EG scheme for the SWE and to compare the quality of the EG and DG discretizations. The method is implemented in the FESTUNG framework (Frank et al. 2015; Reuter et al. 2016; Jaust et al. 2018; Reuter et al. 2020) by modifying our DG implementation for the SWE presented in Reuter et al. (2019), Hajduk et al. (2018). The same scheme was initially introduced in our UTBEST solver (Dawson and Aizinger 2002; Aizinger and Dawson 2002) and later extended to three dimensions in UTBEST3D (Dawson and Aizinger 2005; Aizinger et al. 2013; Reuter et al. 2015).

The paper is structured as follows. The mathematical model and its discretization using the EG method are presented in Sect. 2, a brief description of the implementation using our MATLAB / GNU Octave framework FESTUNG is the subject of Sect. 3. In Sect. 4, we demonstrate the accuracy and robustness of our EG scheme using an analytical convergence test, a supercritical flow example with discontinuous solution, and a realistic tidal flow scenario for Bahamas islands. A short conclusions and outlook section wraps up this work.

2 EG formulation for the SWE

2.1 Governing equations

The SWE in conservative form are given by

$$\begin{aligned}&\partial _t \xi +\nabla \cdot \varvec{q}=0, \end{aligned}$$
$$\begin{aligned}&\partial _t \varvec{q} +\nabla \cdot \left( \varvec{qq}^\mathsf {T}/H \right) +\tau _{\text {bf}} \varvec{q} +\left( {\begin{matrix} 0 &{}\quad -f_\mathrm {c}\\ f_\mathrm {c} &{}\quad 0 \end{matrix}} \right) \varvec{q} +gH\nabla \xi =\varvec{F}. \end{aligned}$$

They are considered on a two-dimensional, polygonally-bounded domain \(\Omega \) and finite time interval \(\left( t_0,\,t_\mathrm {end}\right) \). By \(\xi \), we denote the free surface elevation of the water body with respect to a certain zero level (e.g., the mean sea level). The quantity \(H = \xi - z_\mathrm {b}\) represents the total fluid depth with \(z_\mathrm {b}\) denoting the bathymetry. \(\varvec{q} :=(U,V)^\mathsf {T}\) is the depth integrated horizontal velocity field, \(f_\mathrm {c}\) the Coriolis coefficient, g the gravitational acceleration, and \(\tau _{\text {bf}}\) the bottom friction coefficient. Wind stress, the atmospheric pressure gradient, and tidal potential are combined in the body force term \(\varvec{F} :=(F_x,F_y)^\mathsf {T}\).

Defining \(\varvec{c} :=(\xi , U,V)^\mathsf {T}\), system (1) can be rewritten in the following compact form:

$$\begin{aligned} \partial _t \varvec{c}+ \nabla \cdot \varvec{A}(\varvec{c})= \varvec{r}(\varvec{c}) \end{aligned}$$


$$\begin{aligned} \varvec{A}(\varvec{c}) =\begin{pmatrix} U &{} V\\ \frac{U^2}{\xi - z_\mathrm {b}}+g\xi \left( \frac{\xi }{2} - z_\mathrm {b}\right) &{}\frac{U V}{\xi -z_\mathrm {b}}\\ \frac{U V}{\xi - z_\mathrm {b}}&{}\frac{V^2}{\xi - z_\mathrm {b}}+g\xi \left( \frac{\xi }{2} - z_\mathrm {b}\right) \end{pmatrix} \end{aligned}$$


$$\begin{aligned} \varvec{r}(\varvec{c}) =\begin{pmatrix} 0\\ -\tau _{\text {bf}}U+f_\mathrm {c}V-g\xi \partial _x z_\mathrm {b}+F_x\\ -\tau _{\text {bf}}V-f_\mathrm {c}U- g\xi \partial _y z_\mathrm {b}+F_y\\ \end{pmatrix}. \end{aligned}$$

In this work, we use several types of boundary conditions for the SWE (1). Following the standard approach, interior values are used as boundary values at parts of the boundary, on which the corresponding unknowns are not prescribed. By  \(\hat{\cdot }\) , we denote prescribed boundary values of the respective unknowns. The following types of boundary conditions are used in this work:

Dirichlet boundary: Here, all unknowns are specified:

$$\begin{aligned} \xi = \hat{\xi }, \quad \varvec{q} = \hat{\varvec{q}}. \end{aligned}$$

Land boundary: Denoting by \(\varvec{n}\) the exterior unit normal to \(\partial \Omega \), we set the normal flux to zero:

$$\begin{aligned} \varvec{q} \cdot \varvec{n} = 0. \end{aligned}$$

Open sea boundary: We prescribe the free surface elevation:

$$\begin{aligned} \xi = \hat{\xi }. \end{aligned}$$

Radiation boundary: No unknowns are specified (free outflow). Finally, initial conditions are set for elevation and depth integrated velocity

$$\begin{aligned} \xi (\varvec{x},0) = \xi _0(\varvec{x}),\quad {\varvec{q}}(\varvec{x}, 0) = {\varvec{q}}_0(\varvec{x}). \end{aligned}$$

2.2 Enriched Galerkin discretization

Let \(\{\mathcal {T}_\Delta \}_{\Delta >0}\) be a simplicial, shape-regular, quasi uniform, geometrically conformal family of triangulations of \(\Omega \subset \mathbb {R}^2\) with \(\# T\) denoting the total number of elements of \(\mathcal {T}_\Delta \). We obtain the local variational formulation of system (2) by multiplying with smooth test functions \(\varvec{\phi } \in C^\infty (\Omega )^3\) and integrating by parts on each element \(T \in \mathcal {T}_\Delta \) yielding

$$\begin{aligned} \left( \partial _t \varvec{c},\varvec{\phi }\right) _{T}-\left( \varvec{A}(\varvec{c}), \nabla \varvec{\phi }\right) _{T}+\langle \varvec{A}(\varvec{c})\cdot \varvec{n}, \varvec{\phi }\rangle _{\partial T}= \left( \varvec{r}(\varvec{c}),\varvec{\phi }\right) _{T}, \end{aligned}$$

where we write \((\cdot ,\cdot )_{T}\) and \(\langle \cdot ,\cdot \rangle _{\partial T}\) for the \(L^2\)-scalar products on elements and their boundaries, and denote by \(\varvec{n}=(n_x, n_y)^\mathsf {T}\) an exterior unit normal to \(\partial T\).

Defining the broken polynomial spaces of order \(m \in \mathbb {N}_0\) as

$$\begin{aligned} \mathbb P_m(\mathcal T_\Delta ) :=\Big \{ v \in L^2(\Omega ) \, : \, v_{\mid T} \hbox { is a polynomial of degree at most}\ m, \forall T\in \mathcal T_\Delta \Big \} \end{aligned}$$

and setting \(\mathbb P_{-1}(\mathcal T_\Delta ) :=\{0\}\), we specify the EG test and trial spaces as

$$\begin{aligned} \mathbb P_{k,m}(\mathcal T_\Delta ) :=\Big ( \mathbb P_k(\mathcal T_\Delta ) \cap C(\Omega ) \Big ) + \mathbb P_m(\mathcal T_\Delta ) \end{aligned}$$

for integers \(-1 \le m \le k\), \(k > 0\). Obviously, \(\mathbb P_m(\mathcal T_\Delta ) \subset \mathbb P_{k,m}(\mathcal T_\Delta ) \subset \mathbb P_k(\mathcal T_\Delta )\). Here, ‘+’ denotes the sum of subspaces which is not a direct sum if \(m \ne -1\). Examples of spaces are given in Fig. 1.

Fig. 1
figure 1

Example of EG spaces: CG spaces are marked by a solid line, DG spaces by a dashed line. EG spaces of interest are those lying in-between

From (10) it follows immediately that

$$\begin{aligned} \dim \mathbb P_{k,m}(\mathcal {T}_\Delta )&= \dim \Big (\mathbb P_k(\mathcal {T}_\Delta )\cap C(\Omega )\Big ) + \dim \mathbb P_m(\mathcal {T}_\Delta )\nonumber \\&\quad - \dim \left( \Big (\mathbb P_k(\mathcal {T}_\Delta )\cap C(\Omega )\Big )\cap \mathbb P_m(\mathcal {T}_\Delta )\right) \nonumber \\&= \dim \mathbb P_{k,-1}(\mathcal {T}_\Delta ) + \dim \mathbb P_{m,m}(\mathcal {T}_\Delta ) - \dim \mathbb P_{m,-1}(\mathcal {T}_\Delta )\,. \end{aligned}$$

This dimension formula will come handy in Sect. 3 for defining EG shape functions. More details about these spaces and their properties can be found in Rupp and Lee (2020).

To obtain the semi-discrete EG formulation of (9), \(\varvec{c}\) and \(\varvec{\phi }\) are replaced by their discrete counterparts \(\varvec{c}_\Delta ,\varvec{\phi }_\Delta \in \mathbb P_{k,m}(\mathcal T_\Delta )^3\). Since the values of a discontinuous function are not unique on element edges, we replace the boundary term \(\varvec{A}(\varvec{c})\cdot \varvec{n}\) by a numerical flux \(\varvec{\hat{A}}(\varvec{c}_\Delta ,\varvec{c}^+_\Delta , \varvec{n})\) that depends on the discontinuous values of the solution on element T (without superscript) and its edge neighbor (superscript \(^+\)). On domain boundaries, the specified boundary values of free surface elevation and velocity are utilized in place of \(\varvec{c}^+_\Delta \) for the flux computation. Finally, summing up over all elements \(T \in \mathcal {T}_\Delta \) yields

$$\begin{aligned}&\sum _{T\in \mathcal {T}_{\Delta }} \left( \partial _t \varvec{c}_\Delta ,\varvec{\phi }_\Delta \right) _{T} -\sum _{T\in \mathcal {T}_{\Delta }}\left( \left( \varvec{A}(\varvec{c}_\Delta ), \nabla \varvec{\phi }_\Delta \right) _{T} -\langle \varvec{\hat{A}}(\varvec{c}_\Delta ,\varvec{c}^+_\Delta , \varvec{n}),\varvec{\phi }_\Delta \rangle _{\partial T}\right) \nonumber \\&\quad = \sum _{T\in \mathcal {T}_{\Delta }}\left( \varvec{r}(\varvec{c}_\Delta ),\varvec{\phi }_\Delta \right) _{T}. \end{aligned}$$

In this work, we use the Lax–Friedrichs flux combined with the Roe–Pike averaging (Aizinger and Dawson 2002; Roe 1981; Roe and Pike 1984) defined as

$$\begin{aligned} \varvec{\hat{A}}(\varvec{c}_\Delta ,\varvec{c}^+_\Delta , \varvec{n}) =\frac{1}{2}\left( \varvec{A}(\varvec{c}_\Delta )+\varvec{A}(\varvec{c}_\Delta ^+)\right) \cdot \varvec{n} +\frac{\lambda }{2}\left( \varvec{c}_\Delta -\varvec{c}_\Delta ^+\right) , \end{aligned}$$

where \(\lambda = \lambda (\varvec{c}_\Delta ,\varvec{c}_\Delta ^+,\varvec{n})\) is given by

$$\begin{aligned} \lambda (\varvec{c}_\Delta ,\varvec{c}_\Delta ^+,\varvec{n})= & {} \left| \frac{\left( U_\Delta \sqrt{H_\Delta ^+} + U_\Delta ^+\sqrt{H_\Delta ^{}}\right) n_x + \left( V_\Delta \sqrt{H_\Delta ^+} + V_\Delta ^+\sqrt{H_\Delta ^{}}\right) n_y}{H_\Delta \sqrt{H_\Delta ^+} + H_\Delta ^+\sqrt{H_\Delta }}\right| \\&+ \sqrt{\frac{g}{2}\left( H_\Delta +H_\Delta ^+\right) } \end{aligned}$$

with \(H_\Delta ^+ :=\xi _\Delta ^+ - z_\mathrm {b}\) (note that the bathymetry \(z_\mathrm {b}\) is assumed to be continuous).

Discrete initial conditions are obtained using suitable projections of \(\xi _0\) and \(\varvec{q}_0\) into the discrete function space (10) (cf. Rupp and Lee (2020), (Sect. 5) for a comparison of reasonable choices for such projections). For the temporal discretization, we use strong stability preserving (SSP) explicit Runge–Kutta methods (Gottlieb et al. 2001).

This study uses the inviscid system of SWE (1) (similarly to our DG scheme for the SWE (Hajduk et al. 2018), the EG formulation needs no viscous stabilization). Such terms can be easily added in the same way as in DG methods [see, e.g. Lee et al. (2016)].

3 Implementation aspects

Our implementation of the EG method is based on the DG code for the SWE realized within the FESTUNG framework (Reuter et al. 2019; Frank and Reuter 2020). To obtain the required EG operators, we use a simple strategy of modifying the DG code that exploits the fact that the EG (or, for that matter, also the CG) approximation spaces are embedded in the DG spaces of the same order. Therefore, all EG or CG basis functions can be represented as linear combinations of DG basis functions. This representation can be written in the form of a (generally rectangular) system of linear equations, which can then be used to convert an available DG operator into the corresponding one for EG or CG spaces. This trick is by no means restricted to the SWE system; it can be just as easily applied to any linear or nonlinear PDE. The downside of this approach is that a full DG discretization must be assembled first—thus limiting the efficiency gain due to a smaller approximation space of the EG method. While simplifying switching between different DG, EG, and CG schemes, this approach is not recommended if a dedicated EG scheme were to be implemented from scratch. However, our implementation uses for this purpose multiplication with a time-independent matrix (highly optimized operation in MATLAB / GNU Octave) that only incurs a negligible runtime overhead.

In this work, we utilize finite element spaces of polynomial degree \(k \le 2\). First consider the DG space \(\mathbb P_k(\mathcal T_\Delta )\) and denote by \(N:=\dim \mathbb P_k(\mathcal T_\Delta )\) the number DG unknowns for a scalar quantity. We have \(\dim \mathbb P_1(\mathcal T_\Delta ) = 3\,\#T\) and \(\dim \mathbb P_2(\mathcal T_\Delta ) = 6\,\#T\). For fixed k, let \(\left\{ \varphi _i: i\in \{1,\ldots ,N\}\right\} \) be the element-wise continuous nodal basis of the DG method, where the nodal property is local to every element \(T \in \mathcal T_\Delta \). Moreover, we denote by \(\left\{ \phi _i: i\in \{1,\ldots ,M\}\right\} \) with \(M:=\dim \mathbb P_{k,m}(\mathcal T_\Delta )\) a basis of the EG space.

In the following, we construct bases for various EG spaces from CG and DG basis functions. Here, one has to be careful since, in general, simply combining a CG basis with element-wise DG basis functions produces a linearly-dependent set. For the admissible combinations of k and m with \(-1\le m< k \le 2\) considered in our work, the EG bases can be constructed as follows:

  • \(k \in \{1,2\},~m=-1\): The space \(\mathbb P_{k,-1}(\mathcal {T}_\Delta )\) coincides with the CG ansatz space of order at most k. Therefore, we can simply choose the CG basis functions.

  • \(k=1,~m=0\): The union of characteristic functions of all elements \(T \in \mathcal T_\Delta \) and the continuous, piecewise linear interpolation functions for all but one mesh vertex form a basis.

  • \(k=2,~m=0\): We obtain a basis from the union of characteristic functions of all elements and all but one of the shape functions for the quadratic CG space.

  • \(k = 2,~m=1\): We use the standard linear DG basis and extend it by the nodal quadratic shape functions equal to 1 at one edge midpoint, but omit the ones corresponding to cell vertices. The functions are linearly independent and are contained in the space \(\mathbb P_{2,1}(\mathcal {T}_\Delta )\). As the number of basis functions equals \(\dim \mathbb P_{2,1}(\mathcal {T}_\Delta ) = 3\,\#T+\,\#E\), it must be a basis of \(\mathbb P_{2,1}(\mathcal {T}_\Delta )\). Here, \(\#T\) and \(\#E\) denote the number of triangles and edges of \(\mathcal {T}_\Delta \), respectively. The dimension formula (11) ensures that this, in fact, constitutes a basis of \(\mathbb P_{2,1}(\mathcal T_\Delta )\).

3.1 Assembling nonlinear EG operators from DG operators

Next, we show how to obtain an EG from the corresponding DG operator. To simplify the presentation, we formulate our approach for the scalar case noting that the generalization to vector fields is straightforward. DG discretizations of nonlinear PDEs such as the SWE feature nonlinear operators of the form

$$\begin{aligned} \varvec{B}^{\mathrm {DG}}: \mathbb {R}^{N} \rightarrow \mathbb {R}^{N},\quad \varvec{x} \mapsto \Bigg (b\,\Bigg (\sum _{j=1}^{N}x_j\varphi _j,\varphi _i\Bigg )\Bigg )_{i=1,\ldots ,N}, \end{aligned}$$

where \(b: \mathbb P_k(\mathcal T_\Delta ) \times \mathbb P_k(\mathcal T_\Delta ) \rightarrow \mathbb {R}\) is linear in the second argument. Our goal is to form the EG operator

$$\begin{aligned} \varvec{B}^{\mathrm {EG}}: \mathbb {R}^{M} \rightarrow \mathbb {R}^{M},\quad \varvec{y} \mapsto \Bigg (b\,\Bigg (\sum _{j=1}^{M}y_j\phi _j,\phi _i\Bigg )\Bigg )_{i=1,\ldots ,M}, \end{aligned}$$

by making use of (14). Since \(\mathbb P_{k,m}(\mathcal T_\Delta )\subset \mathbb P_{k}(\mathcal T_\Delta )\), it is possible to express the EG basis functions as linear combinations of the DG basis

$$\begin{aligned} \phi _i = \sum _{k=1}^{N} C_{ik} \varphi _k,\quad C_{ik} \in \mathbb {R},~i \in \{1,\ldots ,M\}. \end{aligned}$$

Next, we insert (16) into (15), obtaining

$$\begin{aligned} b\,\Bigg (\sum _{j=1}^{M}y_j\phi _j,\phi _i\Bigg )&\;=\; b\,\Bigg (\sum _{j=1}^{M}y_j\sum _{l=1}^{N} C_{jl} \varphi _l,\sum _{k=1}^{N} C_{ik} \varphi _k\Bigg )\\&\;=\; \sum _{k=1}^{N} C_{ik}\,b\,\Bigg (\sum _{l=1}^{N}\Bigg (\sum _{j=1}^{M}C_{jl}y_j\Bigg ) \,\varphi _l, \varphi _k\Bigg ) \end{aligned}$$

for \(i \in \{1,\ldots ,M\}\). By defining \(\varvec{C} :=\left( C_{kl}\right) _{kl}\), \(k \in \{ 1,\ldots ,M\}\), \(l \in \{ 1,\ldots ,N\}\), we can thus write

$$\begin{aligned} \varvec{B}^{\mathrm {EG}}(\varvec{y})&= \Bigg (b\,\Bigg (\sum _{j=1}^{M}y_j\phi _j,\phi _i\Bigg )\Bigg )_{i=1,\ldots ,M} = \varvec{C}\,\Bigg ( b\,\Bigg (\sum _{l=1}^{N}\Big (\varvec{C}^\mathsf {T}\varvec{y}\Big )_l \varphi _l, \varphi _k\Bigg )_{}\Bigg )_{k=1,\ldots ,N} \nonumber \\&\;=\; \varvec{C}\varvec{B}^{\mathrm {DG}}(\varvec{C}^\mathsf {T}\varvec{y}). \end{aligned}$$

Employing (17), we assemble the terms corresponding to (15) by modifying an existing DG discretization of the operator (14) and computing the matrix \(\varvec{C}\). Due to the above choice of EG and DG basis functions, the matrix \(\varvec{C}\) can be determined from simple geometric considerations during preprocessing.

3.2 Assembly of the EG mass matrix

We consider the operator \(b_\mathrm M\) defined by \(b_\mathrm M(u_{\Delta },v_{\Delta }) :=\left( u_{\Delta }\,,\,v_{\Delta }\right) _{\Omega }\). Since \(b_\mathrm M\) is a bilinear form, we can write its induced operator (mass matrix) as

$$\begin{aligned} \varvec{B}_\mathrm M^{\mathrm {DG}}: \mathbb {R}^N \rightarrow \mathbb {R}^N,\quad \varvec{x} \mapsto \varvec{B}_\mathrm M^{\mathrm {DG}} \varvec{x} \end{aligned}$$

with \(\left( B_\mathrm {M}^{\mathrm {DG}}\right) _{ij} = \int _{\Omega } \varphi _j \varphi _i\, dx\) for \(i,j \in \{1,..,N\}\). The corresponding EG mass matrix

$$\begin{aligned} \varvec{B}_\mathrm M^{\mathrm {EG}} \in \mathbb {R}^{M\times M}, \quad \left( B_\mathrm {M}^{\mathrm {EG}}\right) _{ij} = \int _{\Omega } \phi _j \phi _i\, dx \end{aligned}$$

can be obtained from (17), and, in operator form, is given by

$$\begin{aligned} \varvec{B}_\mathrm M^{\mathrm {EG}}: \mathbb {R}^M \rightarrow \mathbb {R}^M,\quad \varvec{y} \mapsto \varvec{C} \varvec{B}_\mathrm M^{\mathrm {DG}} \varvec{C}^\mathsf {T}\varvec{y}. \end{aligned}$$

That is, for operators induced by bilinear forms, the assembly can be preprocessed.

4 Numerical results

In this section, we investigate the performance of the EG method using artificial and realistic test problems for the SWE. The main goals of these numerical studies can be summarized as follows:

  • Verify the expected rate of convergence against a manufactured analytical solution;

  • evaluate the solution quality for realistic benchmarks;

  • compare the EG and DG methods in terms of accuracy, stability, and robustness.

We denote the results obtained for different combinations of k and m by the corresponding finite element spaces as illustrated in Fig. 1. To simplify notation, we omit their dependency on \(\mathcal {T}_\Delta \). Hence, we write \(\mathbb P_{k,m}\) instead of \(\mathbb P_{k,m}(\mathcal T_\Delta )\) and use the convention that \(\mathbb P_{k,k}\) is the DG space of polynomial degree k. As temporal discretization, we use SSP Runge–Kutta time stepping schemes described in Aizinger et al. (2000) with \(s = k+1\) stages. The time step size depends on the specific test problem.

4.1 Analytical convergence test

In our first numerical experiment, we approximate a smooth solution of the SWE on the domain \(\Omega :=(0,1000) \times (0,1000)\). The coarsest mesh (corresponding to level one) consists of 16 triangular elements and is shown in Fig. 2 (left). To investigate the convergence behavior, we consider a total of five meshes obtained from the coarsest mesh by uniform refinement via edge bisection.

Setting \(\tau _{\mathrm {bf}}\) and \(f_\mathrm {c}\) to zero and prescribing the bathymetry by

$$\begin{aligned} z_\mathrm {b}(x,y) :=-4+\frac{1}{1000} x+\frac{2}{1000}y \end{aligned}$$

we utilize the following analytical solution

$$\begin{aligned} \xi (x,y,t)&:=2+C_1-2 \,C_2\, \sin \left( \frac{\pi (x+y+C_3\,t)}{600}\right) , \end{aligned}$$
$$\begin{aligned} U(x,y,t)&:=2\,C_1+C_2\,C_3\sin \left( \frac{\pi (x+y+C_3\,t)}{600}\right) , \end{aligned}$$
$$\begin{aligned} V(x,y,t)&:=C_1+C_2\,C_3\sin \left( \frac{\pi (x+y+C_3\,t)}{600}\right) . \end{aligned}$$

Using the method of manufactured solution, we substitute (18) into (2)–(4) to obtain a forcing function for the right-hand side; in addition, Dirichlet boundary conditions for each unknown arising from (18) are imposed on all boundaries. We solve the SWE for the time interval (0, 1000) using the time step size \(\Delta t=1/4\) in all considered scenarios. This value of \(\Delta t\) is sufficiently small to make temporal discretization errors negligible compared to spatial approximation errors. Solution parameters are chosen as \(C_1=0.3,\, C_2=0.2,\, C_3=0.2\).

The projected initial surface elevation for \(\mathbb P_{1,0}\) on the coarsest mesh is shown in Fig. 2 (middle). Figure 2 (right) displays the numerical solution for the surface elevation using the same approximation (\(k=1,~m=0\)) on the finest mesh at the final time.

Fig. 2
figure 2

Analytical convergence test: Coarsest mesh (left), initial surface elevation for the EG method with \(k=1,~m=0\) on coarsest mesh (middle), EG solution at the final time \(t = 1000\) for the surface elevation on the finest mesh (level 5) for \(k=1,~m=0\) (right)

For all considered CG, EG, and DG methods (\(-1\le m\le k\le 2\)), we list the \(L^2(\Omega )\) discretization errors at the final time along with the corresponding convergence rates in Table 1 and plot them in Fig. 3. The results indicate that we obtain at least second order convergence for all methods. Third order convergence can be observed for \(k=2\) and \(m\in \{1,2\}\), but not for the CG approximation \(k=2, m=-1\). The results for \(k=2\) and \(m=0\) are slightly better than without the piecewise constant enrichment, but do not exhibit third order accuracy. In conclusion, the EG scheme converges almost exactly as well as the DG method if \(m=k-1\), \(k \in \{1,2\}\), although the absolute errors tend to be somewhat larger than for their DG counterparts. This is to be expected because EG has fewer unknowns, and even the projected exact solution in general becomes less accurate if the number of unknowns is reduced.

Table 1 Analytical convergence test: \(L^2(\Omega )\) errors Err(\(\cdot \)) at the final time and experimental orders of convergence EOC(\(\cdot \)) for the surface elevation and depth integrated velocity components (see Fig. 1)

In Table 2, we list the total numbers of degrees of freedom for all configurations. Note that, the DOF for the EG method with element-wise constant enrichment \((k \in \{1,2\},~m=0)\) has approximately half as many DOF as the DG method of order k. EG with \(k=2,~m=1\) has ca. three quarters of the DOF for the quadratic DG method. In conclusion, the EG method performs similarly to DG for smooth solutions while having significantly fewer DOF.

Fig. 3
figure 3

Analytical convergence test: Logarithmic plots of the \(L^2(\Omega )\) errors at the final time for the surface elevation \(\xi \) (left) and the depth integrated velocity magnitude \(|\varvec{q}|\) (right) for CG, EG, and DG discretizations (see Fig. 1)

Table 2 Mesh details: Number of triangles (\(\# T\)), vertices (\(\#V\)), edges (\(\#E\)), and degrees of freedom for CG, EG, and DG for all considered orders (see Fig. 1)

One has to note here that the relation between the computational cost (even in serial execution) and the number of degrees of freedom is not a simple one. In a time-explicit DG or EG scheme, main cost factors are element and edge integrals that are not significantly cheaper for the EG method than for the DG one. The situation is more complicated in time-implicit and semi-implicit cases (not evaluated in our study): The EG system is smaller than the DG one and whether it is faster to solve depends on a number of different aspects such as the condition number, sparsity structure, preconditioner, solver, etc.

The focus of the present study is an evaluation of the accuracy, stability, and robustness of the EG method for the SWE; any conclusions about the computational performance of this scheme are clearly outside of the scope—in particular, since our implementation is based on MATLAB / GNU Octave and rides on top of the DG assembly for the SWE solver. However, to give an indication of the computational costs associated with an EG scheme, we list in Table 3 the cumulative (summed over all time steps) solution times for systems with the DG mass matrix (\(\mathbb {P}_{1,1}\)), the EG mass matrix (\(\mathbb {P}_{1,0}\)), and the EG mass matrix (\(\mathbb {P}_{1,0}\)) lumped according to the scheme proposed in Becker et al. (2003). We see there that naively using the direct MATLAB solver (backslash operator that calls the Cholesky method) on the EG mass matrix produces a rather inefficient and poorly scaling implementation, whereas system solves with block-diagonal DG and lumped EG matrices scale much better. Another interesting point to note is the fact that lumping the mass matrix leads to some degradation of the method’s accuracy and its convergence.

Table 3 Cumulative solution time (seconds) for the linear equation systems with the mass matrix for the following configurations: DG for \(\mathbb {P}_{1,1}\) and block-diagonal mass matrix, EG for \(\mathbb {P}_{1,0}\) and the consistent mass matrix, EG for \(\mathbb {P}_{1,0}\) and the lumped mass matrix

4.2 Supercritical flow in a constricted channel

In order to demonstrate the stability and robustness of the EG method, we solve the supercritical flow problem proposed in Zienkiewicz and Ortiz (1995). The computational domain is a channel whose lateral boundary walls are constricted on both sides with an angle of five degrees (cf. Fig. 4). This benchmark uses constant bathymetry \(z_b \equiv -1\) while parameters \(\tau _{\mathrm {bf}},\) and \(f_\mathrm {c}\) are once more set to zero. The following initial and boundary conditions are prescribed for this problem: Initially, we set the surface elevation and momentum to \(\xi _0\equiv 0,\) \(\varvec{q}_0 = (1,0)^\mathsf {T}\), respectively. Land boundary conditions are imposed at the lateral (wall) boundaries, while at the inlet (left), free surface elevation as well as velocity are for all times specified to be identical to their corresponding initial values. Finally, at the outlet (right), radiation boundary conditions are used. Denoting by u and H the axial velocity and water depth at the inlet, respectively, the flow regime is made supercritical by choosing the inlet Froude number \(\mathrm {Fr}\)

$$\begin{aligned} \mathrm {Fr} :=\frac{u}{\sqrt{g H}} = 2.5, \end{aligned}$$

achieved by setting the gravitational constant \(g=0.16~\frac{m}{s^2}\).

The solution to this problem converges to a steady-state, for which an analytical solution is available (Ippen 1951). Figure 4 illustrates the computational domain along with the unstructured mesh used in all computations.

Fig. 4
figure 4

Supercritical flow in a constricted channel: Unstructured mesh with 3155 elements

We run all simulations to a steady-state using pseudo time stepping with a time step of \(\Delta t = 1/10\) and present the results for all schemes in Fig. 5. The steady-state surface elevation shown in Fig. 5 (top left) is discontinuous and displays interactions of waves reflected from the channel constrictions. Figure 5 (top right to bottom right) depicts the steady-state solutions for all considered EG and DG methods (\(0\le m\le k\le 2\)). Since no limiter was used, all approximations exhibit spurious oscillations close to the discontinuities. We expect the results to improve and the numerical solutions to become bound-preserving if a limiter, such as the one developed in Kuzmin et al. (2020), is utilized (at least) for the free surface elevation.

Fig. 5
figure 5

Supercritical flow in a constricted channel, free surface elevation, analytical solution (top left) and approximations (see Fig. 1) at steady-state for: \(\mathbb P_{1,1}\) (top right), \(\mathbb P_{1,0}\) (middle left), \(\mathbb P_{2,2}\) (middle right), \(\mathbb P_{2,0}\) (bottom left), \(\mathbb P_{2,1}\) (bottom right)

The specific type of enrichment appears to play a major role for the stability of EG schemes. Thus the results for enrichments with \(m=k-1,~k \in \{1,2\}\) are almost indistinguishable from the corresponding DG results. In particular, the characteristic wave features such as positions and magnitudes of discontinuities are in good agreement with the analytical solution. On the other hand, the EG solution for \(k=2,~m=0\) exhibits severe oscillations not only in the vicinity of the discontinuities but also in the remainder of the computational domain. This phenomenon indicates that a piecewise constant enrichment of the quadratic CG space may not be sufficient to obtain an intrinsically stable scheme. However, an enrichment by piecewise linear discontinuous functions seems to remedy this issue, which confirms the previous findings in Sect. 4.1 and (Rupp and Lee 2020). The results for this benchmark suggest that optimal EG schemes (\(m=k-1\)) possess similar stability properties to their DG counterparts while offering potential advantages in computational efficiency.

4.3 Tidal flow at Bahamas Islands

The next benchmark considered in this work involves a tidal flow scenario around the Bahamas Islands based on the configuration, parameter values, and mesh presented in Westerink et al. (1989) for the Bight of Abaco. The domain geometry, bathymetry as well as boundary types are depicted in Fig. 6 (left), and Fig. 6 (right) shows the unstructured mesh used in all simulations. Furthermore, four recording stations also shown in Fig. 6 (left) are placed at the following locations \((38\,667,49\,333)\), \((56\,098,9\,613)\), \((41\,263,29\,776)\), and \((59\,594,41\,149)\) (coordinates in meters) to monitor the temporal evolution of surface elevation and depth integrated velocity.

Fig. 6
figure 6

Tidal flow around Bahamas (Bight of Abaco): Bathymetry, boundary type and positions of recording stations (left), unstructured triangular mesh with 1696 elements (right)

For bottom friction, we use the standard quadratic friction law \(\tau _{\text {bf}} = C_f | \varvec{q} | /H^2\) [see e.g. Vreugdenhil (1994)] with coefficient \(C_f = 0.009\). The constant Coriolis parameter is set to \(3.19 \times 10^{-5}~\mathrm {s}^{-1}\). The following tidal forcing is prescribed at the open sea boundary (Kolar et al. 1994):

$$\begin{aligned} \hat{\xi }(t) \;=\;&0.075\cos \left( \frac{t}{25.82}+3.40\right) + 0.095\cos \left( \frac{t}{23.94}+3.60\right) \nonumber \\&+ 0.100\cos \left( \frac{t}{12.66}+5.93\right) + 0.395\cos \left( \frac{t}{12.42}+0.00\right) \nonumber \\&+ 0.060\cos \left( \frac{t}{12.00}+0.75\right) \end{aligned}$$

with time t in hours. In real-life ocean simulations, the initial conditions are often unknown or very difficult to obtain, therefore a cold start initialization is performed: The flow domain is assumed to be at rest initially (\(\xi _0 \equiv 0\), \({\varvec{q}}_0 \equiv (0,0)^T\)). Then, starting immediately at initial time \(t=0\), the tidal forcing (19) is imposed at the open sea boundary.

The simulations are run for a total of 12 days using constant time step \(\Delta t = 15\) seconds for EG and DG discretizations corresponding to \(0\le m \le k \le 2\). In Figs. 78 and 9, we compare the numerical solutions for all considered EG and DG methods at the four recording stations.

Fig. 7
figure 7

Tidal flow at Bahamas Islands: Free surface elevation for days eleven and twelve at stations one to four (left to right, top to bottom) (see Fig. 1 for legend explanation)

Fig. 8
figure 8

Tidal flow at Bahamas Islands: x-velocity component for days eleven and twelve at stations one to four (left to right, top to bottom) (see Fig. 1 for legend explanation)

Fig. 9
figure 9

Tidal flow at Bahamas Islands: y-velocity component for days eleven and twelve at stations one to four (left to right, top to bottom) (see Fig. 1 for legend explanation)

Surface elevation results in Fig. 7 demonstrate excellent agreement for all EG and DG methods. The curves lie nearly on top of each other and no differences can be visually detected at this resolution. The differences in the surface elevation between the EG and DG approximations at the recording stations are on the order of \(10^{-4}\) meters.

In Figs. 8 and 9, we plot both velocity components at the recording stations. For the station two (Figs. 89 top right), one can observe slight differences between the approximations of orders one and two. Such behavior is fully consistent with the station comparisons performed in Aizinger and Dawson (2002). The effect of approximation order has greater magnitude than the differences between the EG and DG discretizations. Small differences between EG and DG discretizations of the same order can be observed in the plots at some locations. The deviations at the recording stations are on the order of \(10^{-3}\) to \(10^{-4}~\mathrm {m\,s^{-1}}\).

In addition to testing the accuracy and robustness of the EG method, we also use the Bahamas example to verify the well-balancedness of the method. For this purpose, our simulation were run for the lake-at-rest configuration with open sea boundary condition set to zero. Just as expected, no spurious circulation emerged for any of the EG or DG configurations.

5 Conclusions and future work

In this work, we present the first enriched Galerkin discretization for the system of 2D shallow-water equations, evaluated its performance in analytical and realistic test problems, and compared the numerical results to those obtained using our discontinuous Galerkin solver. The results of our studies demonstrate that EG schemes with enrichments using discontinuous spaces of order one less than the order of the continuous space display accuracy and robustness on a par with the corresponding DG discretizations. Similarly to DG, the EG method guarantees local conservation of all primary unknowns while the total number of degrees of freedom is substantially lower than that for the DG space of the same order. This makes the enriched Galerkin method a very attractive candidate for solving the shallow-water equations and other nonlinear hyperbolic PDE systems.

Several interesting avenues of future research concerning the development of an EG solver for the SWE present themselves at this time. Thus one could try the quadrature-free methodology to improve the computational efficiency of the method—similarly to our recent work for the DG SWE solver in Faghih-Naini et al. (2020). Formulating the EG method for the SWE as a time-implicit or a semi-implicit scheme—especially in a combination with an efficient linear solver such as the hierarchical scale separation method either directly (Aizinger et al. 2015) or in a hybridized setting (Schütz and Aizinger 2017) appears to be particularly attractive due to similarities with the DG method, on one hand, and a substantially smaller system size, on the other. Also EG schemes hold promise (perhaps even more so than the DG ones) for bathymetry reconstruction using modified shallow-water equations (Hajduk et al. 2020).