1 Introduction

Discontinuous Galerkin (DG) methods combine some of the most attractive features of the classical (continuous) finite element and finite volume methods; they are widely used nowadays in many areas of computational science, particularly for fluid dynamics simulation. Thanks to local approximation stencils, DG schemes have a very favorable computation/communication ratio and are therefore well suited for high performance computing applications that utilize distributed memory parallelization or heterogeneous hardware architectures.

One of the most interesting aspects of the DG methodology is the support for the computational mesh (h-) or the local approximation space (p-) adaptivity. However, adaptive discretizations need indicators to guide the refinement/derefinenement. A wide variety of different error and smoothness indicators exists in the literature (see, for example, an overview in Naddei et al. 2019). Generally, these approaches can be subdivided into several classes according to the information used to evaluate the local discretization error (or the local solution regularity). A simple and computationally efficient scheme (Eskilsson 2011; Tumolo et al. 2013) measures the so-called local spectral decay of the DG solution, i.e., compares the absolute or relative magnitudes of the degrees of freedom corresponding to different polynomial orders. Another indicator, which has been already successfully used for the shallow water equations (SWE), estimates local solution gradients (Kubatko et al. 2009; Michoski et al. 2011). In Michoski et al. (2011), the first attempt was made to combine a slope limiter with a p-adaptive DG method; however, contrary to our approach, no slope limiting information was utilized by the adaptivity procedure and vice versa.

Our indicator uses techniques introduced in Krivodonova et al. (2004) for so-called non-conformity estimators which measure the absolute or relative size of discontinuity between the element-local solutions – mostly by integrating solution jumps over the inter-element boundaries. However, a new indicator became necessary due to additional requirements resulting from the specifics of our application:

  • Highly dynamic character of typical simulation scenarios for the SWE (e.g., tidal waves or tsunamis) requires a computationally efficient procedure,

  • Support for piecewise constant polynomial spaces – the indicator must be able to detect local errors using the lowest order approximation space—cannot be provided using, e.g., the spectral decay methods.

The adaptivity indicator proposed in this work aims to fulfill the above requirements, does not need any problem-specific (e.g. sensitivity) parameters, and is integrated into the vertex-based slope limiter (Aizinger 2011; Kuzmin 2010; Aizinger et al. 2017; Hajduk et al. 2018, 2019) further reducing the total computational cost of the scheme. In addition, our DG discretization based on modal hierarchical bases is highly suited for p-adaptive schemes – even more so since it relies on a quadrature-free formulation introduced in Faghih-Naini et al. (2020). Analytic evaluation of all element and edge integrals completely avoids the main overhead of varying-order approximation spaces: The necessity to use the most accurate (and thus the most expensive) quadrature rule or to maintain several quadrature rules of different orders.

We begin, in the next section, by introducing the mathematical model and its discretization by a quadrature-free DG method. Section 3 details our adaptivity indicator and two other indicators used for comparison. Numerical results evaluating the performance of our scheme are presented in Sect. 4 followed by a short Conclusions and outlook section.

2 Quadrature-free DG formulation for the shallow water equations

2.1 Governing equations

The presentation in this section closely follows (Faghih-Naini et al. 2020) and is included here for completeness. We start with the classical 2D shallow water equations defined on some 2D domain \(\varOmega \) and given by

$$\begin{aligned}&\partial _t \xi +\nabla \cdot \varvec{q}=0, \end{aligned}$$
(1)
$$\begin{aligned}&\partial _t \varvec{q} +\nabla \cdot \left( \varvec{qq}^T /H \right) +\tau _{\text {bf}} \varvec{q} +\left( {\begin{matrix} 0 &{} -f_c\\ f_c &{} 0 \end{matrix}} \right) \varvec{q} +gH\nabla \xi =\varvec{F}, \end{aligned}$$
(2)

where \(\xi \) represents the elevation of the free water surface with respect to some datum (e.g., the mean sea level). Using \(h_b\) to denote the bathymetric depth, \(H = h_b + \xi \) is then the total water depth. \(\varvec{q} := (U,V)^T\) denotes the depth integrated horizontal velocity field, \(f_c\) the Coriolis coefficient, g the gravitational acceleration, \(\tau _{\text {bf}}\) the bottom friction coefficient, and \(\varvec{F}\) the body force accounting for the variable atmospheric pressure and tidal potential.

Introducing the notation \(\varvec{c} := (\xi , U,V)^T\), system (1), (2) can be re-formulated in the following compact form:

$$\begin{aligned} \partial _t \varvec{c}+ \nabla \cdot {\widetilde{\varvec{A}}}(\varvec{c}) = \varvec{r}(\varvec{c}), \end{aligned}$$
(3)

where

$$\begin{aligned} {\widetilde{\varvec{A}}}=\begin{pmatrix} U &{} V\\ \frac{U^2}{H}+\frac{g \xi (H+h_b)}{2}&{}\frac{U V}{H}\\ \frac{U V}{H}&{}\frac{V^2}{H}+\frac{g\xi (H+h_b)}{2} \end{pmatrix}, \; \varvec{r}=\begin{pmatrix} 0\\ -\tau _{\text {bf}}U+f_cV+g\xi \partial _x h_b+F_x\\ -\tau _{\text {bf}}V-f_cU+g\xi \partial _y h_b+F_y\\ \end{pmatrix}. \end{aligned}$$

Taking into account the relation \(\varvec{q} = \varvec{u}H\), where \(\varvec{u}=(u,v)^T\) is the depth averaged velocity, system (3) can be re-formulated as follows:

$$\begin{aligned}&\partial _t \varvec{c}+ \nabla \cdot \varvec{A}(\varvec{c}, \varvec{u})= \varvec{r}(\varvec{c}), \end{aligned}$$
(4)
$$\begin{aligned}&\varvec{u} H = \varvec{q}, \end{aligned}$$
(5)

where

$$\begin{aligned} \varvec{A}(\varvec{c}, \varvec{u})&=\begin{pmatrix} U &{} V\\ Uu+\frac{g\xi (H+h_b)}{2}&{}U v\\ V u&{} V v+\frac{g\xi (H+h_b)}{2} \end{pmatrix}. \end{aligned}$$
(6)

With the help of the auxiliary vector \(\varvec{u}\) calculated diagnostically from (5), this formulation, first presented in Faghih-Naini et al. (2020), avoids fraction-type nonlinearities in the momentum equation (2) and is better suited for quadrature-free integration.

This work uses the following types of boundary conditions for the SWE:

Land boundary: At a land boundary, we assume no normal flow

$$\begin{aligned} \varvec{q} \cdot \varvec{n} = 0. \end{aligned}$$

Open-sea boundary: Denoting by \(\hat{\xi }\) the prescribed free surface elevation we set at open sea boundaries

$$\begin{aligned} \xi (t, \varvec{x}) = \hat{\xi }(t, \varvec{x}). \end{aligned}$$

River boundary: For supercritical flow examples, we set the following river (inflow) boundary conditions:

$$\begin{aligned} q_{\varvec{n}}(t, \varvec{x}) = \hat{q}_{\varvec{n}}(t, \varvec{x}), \quad q_{\varvec{\tau }}(t, \varvec{x}) = \hat{q}_{\varvec{\tau }}(t, \varvec{x}), \quad \xi (t, \varvec{x}) = \hat{\xi }(t, \varvec{x}) \end{aligned}$$

with the normal and tangential integrated velocities \( \hat{q}_{\varvec{n}}(t, \varvec{x})\) and \( \hat{q}_{\varvec{\tau }}(t, \varvec{x})\).

Radiation boundary: At the outflow boundary of supercritical flow examples, we specify radiation boundary conditions where no unknowns are prescribed.

Lastly, we provide initial conditions for the elevation and integrated velocity

$$\begin{aligned} \xi (0, \varvec{x}) = \xi _0(\varvec{x}),\quad {\varvec{q}}(0, \varvec{x}) = {\varvec{q}}_0(\varvec{x}). \end{aligned}$$
(7)

2.2 Spatial discretization by a quadrature-free DG method

For \(\{\mathcal {T}_\varDelta \}_{\varDelta >0}\), a family of triangulations of \(\varOmega \subset \mathbb {R}^2\), let \(\varOmega _e\) be an element of \(\mathcal {T}_\varDelta \). The variational formulation of system (4), (5) is obtained by multiplication with sufficiently smooth test functions \(\varvec{\phi }\) and \(\varvec{\psi }\) followed by the integration by parts on each element \(\varOmega _e \in \mathcal {T}_\varDelta \) (we use the standard notation \((\cdot ,\cdot )_{\varOmega _e}\) and \(\langle \cdot ,\cdot \rangle _{\partial \varOmega _e}\) for the \(L^2\)-scalar products on elements and edges, respectively)

$$\begin{aligned}&\left( \partial _t \varvec{c},\varvec{\phi }\right) _{\varOmega _e}-\left( \varvec{{A}}, \nabla \varvec{\phi }\right) _{\varOmega _e}+\langle \varvec{{A}}\cdot \varvec{n}_e, \varvec{\phi }\rangle _{\partial \varOmega _e}= \left( \varvec{r},\varvec{\phi }\right) _{\varOmega _e}, \end{aligned}$$
(8)
$$\begin{aligned}&\left( \varvec{u} H, \varvec{\psi } \right) _{\varOmega _e} = \left( \varvec{q}, \varvec{\psi } \right) _{\varOmega _e}, \end{aligned}$$
(9)

where \(\varvec{n}_e\) denotes an exterior unit normal to \(\partial \varOmega _e\).

Denoting by \(\mathbb {P}^p(\varOmega _e)\) the polynomial space of order (i.e. the highest polynomial degree) p on \(\varOmega _e\), the initial conditions for the semi-discrete problem are generated using an \(L^2\)-projection of (7) into the corresponding discrete space. The (local) semi-discrete formulation is then obtained from (8), (9) by replacing \(\varvec{c}\) and \(\varvec{u}\) by their finitely-dimensional counterparts \(\varvec{c}_\varDelta , \varvec{u}_\varDelta \) and utilizing test functions \(\varvec{\phi }_\varDelta \in \mathbb {P}^p(\varOmega _e)^3, \, \varvec{\psi }_\varDelta \in \mathbb {P}^p(\varOmega _e)^2\):

$$\begin{aligned}&\left( \partial _t \varvec{c}_\varDelta ,\varvec{\phi }_\varDelta \right) _{\varOmega _e} -\left( \varvec{{A}}, \nabla \varvec{\phi }_\varDelta \right) _{\varOmega _e} +\langle \varvec{\widehat{A}},\varvec{\phi }_\varDelta \rangle _{\partial \varOmega _e} = \left( \varvec{r},\varvec{\phi }_\varDelta \right) _{\varOmega _e}, \end{aligned}$$
(10)
$$\begin{aligned}&\left( \varvec{u}_\varDelta H_\varDelta , \varvec{\psi }_\varDelta \right) _{\varOmega _e} = \left( \varvec{q}_\varDelta , \varvec{\psi }_\varDelta \right) _{\varOmega _e}. \end{aligned}$$
(11)

The edge flux \(\varvec{{A}}(\varvec{c}_\varDelta , \varvec{u}_\varDelta ) \cdot \varvec{n}_e\) is approximated on \(\partial \varOmega _e\) by a numerical flux \(\varvec{\widehat{A}}(\varvec{c}_\varDelta , \varvec{u}_\varDelta ,\varvec{\tilde{c}}_\varDelta , \varvec{\tilde{u}}_\varDelta , \varvec{n}_e)\) that depends on discontinuous values of the solution on element \(\varOmega _e\) (without tilde) and on its edge neighbors (with tilde). On exterior domain boundaries, the specified boundary values of the elevation and velocity are utilized in the flux computation. In this work, we rely – unless specified otherwise – on the Lax–Friedrichs flux (Hajduk et al. 2018) defined as

$$\begin{aligned} \varvec{\widehat{A}}^{LF}(\varvec{c}_\varDelta , \varvec{u}_\varDelta ,\varvec{\tilde{c}}_\varDelta , \varvec{\tilde{u}}_\varDelta , \varvec{n}_e) =\frac{1}{2}\big (\varvec{{A}}(\varvec{c}_\varDelta , \varvec{u}_\varDelta )+\varvec{{A}}(\varvec{\tilde{c}}_\varDelta , \varvec{\tilde{u}}_\varDelta )\big ) \cdot \varvec{n}_e +\frac{\lambda }{2} (\varvec{c}_\varDelta -\varvec{\tilde{c}}_\varDelta ), \end{aligned}$$

where \(\lambda \) denotes the largest (in absolute value) eigenvalue of \(\frac{\partial }{\partial \varvec{c}} \varvec{\widetilde{A}}(\varvec{c}) \cdot \varvec{n}_e\). Denoting by \(\varvec{x}_f\) the midpoint of edge \(f \subset \partial \varOmega _e\), our quadrature-free scheme uses the following approximation of \(\lambda |_f\):

$$\begin{aligned} \lambda |_f {:}{=}\max \left\{ \left| {\varvec{u}_\varDelta }(\varvec{x}_f) \cdot \varvec{n}_e\right| , \left| {\varvec{\tilde{u}}_\varDelta }(\varvec{x}_f) \cdot \varvec{n}_e\right| \right\} + \max \left\{ \sqrt{g {H_\varDelta }(\varvec{x}_f)},\sqrt{g {\widetilde{H}_\varDelta }(\varvec{x}_f)}\right\} . \end{aligned}$$
(12)

Since the Lax–Friedrichs flux sometimes leads to rather diffusive results, we use in some cases the FORCE flux (Toro et al. 2009) instead with \(\lambda \) as described above. The FORCE flux can be constructed as the arithmetic mean between the Lax–Friedrichs flux and the two-step version of the Lax–Wendroff flux

$$\begin{aligned}&\varvec{\widehat{A}}^{F}= \frac{1}{2}\left( \varvec{\widehat{A}}^{LF}+\varvec{\widehat{A}}^{LW}\right) \end{aligned}$$
(13)

where the latter is defined as \(\varvec{\widehat{A}}^{LW} = \varvec{{A}}\left( \varvec{Q}^{LW}\right) \) with

$$\begin{aligned}&\varvec{Q}^{LW}(\varvec{c}_\varDelta , \varvec{u}_\varDelta ,\varvec{\tilde{c}}_\varDelta , \varvec{\tilde{u}}_\varDelta , \varvec{n}_e) =\frac{\lambda }{2}\big (\varvec{{A}}(\varvec{c}_\varDelta , \varvec{u}_\varDelta )-\varvec{{A}}(\varvec{\tilde{c}}_\varDelta , \varvec{\tilde{u}}_\varDelta )\big ) \cdot \varvec{n}_e +\frac{1}{2}(\varvec{c}_\varDelta +\varvec{\tilde{c}}_\varDelta ). \end{aligned}$$

Since the FORCE flux is more computationally expensive, it is only used if the Lax–Friedrichs flux produces too much numerical diffusion. Note that with either Lax–Friedrichs or FORCE flux and the parameter \(\lambda \) approximated as in (12), our semi-discrete formulation (10), (11) only contains nonlinearities in product form, thus all edge integrals are well suited for an analytical integration using a quadrature-free approach.

For simplicity, we formulate our methodology in the remainder of this section and in Sect. 3 for a generic scalar function \(w(\varvec{x})\) defined on \(\varOmega \) and its discretized counterpart \(w_{\varDelta }(\varvec{x})\). Given \(\varphi _{ei}(\varvec{x}), \,i=1, \ldots , P(p)\), a basis of \(\mathbb {P}^p(\varOmega _e)\), \(w_\varDelta \) can be represented as

$$\begin{aligned} w_\varDelta (t,\varvec{x})|_{\varOmega _e} =\sum _{i=1}^{P(p)} w_{ei}(t) \, \varphi _{ei}(\varvec{x}) =: w_e^p, \end{aligned}$$

that is, the superscript indicates the approximation order; no superscript means the default (full) approximation order. We also use a short-hand notation \(p0, p1, \ldots \) for piecewise constant, linear, \(\ldots \) approximation spaces. The number of basis functions P(p) depends on the respective polynomial order and the space dimension; it has the following values in \(\mathbb {R}^2\): \(P(0)\!=\!1, \, P(1)\!=\!3,\) and \(P(2)\!=\!6\). The basis functions used in this work are chosen to be orthonormal with respect to the \(L^2\)-scalar product on \(\varOmega _e\) (see, for example, Frank et al. 2015).

2.3 Temporal discretization and slope limiting

To avoid spurious oscillations in linear and superlinear DG solutions, we apply a vertex-based slope limiter following (Aizinger 2011; Kuzmin 2010) to piecewise linear and higher order approximations where appropriate. For orthonormal basis functions used in our work, the local representation of the limiting operator \(\varPi : \mathbb {P}^p(\varOmega _e) \rightarrow \mathbb {P}^p(\varOmega _e)\) can be given by

$$\begin{aligned} \varPi \left( w_\varDelta )|_{\varOmega _e}\right) = w_{e1}\varphi _{e1} + \alpha _e \sum _{i=2}^{P(p)} w_{ei} \, \varphi _{ei} \end{aligned}$$
(14)

where the limiting factor \(\alpha _e \in [0,1]\) is computed as follows:

$$\begin{aligned} \alpha _e = \min _{i\in \{1,2,3\}} {\left\{ \begin{array}{ll} \min \left( 1,\frac{w_i^{\text {max}}-w_e^0}{w_e(\varvec{a}_{ei}) -w_e^0} \right) ,&{} \text {if } w_e(\varvec{a}_{ei}) - w_e^0 > \varepsilon , \\ \min \left( 1,\frac{w_i^{\text {min}}-w_e^0}{w_e(\varvec{a}_{ei}) - w_e^0} \right) ,&{} \text {if } w_e(\varvec{a}_{ei}) - w_e^0 < -\varepsilon , \\ 1, &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$
(15)

choosing \(\varepsilon = 10^{-5}\) in the following. \(w_i^{\text {max}}\) and \(w_i^{\text {min}}\) denote the solution bounds given by the maximum and minimum of piecewise constant solutions \(w_e^0\) on all elements sharing vertex \(\varvec{a}_{ei}\). Note that for an orthonormal basis, one has

$$\begin{aligned} w_e^0 = \frac{1}{\left| \varOmega _e \right| }\int _{\varOmega _e} w_\varDelta (t,\varvec{x})|_{\varOmega _e} d\varvec{x}. \end{aligned}$$

If (15) results in \(\alpha _e<1\), all degrees of freedom corresponding to superlinear (quadratic and higher) basis functions in (14) are set to zero. In such cases, this limiting does not guarantee the exact preservation of bounds \(w_i^{\text {max}}\) and \(w_i^{\text {min}}\), but the solution is mostly oscillation-free in most practical scenarios.

The temporal discretization of system (10), (11) is performed using a SSP (strong stability preserving) Runge–Kutta method (Gottlieb and Shu 1998). Following the presentation in Reuter et al. (2016), let \(0 = t_1< t_2< \cdots < t_{\text {end}}\) be a not necessarily equidistant decomposition of the time interval and \(\varDelta _n t := t_{n+1}-t_n\) the size of the nth time step. The update scheme of the s-stage SSP Rung–Kutta method (including limiting) is then given by

$$\begin{aligned}&\varvec{c}_{\varDelta }^{(0)} := \varvec{c}_{\varDelta }(t_n, \cdot ),\\&\varvec{c}_{\varDelta }^{(i)} := \varPi \left( \omega _i \varvec{c}_{\varDelta }^{(0)} + (1-\omega _i)\left\{ \varvec{c}_{\varDelta }^{(i-1)}+ \varDelta _n t\, \varvec{L} (\varvec{c}_{\varDelta }^{(i-1)}, t_n+ \delta _i \varDelta _n t) \right\} \right) , \; i = 1, \ldots , s,\\&\varvec{c}_{\varDelta }(t_{n+1}, \cdot ) := \varvec{c}_{\varDelta }^{(s)}, \end{aligned}$$

where \(\varvec{L}\) denotes the spatial discretization operator specified by (10), (11). In the test cases presented in Sect. 4, we use a two-stage SSP Runge–Kutta method with the coefficients \(\omega _1=0,\, \omega _2 = 1/2,\, \delta _1 = 0,\, \delta _2 = 1\) (cf. Eq. (2.4) (Gottlieb and Shu 1998).

3 Adaptivity indicators

3.1 Design of a new parameter-free adaptivity indicator

The goal of the new indicator, which will be referred to as JRL (Jump-Reconstruction-Limiting), is to identify resolved and under-resolved solution parts while distinguishing between smooth and non-smooth regions (e.g., shocks). Non-constant smooth regions should be approximated by piecewise linear or quadratic polynomials, constant regions should automatically revert to constant approximations. In regions with shocks or discontinuities, over- and undershoots should be avoided either by limiting, where possible, or by reducing the approximation order where necessary. Among the existing techniques in this area particularly relevant for our approach are the following ones:

  • for continuous finite element methods, (Kuzmin and Schieweck 2013) utilized a gradient reconstruction (using a local \(L^2\)-projection) as a parameter-free smoothness indicator for the unsteady linear advection equation;

  • (Aizinger et al. 2017) introduced a flux-based gradient reconstruction which was then employed in the framework of anisotropic slope limiters for the DG method.

The indicator’s scheme is shown in Fig. 1; it combines the above two ideas with a vertex-based slope limiter to re-use pre-computed data in an effort to increase computational efficiency and to smoothly blend p-adaptivity with slope limiting. The following subsections detail indicator’s working principles.

Fig. 1
figure 1

Flow chart of adaptivity indicator JRL for orders 0, 1, and 2. Values in gray boxes stand for different adaption cases, zero means no adaption, negative values indicate a decrease, and positive ones point to an increase of the approximation order

3.1.1 Error detection

First, we need to introduce some notation. For the base approximation order \(b:=\max {\{p-1,0\}}\), the jump of element \(\varOmega _e\) is defined as

$$\begin{aligned} \llbracket w\rrbracket _{e}:=\int _{\partial \varOmega _{e}\cap \partial \varOmega _{\tilde{e}}}\left| w_e^p -w_{\tilde{e}}^b\right| ds = \int _{\partial \varOmega _{e}\cap \partial \varOmega _{\tilde{e}}}\left| \sum _{i=1}^{P(p)}w_{ei}\varphi _{ei}-\sum _{i=1}^{P(b)} w_{\tilde{e}i}\varphi _{\tilde{e}i}\right| ds, \end{aligned}$$
(16)

where \(\tilde{e}\) denotes the index of the edge-neighbor of element \(\varOmega _e\). The base jump of element \(\varOmega _e\) is the jump when using the base order approximation on \(\varOmega _e\)

$$\begin{aligned} \llbracket w^b\rrbracket _{e}:=\int _{\partial \varOmega _{e}\cap \partial \varOmega _{\tilde{e}}}\left| w_e^b -w_{\tilde{e}}^b \right| ds \end{aligned}$$

It is mainly used in combination with \(\llbracket w\rrbracket _{e}\) to estimate the solution regularity. In the numerical results section (Sect. 4), the adaptivity indicator as shown in Fig. 1 is applied to the free surface elevation, that is, \(w=\xi \). The decisions are made depending on the following thresholds: \(\varepsilon _0=0.01\), \(\varepsilon _1=0.005\), \(\varepsilon _2=0.001\) and \(\tilde{\varepsilon }_{1}=0.01\), which are problem independent and remain constant for all test cases and resolutions.

Furthermore, to avoid frequent in- and decrements in the local approximation order, we add an additional constraint: once the order is increased, it cannot be decreased for a fixed number of time steps chosen equal to 10 in all our test cases. Additionally, the approximation order of an element can only be increased by one at a time. This explains why the jump definition in (16) is somewhat different from the standard jump definition (cf. (20) in Sect. 3.2.1) where a full order approximation is considered. Our approach is independent of the decisions on approximation order in- and decrements involving neighboring elements—since we only permit changes by one polynomial order.

3.1.2 Gradient reconstruction

For a p0 solution \(w^0_\varDelta (t,\varvec{x})\) with \(\frac{\llbracket w^0\rrbracket _{e}}{|\partial \varOmega _e|}\le 0.01\cdot |w^0_e|\), we construct a linear solution as follows using a rotationally invariant gradient reconstruction

$$\begin{aligned} R(w_\varDelta )|_{\varOmega _e} = w_e^0 + \frac{\partial w_e^1}{\partial x}\psi _{e2}(\varvec{x}) + \frac{\partial w_e^1}{\partial y}\psi _{e3}(\varvec{x}). \end{aligned}$$
(17)

This reconstruction uses the linear Taylor basis functions \(\psi _{ei}\) as in Eq. (6) of (Kuzmin 2010) defined on the element \(\varOmega _e\) as

$$\begin{aligned} \psi _{e1}(\varvec{x}) = 1, \quad \psi _{e2}(\varvec{x}) = x-x_e^c, \quad \psi _{e3}(\varvec{x}) = y-y_e^c, \end{aligned}$$

where \(\varvec{x}_e^c=(x_e^c, y_e^c)^T=\frac{1}{3} \left( \varvec{a}_{e1} + \varvec{a}_{e2} +\varvec{a}_{e3}\right) \) denotes the centroid of \(\varOmega _e\). The first coefficient is just the mean of the solution, the other two are obtained by taking the directional derivatives on all edges. Consider \(\varOmega _{\tilde{e}}\) with centroid \(\varvec{\tilde{x}}^c=(\tilde{x}^c, \tilde{y}^c)\) sharing an edge with \(\varOmega _e\). The directional derivative of \(w_e\) on \(\partial \varOmega _{e}\cap \partial \varOmega _{\tilde{e}}\) in direction \(\varvec{d}_{\tilde{e}} = \varvec{x}_{\tilde{e}}^c - \varvec{x}_e^c\) can then be approximated by

$$\begin{aligned} \frac{\partial w_e}{\partial \varvec{d}_{\tilde{e}}} \approx \frac{w_{\tilde{e}}^{0}-w_e^{0}}{|\varvec{d}_{\tilde{e}}|}. \end{aligned}$$
(18)

After computing the directional derivatives, we solve a least squares problem to obtain the gradient. Let N be the matrix containing the directions \(\varvec{d}_{\tilde{e}}\) in its rows, and let \(\varvec{\nu }_{e}\) be the vector of directional derivatives defined in (18) for all neighboring elements \(\varOmega _{\tilde{e}}\). The missing coefficients from (17) are then obtained as the solution of the least squares problem:

$$\begin{aligned} \left( \frac{\partial w_e^1}{\partial x}, \frac{\partial w_e^1}{\partial y} \right) ^T = (N^T N)^{-1} N^T\, \varvec{\nu }_{e}. \end{aligned}$$
(19)

Finally, the solution is transformed back into the orthonormal basis. For elements at the boundary of \(\varOmega \), one can define the directional derivatives using a ghost layer. The above approach trivially generalizes to arbitrary polygons.

3.2 Two further indicators for comparison

To evaluate the performance of our new indicator, we perform in Sect. 4 comparisons to two other types of adaptivity indicators. The first one, described in Sect. 3.2.1, is a well established scheme which uses inter-element jumps to assess the local solution regularity. The other one, described in Sect. 3.2.2, is our own version of the gradient indicator enhanced in order to be applicable for p0 discretizations. It needs to be noted here that these two indicators need either one (cf. Sect. 3.2.1) or three (cf. Sect. 3.2.2) user-defined thresholds as input parameters. These parameters have to be calibrated manually and may differ for different scenarios, for limited and unlimited simulations as well as for different refinement levels of the same scenario.

3.2.1 Jump indicator

The indicator introduced in Remacle et al. (2003) computes the sum of the jumps across the edges of an element \(\varOmega _e\), in our notation as

$$\begin{aligned} \llbracket w_\varDelta (t,\varvec{x})\rrbracket _{e}:=\int _{\partial \varOmega _{e}\cap \partial \varOmega _{\tilde{e}}}\left| w_e^p - w_{\tilde{e}}^p\right| ds. \end{aligned}$$
(20)

The local approximation order is then increased if the total jump over the element is greater than the user-provided threshold, and a decrease is performed if it is smaller (also here the minimum of ten time steps between an increase and a decrease is enforced). This indicator can be considered as a simplified version of the JRL indicator since it also uses jumps for error detection. We apply it to the free surface elevation, that is set \(w=\xi \), and refer to it in the following as JE (Jump-Estimation).

3.2.2 Gradient indicator

There exist gradient indicators in the literature such as the one introduced in Burbeau and Sagaut (2005) for the compressible Navier-Stokes equations and employed for the SWE in Kubatko et al. (2009). It computes gradients using the element-local solutions and therefore cannot be applied to p0-discretizations. To address this deficiency we designed a new scheme by taking directional derivatives in the directions \(\varvec{d}_f = \varvec{x}_f - \varvec{x}_e^c\), where \(\varvec{x}_f\) denotes the midpoint of edge \(f \subset \partial \varOmega _e\). The directional derivative is then approximated using the element centroid value and the edge midpoint values taken from the current element and from its edge neighbor as follows

$$\begin{aligned} \frac{\partial w_{e}}{\partial \varvec{d}_f} \approx \frac{w_e(\varvec{x}_f)-w_e(\varvec{x}^c_e)}{|\varvec{d}_f|} \quad \text { and } \quad \frac{\partial \widetilde{w}_{e}}{\partial \varvec{d}_f} \approx \frac{w_{\tilde{e}}(\varvec{x}_f)-w_e(\varvec{x}^c_e)}{|\varvec{d}_f|}. \end{aligned}$$

To decide, whether an increment in the approximation order is necessary, comparison is done by evaluating

$$\begin{aligned} \left| \frac{\partial w_{e}}{\partial \varvec{d}_f} -\frac{\partial \widetilde{w}_{e}}{\partial \varvec{d}_f}\right| < \varepsilon _w (\varDelta _e)^p. \end{aligned}$$
(21)

where \(\varDelta _e =\sqrt{2 |\varOmega _e|}\). In our implementation, this indicator is applied to three primary unknowns, that is, \(w \in \{\xi , U, V\}\), in the following way. If the inequality does not hold for all directions and all three unknowns, the local approximation order of the respective element is increased. If all of them hold, the same comparison (21) is performed for the base order solution \(w^{b}_{e}\) (see Sect. 3.1.1) on the current element and using \((\varDelta _e)^b\) in the upper bound in (21). If the criterion still holds for \(w^{b}_{e}\), then the solution is approximated well enough by the lower order approximation, thus the order is decreased. The threshold \(\varepsilon _w\) has to be chosen individually for every unknown and scenario but can be chosen the same for different mesh resolutions. We refer to this indicator as the GRE (Gradient-Reconstruction-Extended).

4 Numerical results

To evaluate the accuracy and robustness of the proposed adaptivity indicator, we use three different benchmarks representing a wide range of spatial (constant, smoothly varying, shocks) and temporal (stationary, time-varying) flow regimes. In addition to our indicator, simulations utilizing two other indicators described in Sect. 3.2 were carried out and used for comparison purposes.

The simulations were performed with the ExaStencils code generation framework (Lengauer et al. 2020), which is based on the domain-specific language ExaSlang (Schmitt et al. 2014b) and outputs an optimized C++ or CUDA code. Our work extends the framework using a Python frontend – called GHODDESS and available at https://i10git.cs.fau.de/ocean/ghoddess-release – responsible for mapping the DG scheme to ExaSlang (Faghih-Naini et al. 2020). Recently, much effort was put into improving the performance of our p-adaptive scheme via algorithmic optimizations and various hybridization strategies controlling the distribution of the compute kernels between CPUs and GPUs. Runs on different architectures showed a speedup compared to a pure CPU or GPU implementation (see Faghih-Naini et al. 2023).

Note that no bottom friction \(\tau _{\text {bf}}\), Coriolis force \(f_c\), or forcing \(\varvec{F}\) is used in any of the following test examples (cf. (2)).

4.1 Radial dam break

The radial dam break example is based on (LeVeque 2002; Hajduk 2021). We set \(\varOmega = [0,5]\times [0,5]\), \(g=1\), and a constant bathymetry \(h_b =0\). To make the test problem better suited for a p-adaptive approach, we chose the initial condition as

$$\begin{aligned}&\xi (x, y ,t)={\left\{ \begin{array}{ll}2+0.5\, e^{-15((x-2.5)^2+(y-2.5)^2)}, &{}\text {if } (x-2.5)^2+(y-2.5)^2<0.25,\\ 1,&{} \text {otherwise}, \end{array}\right. }\\&U(x,y,t)= 0, \; V(x,y,t) = 0. \end{aligned}$$

The simulation results shown in Fig. 2 (top and middle) were obtained on a uniform mesh with 32 768 triangles, whereas those displayed in Fig. 2 (bottom) used the same mesh refined twice in a uniform fashion. The latter solution serves as a reference for our comparisons and, in order to reduce numerical diffusion, employs the FORCE (see (13)) instead of the Lax–Friedrichs flux. Since all external boundaries use land boundary conditions, the wave is reflected as illustrated in Fig. 2 (right) which correspond to \(t=3\,s\). For better comparability, we limit the free surface elevation for the non-adaptive cases and show limited and unlimited results for the adaptive ones.

Fig. 2
figure 2

Radial dam break solution for \(t=0.1\,\mathrm {s}\) (left), \(t=1\,\mathrm {s}\) (middle), and \(t=3\,\mathrm {s}\) (right): constant (p0) (top), limited linear (p1) (middle), and limited linear (p1) on a twice uniformly refined mesh using the FORCE Riemann solver (bottom)

A comparison between the p-adaptive solutions with different indicators is shown in Fig. 3(a) for \(t=0.1\,\)s, in Fig. 3(b) for \(t=1\,\)s, and in Fig. 3(c) for \(t=3\,\)s, respectively. For indicator JE, a threshold value of 0.0003 led to the best results. For indicator GRE, in the unlimited case, the threshold for the free surface elevation of 0.8 and for the velocities of 1.7, and, in the limited case, the threshold for the free surface elevation of 0.2 and for the velocities of 0.5 produced the best solution. Our new indicator avoids over- and undershoots while not suffering from excessive numerical diffusion such as shown by the JE indicator with and without limiting. The GRE indicator produces a reasonable solution if limiting is active; otherwise, it is very diffusive and leads to over- and undershoots.

Fig. 3
figure 3

Radial dam break: Free surface elevation at different time levels. Adaption range: Constant-linear (p0-1) or constant-quadratic adaptive (p0-2). Indicator: JRL, JE, JE limited, GRE, GRE limited. Local approximation order \(p \in \{0, 1, 2\}\). Adaption case: see Fig. 1

These findings are further corroborated by the difference plots for the free surface elevation \(\xi -\xi _{ref}\) shown in Fig. 4 and the \(L^1\)-errors with respect to the reference solution listed in Table 1 alongside the fraction of elements with a specific order and the number of degrees of freedom. For \(t=0.1\,\mathrm {s}\) in the adaptive cases, more than 93 % of the elements are approximated by order 0, and only between 35 % and 40 % of the degrees of freedom of the linear approximation are needed to reach comparable \(L^1\)-errors. For \(t=1\,\mathrm {s}\), in order to reach a comparable \(L^1\)-error, one needs up to approx. 61 % of degrees of freedom of the uniformly linear approximation for p0-1 and up to approx. 82 % of degrees of freedom for p0-2. Here, a significant amount of p1 elements is necessary to represent the curvature correctly. For \(t=3\,\mathrm {s}\), the distribution of the approximation order is somewhat more dependent on the resolution, i.e., for higher resolutions, the fraction of lower order elements is higher and vice versa. Yet even in the worst case only approx. 81 % of the degrees of freedom compared to the uniform higher order approximation are needed. The highest resolution solution suffers from excessive numerical diffusion for orders 0 and 1 which can be remedied for order 1 by using the FORCE flux.

Fig. 4
figure 4

Radial dam break: \(\xi -\xi _{ref}\) difference plots at different time levels

Table 1 Radial dam break: \(L^1\)-error for the free surface elevation, fraction of elements of each discretization order, number of degrees of freedom (dof), and time step used throughout the simulation. Color coding indicates the quality/efficiency (green = good, red = poor)

4.2 Supercritical flow in a constricted channel

In our next benchmark, we consider a supercritical flow in a constricted channel (constriction angle of 5 degrees) with constant bathymetry based on the configuration proposed in Zienkiewicz and Ortiz (1995). Flow is induced through the inflow (bottom) boundary and there is no flow through the left and right boundaries, whereas the radiation boundary conditions are specified at the outflow (upper) boundary. Denoting by \(u_i\) and \(H_i\) the axial velocity and water depth at the inlet, respectively, the inlet Froude number \(\text {Fr}\) defined by \(\text {Fr} ={u_i}/{\sqrt{g\, H_i}} \) is chosen equal to 2.5 corresponding to a supercritical regime.

For better comparability and to avoid instabilities in the p2 approximation, we limit the free surface elevation and velocity components using the same limiting factor \(\alpha _e\) (15) for the non-adaptive cases and plot limited and unlimited results for the adaptive ones. Fig. 5, shows our (topologically) block-structured grid (Zint et al. 2019) with 3 584 elements along with the steady-state solution for different approximations. Whereas a piecewise constant DG approximation is very diffusive, the limited linear and quadratic solutions look much better. Thanks to the adaption scheme, the constant-linear and the constant-quadratic solutions using indicator JRL do not suffer from over- or undershoots while representing the jumps very accurately without introducing too much diffusion.

Fig. 5
figure 5

Supercritical flow in a constricted channel: Free surface elevation. Left to right: Block-structured mesh with 3 584 elements, exact solution projected onto a mesh with approx. 230 000 elements, constant (p0), limited linear (p1), limited quadratic (p2), constant-linear adaptive (p0-1), constant-quadratic adaptive (p0-2) solution with indicator JRL

The robustness and accuracy of our indicator are illustrated in Fig. 6, which details the local approximation order and the used adaption case (cf. Fig. 1). As expected, the constant plateaus are approximated by order 0 while higher orders are only activated in the vicinity of the discontinuities. When comparing the JE and GRE indicators, the unlimited simulations produce large under- and overshoots in jump regions whereas the limited JE and GRE indicators demonstrate acceptable performance, even though the results become somewhat diffusive. For JE, a threshold of 0.005 led to the best results. For GRE, the threshold for the free surface elevation of 0.02 and for the velocity of 0.05 appears to be optimal.

Fig. 6
figure 6

Supercritical flow in a constricted channel. Free surface elevation. Adaption range: Constant-linear (p0-1) or constant-quadratic adaptive (p0-2). Indicator: JRL, JE, JE limited, GRE, GRE limited. Local approximation order \(p \in \{0, 1, 2\}\). Adaption case: see Fig. 1. The z-axis is scaled by the factor 50

Fig. 7
figure 7

Supercritical flow in a constricted channel: \(\xi -\xi _{exact}\) difference plots using the exact solution projected to a mesh with approx. 230 000 elements as the reference

To quantify the performance of different adaptivity indicators, Fig. 7 displays the difference plots for the free surface elevation \(\xi -\xi _{exact}\) using the exact solution (this problem can be solved analytically, see, for example, (Ippen 1951) projected on a high resolution mesh with approx. 230 000 elements as the reference. The constant solution is clearly diffusive, the limited linear and quadratic solutions look reasonable. The darker colors in the unlimited constant-linear solutions indicate over- and undershoots which are absent in the fully limited adaptive solutions and in our adaption scheme with built-in liming. In Table 2, the \(L^1\)-errors with respect to the exact solution are listed next to the fraction of elements with a given approximation order and the total number of degrees of freedom. It can be seen that between 64 % and 82 % of the elements use a constant approximation, and the p-adaptive solutions only need between 45 % and 58 % of the degrees of freedom of the uniformly linear approximation for p0-1 and approx. 63 % for p0-2.

Table 2 Supercritical flow in a constricted channel: \(L^1\)-error for the free surface elevation, fraction of elements of each discretization order, number of degrees of freedom, and time step used throughout the simulation. Color coding indicates the quality/efficiency (green = good, red = poor)

4.3 Circular hydraulic jump

The last test example used to study our adaptivity indicator considers an instationary circular hydraulic jump based on the setup presented in Ketcheson and Quezada de Luna (2022). The domain geometry is a quarter annulus with \(r_0\in (0.1, 1)\) (cf. Fig. 8 (left)) and a constant bathymetry \(h_b=0.1\). The solution to this problem contains constant plateaus, smooth non-constant regions, and discontinuities. There is a constant inflow at \(r_0 = 0.1\), that is, \(\hat{\xi }(r_0=0.1, t) = 0.2\), \(\hat{q}_{\varvec{n}}(r_0=0.1, t) = -0.75\) and \(\hat{q}_{\varvec{\tau }}(r_0=0.1, t) = 0\). Differing from the original setup, we prescribe a land boundary condition at \(r_0 = 1\) causing a reflection. The initial conditions are \(\xi (r_0, 0) = 0\) and \(\varvec{q}(r_0, 0) = \varvec{0}\). The gravity acceleration g is set to 1.

The employed block-structured grid (Zint et al. 2019) consists of 48 000 triangles, and r in Fig. 8 (left) indicates the line along which the free surface elevation is plotted in Fig. 11. In the middle and right panels of Fig. 8, the p0 high-resolution solution at \(t=0.5\,\mathrm {s}\) and \(t=1.3\,\mathrm {s}\) on a mesh with 768 000 elements is displayed, which serves as a reference in the following evaluation.

Fig. 8
figure 8

Hydraulic jump: Computational mesh with the line ‘r’ indicating the position of the line plot in Fig. 11 (left), piecewise constant (p0) reference solution on the twice refined mesh for \(t=0.5\,\mathrm {s}\) (middle), and for \(t=1.3\,\mathrm {s}\) (right)

To avoid instabilities at the inflow, we limit the free surface elevation and velocity components with the same limiting factor \(\alpha _e\) (15) for all cases except for the JRL indicator. The JRL indicator automatically reduces the local approximation order on critical elements at the inflow boundary to zero making limiting unnecessary there. The JE and the GRE indicators fail to detect the need for the lowest approximation order on critical elements therefore only the limited solution is stable.

The p-adaptive solution with different indicators is shown for \(t=0.5\,\mathrm {s}\) (Fig. 9a) and \(t=1.3\,\mathrm {s}\) (Fig. 9b), respectively. For indicator JE, a threshold of 0.00001 led to the best results. For indicator GRE, the threshold for the free surface elevation of 0.6 and for the velocities of 1.0 resulted in the best simulation outcome.

Fig. 9
figure 9

Hydraulic jump: Free surface elevation at different time levels. Adaption range: Constant-linear (p0-1) or constant-quadratic adaptive (p0-2). Indicator: JRL, JE limited, GRE limited. Local approximation order \(p \in \{0, 1, 2\}\). Adaption case: see Fig. 1

The difference plots for \(\xi - \xi _{ref}\) in Fig. 10 confirm that our indicator is better at capturing jumps than the two other indicators as pointed out by lighter colors in the corresponding regions. Furthermore, a linear approximation for the drop-off after the inflow is correctly identified. The \(L^1\)-errors with respect to the reference solution are shown in Table 3 along with the fraction of elements with a given approximation order. All adaptive cases approximately match the \(L^1\)-errors of the uniformly linear solution for \(t=0.5\,\mathrm {s}\) while using fewer than 44 % of the degrees of freedom for p0-1 and approx. 50 % for p0-2. The \(L^1\)-errors of the adaptive solutions (except for the JE indicator) are even lower for \(t=1.3\,\mathrm {s}\) when compared to the uniformly linear approximation and utilize up to approx. 60 % of degrees of freedom for the p0-1 approximation and approx. 78 % for the p0-2 approximation.

Fig. 10
figure 10

Hydraulic jump: \(\xi -\xi _{ref}\) difference plots at different time levels

Table 3 Hydraulic jump: \(L^1\)-error for the free surface elevation, fraction of elements of each discretization order, number of degrees of freedom, and time step used throughout the simulation. Color coding indicates the quality/efficiency (green = good, red = poor)

Figure 11 displays the free surface elevation along the line ’r’ highlighted in Fig. 8 (left). Noting that \(h_b\)=0.1, it shows a very good agreement with Fig. 7 from (Ketcheson and Quezada de Luna 2022). The zoom-ins also demonstrate a better accuracy of the approximations using the JRL indicator in the jump regions compared to those relying on the JE or GRE indicator with limiting.

Fig. 11
figure 11

Hydraulic jump: Free surface elevation along the line (0.1,0.1) to (1/\(\sqrt{2}\),1/\(\sqrt{2}\)) (cf. the gray line in Fig. 8 marked as ’r’) for \(t=0.5\,\mathrm {s}\) (top) and \(t=1.3\,\mathrm {s}\) (bottom)

5 Conclusions and outlook

We presented a parameter-free adaptivity indicator and validated its ability to identify resolved and underresolved areas of the computational domain while distinguishing between smooth and non-smooth solution regions and adapting the local approximation order accordingly. On a range of numerical examples, our new indicator demonstrated comparable or better solution quality than that of a limited uniformly linear approximation. These results were achieved without any calibration parameters while using significantly fewer degrees of freedom. A comparison to two further indicators suggests that our new scheme achieves a good balance between solution quality and number of degrees of freedom used.

This work is an integral part of our ongoing effort to produce a p-adaptive DG solver for the SWE capable of efficient utilization of massively parallel and heterogeneous computing architectures (Faghih-Naini et al. 2023). It bundles a number of other numerical and computational techniques such as a novel block-structured grid generator for ocean domains (Zint et al. 2019, 2022; Faghih-Naini et al. 2022) and the code generation framework (Lengauer et al. 2014; Schmitt et al. 2014a; Kuckuk and Köstler 2016). Our future research goals include adding the FPGA support—already available for the unstructured-mesh version of the model (Kenter et al. 2021; Faj et al. 2023)—and implementing semi-implicit time stepping schemes based on the hierarchical scale separation (HSS) approach (Aizinger et al. 2015; Schütz and Aizinger 2017) related to the p-multigrid method.