1 Introduction

Discontinuous Galerkin methods [8] are a popular tool to design numerical schemes for hyperbolic systems of conservation laws [12]

$$\begin{aligned} {\frac{\partial {u} }{\partial {t}}} + {\frac{\partial {f(u)} }{\partial {x}}} = 0 \quad \text {for} \quad u(x, t): \mathbb {R}\times \mathbb {R}\rightarrow \mathbb {R}^m, \quad f:\mathbb {R}^m \rightarrow \mathbb {R}^m. \end{aligned}$$
(1)
Table 1 Notation used

An intriguing feature of DG methods is their ability to transfer the definition of a weak solution to a hyperbolic conservation law [22]

$$\begin{aligned} \begin{aligned} \forall \varphi \in \mathscr {D}&: \int _0^\infty \int _\mathbb {R}\left( u(x, t)\big )\cdot \big ( {\frac{\partial {\varphi (x, t)} }{\partial {t}}}\right)&+ \left( f \circ u(x, t)\big )\cdot \big ( {\frac{\partial {\varphi (x, t)} }{\partial {x}}}\right) \textrm{d}x \textrm{d}t \\&\quad + \int _\mathbb {R}\left( u(x, 0)\big )\cdot \big (\varphi (x, 0)\right) \textrm{d}x = 0 \end{aligned} \end{aligned}$$
(2)

to the semidiscrete level [2, 3, 7]. Using a method of lines approach this leads to the set of equations

$$\begin{aligned} \forall Z \in \mathscr {Z}, \varphi \in \mathscr {D}: \left\langle {\varphi }, {{\frac{\partial {u^Z} }{\partial {t}}}} \right\rangle _Z + \left[ {\varphi }, {f} \right] _Z - \left\langle {{\frac{\partial {\varphi } }{\partial {x}}}}, {f} \right\rangle _Z = 0 \end{aligned}$$

for every cell \(Z \in \mathscr {Z}\) of a subdivision \(\mathscr {Z}\) of the domain. Here, the inner products

$$\begin{aligned} \left\langle {u}, {v} \right\rangle _Z = \int _Z \left( u(x)\big )\cdot \big (v(x)\right) \textrm{d}x, \quad \left[ {u}, {v} \right] _Z = \int _{\partial Z} \left( u(x)\big )\cdot \big (v(x)\right) n\textrm{d}O \end{aligned}$$

stand in for the volume and surface integrals of the semidiscrete weak form. Consult also Table 1 for a complete overview of the notation used in this work. The solution u(xt) is approximated in every cell by

$$\begin{aligned} u^Z(x, t) = \sum _{k=1}^p c_k^Z(t) \varphi _k^Z(x) \in V^Z \end{aligned}$$

out of a finite dimensional space of ansatz functions \(V^Z\). No continuity between cells is assumed, hence the name Discontinuous Galerkin method. The inner products can be represented on these finite-dimensional spaces via the Gramian matrices

$$\begin{aligned} M^Z_{k, l} = \left\langle {\varphi _k^Z}, {\varphi _l^Z} \right\rangle _{Z}, \quad S_{k, l}^Z = \left\langle {{\frac{\partial {\varphi _k^Z} }{\partial {x}}}}, { \varphi _l^Z} \right\rangle _Z. \end{aligned}$$

If we select \(p + 1\) nodes in every cell Z and the functions \(\varphi _k^Z\) are a nodal basis of the space \(V^Z\) then the resulting matrices are also approximations to these inner products when the functions do not stem from the space \(V^Z\). Evaluating these approximations is equivalent to interpolating the point values of the function u in the space \(V^Z\) as \({{\,\mathrm{\mathbb {I}}\,}}_{V^Z} u\) and evaluating \(\left\langle {{{\,\mathrm{\mathbb {I}}\,}}_{V^Z} u}, {{{\,\mathrm{\mathbb {I}}\,}}_{V^Z} v} \right\rangle _Z\). Using this approximation of the inner products by point evaluations results in the matrix vector form

$$\begin{aligned} M^Z {\frac{ \textrm{d}u^Z }{ \textrm{d}t}} = S^Z f(u^Z(t)) - \begin{pmatrix} \varphi ^Z_1(x_r)f^*_r-\varphi ^Z_1(x_l) f^*_l \\ \vdots \\ \varphi ^Z_N(x_r) f^*_r - \varphi ^Z_N(x_l) f^*_r\end{pmatrix}. \end{aligned}$$
(3)

This scheme can be evaluated by multiplying with the inverse of the mass matrix \(M^Z\). In equation (3) several flux values are marked with an asterisk because they lie on the boundary of the cell Z. Discontinuities in the ansatz functions imply the possibility of discontinuities in the fluxes between different cells. Conservation of our conserved quantities is only ensured if the outflow flux of one cell is equal to the inflow of the adjacent cell. A sufficient condition for this is that the fluxes at the adjacent edges match, and we therefore swap the evaluation of flux function at the Ansatz polynomial for numerical fluxes

$$\begin{aligned} f^* = f^{\textrm{num}}\left( \lim _{x \uparrow x_{k + \frac{1}{2}}} u^Z(x, t), \lim _{x \downarrow x_{k + \frac{1}{2}}} u^{{\tilde{Z}}}(x, t) \right) \end{aligned}$$
(4)

when flux values at the cell edges are needed. In our case the family of HLL fluxes [14, 15, 26, 58, 59] will be used for this purpose. Sadly, the constructed schemes lack robustness and stability in the high order case and some stabilization measures and robustness enhancements are needed. Popular are overintegration, flux-differencing, modal filtering, sub-cells and (W)ENO recoveries [3, 19, 21, 40, 44, 51, 62]. In this publication the procedure first presented in [30] will be refined, connections to some other stabilization techniques will be shown, and the technique will be tested on a catalog of problems including the Buckley–Leverett equation and the Euler system of conservation laws. Additional admissibility conditions are needed as weak solutions in the sense of (2) are non-unique. An entropy [35] is a convex functional \(U:\mathbb {R}^m \rightarrow \mathbb {R}\) satisfying

$$\begin{aligned} {\frac{ \textrm{d}f }{ \textrm{d}u}} {\frac{ \textrm{d}U }{ \textrm{d}u}} = {\frac{ \textrm{d}F }{ \textrm{d}u}} \end{aligned}$$

in conjunction with an entropy flux function \(F:\mathbb {R}^m \rightarrow \mathbb {R}\). Assume the solution u is the strong \(\textrm{L}^p\) limit of solutions \(u_\varepsilon \) satisfying a viscosity regularization

$$\begin{aligned} {\frac{\partial {u(x, t)} }{\partial {t}}} + {\frac{\partial {f(u(x, t))} }{\partial {x}}} = \varepsilon \nabla ^2 u \end{aligned}$$

of the system of conservation laws (1). Such a solution u is called a vanishing viscosity solution [35]. One can show that for a vanishing viscosity solution u and any entropy pair (FU) holds

$$\begin{aligned} {\frac{\partial {U(u(x, t))} }{\partial {t}}} + {\frac{\partial {F(u(x, t))} }{\partial {x}}} \le 0 \end{aligned}$$
(5)

in the sense of distributions [35]. If the solution is smooth one can even show

$$\begin{aligned} {\frac{\partial {U(u(x, t))} }{\partial {t}}} + {\frac{\partial {F(u(x, t))} }{\partial {x}}} = 0. \end{aligned}$$

Apart from the hope that the numerical counterparts of entropy admissibility criteria will enforce convergence to a physically relevant solution a numerical entropy inequality will also increase robustness [45]. Counterexamples for the hoped uniqueness of such an entropy solution exist for the multidimensional isentropic [4,5,6] and full Euler equations [18]. The method in [30] is based on the entropy rate admissibility criterion [9, 10, 17]. This criterion states that the total entropy

$$\begin{aligned} E_u(t) = \int U(u(x, t)) \textrm{d}x \end{aligned}$$

of the selected weak solution u should reduce faster than the entropy of any other existing weak solution \({\tilde{u}}\) with the same initial data

$$\begin{aligned} \forall t > 0: \quad {\frac{ \textrm{d}E_u }{ \textrm{d}t}} \le {\frac{ \textrm{d}E_{{\tilde{u}}} }{ \textrm{d}t}}. \end{aligned}$$

Usage of this criterion can be motivated from direct examples, the fact that it implies the classical entropy inequality for self similar solutions to Riemann initial data of small amplitude [9] and for piecewise smooth solutions of scalar conservation laws [10]. One can show for Lagrangean gas dynamics under assumptions on the value of the adiabatic exponent \(\gamma \) that it is not always equivalent to the entropy inequality [27]. Our hope is that at least the physical entropy combined with the entropy rate criterion singles out physically relevant weak solutions for the (multidimensional) Euler equations. A numerical approximation of the total entropy can be defined as

$$\begin{aligned} \begin{aligned} E_{u, Z}(t)&= \int _Z U\left( u^Z(x, t)\right) \textrm{d}x \approx \sum _{k} \omega ^Z_k U\left( u^Z(x_k, t)\right) =: E^Z(t) \\ \quad E_{u}&= \sum _{Z \in \mathscr {Z}} E_{u, Z} \approx \sum _{Z \in \mathscr {Z}} E^Z \end{aligned} \end{aligned}$$
(6)

via a (positive) quadrature rule \(\omega ^Z_k\) on each cell \(Z \in \mathscr {Z}\). The numerical enforcement of the criterion with respect to such a definition of the discrete entropy happened in [30] in three steps:

  • Calculate the time derivative of the ansatz function \({\frac{ \textrm{d}{\tilde{u}}^Z }{ \textrm{d}t}}\) using a DG scheme.

  • Calculate an error prediction \(\delta ^Z\) for \({\frac{ \textrm{d}{\tilde{u}}^Z }{ \textrm{d}t}}\) on Z, i.e. \(\left| \left| {\frac{ \textrm{d}{\tilde{u}}^Z }{ \textrm{d}t}} - {\frac{\partial {u} }{\partial {t}}}\right| \right| _Z \le \delta ^Z\).

  • Correct the time derivative into the direction of the steepest entropy descent

    $$\begin{aligned} {\frac{ \textrm{d}u^Z }{ \textrm{d}t}} = {\frac{ \textrm{d}{\tilde{u}}^Z }{ \textrm{d}t}} + \frac{\delta ^Z}{\left| \left| h^Z\right| \right| _Z} h^Z, \end{aligned}$$

    where h shall be the steepest descent direction that does not change the average value in cell Z.

While the approach above is successful for scalar conservation laws [30] significant improvements can be made by introducing two refinements. The first one concerns the usage of an error indicator to estimate the entropy correction needed. We will instead show that it is possible to directly give bounds on how dissipative a weak solution can be. This will eliminate the need for the error indicator while allowing a faster convergence, because the derived bounds converge to zero significantly faster in the smooth case. A second refinement concerns the direction used for the entropy correction. DG methods can make use of modal filtering to remove unwanted high frequency modes from the solution [44]. These filters can be sometimes expressed as viscosity. We will devise correction directions that at the same time dissipate entropy and filter the solution from unwanted oscillations and thereby combine the dissipation and filtering.

Our schemes will therefore follow the slightly different general layout of:

  • Calculate a time derivative for the ansatz function \({\frac{ \textrm{d}{\tilde{u}}^Z }{ \textrm{d}t}}\) using a DG scheme.

  • Estimate the highest possible entropy dissipation speed \(\sigma ^Z\) in cell Z.

  • Calculate the correction direction \(\upsilon ^Z\).

  • Calculate the size \(\lambda ^Z\) of the correction needed to achieve that

    $$\begin{aligned} {\frac{\partial {u^Z} }{\partial {t}}} = {\frac{\partial {{\tilde{u}}^Z} }{\partial {t}}} + \lambda ^Z \upsilon ^{Z} \end{aligned}$$

    satisfies

    $$\begin{aligned} {\frac{ \textrm{d}E_{u, Z} }{ \textrm{d}t}} = \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {{\frac{\partial {{\tilde{u}}^Z} }{\partial {t}}} + \lambda ^Z \upsilon ^Z} \right\rangle _Z \le \sigma ^Z + F^*_l - F^*_r \end{aligned}$$
    (7)

    with the dissipation \(\sigma ^Z\) mandated by the estimate. \(F^*\) shall be a numerical entropy flux [52,53,54].

The procedure makes only use of the fact that in our cells there exist local ansatz functions. It is therefore also applicable to similar schemes like the spectral volume (SV) method [60]. The only difference would lie in the evaluation of a different scheme for the uncorrected derivative \({\frac{ \textrm{d}{\tilde{u}} }{ \textrm{d}t}}\). One complication is brought in by the fact that entropy dissipation implies the non-smoothness of the solution, as otherwise the entropy equality applies. Therefore, dissipation cannot happen in cells in the continuous setting, as polynomials are smooth. Instead, dissipation is a process taking place at the cell edges were our different ansatz functions transition. As we are not correcting the numerical fluxes used between cells, dissipation will be centered in cells and not at cell edges, and we will show in Sect. 3.1 how to work around this problem.

2 Entropy Inequality Predictors

2.1 Bounds for Entropy and Entropy Dissipation

Our main tool to approximate the most dissipative weak solution using a DG method will be a bound on the derivative of the total entropy. We will derive a lower bound for the entropy dissipation

$$\begin{aligned} s^\theta _u(t_1, t_2) = \int _{t_1}^{t_2}\int _{\theta } {\frac{\partial {U} }{\partial {t}}} + {\frac{\partial {F} }{\partial {x}}} \textrm{d}x \textrm{d}t \le 0. \end{aligned}$$

Here \(\theta \subset \varOmega \) shall be an arbitrary open subdomain of the complete domain. This value has to be smaller than zero for a solution that is admissible with respect to the classical entropy inequality (5). Further, we are interested in the entropy dissipation speed

$$\begin{aligned} \sigma ^\theta _u(t) = {\frac{\partial {s^\theta _u(t_0, t)} }{\partial {t}}} = \int _{\theta } {\frac{\partial {U} }{\partial {t}}} + {\frac{\partial {F} }{\partial {x}}} \textrm{d}x \le 0. \end{aligned}$$

If \(s^\theta _u\) and \(\sigma ^\theta _u\) are known one can estimate the total entropy \(E_u(t)\) and the time derivative of the total entropy \({\frac{ \textrm{d}E_u }{ \textrm{d}t}}\) as

$$\begin{aligned} E_u(t) {- E_u(0)} \ge \sum _{\theta \in \varTheta } s^\theta _u(0, t), \quad {\frac{ \textrm{d}E_u }{ \textrm{d}t}} \ge \sum _{\theta \in \varTheta } \sigma ^\theta _u, \end{aligned}$$
(8)

when \(\varTheta = \{ \theta _1, \theta _2, \theta _3, \dots , \theta _L\}\) is overlapping \(\varOmega \) in the sense of

$$\begin{aligned} \varOmega \subset \bigcup _{\theta \in \varTheta } \theta . \end{aligned}$$

Please note that the inflow into and outflow of entropy can influence the total amount of entropy, but we are only interested in infinite domains without boundaries or finite domains with periodic boundary conditions for the theoretical part of this publication. Otherwise the entropy inflow and outflow would influence equation (8). Inflow and outflow do not influence \(s^\theta _u\) as it only accounts for entropy dissipation.

To achieve our goal of estimating \(s^\theta \) we will view the problem in the setting of classical Finite-Volume schemes [50] and go over to the limit \(\varDelta x \rightarrow 0\). In [11] it was shown that for scalar conservation laws the flux f of the solution to the Riemann problem \(u_\text {R}(u_l, u_r; x/t)\) at \(x/t = 0\) is given by

$$\begin{aligned} f\left( u_\text {R}(u_l, u_r; 0)\right) = f\left( {{\,\textrm{argmin}\,}}_{u \in {{\,\textrm{ch}\,}}(u_l, u_r)} \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}} (u_r) - {\frac{ \textrm{d}U }{ \textrm{d}u}} (u_l)}, {f(u)} \right\rangle \right) , \end{aligned}$$

i.e. by entering the value of u from the convex hull

$$\begin{aligned} {{\,\textrm{ch}\,}}(u_l, u_r) = \left\{ \lambda u_l + (1- \lambda ) u_r ~\bigg |~ \lambda \in [0,1] \right\} \end{aligned}$$

into f that when entered into the flux yields the fastest entropy dissipation. This identity is in general wrong for systems as the existence of a counterexample shows [42, Theorem 9.8]. In [31] it was shown that some approximate Riemann solvers, for example the local Lax-Friedrichs flux, can be also interpreted as approximate solutions to such variational descriptions of two-point fluxes. While the aforementioned results hold for semidiscrete schemes the new results below are fully discrete and aim at three point first order Finite-Difference/Finite-Volume schemes for systems of conservation laws. As one assumes piecewise constant functions in those first order methods any quadrature exact for constants will yield the same result in equation (6). As we only look at discrete time values in this part of the publication we will write \(E^n_u = E_u(t_n)\) for the discrete total entropy at time level n.

Lemma 1

Let a system of hyperbolic conservation laws in conservation form and a strictly convex entropy pair (UF) be given that is approximated by an explicit Finite-Volume scheme with grid constant \(\lambda = \frac{\varDelta t}{\varDelta x}\). Assume there exists a conservative consistent three-point scheme with minimal (most negative) entropy rate for the discrete entropy

$$\begin{aligned} E^n_u = \sum _{k} U(u^n_k) \varDelta x. \end{aligned}$$

Then this scheme has to be the original Lax-Friedrichs scheme.

Proof

Assume \(f(u_k, u_{k+1})\) is a consistent numerical two-point-flux minimizing the total entropy with maximal rate and let \(u_l, u_r\) be arbitrary in the domain of admissible values for the conserved variables. We apply a scheme using this flux to a Riemann problem, i.e. the initial data

$$\begin{aligned} u^0_{k} = {\left\{ \begin{array}{ll} u_l &{} k \le 0 \\ u_r &{} k > 0 \\ \end{array}\right. } \end{aligned}$$

to query the flux value \(f(u_l, u_r) = f(u_0, u_1) = f_{\frac{1}{2}}\) by analyzing the solution. As the flux is consistent it holds

$$\begin{aligned} \forall k < 0: f\left( u_{k}, u_{k+ 1}\right) = f_{k + \frac{1}{2}} = f(u_l), \quad \forall k > 0: f\left( u_{k}, u_{k+ 1}\right) = f_{k + \frac{1}{2}} = f(u_l). \end{aligned}$$

The scheme

$$\begin{aligned} u^1_k = u^0_k + \lambda \left( f_{k - \frac{1}{2}} - f_{k + \frac{1}{2}}\right) \end{aligned}$$

therefore implies that \(u^1_k = u^0_k\) for all \(k \not \in \{0, 1\}\). The total entropy

$$\begin{aligned} E^1_u = E^0_u - \varDelta x \left( U\left( u^0_0\right) + U\left( u^0_1\right) \right) + \varDelta x \left( U\left( u^1_0\right) + U\left( u^1_1\right) \right) \end{aligned}$$

is minimized by \(u^1_0 = u^1_1\), as U is strictly convex. Assume \(u^1_0 \ne u^1_1\) holds. In this case the strict form of Jensen’s inequality

$$\begin{aligned} 2 U\left( \frac{u_0^1 + u_1^1}{2}\right) < \frac{2}{2} \left( U\left( u_0^1\right) + U\left( u_1^1\right) \right) \end{aligned}$$

would imply that the total entropy could be reduced by setting \(u_0^1 = u_1^1\) using averaging. Entering this into the scheme’s definition with \(u^0_0 = u_l\) and \(u^0_1 = u_r\) implies

$$\begin{aligned} u_l + \lambda (f(u_l) - f(u_l, u_r)) = u_r + \lambda (f(u_l, u_r) - f(u_r)). \end{aligned}$$

Rearranging for \(f(u_l, u_r)\) shows

$$\begin{aligned} f(u_l, u_r) = \frac{f(u_l) + f(u_r) }{2} + \frac{u_l - u_r}{2 \lambda }, \end{aligned}$$

and this is the classical Lax-Friedrichs flux. \(\square \)

This result shows that the classical LF scheme is the most direct realisation of a scheme satisfying Dafermos’ entropy rate criterion and therefore justifies the use of the LF scheme in [29] as the most dissipative explicit three-point scheme possible for systems of conservation laws for fixed grid constant \(\lambda = \frac{\varDelta t}{\varDelta x}\). Similar results are also known for scalar conservation laws. Tadmor showed in [52, 53] that every monotonicity preserving scheme, that can be written in averaging form and satisfies classical numerical entropy inequalities, has a viscosity coefficient less or equal to that of his modified Lax-Friedrichs (MLF) scheme, and higher or equal than the viscosity coefficient of Godunov’s scheme. The LF scheme was modified in that case

$$\begin{aligned} u^{n+1}_k = \frac{u^n_{k-1} + 2 u^n_k + u^n_{k+1} }{4} + \frac{\varDelta t}{\varDelta x} \left( f(u^n_{k-1}) - f(u^n_{k+1}) \right) \end{aligned}$$
(9)

to be in an averaging form, i.e. can be written as

$$\begin{aligned} \begin{aligned} u^{n+1}_k&= \frac{u^{n+1}_{k-\frac{1}{4}} + u^{n+1}_{k + \frac{1}{4}} }{2}, \quad f^{\textrm{MLF}}(u_l, u_r) = \frac{f(u_l) + f(u_r)}{2} + \varDelta x \frac{u_l - u_r}{4 \varDelta t} \\ u^{n+1}_{k-\frac{1}{4}}&= u^n_k + \frac{\varDelta t}{\varDelta x / 2} \left( f^{\textrm{MLF}}(u^n_{k-1}, u^n_k) - f(u^n_{k}) \right) \\ u^{n+1}_{k+\frac{1}{4}}&= u^n_k + \frac{\varDelta t}{\varDelta x / 2}\left( f(u^n_{k}) - f^{\textrm{MLF}}(u^n_{k}, u^n_{k + 1}) \right) . \end{aligned} \end{aligned}$$
(10)

This scheme can be devised from the Lax-Friedrichs scheme by applying it to a grid with half the cell size, and averaging after each timestep two adjacent cells. In a more general form we will speak of schemes in averaging form if the cell update can be written in a two-step process—first the calculation of subcell values at the next time layer

$$\begin{aligned} \begin{aligned} u^{n+1}_{k - \frac{1}{4}}&= u^n_k + \frac{\varDelta t}{\varDelta x / 2} \left( f^{\textrm{MLF}}(u^n_{k-1}, u^n_k) - f(u^n_{k}) \right) \\ u^{n+1}_{k+\frac{1}{4}}&= u^n_k + \frac{\varDelta t}{\varDelta x / 2}\left( f(u^n_{k}) - f^{\textrm{MLF}}(u^n_{k}, u^n_{k + 1}) \right) . \end{aligned} \end{aligned}$$
(11)

As a second step the averaging of these two states into the next time-layer

$$\begin{aligned} u^{n + 1}_{k} = \frac{u^{n+1}_{k-\frac{1}{4}} + u^{n+1}_{k + \frac{1}{4}}}{2}. \end{aligned}$$
(12)

A suitable total entropy for the intermediate subgrid states would be

$$\begin{aligned} E^{n+1}_u = \sum _k \frac{\varDelta x}{2} \left( U\left( u^{n+1}_{k - \frac{1}{4}}\right) + U\left( u^{n+1}_{k + \frac{1}{4}}\right) \right) . \end{aligned}$$
(13)

This total entropy is obviously dissipated by the primary grid averaging of equation (12). We can use this reformulation to state the following result.

Lemma 2

The modified Lax-Friedrichs scheme is the most dissipative scheme with respect to the total entropy (13) that is also conservative, consistent, three point and can be written in averaging form.

Proof

Let \(u^n\) be an arbitrary initial condition. Then the total entropy \(E^{n+1}\) before averaging takes place can be written as

$$\begin{aligned} E^{n+1}_u = \sum _k \frac{\varDelta x}{2}\left( U\left( u^{n+1}_{k + \frac{1}{4}}\right) + U\left( u^{n+1}_{k + \frac{3}{4}}\right) \right) = \sum E_{k+ \frac{1}{2}} \end{aligned}$$
(14)

where the terms \(E_{k + \frac{1}{2}}\) are determined by the fluxes \(f_{k + \frac{1}{2}}\). The minimum is achieved for \(u^{n+1}_{k + \frac{1}{4}} = u^{n+1}_{k + \frac{3}{4}}\) and rearranging yields, as in the proof of lemma 1 above,

$$\begin{aligned} f_{k + \frac{1}{2}} = \frac{f(u^n_k) + f(u^n_{k+1})}{2} + \frac{\varDelta x}{4 \varDelta t} \left( u^n_{k} - u^n_{k+1}\right) . \end{aligned}$$
(15)

\(\square \)

Our result can be seen as a generalization of the MLF part of Tadmors result to systems of conservation laws, as it states that the MLF flux is the most dissipative flux for a selected time-step size concerning the sub-cell entropy.

Using the scheme above one can derive estimates for the highest possible entropy dissipation in a time-step. The difference quotient of this result and the time step size approximates the lowest possible derivative of the total entropy with respect to time.

Corollary 1

The biggest possible entropy dissipation during a discrete time-step of a Finite-Volume scheme with grid constant \(\lambda = \frac{\varDelta t}{\varDelta x}\) is given by the difference

$$\begin{aligned} E^{n+1} - E^{n} = \sum _k U\left( \frac{u_{k -1}^n + u_{k +1}^n}{2} + \lambda \left( f\left( u_{k-1}^n\right) + f\left( u_{k+1}^n\right) \right) \right) \varDelta x - U\left( u_{k}^n\right) \varDelta x \end{aligned}$$

and an approximation to the total entropy’s minimal derivative by

$$\begin{aligned} {\frac{ \textrm{d}E }{ \textrm{d}t}} \approx \sum _k \frac{U\left( \frac{u_{k -1} + u_{k +1}}{2} + \lambda (f(u_{k-1}) + f(u_{k+1}))\right) - U(u_{k})}{\lambda }. \end{aligned}$$

The second estimate above degenerates for \(\lambda \rightarrow 0\) as the difference in entropy is in general finite between cells \(u_{k-1}, u_k, u_{k +1}\). Similar problems exist for first generation central schemes and the LF scheme can be interpreted as a central scheme using piecewise constant recovery [34]. As the estimate above is only a lower bound this divergence for \(\lambda \rightarrow 0\) is not a paradox if one can find a sharper bound from below. A second, more refined, estimate is given by the following lemma based on the ideas from [26] and does not have these deficiencies.

Lemma 3

Given bounds on the fastest signal speed to the left \(a_l\) and the highest signal speed to the right \(a_r\) let \(R \ge \max (\left| a_l\right| , \left| a_r\right| )\). The maximum entropy dissipation of a Riemann problem solution on the interval \(\theta = (-R, R)\) is bounded from below by

$$\begin{aligned} E^{\theta }_u(t) - E^{\theta }_u(0) \ge t \left( (a_r - a_l)U\left( u_{lr} \right) +a_l U(u_l) - a_rU(u_r) \right) \end{aligned}$$

with

$$\begin{aligned} u_{lr} = \frac{a_r u_r - a_l u_l+ f(u_l) - f(u_r)}{a_r - a_l}. \end{aligned}$$

The rate is bounded from below by

$$\begin{aligned} \begin{aligned} {\frac{ \textrm{d}E^{\theta } }{ \textrm{d}t}}|_{t = 0} \ge (a_r - a_l) U(u_{lr}) + a_l U(u_l) - a_r U(u_r) \end{aligned}. \end{aligned}$$

The entropy dissipation is bounded by

$$\begin{aligned} s^\theta \ge t \left( (a_r - a_l) U(u_{lr}) + a_l U(u_l) - a_r U(u_r) +F(u_l) - F(u_r) \right) , \end{aligned}$$

and its rate by

$$\begin{aligned} \sigma ^\theta \ge (a_r - a_l) U(u_{lr}) + a_l U(u_l) - a_r U(u_r) +F(u_l) - F(u_r). \end{aligned}$$

Proof

The entropy of the initial condition in the interval \([-R, R]\) is given by

$$\begin{aligned} \int _{-R}^R U(u(x, 0)) \textrm{d}x = R U(u_l) + RU(u_r) \end{aligned}$$

for any \(R > 0\). Integrating over the triangle \(T = {{\,\textrm{ch}\,}}\{(0, 0), (a_l, 1), (a_r, 1)\}\) in spacetime and using the conservation law yields

$$\begin{aligned} \begin{aligned} 0&= \int _T {\frac{\partial {u} }{\partial {t}}} + {\frac{\partial {f} }{\partial {x}}} \textrm{d}V(x, t) = \int _{\partial T} \begin{pmatrix} f \\ u \end{pmatrix} \cdot n \textrm{d}O(x, t) \\&= \int _{a_l}^{a_r} \begin{pmatrix} f(u) \\ u(x, 1) \end{pmatrix} \cdot \begin{pmatrix} 0 \\ 1 \end{pmatrix} \textrm{d}x + \int _0^1 \begin{pmatrix} f(u) \\ u(t a_l, t) \end{pmatrix} \cdot \begin{pmatrix} -1 \\ a_l \end{pmatrix} \textrm{d}t + \int _0^1 \begin{pmatrix} f(u) \\ u(t a_r, t) \end{pmatrix} \cdot \begin{pmatrix} 1 \\ -a_r \end{pmatrix} \textrm{d}t \\&= (a_r - a_l) u_{lr} + a_l u_l - a_r u_r + f(u_r) - f(u_l), \end{aligned} \end{aligned}$$

in conjunction with the Gauß divergence theorem, cf. Fig. 1. Here \(u_{lr}\) shall denote the mean value of u(x, 1) on \([a_l, a_r]\) and is

$$\begin{aligned} u_{lr} = \frac{a_r a_l - a_l u_l + f(u_l) - f(u_r)}{a_r - a_l}, \end{aligned}$$

as apparent from the calculation above. Jensen’s inequality implies

$$\begin{aligned} \begin{aligned} t(a_r - a_l)U(u_{lr})&= t(a_r - a_l) U \left( \frac{1}{t(a_r - a_l)} \int _{ta_l}^{ta_r} u(x, t) \textrm{d}x \right) \\&\le \frac{t(a_r - a_l)}{t(a_r - a_l)} \int _{ta_l}^{ta_r} U(u(x, t)) = E^{( ta_l, ta_r)}_u(t). \end{aligned} \end{aligned}$$
(16)

Therefore it follows

$$\begin{aligned} \begin{aligned} E^\theta (1) - E^\theta (0) \ge&(a_r - a_l) U(u_{lr}) + (R - a_r) U(u_r) + (R + a_l) U(u_l) \\&- R (U(u_l) + U(u_r)) \\ =&(a_r - a_l) U(u_{lr}) + a_l U(u_l) - a_r U(u_r) \end{aligned} \end{aligned}$$
(17)

for the entropy dissipation between \(t=0\) and \(t=1\) and using the invariance under transformations \((x, t) \mapsto (\mu x, \mu t)\) for \(\mu > 0\) yields

$$\begin{aligned} \begin{aligned} {\frac{ \textrm{d}E }{ \textrm{d}t}}|_{t = 0} \ge (a_r - a_l) U(u_{lr}) + a_l U(u_l) - a_r U(u_r) \end{aligned} \end{aligned}$$
(18)

for the rate. To calculate the entropy dissipation \(s^\theta \) and its speed \(\sigma ^\theta \) we just have to account for the entropy flowing in and out of the intervall \(\theta \) using the entropy flux F. This is possible as u is constant to the left of \((ta_l, t)\) and to the right of \((t a_r, t)\). \(\square \)

Fig. 1
figure 1

Layout of the integration areas in the proof (blue). As originally used in [26]. To the left and right of the lines \({\frac{ \textrm{d}x }{ \textrm{d}t}} = a_l\) and \({\frac{ \textrm{d}x }{ \textrm{d}t}} = a_r\) the initial condition is unaltered

The estimate above does not depend on any grid constant, and reduces to the previous one for \(-a_l = c_\text {max} = a_r\), \(\lambda c_\text {max} = 1\), and this is the CFL condition for the classical Lax-Friedrichs scheme, i.e. both estimates are compatible.

A Godunov type scheme using the HLL approximate Riemann solver is also compatible with the estimate above. The discrete total entropy after one time-step is still less or equal than the bound given above.

Fig. 2
figure 2

A set of noninteracting HLL approximate Riemann solutions

Let \(\lambda c_{\text {max}} \le \frac{1}{2}\) hold implying that the Riemann problems do not interact and \(u^\textrm{HLL}(x, t_{n+1})\) be the picewise constant solution of the HLL solver as in Fig. 2, but not averaged over the cells, while \(u^{n+1}_k\) shall be the corresponding cell averages. In this case the total discrete entropy at the next time-step is given by

$$\begin{aligned} \begin{aligned} E^{n+1}_{\textrm{FV}} =&\sum _k \varDelta x U(u_k^{n+1}) \le \sum _k \frac{\varDelta x}{\varDelta x} \int _{x_{k-\frac{1}{2}}}^{x_{k + \frac{1}{2}}} U(u^{HLL}(x, t)) \textrm{d}x\\ =&\int _\varOmega U(u^{HLL}(x, t^{n+1})) \textrm{d}x \le E^{n+1}_{u^{\textrm{HLL}}}. \end{aligned} \end{aligned}$$

Therefore, the discrete entropy of the approximate solution is lower than the entropy of any exact weak solution. The next subsection will move beyond first order schemes by generalizing this lower bound to one that also allows smooth solutions instead of piecewise constant ones. To complete our predictor we need formulas for the speeds \(a_l\) and \(a_r\). A big set of literature exists concerning these speeds and the first ones to our knowledge are the formulas given by Einfeld [15] and Davis [14]. The second formula given by Davis for the Euler equations is

$$\begin{aligned} a_l = \min (v_l - c_l, v_r - c_r), \quad a_r = \max (v_l + c_l, v_r + c_r). \end{aligned}$$
(19)

Here v should be the particle speed and c the speed of sound. In our implementation a \(\ll \) numerical sound speed\(\gg \)  \(c = \sqrt{\frac{3 p }{\rho }}\) is used in the speed calculations to circumvent the sonic point glitch present in HLL type solvers [56]. While its simplicity is intriguing it was shown that these formulas are in general no upper bounds [58]. Still, this formula is easily generalized to other systems as it only needs the eigenvalues of the Jacobians of the flux on the left and right side. The generalizations in the next chapter are not as formal as before and we therefore do not see the Davis estimates as a fundamental weakness.

2.2 Asymptotic Analysis Based Entropy Inequality Predictor

Fig. 3
figure 3

Construction and application of the generalized HLL dissipation estimate

The entropy inequality predictor in this section will be based on an asymptotic analysis of the problem described in Fig. 3a, i.e. two smooth solutions splined together at an interface. An obstacle lies in the missing self-similarity. This is a difference to the previous part where the self-similarity of the initial condition and assumed self-similarity of the solution induced the existence of a self-similar, i.e. constant, speed of the entropy dissipation. We will therefore try to approximate

$$\begin{aligned} s^\theta (t_1, t_2) = \int _{t_1}^{t_2} \int _\theta {\frac{\partial {F} }{\partial {x}}} + {\frac{\partial {U} }{\partial {t}}} \textrm{d}x \textrm{d}t \end{aligned}$$

for reasonably small \(\left| t_2 - t_1\right| \) and the discontinuity at the interface in the interior of \(\theta \). The schemes in which we will use these entropy inequality predictors should converge with high orders for smooth solutions, necessitating a convergence of the predictor to zero with a high order for smooth solutions. This convergence is also dictated by the entropy equality for smooth solutions. If \(u_l(x)\) and \(u_r(x)\) are piecewise constant this problem is already solved by the methods described in the last subsection. We will therefore now reiterate through the proof of lemma 3 assuming that \(u_l(x)\) and \(u_r(x)\) are smooth functions. The missing self-similarity of Generalized Riemann problems [1], cf. Fig. 3b, defies the existence of the speed estimates \(a_l\) and \(a_r\), and we therefore just assume that these speed estimates exist for small times. Further we assume that for small times the solutions left of \((ta_l, t)\) and right of \((ta_r, t)\) remain smooth, as no waves from the interaction arrive there and \(u_l(x), u_r(x)\) have bounded derivatives.

The average value \(u_{lr}\) shall be determined by applying the conservation law to the triangle \(T = {{\,\textrm{ch}\,}}\{(0, 0), (t a_l, t), (t a_r, t)\}\)

$$\begin{aligned} \begin{aligned} 0 =&\int _T {\frac{\partial {u} }{\partial {t}}} + {\frac{\partial {f} }{\partial {x}}} \textrm{d}V(x, t) = \int _{\partial T} \begin{pmatrix} f \\ u \end{pmatrix} \cdot n \textrm{d}O(x, t) \\ =&\int _{t a_l}^{t a_r} \begin{pmatrix} f(u) \\ u(x, t) \end{pmatrix} \cdot \begin{pmatrix} 0 \\ 1 \end{pmatrix} \textrm{d}x + \int _0^t \begin{pmatrix} f(u) \\ u(\tau a_l, \tau ) \end{pmatrix} \cdot \begin{pmatrix} -1 \\ a_l \end{pmatrix} \textrm{d}\tau \\&+ \int _0^t \begin{pmatrix} f(u) \\ u(\tau a_r, \tau ) \end{pmatrix} \cdot \begin{pmatrix} 1 \\ -a_r \end{pmatrix} \textrm{d}\tau \\ =&\underbrace{\int _{ta_l}^{ta_r} u(x, t) \textrm{d}x}_{t(a_r - a_l)u_{lr}} + \int _0^t f(u_r(\tau a_r, \tau )) - f(u_l(\tau a_l, \tau )) \textrm{d}\tau \\&- \int _{t a_l}^0 U(u_l(x, 0)) \textrm{d}x - \int _{0}^{t a_r} U(u_r(x, 0)) \textrm{d}x. \end{aligned} \end{aligned}$$

Dividing this equation by t and going over to the limit \(t \rightarrow 0\) results in

$$\begin{aligned} \begin{aligned} \frac{\int _0^t f(u_r(\tau a_r, \tau )) - f(u_l(\tau a_l, \tau )) \textrm{d}\tau }{t}&\xrightarrow {t \rightarrow 0} f(u_r(0, 0)) - f(u_l(0, 0)), \\ \frac{\int _{t a_l}^0 U(u_l(x, 0)) \textrm{d}x - \int _{0}^{t a_r} U(u_r(x, 0)) \textrm{d}x}{t}&\xrightarrow {t \rightarrow 0} a_l U(u_l(0, 0)) - a_r U(u_r(0, 0)), \end{aligned} \end{aligned}$$

using the continuity of the integrands and the mean value theorem of integration [37]. Therefore it follows

$$\begin{aligned} u_{lr}(t) \xrightarrow {t \rightarrow 0} \frac{a_r u_r(0) - a_l u_l(0) + f(u_l( 0)) - f(u_r( 0))}{a_r - a_l} \end{aligned}$$

for vanishing t. Equation (16) stays also valid in the case of piecewise polynomial functions as initial conditions and for small \(t > 0\). We can therefore conclude that a generalization of equation (17) holds in the form

$$\begin{aligned} \begin{aligned} E^\theta (t) - E^\theta (0) \ge t(a_r - a_l) U(u_{lr})&+ \int _{-R}^{t a_l} U(u(x, t)) \textrm{d}x + \int _{t a_r}^R U(u(x, t)) \textrm{d}x\\&- \int _{-R}^R U(u(x, 0)) \textrm{d}x. \end{aligned} \end{aligned}$$

Accounting for the entropy flowing in and out of \([-R, R]\) yields

$$\begin{aligned} \begin{aligned} s^\theta (0, t) \ge t(a_r - a_l) U(u_{lr})&+ \int _{-R}^{t a_l} U(u(x, t)) \textrm{d}x + \int _{t a_r}^R U(u(x, t)) \textrm{d}x\\&- \int _{-R}^R U(u(x, t)) \textrm{d}x + \int _0^t F(u(-R, \tau )) - F(u(R, \tau )) \textrm{d}\tau . \end{aligned} \end{aligned}$$

Applying the entropy equality to the subdomains \([-R, ta_l] \times [0, t]\) and \([ta_r, R] \times [0, t]\)

$$\begin{aligned} \int _{-R}^{t a_l} U(u(x, t)) \textrm{d}x - \int _{-R}^{t a_l} U(u(x, 0)) \textrm{d}x = \int _0^{t} F(u(-R, \tau )) - F(u(t a_l, \tau )) \textrm{d}\tau , \end{aligned}$$

that holds for small \(t > 0\) because the solution stays smooth in the subdomains, allows us to restate this as

$$\begin{aligned} \begin{aligned} s^\theta (0, t) \ge t(a_r - a_l) U(u_{lr}) - \int _{ta_l}^{ta_r} U(u(x, t)) \textrm{d}x - \int _0^t F(u(t a_l, \tau )) - F(u(t a_r, \tau )) \textrm{d}\tau . \end{aligned} \end{aligned}$$

Dividing by t and going over to the limit, using the limit of \(u_{lr}\) and once more the mean value theorem, shows in this case also

$$\begin{aligned} \sigma ^\theta \ge (a_r - a_l) U(u_{lr}) - a_r U(u_r(0)) + a_l U(u_l(0)) - F(u_l(0)) - F(u_r(0)). \end{aligned}$$
(20)

A significant problem of the derivation above lies in the fact that one can only estimate the entropy dissipation speed in the interval \(\theta = (-R, R)\), but not in \((-R, 0)\) as the true dissipation can be located anywhere in the cone \([ta_l, ta_r]\). As the cells

$$\begin{aligned} \mathscr {Z}= \left\{ T_k = \left[ x_{k-\frac{1}{2}}, x_{k + \frac{1}{2}}\right] ~\bigg |~ k \in \mathbb {Z} \right\} , \quad x_{k - \frac{1}{2}} < x_{k+ \frac{1}{2}} \end{aligned}$$

in our numerical tests will be layed out as in Fig. 3d a suitable set of overlapping open intervals is

$$\begin{aligned} \varTheta = \left\{ \theta _{k + \frac{1}{2}} = (x_{k} - \varepsilon , x_{k+1} + \varepsilon ) ~\bigg |~ k \in \mathbb {Z} \right\} , \quad x_{k} = \frac{x_{k-\frac{1}{2}} + x_{k + \frac{1}{2}}}{2}. \end{aligned}$$

We are therefore left with the problem of how to split this dissipation onto the two neighboring cells that have overlap with \(\theta _{k + \frac{1}{2}}\). This problem will be handled below in Sect. 3.1.

2.3 Accounting for Aliasing Errors

In [30] one of the findings in the numerical tests section was that the entropy dissipation of the numerical solutions started already shortly before a real entropy dissipating discontinuity formed. This was attributed to the fact that while the entropy of the exact solution is still constant as long as the solution is smooth this exact solution will in general not be representable in our ansatz space. It is therefore wise to dissipate entropy to arrive at a function that still lies in our space, and certainly better than selecting an ansatz function that has more entropy than the true solution.

Fig. 4
figure 4

The effect of \(\textrm{L}^2\) projection on jump recovery. A function u, jumping from 0 to 1 at \(\frac{1}{100}\) was \(\textrm{L}^2\) projected onto an Ansatzspace consisting of 4 cells with polynomials with degrees between 4 and 7. The dotted line of the function \(\textrm{P}_7u\), the projection of the jump onto polynomials of degree less or equal 7, underpredicts the jump. Lower order projections deliver significantly better approximations of the jump

A similar issue could be the fact that in the \(\textrm{L}^p\) norms, for \(p < \infty \), near each piecewise continuous solution \(u^T\) lies a \(\mathscr {C}^\infty \) function that can be constructed via mollification. Therefore an infinitely small perturbation of \(u^T\) in the usual norms leads to a vanishing entropy dissipation. Or, put differently, the dissipation bound as a functional is discontinuous in the \(L^p\) spaces. While unsatisfactory, let us remark that the functional is better behaved with respect to the BV semi norms. The discontinuity of the entropy dissipation bound is problematic with under-resolved solutions where a lucky, or in this case better to be considered unlucky, too smooth approximation of the solution in our piecewise polynomial spaces induces wrong, i.e. too conservative entropy dissipation predictions. One such situation is depicted in Fig. 4. There, a discontinuous u with a jump at \(x = \frac{1}{100}\) was projected on the ansatzpolynomials of 4 cells. The polynomial order was varied between 4 and 7. The jump predicted by the lower order expansions is significantly bigger than the jump between the Ansatz polynomials of higher order. Therefore an entropy inequality predictor using the high-order expansion will deliver a slower dissipation speed than one that only uses the first 5 orthogonal basis functions, i.e. the first 4 Legendre polynomials in the cell.

We are therefore interested in allowing our entropy inequality predictor to be also greedy, or one could say pessimistic, with respect to an under-resolved solution. The key to this strengthening is the following lemma.

Lemma 4

(Order of the entropy dissipation bound) The maximal entropy dissipation prediction (20) of a Riemann problem for a smooth flux function with smooth entropy-entropy flux pair vanishes quadratically with the jump of u at the interface

$$\begin{aligned} \left| \sigma ^\theta \right| \in {{\,\mathrm{\mathscr {O}}\,}}(\left| \left| u_l - u_r\right| \right| ^2). \end{aligned}$$

Proof

As the entropy inequality holds it is clear that the entropy dissipation is non-positive in the sense of distributions. As we only allow entropy dissipative solutions the entropy dissipation on \(\theta \) is a non-positive constant for a fixed jump. On the contrary (20) has to be zero for \(u_l = u_r\) and is smooth, implying that the line \(u_l = u_r\) consists of local maxima. Therefore, a power expansion of (20) around \(u_l\) in \(u_r\) has to be of the form

$$\begin{aligned} \sigma ^\theta (u_l, u_r) = \left( u_l - u_r\big )\cdot \big (H (u_l - u_r)\right) + {{\,\mathrm{\mathscr {O}}\,}}{\left| \left| u_l - u_r\right| \right| ^3} \end{aligned}$$

with a negative semi-definite Hessian \(H \in \mathbb {R}^{m \times m}\). This shows the claim. \(\square \)

We are therefore in the relaxing position that even if our approximations of \(u_l, u_r\) only satisfy \(\left| \left| u_l - u_r\right| \right| \in {{\,\mathrm{\mathscr {O}}\,}}((\varDelta x)^p)\) the corresponding estimate will converge significantly faster with \(\sigma ^\theta (u_l, u_r) \in {{\,\mathrm{\mathscr {O}}\,}}((\varDelta x)^{2p})\). Our basic DG method predicts values for our solution \(u^Z\) in a Hilbertspace that is spanned by polynomials on every cell. In this case a suitable orthonormal basis is spanned by Legendre polynomials and the limits of these basis representations are \(\textrm{L}^2\) functions. But as explained before our functional is not continuous on \(\textrm{L}^2\) and our ansatz \(u^Z\) is only an approximation of a projection of the true solution onto our ansatz space. We can therefore try to exploit different projections of our ansatz, especially projections that assume less regularity of \(u^Z\), and estimate our entropy dissipation with the strongest one encountered in all of these different approximations of \(u_l\) and \(u_r\). A natural choice for projections on spaces assuming less regularity are projections on lower order polynomials. As the Legendre polynomials on each cell, when truncated up to polynomial p, are an orthogonal basis of the polynomials with degree less than or equal to p, the orthogonal projection onto these spaces is given by discarding the higher order coefficients in the Legendre expansion of \(u^Z\). We can truncate down to order \(p-1\) by discarding the highest coefficient and still achieve a convergence order of at least \(q = 2p-2 > p\) of our entropy inequality predictor for \(p>2\). This can be summed up in the following procedure used above order \(p = 2\).

  • Assign \(\sigma ^{\theta _{k + \frac{1}{2}}}_p = \sigma ^{\theta _{k + \frac{1}{2}}}_{u(x, t)}\).

  • Project the ansatz \(u^Z\) in every cell onto \(V^{Z, p-1}\) using an orthonormal projection

    $$\begin{aligned} u^{Z, p-1} = {{\,\mathrm{\mathbb {P}}\,}}_{V^{Z, p-1}} u(\cdot , t). \end{aligned}$$
  • Assign \(\sigma ^{\theta _{k + \frac{1}{2}}}_{p-1} = \sigma ^{\theta _{k + \frac{1}{2}}} _{ u^{p-1}(\cdot , t) }\).

  • Use

    $$\begin{aligned} \sigma ^{\theta _{k + \frac{1}{2}}} = \min \left( \sigma ^{\theta _{k + \frac{1}{2}}}_p,\sigma ^{\theta _{k + \frac{1}{2}}}_{p-1}\right) \end{aligned}$$
    (21)

    as entropy inequality prediction.

The kind of aliasing errors described above were not observed for low order DG schemes (\(p = 1, 2\)). We therefore do not see the limitation \(p > 2\) for the method above as an obstacle (Fig. 5).

3 Suitable Dissipation Directions and Filtering

Fig. 5
figure 5

Use of alternative dissipation directions. The direction \(-{\frac{\partial {f} }{\partial {x}}}\) shall denote the \(\textrm{L}^2\) projection of the exact solutions’ derivative \({\frac{\partial {u} }{\partial {t}}}\) onto V. Our approximation \({\frac{ \textrm{d}u^Z }{ \textrm{d}t}}\) is not entropy dissipative in this example and should be corrected into the entropy dissipative half-space characterized by the normal \(-{\frac{ \textrm{d}U }{ \textrm{d}u}}\). The direction \(\nabla ^2 u\) that stems from a discretization of the heat kernel is suitable for this correction and has additional benefits compared to \(-{\frac{ \textrm{d}U }{ \textrm{d}u}}\). While the diffusion also has a smoothing effect the addition of \(-{\frac{ \textrm{d}U }{ \textrm{d}u}}\) can even result in a sharpening effect. Higher even derivatives like \(\nabla ^8 u\) will smooth the solution but will not result in a dissipation for all entropies

After deriving approximations for the entropy dissipation needed we will now determine how to correct the time derivative of the DG scheme to dissipate the amount of entropy needed. At the same time the resulting scheme is hopefully still high order accurate for entropy conservative solutions. In the scalar case the direction of the steepest descent of the entropy, corrected for conservation, was used for this purpose. This approach incurs several problems:

  • The direction of steepest entropy descent has in general no smoothing/filtering effect.

  • It was proved in the previous publication that a correction in the steepest descent direction with the length taken from an error indicator results in an entropy inequality. This resulted in highly technical arguments and the arguments are coupled to the error indicator and the steepest descend direction [30].

  • Dissipation stems from the viscous and parabolic history of hyperbolic conservation laws. Let us define a viscous flux

    $$\begin{aligned} f_{\varepsilon }\left( u, {\frac{\partial {u} }{\partial {x}}}\right) = f(u) - \varepsilon {\frac{\partial {u} }{\partial {x}}} \end{aligned}$$

    associated with a viscous regularization of a hyperbolic conservation law. The viscous part of this flux is proportional to the gradient of the solution for fixed viscosity. If we fix a particular solution gradient \({\frac{\partial {u} }{\partial {x}}}\) the regularizing change to the flux will be proportional to the viscosity \(\varepsilon \). If a component of u is smooth with a low magnitude of the first and second derivative the viscous flux of this component will also only differ from the hyperbolic flux by a small margin. If our scheme is corrected with the steepest entropy descent direction one can ask if this correction can be expressed using some viscosity distribution \(\varepsilon (x)\) in the domain. This will be false in general. Even worse, the steepest gradient descent of the entropy cannot be bounded using the first derivative of the respective component of the vector valued function u(xt), incurring an infinitely large viscosity.

All of the above reasons motivate us to devise alternative directions for the entropy correction. These alternative directions should have the following properties

  • The dissipation direction should have a filtering effect. When the direction is used as \({\frac{\partial {u} }{\partial {t}}} = \upsilon (u)\) the high order modes of u should be dissipated.

  • The direction should be one of entropy decay, even if not that of steepest decay.

  • The dissipation should stem from a viscosity added to the hyperbolic flux.

Our new correction directions will be based on the construction of filters, i.e. operators that can regularize a solution u. A filter will in our case be a special Hilbert–Schmidt operator K [36].

Definition 1

(Filter) An operator \(K: \textrm{L}^2(\varOmega ) \rightarrow \textrm{L}^2(\varOmega )\) is said to be a filter if it is an integral operator whose pointwise evaluation results in a weighted average, i.e.

$$\begin{aligned}{}[K u] (x) = \int _\varOmega k(x, y) u(y) \textrm{d}y, \quad \text {with } \forall x \in \varOmega : \int _\varOmega k(x, y) \textrm{d}y = 1 \end{aligned}$$

is satisfied and the kernel k is of bounded Hilbert–Schmidt norm.

We are especially interested in conservative filters as they do not destroy the conservation of our basic schemes when they are applied on a per cell basis.

Lemma 5

(Conservative filter) A filter \(K: \textrm{L}^2(\varOmega ) \mapsto \textrm{L}^2(\varOmega )\) is conservative

$$\begin{aligned} \int _\varOmega [K u](x) \textrm{d}x = \int _\varOmega u(x) \textrm{d}x \end{aligned}$$

if it can be written as an integral operator with a kernel with mass one, i.e.

$$\begin{aligned}{}[K u] (x) = \int _\varOmega k(x, y) u(y) \textrm{d}y, \quad \text {with } \forall y\in \varOmega : \int _\varOmega k(x, y) \textrm{d}x = 1. \end{aligned}$$

Proof

Using Fubini’s theorem shows

$$\begin{aligned} \int _\varOmega [K u](x) \textrm{d}x = \int _\varOmega \int _\varOmega k(x, y) u(y) \textrm{d}y \textrm{d}x = \int _\varOmega \underbrace{\int _\varOmega k(x, y) \textrm{d}x}_{=1} \, u(y) \textrm{d}y = \int _\varOmega u(y) \textrm{d}y \end{aligned}$$

in this case. \(\square \)

Please note that the weighted average property is stated using the integration w.r.t. the second variable while the conservation results from the unit measure in the first variable. Obviously a convolution with a convolution kernel satisfying

$$\begin{aligned} \int _\mathbb {R}k(y) \textrm{d}y = 1 \end{aligned}$$

satisfies both as \(k(x, y) = k(x -y)\) holds in this case, but not every operator satisfying these properties is a convolution. Especially when one is interested in bounded domains, convolutions are not an option, but there still exist suitable smoothing operators.

Theorem 1

(Universally dissipative filters) A conservative filter K is dissipative for all convex entropies U,

$$\begin{aligned} E_{K u} = \int _\varOmega U([K u](x)) \textrm{d}x \le \int _\varOmega U(u(x)) \textrm{d}x = E_u, \end{aligned}$$

if it can be written as a conservative filter with a positive kernel, i.e.

$$\begin{aligned}{}[K u] (x) = \int _\varOmega k(x, y) u(y) \textrm{d}y \end{aligned}$$

with \(\forall x, y: k (x, y) \ge 0\).

Proof

Using Jensen’s inequality [46], the positivity and conservation of the filter allows us to show

$$\begin{aligned} \begin{aligned} \int _\varOmega U([K u](x)) \textrm{d}x =&\int _\varOmega U\left( \int _\varOmega k(x, y) u(y) \textrm{d}y \right) \textrm{d}x \le \int _\varOmega \int _\varOmega k(x, y)U(u(y)) \textrm{d}y \textrm{d}x \\ =&\int _\varOmega \underbrace{\int _\varOmega k(x, y) \textrm{d}x}_{ = 1} U(u(y)) \textrm{d}y = \int _\varOmega U(u(y)) \textrm{d}y \end{aligned}. \end{aligned}$$

\(\square \)

This theorem shows that the first and second bullet above can be satisfied by an integral operator with a suitable kernel. An example of a dissipation that can be identified with a positive conservative filter is the filtering by the time evolution of

$$\begin{aligned} {\frac{\partial {u} }{\partial {t}}} = \varepsilon \nabla ^2_x u \end{aligned}$$

on the entire domain as the assorted filter has the heat kernel as kernel function [16],

$$\begin{aligned} k^t(x, y) = h(x-y, t), \quad h(x, t) = \frac{\textrm{e}^{-\frac{\left| x\right| ^2}{4t}}}{(4 \pi t)^{n/2}}, \quad t > 0. \end{aligned}$$

Further, this filtering obviously stems from viscosity and has therefore a direct physical interpretation. It is known that while a positive conservative filter always dissipates entropy, a high order finite-difference implementation of a second derivative will not dissipate all entropies [43] and similar theorems hold for higher even derivatives even in the analytic case. We will therefore outline how to construct a filter that is dissipative in the semidiscrete and fully discrete setting and can therefore be used as a descent direction. We begin by stating some discrete equivalents of the theorems above and will analyze if usual dissipations/filters satisfy this property. We will assume that \(\omega _k \ge 0\) is a positive quadrature rule on the cell Z for the rest of the chapter and all notions of conservation for our filters will be centered around being conservative with respect to this quadrature rule. For a general DG method with dense mass matrix a quadrature can be calculated via \(\sum _l M_{lk} = \omega _k\), i.e. by entering the constant one into the discretised inner product, but positivity is not guaranteed in general. A general view of our plan could be to not discretise the second derivative, but its action as the generator of a Hilbert–Schmidt operator. We will therefore, when given a discrete filter, consider also its (discrete) generator.

Definition 2

(Conservative and positive filter generator) Let \(G \in \mathbb {R}^{(p+1) \times (p+1)}\) be a square matrix. We call this matrix a filter generator if

$$\begin{aligned} \forall k \in \{ 1, \dots , p+1\}: \quad \sum _{l=1}^{p+1} G_{kl} = 0 \end{aligned}$$

holds. It will be conservative if

$$\begin{aligned} \forall l \in \{ 1, \dots , p+1\}: \quad \sum _{k=1}^{p+1} \omega _k G_{kl} = 0 \end{aligned}$$

is satisfied. Further, we call it positive, if

$$\begin{aligned} \forall l \in \{ 1, \dots , p+1\},\quad \forall k \in \{ 1, \dots , l-1, l+1, \dots , p+1\}: \quad G_{kl} \ge 0 \end{aligned}$$

holds.

Definition 3

(Discrete conservative and positive filter) We call a matrix \(\varUpsilon \in \mathbb {R}^{(p+1) \times (p+1)}\) a filter, if

$$\begin{aligned} \forall k \in \{ 1, \dots , p+1\}: \quad \sum _{l=1}^{p+1} \varUpsilon _{kl} = 1 \end{aligned}$$

holds. It is termed conservative, if

$$\begin{aligned} \forall l \in \{ 1, \dots , p+1\}: \quad \sum _{k=1}^{p+1} \omega _k \varUpsilon _{kl} = \omega _l \end{aligned}$$

is satisfied. Further, we call it positive, if

$$\begin{aligned} \forall k, l \in \{ 1, \dots , p+1\}:\quad \varUpsilon _{kl} \ge 0. \end{aligned}$$

Obviously, the definition of the conservative positive discrete filter mirrors the definition of such a filter in the continuous case using the quadrature rule. The definition of the averaging property on the other hand is not based on the quadrature rule, as this rule is not used when applying the filter pointwise

$$\begin{aligned} \upsilon _k = \sum _{l = 1}^{p+1} \varUpsilon _{kl} u_l. \end{aligned}$$

Forward Euler steps connect the generators defined above with the filters, as we will see in the lemma below.

Lemma 6

(Connecting generators and filters) It holds

$$\begin{aligned} G \text { conservative as generator } \implies \varUpsilon = {{\,\mathrm{\textrm{I}}\,}}+ \varDelta t G \text { conservative as filter.} \end{aligned}$$

Let further \(\varDelta t \max _l \left| G_{ll}\right| \le 1\). Then it follows

$$\begin{aligned} G \text { positive as generator} \implies \varUpsilon \text { positive as filter}. \end{aligned}$$

Proof

We begin by showing the conservativity and filter property. It holds

$$\begin{aligned} \sum _{l=1}^{p+1} {{\,\mathrm{\textrm{I}}\,}}_{kl} = 1 \implies \sum _{l=1}^{p+1} \varUpsilon _{kl} = \sum _{l=1}^{p+1} ({{\,\mathrm{\textrm{I}}\,}}+ \varDelta t G)_{kl} = 1. \end{aligned}$$

As the identity is conservative follows

$$\begin{aligned} \sum _{k=1}^{p+1} \omega _k {{\,\mathrm{\textrm{I}}\,}}_{kl} = \omega _l \implies \sum _{k=1}^{p+1} \omega _k \varUpsilon _{kl} = \sum _{k=1}^{p+1} \omega _k ({{\,\mathrm{\textrm{I}}\,}}+ \varDelta t G)_{kl} = \omega _l. \end{aligned}$$

The positivity follows, as for non-diagonal elements,

$$\begin{aligned} k \ne l \implies \varUpsilon _{kl} (I + \varDelta tG)_{kl} = \varDelta t G_{kl} \ge 0 \end{aligned}$$

is satisfied for any positive time step size, while the given restriction is needed to enforce

$$\begin{aligned} \varUpsilon _{ll} = (I + \varDelta t G)_{ll} \ge 1 - \varDelta t \left| G_{ll}\right| \ge 0. \end{aligned}$$

\(\square \)

It is clear that a discrete filter that is positive and conservative is also dissipative by reiterating through the arguments given above for the continuous case. Sadly, it is also true that while in the continuous case the filter which is generated by the second derivative, i.e. the heat kernel, is positive, the second derivative discretised in our DG method is not a positive generator and also does not induce a positive filter directly. We will therefore show how to design a generator that is an approximation of the heat kernel for forward Euler steps, thereby even allowing to prove the dissipativity of the correction operator for finite time steps. The basis will be the heat equation with varying heat conductivity \(\alpha (x)\) [28]

$$\begin{aligned} {\frac{\partial {u} }{\partial {t}}} = \sum _{k=1}^n {\frac{\partial {~} }{\partial {x_k}}} \left( \alpha (x) {\frac{\partial {u} }{\partial {x_k}}}\right) , \quad \alpha {\frac{\partial {u} }{\partial {n}}}\Bigg |_{\partial T} = 0, \end{aligned}$$

on the (reference) element in conjunction with Neumann boundary conditions. The Neumann boundary conditions enforce the conservation of the resulting solution operator as any change of the cell mean values must happen through the numerical flux of the basic DG method. We discretize this problem with the nodal basis of our DG method [28]

$$\begin{aligned} {\frac{ \textrm{d}u^T }{ \textrm{d}t}} = -M^{-1} Q u^T, \quad Q_{kl} = \int _T \left( {\frac{\partial {\varphi _k} }{\partial {x}}}\big )\cdot \big (\alpha (x) {\frac{\partial {\varphi _l} }{\partial {x}}}\right) \textrm{d}x. \end{aligned}$$
(22)

This is a continuous Galerkin discretization, as we consider only a single element. In general there exists no \(\varDelta t > 0\) where \({{\,\mathrm{\textrm{I}}\,}}+ \varDelta t (-M^{-1} Q)\) is a positive operator because the negative elements in \((-M^{-1} Q)\) prohibit it from being a positive generator. Yet the following theorem shows that the exact ODE solution to this problem for a \(t > 0\) big enough is in fact eligible as a filter.

Theorem 2

If the quadrature \(\omega \) is exact on \(V^Z\), the solution of (22) for a positive initial condition \(u_0 \in \mathbb {R}^{p+1}\) satisfies for all \(t > 0\)

  • \(\sum _{k = 1}^{p+1} \omega _k u_k(t) = \sum _{k=1}^{p+1} \omega _k u_k(0)\) (conservation),

  • \(u_k(t) = C_{kl} (t) u_l(0)\) with \(\forall k\in \{ 1, \dots , p+1\}: \quad \sum _{l = 1}^{p+1} C_{kl} = 1\) (averaging property).

Further, for a \(t > 0\) big enough it follows \(\forall k \in \{ 1, \dots , p+1\}: \quad u_k(t) \ge 0\).

Proof

Entering \(v = 1\) into the weak form results in

$$\begin{aligned} \int _T {\frac{\partial {u} }{\partial {t}}} \textrm{d}x = -\int _T {\frac{\partial {1} }{\partial {x}}} \alpha (x) {\frac{\partial {u} }{\partial {x}}} \textrm{d}x = 0. \end{aligned}$$

As the quadrature is exact for the basis functions the same follows for the discretisation, and this shows the conservation. The matrix C used to describe the solution has the explicit form [36, sec. 34]

$$\begin{aligned} u(t) = \underbrace{\textrm{e}^{-t M^{-1}Q}}_{C(t)}u(0). \end{aligned}$$

Multiplying this matrix with the vector \(v \in V\) representing the function 1 from the right reveals

$$\begin{aligned} M^{-1}Q v = 0 \implies \textrm{e}^{-t M^{-1}Q} v = {{\,\mathrm{\textrm{I}}\,}}v = v. \end{aligned}$$

This already shows the second result as the nodal representation of C(t) must have unit row sum. The matrix \(-Q\) is negative semi-definite, the v vector is in its null space. If another linearly independent \(u \in V\) would be in its null space it would follow

$$\begin{aligned} \left( u\big )\cdot \big (Q u\right) = \left\langle {{\frac{\partial {u} }{\partial {x}}}}, {\alpha {\frac{\partial {u} }{\partial {x}}}} \right\rangle _T = \int _T \alpha (x) \left| {\frac{\partial {u} }{\partial {x}}}\right| ^2 \textrm{d}x = 0 \end{aligned}$$

and this is a contradiction to \({\frac{\partial {u} }{\partial {x}}} \ne 0\), as u was assumed non-constant. Therefore, there exists an orthonormal eigenvalue decomposition of the discretisation whose eigenvalues, apart from the constant eigenfunction \(\psi _1 = v\) with eigenvalue \(\lambda _1 = 0\), are bounded away from zero,

$$\begin{aligned} \forall k \in \{ 1, \dots , p+1\}: \quad -M^{-1} Q \psi _k = \lambda _k \psi _k. \end{aligned}$$

We assume that the eigenvectors are sorted by increasing absolute value of the corresponding eigenvalues,

$$\begin{aligned} 0 = \lambda _1< \left| \lambda _2\right| \le \left| \lambda _3\right| \le \dots \le \left| \lambda _{p+1}\right| . \end{aligned}$$

The solution

$$\begin{aligned} u(t) = \sum _{k=1}^{p+1} \textrm{e}^{- \lambda _k t} \psi _k \left\langle {\psi _k}, {u(0)} \right\rangle \end{aligned}$$

therefore converges to the average of u(0), as

$$\begin{aligned} \left| \left| u(t) - \psi _1\left\langle {\psi _1}, {u(0)} \right\rangle \right| \right| ^2 = \left| \left| \sum _{k = 2}^{p+1} \textrm{e}^{- \lambda _k t} \psi _k \left\langle {\psi _k}, {u(0)} \right\rangle \right| \right| ^2 \le \textrm{e}^{-2 \lambda _2 t} \left| \left| u_0\right| \right| ^2 \end{aligned}$$

holds. Because a positive initial condition has a positive average the solution will converge to this positive average. \(\square \)

Using the theorem above we can construct filters \(\varUpsilon \) simply by calculating the matrix \(C(t) = G\) used in the proof above. This matrix, which maps an initial state onto the solution at time t, is always a conservative filter, and when t is large enough also positive. In the implementation the suitable t was found using a bisection algorithm. Using \(\varUpsilon = (C(t) - {{\,\mathrm{\textrm{I}}\,}})/t\) the corresponding generator can be found. We note in passing that numerous other possibilites exist to define a positive conservative filter as defined above, but that the method given above defines a filter than can be associated with viscosity.

Lemma 7

Assume that the null space of G consists only of constants. Then for a non-constant u and a strictly convex entropy U it holds

$$\begin{aligned} \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {G u} \right\rangle _{Z, \omega } < 0. \end{aligned}$$

If U is just convex only \(>> \le <<\) applies in the equation above.

Proof

The discrete dissipativity

$$\begin{aligned} \begin{aligned} E^Z(u + \varDelta t \lambda G u)&= \sum _{k = 1}^{p+1} \omega _k U\left( u_k + \varDelta t \lambda \sum _{l=1}^{p+1} G_{kl} u_l \right) \\&< \sum _{k = 1}^{p+1} \omega _k \sum _{l = 1}^{p+1} G_{kl} U(u_l) = \sum _{l=1}^{p+1} \omega _l U(u_l) = E^Z(u) \end{aligned} \end{aligned}$$

follows from the positive conservative filter property of G for \(\lambda \varDelta t > 0\) small enough as in lemma 6 in conjunction with the strict convexity and Jensen’s inequality in the strict sense. Let now \(\varDelta t\) be fixed and small enough for all \(\lambda \in [0, 1]\), and denote by \(\varepsilon = E^Z(u + \varDelta t G u) - E^Z(u) < 0\) the entropy dissipation for \(\lambda = 1\). The convexity of U implies with \( u + \lambda \varDelta t Gu = (1-\lambda ) u + \lambda (u + \varDelta t Gu)\)

$$\begin{aligned} \begin{aligned} E^Z((1-\lambda ) u + \lambda (u + \varDelta t Gu))&\le (1- \lambda ) E^Z(u) + \lambda E^Z(u + \varDelta t G u),\\ E^Z(u + \lambda \varDelta t Gu)&\le E^Z(u) + \lambda \left( E^Z(u + \varDelta t Gu) - E^{Z}(u)\right) = E^Z(u) + \lambda \varepsilon . \end{aligned} \end{aligned}$$

Entering this into the definition of the derivative of \(E^Z\) with respect to \(\lambda \) shows

$$\begin{aligned} {\frac{ \textrm{d}E^Z(u + \lambda \varDelta t Gu) }{ \textrm{d}\lambda }} = \lim _{\lambda \rightarrow 0} \frac{E^Z(u+\lambda \varDelta t Gu) - E^Z(u)}{\lambda } \le \varepsilon , \end{aligned}$$
(23)

and therefore

$$\begin{aligned} \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {Gu} \right\rangle _{Z, \omega } = \sum _{k = 1}^{p+1} \omega _k \left( {\frac{ \textrm{d}U }{ \textrm{d}u}} (x_k)\big )\cdot \big ((G u)_k\right) ={\frac{ \textrm{d}E^Z(u + \lambda \varDelta t Gu) }{ \textrm{d}\lambda }} \frac{1}{\varDelta t} \le \frac{\varepsilon }{\varDelta t}. \end{aligned}$$
(24)

If U is not strictly convex the case \(\varepsilon = 0\) is possible, reducing the result to \(>>\le <<\). \(\square \)

The result above is not only of theoretical value. The equations (23), (24) show

$$\begin{aligned} \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {\varDelta t Gu} \right\rangle _{Z, \omega } = {\frac{ \textrm{d}E^Z(u + \lambda \varDelta t Gu) }{ \textrm{d}\lambda }} \le \varepsilon = E^Z(u + \varDelta t G u) - E^Z(u). \end{aligned}$$
(25)

We can therefore refrain from evaluating the effectivity of the correction direction \(\upsilon = Gu\) via the direct computation of \(\partial _{\lambda }{E^Z(u + \lambda \varDelta t Gu)}\). Instead it suffices to enter u and \(u + \varDelta t Gu\) into the definition of the total entropy and use a finite-difference approximation of the derivative. This approximation will predict a smaller effectivity, i.e. a not so negative value. The last step consists of selecting a suitable viscosity distribution \(\alpha \). We choose \(\alpha (x) = 1\) for simplicity. Please note that this viscosity is not zero at the boundary \(\partial Z\). The derivation of the weak form above assumes zero flux over the surface of the cell \(\partial Z\). Therefore the discrete weak form is not equivalent to the discrete strong form via summation by parts - the boundary terms are missing. We are therefore sure that this is not equivalent to adding a parabolic term discretised using continuous FE. The matrix G is not a suitable discretisation of a second derivative. Any high order discretisation of a second derivative would employ negative weights in its difference matrix on non-diagonal positions [43]. This is a direct contradiction to the positivity of G. Experiments of the author with dissipation matrices constructed from subcells showed that these are also suitable. Still it was felt that the presented process is the most natural one for DG, while subcells are the natural construction procedure for spectral volume methods.

3.1 Stable Computation of the Correction Size Required and Timestep Restrictions

After we have calculated the entropy dissipation needed and a suitable direction \(\upsilon = G u^Z\) one would guess we only have to calculate \(\lambda \) as in equation (7) via

$$\begin{aligned} \lambda \ge \frac{\sigma ^Z - \left( \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {{\frac{ \textrm{d}u }{ \textrm{d}t}}} \right\rangle _{Z, \omega } - \left( F^*_l - F^*_r \right) \right) }{ \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {\upsilon } \right\rangle _{Z, \omega }} \end{aligned}$$

with \(\sigma ^Z\) from equation (21). It turns out that this process is significantly more intricate than one would expect as this computation has to be stable with respect to roundoff errors. Further, our entropy inequality predictors can only estimate the entropy dissipation that can take place at the interface between two adjacent cells, but are not able to give information on how this dissipation is split between the two cells. Our method of calculating suitable values of \(\lambda ^Z\) therefore consists of two steps. First,

$$\begin{aligned} \lambda _{\textrm{ED}}^Z = \max \left( 0, -\frac{ \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {{\frac{ \textrm{d}u }{ \textrm{d}t}}} \right\rangle _{Z, \omega } - \left( F^*_l - F^*_r \right) }{ \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {\upsilon } \right\rangle _{Z, \omega }} \right) \end{aligned}$$
(26)

is calculated to enforce the per cell entropy dissipativity

$$\begin{aligned} \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, { {\frac{ \textrm{d}u }{ \textrm{d}t}} + \lambda _{\textrm{ED}}^Z \upsilon } \right\rangle _{Z, \omega } \le F^*_l - F^*_r. \end{aligned}$$

In a second step a correction to enforce an entropy rate high enough

$$\begin{aligned} \lambda _{\textrm{ER}}^{\theta } = \max \left( 0, \frac{ \sigma ^{\theta } - \sum _{Z \cap \theta \ne \emptyset }\left( \left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {{\frac{ \textrm{d}u }{ \textrm{d}t}} +\lambda _{\textrm{ED}}^Z \upsilon } \right\rangle _{Z, \omega } - (F^*_{l, Z} - F^*_{r, Z})\right) }{ \sum _{\theta \cap Z \ne \emptyset }\left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, {\upsilon } \right\rangle _{Z, \omega } } \right) \end{aligned}$$
(27)

is determined for all \(\theta \in \varTheta \). Both corrections are then added together

$$\begin{aligned} \lambda ^Z_{\varSigma } = \lambda ^{Z}_{\textrm{ED}} + \sum _{\theta \cap Z \ne \emptyset } \lambda ^{\theta }_{\textrm{ER}} \end{aligned}$$

for all cells \(Z \in \mathscr {Z}\). Round-off errors tend to influence the calculation out of two reasons. The division by \(\left\langle {{\frac{ \textrm{d}U }{ \textrm{d}u}}}, { \upsilon } \right\rangle \) in equation (26) and (27) can approach a division by zero for a solution that is nearly constant in the cell, as \(\upsilon \rightarrow 0\) follows in this case. Further, we saw in lemma (1) that the entropy inequality predictor can vanish with a high order for smooth solutions, and an accurate DG scheme will also have an entropy error that tends to zero with a high order. The difference of these two values, i.e. the denominator of the fraction above, will in general not vanish that fast because round-off in the difference becomes important. Therefore \(\lambda \) will, for highly resolved smooth solutions, be to big because round-off errors propagate into the calculation. Our solution to this problem is to calculate

$$\begin{aligned} \lambda _{(\cdot )}^Z = \max \left( \frac{ab}{b^2 + c^2}, 0\right) , \text { instead of } \frac{a}{b} \end{aligned}$$

every time a \(\lambda \) is calculated by a division in the procedure above. Here, a shall be the nominator, b shall be the denominator and c shall be a suitable bound on the round-off error, a constant small with respect to ab but large with respect to the machine precision. In our implementation this is selected as \(c = \sqrt{10^{-16}}\), i.e. the square root of the machine precision for a solution scaled to be of unit magnitude. The addition of c can be seen as the one-dimensional version of Tikhonov regularization [33]. Clipping the calculation of \(\lambda \) at 0 ensures that if a or b become negative from rounding errors \(\lambda \) will not become negative, i.e. \(\lambda \upsilon \) will not be antidissipative. In a last step,

$$\begin{aligned} \lambda ^Z = \min \left( \lambda _\textrm{max},\lambda ^Z_{\varSigma } \right) , \end{aligned}$$

the upper limit \(\lambda _{\text {max}}\) is introduced for stability reasons as we want to enforce stability of

$$\begin{aligned} {\frac{ \textrm{d}u^Z }{ \textrm{d}t}} = \upsilon = \lambda ^Z G u^Z. \end{aligned}$$
(28)

If a Runge–Kutta time integration method can be written as convex combination of forward Euler steps, i.e. is Strong Stability Preserving (SSP) [23, 47, 48] and the time-steps satisfy \(\varDelta t \lambda \le 1\) during every Euler step, the lemma 6 allows us to show that the solution is also entropy dissipative in the discrete case. If the time integration method used is just a conditionally stable Runge–Kutta method [13, 61] we are interested in limiting the operator norm of \(\varDelta t \left| \left| \lambda G\right| \right| \le R\) in order to at least avoid a linear instability. The exact size depends on the time integration methods’ stability region as we would like to fit the half-circle

$$\begin{aligned} C = \{z \in \mathbb {C}| \left| \left| z\right| \right| \le R \wedge {{\,\textrm{im}\,}}z \le 0\} \end{aligned}$$

into the stability region of the method.

4 Numerical Tests

4.1 Final Algorithm

The last sections described our (refined) methods to estimate the maximal entropy dissipation possible and our refined dissipation directions. These are combined with the orthogonal projection approach to define a stable entropy dissipation functional and our round-off hardened \(\lambda \) calculation procedure. We will now describe the final algorithm, as all of these pieces have to be plugged together to form the final scheme. First, a startup has to be carried out:

  1. 1.

    Calculate the reference element mass matrix M.

  2. 2.

    Determine the reference element first derivative stiffness matrix S.

  3. 3.

    Find the dissipation generator G:

    • Calculate the laplacian stiffness matrix Q for the reference element in conjunction with the heat conductivity \(\alpha (x)\).

    • Solve \({\frac{ \textrm{d}v }{ \textrm{d}t}} = -M^{-1}Q v\) forward in time with \(v(0) = {{\,\mathrm{\textrm{I}}\,}}\) the identity matrix as initial condition.

    • Determine a value for t using bisection that is minimal (to floating point accuracy) but also satisfies \((v(t))_{kl} \ge 0\) elementwise.

    • Set the dissipation operator to \(G = (v(t) - {{\,\mathrm{\textrm{I}}\,}})/t.\)

  4. 4.

    Initialize the grid.

  5. 5.

    \(\textrm{L}^2\)-Project the initial condition onto the ansatz polynomials in every cell.

One evaluation of the scheme in semidiscrete form for \(p > 2\) is done using the following enumerated steps:

  1. 1.

    Calculate the time derivative of the coefficients using a standard DG method as in equation (3).

  2. 2.

    Estimate the maximal possible entropy dissipation per cell edge:

    1. (a)

      Evaluate the entropy inequality predictor on the values at the cell edges as in equation (20).

    2. (b)

      \(\textrm{L}^2\) Project the ansatz polynomial to a polynomial of one degree lower as in Sect. 2.3.

    3. (c)

      Evaluate the entropy inequality predictor on the values at the cell edges of the projected polynomials as in equation (20).

    4. (d)

      Take the minimal value of both (the fastest dissipation) as prediction as in Sect. 2.3.

  3. 3.

    Determine the correction direction and size:

    1. (a)

      Calculate a dissipation direction for every cell via the application of G to u as \(\upsilon = Gu\).

    2. (b)

      Determine the effectivity of the dissipation direction \(\upsilon \) per cell, i.e. the denominator in equation (26) or approximate using equation (25).

    3. (c)

      Find the total application of the dissipation direction needed from equations (26) and (27).

  4. 4

    Apply the correction \({\frac{ \textrm{d}u }{ \textrm{d}t}} = {\frac{ \textrm{d}u }{ \textrm{d}t}} + \lambda \upsilon \).

In case of \(p=1, 2\) points 2(b) through 2(d) are left out. Nearly all tests that follow need boundary conditions while the theoretical parts of this publication was restricted to periodic boundary conditions. The shock-tube tests below add a ghost cell to the left and right of the domain that is kept constant during the computation to allow us to only use a finite number of cells for our computations. A third publication in this series is in preparation dealing with two-dimensional tests and real boundary conditions. Time integration will be carried out using the SSPRK(4, 3) method for most solutions [32, 41], while the convergence analysis for \(p = 7\) below will use the Hairer-Wanner DOPRI8 method, to achieve the needed convergence speed of the time integration [24]. In all images below the ansatz functions of all cells are shown without any post-processing. The used reference solvers and flux functions are given in Table 2.

Table 2 Used schemes in the numerical tests

4.2 Comparison with Part I for Burgers’ Equation

Fig. 6
figure 6

Solutions to the testcases from [30] using the new method but the old parameters. Initial condition \(u_1(x, 0) = \sin (\pi x)\) is a sine that develops a shock at around \(t=0.3\) while \(u_2(x, 0)\) depicts the ability of the scheme to handle rarefactions

The results are printed in Fig. 6. The values

$$\begin{aligned} a_l = \min (u_l, u_r), \quad a_r = \max (u_l, u_r) \end{aligned}$$

were used as wave speed estimates. As before the shock disturbs only two cells that are connected to it. Sadly the chaos in these two cells is significantly higher than with the old method. This is not necessarily a problem as the resolution of the new method is higher in case of the rarefaction. Further, as we will see in the numerical tests section for the Euler equations, the new method converges with a higher order for smooth problems. If one accepts the less clean cells around the shock the new scheme is superior for scalar problems.

4.3 Buckley–Leverett Equation

The Buckley–Leverett equation is given by the flux function

$$\begin{aligned} f(u) = \frac{u^2}{u^2 + a(1-u)^2}, \quad a = 1. \end{aligned}$$

It is used to predict two-phase flow in a porous medium [39]. A solution to a Riemann problem for this scalar conservation law can involve a rarefaction and a shock wave at the same time. We use

$$\begin{aligned} U(u) = \left( u - \frac{1}{2}\right) ^4, \quad {\frac{ \textrm{d}U }{ \textrm{d}u}} = 4 \left( u - \frac{1}{2}\right) ^3, \quad F(u) = \int _0^u {\frac{ \textrm{d}f }{ \textrm{d}u}} {\frac{ \textrm{d}U }{ \textrm{d}u}} \textrm{d}u \end{aligned}$$

as entropy flux pair. Our implementation calculates the entropy flux using numerical quadrature. \(N_{quad} = 30\) Gauß-Lobatto points are used in the interval [0, u] for this numerical quadrature and this was found to be exact up to machine precision. The speed estimates for the non-convex flux of the Buckley–Leverett equation were calculated by splitting the domain of the flux into two zones. For \(u \in [0, \frac{1}{2}]\) is \({\frac{ \textrm{d}f }{ \textrm{d}u}}\) increasing, while it is decreasing for \(u \in [\frac{1}{2}, 1]\). The extrema have to be at the ends of the intersections between these intervalls and the values \(u_l\) and \(u_r\). The HLL flux dissipates all entropies [26] and was therefore used in a first order FV solver to calculate reference solutions with \(N = 3\cdot 10^4\) cells.

Fig. 7
figure 7

Solution to Riemann initial data for the Buckley–Leverett equation. The domain is periodic and polynomials of degree 3 and 7 were used. The solutions are shown at \(t = 0.6\) and the reference solution was calculated using a first order FV scheme using a HLL approximate Riemann solver

The results for the Riemann problem

$$\begin{aligned} u(x, 0) = {\left\{ \begin{array}{ll} 1.0 &{} x < 1.0 \\ 0.0 &{} x > 1.0 \end{array}\right. } \end{aligned}$$

can be seen in Fig. 7. The DG solver converges to the same solution as the HLL-FV solver. Several examples of non-convex conservation laws exist where solvers do not converge to the same solution as a robust first order scheme while they are entropy dissipative at the same time [38]. One can conjecture that the good behavior of our DG method is due to a numerical enforcement of the entropy rate criterion.

4.4 Tests for the Euler Equations

The next tests will be carried out for the Euler equations of gas dynamics in conservation form [25]

$$\begin{aligned} u =(\rho , \rho v, E), \quad f(\rho , \rho v, E) = \begin{bmatrix} \rho v \\ \rho v^2 + p\\ v(E + p) \end{bmatrix}, \quad p = (\gamma - 1)\left( E - \frac{1}{2} \rho v^2 \right) \end{aligned}$$

in conjunction with the physical entropy [25, 55]

$$\begin{aligned} U(\rho , \rho v, E) = - \rho S, \quad F(\rho , \rho v, E) = - \rho v S, \quad S = \ln (p \rho ^{- \gamma }). \end{aligned}$$

The tests below will focus on the cases \(p = 3\) and \(p = 7\) as the latter are popular in applications because they amount to 4 and 8 nodes, suitable for SIMD processor instructions. Results for values in between are essentially interpolatory to the ones reported for \(p = 3\) and \(p = 7\) and the source code is available to carry out tests for all values \(p > 0\). Equation (19) will be used as wave speed estimate for the Euler equations.

4.4.1 Shock Tube Tests

First, a series of shock tube tests was done to highlight the effectivity of the entropy correction in shock calculations as this is the primary aim of this publication. The first initial condition is the problem of Sod [Problem 6a] [48, 49]

$$\begin{aligned} \rho _0(x, 0) = {\left\{ \begin{array}{ll}1, \\ 0.125, \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} 0, \\ 0, \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 1.0, &{} x<5, \\ 0.1, &{} x \ge 5.\end{array}\right. } \end{aligned}$$

This problem is one of the most well known shock tube tests in the literature. A slight variation of this problem is given in [59, Problem I, Section 4.3.3]

$$\begin{aligned} \rho _0(x, 0) = {\left\{ \begin{array}{ll}1, \\ 0.125, \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} 0.75, \\ 0, \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 1.0, &{} x<3, \\ 0.1, &{} x \ge 3.\end{array}\right. } \end{aligned}$$

This variation can show entropy violations of the classical entropy inequality as a left sonic rarefaction wave is part of the solution.

Our third shock tube is the time-evolution of the following Riemann problem [48, Problem 6b])

$$\begin{aligned} \rho _0(x, 0) = {\left\{ \begin{array}{ll}0.445, \\ 0.5, \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} 0.698, \\ 0, \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 3.528, &{} x < 5.0, \\ 0.571, &{} x \ge 5.0. \end{array}\right. } \end{aligned}$$

More severe shocks can be expected from the following three initial conditions that originaly served as testcases 3,4 and 5 in [59] and will be our initial conditions 4, 5 and 6:

$$\begin{aligned}&\rho _0(x, 0) = {\left\{ \begin{array}{ll}1, \\ 1.0, \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} 0, \\ 0, \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 1000.0, &{} x<5, \\ 0.01, &{} x \ge 5.\end{array}\right. }\\&\rho _0(x, 0) = {\left\{ \begin{array}{ll}5.99924, \\ 5.99242, \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} 19.5975, \\ -6.19633, \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 460.894, &{} x<4, \\ 46.0950, &{} x \ge 4.\end{array}\right. }\\&\rho _0(x, 0) = {\left\{ \begin{array}{ll}1.0, \\ 1.0, \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} -19.59745, \\ -19.59745, \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 1000.0, &{} x<8, \\ 0.01, &{} x \ge 8.\end{array}\right. } \end{aligned}$$
Fig. 8
figure 8

Shock tube 1 at \(t = 1.8\) with 25 cells corresponding to 100 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=3\))

Fig. 9
figure 9

Shock tube 1 at \(t = 1.8\) with 13 cells corresponding to 104 degrees of freedom and 100 cells corresponding to 800 degrees of freedom (\(p=7\))

Fig. 10
figure 10

Shock tube 2 at \(t = 1.8\) with 25 cells corresponding to 100 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=3\))

Fig. 11
figure 11

Shock tube 2 at \(t = 1.8\) with 13 cells corresponding to 104 degrees of freedom and 100 cells corresponding to 800 degrees of freedom (\(p=7\))

Fig. 12
figure 12

Shock tube 3 at \(t = 1.2\) with 25 cells corresponding to 100 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=3\))

Fig. 13
figure 13

Shock tube 3 at \(t = 1.2\) with 13 cells corresponding to 108 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=7\))

Fig. 14
figure 14

Shock tube 4 at \(t = 1.8\) with 25 cells corresponding to 100 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=3\))

Fig. 15
figure 15

Shock tube 4 at \(t = 1.8\) with 13 cells corresponding to 104 degrees of freedom and 100 cells corresponding to 800 degrees of freedom (\(p=7\))

Fig. 16
figure 16

Shock tube 5 at \(t = 1.8\) with 25 cells corresponding to 100 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=3\))

Fig. 17
figure 17

Shock tube 5 at \(t = 1.8\) with 13 cells corresponding to 104 degrees of freedom and 100 cells corresponding to 800 degrees of freedom (\(p=7\))

Fig. 18
figure 18

Shock tube 6 at \(t = 1.8\) with 25 cells corresponding to 100 degrees of freedom and 100 cells corresponding to 400 degrees of freedom (\(p=3\))

Fig. 19
figure 19

Shock tube 6 at \(t = 1.8\) with 13 cells corresponding to 104 degrees of freedom and 100 cells corresponding to 800 degrees of freedom (\(p=7\))

The shock tube tests were carried out for two different numbers of cells. First for \(N = N_{\text {typ}}/(p +1)\) cells, where \(N_{typ} = 100\) is the usual number of cells used in comparisons for Finite-Volume methods. This was done so that the same number of degrees of freedom has to be saved. The results are satisfactory and highlight the effectivity of the method in Figs. 8, 9, 12, 13. All shocks are sharp and concentrated to less than one cell width. Yet, only slight overshoots and oscillations are visible directly around the shocks. These distortions are confined to the cell directly next to the shock. Contact discontinuities are slightly smeared over one cell, but after they have been smeared to this width no additional smearing takes place. The computational complexity per timestep is still low as no recovery stencil selection has to be carried out and only \(1/(p + 1)\) times the number of two-point fluxes need to be evaluated. Because some other publications use 100 cells also for DG methods we carried out the tests once more for \(N = 100\) cells, amounting to 400 and 800 degrees of freedom for orders \(p=3\) and \(p=7\) (Figs. 10, 11, 12, 13, 14, 15, 16, 17, 18, 19).

4.4.2 Numerical Validation of the Entropy Rate Criterion

To verify the entropy rate criterion the total entropy of the solution to the first shock tube above was compared to the solution calculated by a modified Lax-Friedrichs scheme with \(3\cdot 10^4\) cells. Similar comparisons were carried out in [29,30,31]. This is supported by our findings in lemma 1, lemma 2 and corollary 1 as a (modified) Lax-Friedrichs solution therefore has to comply with the entropy rate criterion. A scheme should in these comparisons have the same entropy dissipation rate (in the limit) as the Lax-Friedrichs scheme in the limit. Comparisons for orders \(p=3\) and \(p=7\) in Fig. 20 show that this seems to be the case. The DG scheme always has an entropy that lies below the entropy of the LF scheme. As the entropy inequality for vanishing viscosity solutions is also desirable it was also verified on a per-cell basis. We just note that the small positive violations in Fig. 20 are of the same magnitude as the precision achievable during the calculation of \(\lambda \) using our procedure with double precision floats (Fig. 21).

Fig. 20
figure 20

Entropy tests for the first shock tube

4.4.3 Shu–Osher Test

To showcase a combination of shocks and smooth areas the well established shock-sine interaction problem from [48, Problem 8] was tested. The initial conditions are given by

$$\begin{aligned} \rho _0(x, 0) = {\left\{ \begin{array}{ll}3.857153 \\ 1 + \varepsilon \sin (5 x) \end{array}\right. } \quad v_0(x, 0) = {\left\{ \begin{array}{ll} 2.629 \\ 0 \end{array}\right. } p_0(x, 0) = {\left\{ \begin{array}{ll} 10.333 &{} x < 1 \\ 1 &{} x \ge 1 \end{array}\right. }. \end{aligned}$$

The parameter \(\varepsilon \) was set to the canonical value of \(\varepsilon = 0.2\).

Fig. 21
figure 21

Shu-Osher test for 50, 100 and 200 cells and order \(p = 3\) and \(p = 7\) and therefore 200, 400, 800 and 1600 degrees of freedom at \(t = 1.8\)

The results look satisfactory already when only \(N = 100\) cells are used in the calculation. Yet, we note that this already corresponds to 400 and 800 degrees of freedom for the selected orders. When \(N=200\) cells are used the solution is nearly indistinguishable from the reference solution.

4.4.4 Convergence Analysis

While the main aim of our modification was to devise a new DG scheme usable for shock-capturing calculations the scheme also converges with high order of accuracy for smooth solutions in our experiments. As an example the solution of

$$\begin{aligned} \rho _0(x, 0) = 3.857153 + \varepsilon (x) \sin (2 x), \quad v_0(x, 0) = 2.0, \quad p_0(x, 0) = 10.33333, \end{aligned}$$

with

$$\begin{aligned} \varepsilon (x) = \textrm{e}^{(x-3)^2}. \end{aligned}$$

and periodic boundary conditions was calculated using our modified DG method. The analytical solution for this test problem is

$$\begin{aligned} \rho (x, t) = 3.857153 + \varepsilon (x-2t) \sin (2 x-4t), \quad v(x, t) = 2.0, \quad p(x, t) = 10.33333, \end{aligned}$$

with suitable periodic boundary conditions.

Fig. 22
figure 22

Convergence analysis for \(p = 3\) and \(p = 7\). The error of the baseline DG scheme is shown next to the error of the modified scheme. The error of the modified scheme is higher, and for small cell numbers dominates the error of the additional dissipation. The entropy inequality predictors vanish with a higher order than the base scheme, and therefore vanishes the difference between both schemes under grid refinement

After the solution was calculated for \(N = \{10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100\}\) cells for \(p = 3\) and with the same stepping up to 50 cells for \(p = 7\) up to \(t = 5\) the \(\textrm{L}^1\) and \(\textrm{L}^2\) errors were calculated. The convergence in Fig. 22 seems to take place with too high an order for the ansatz polynomials used. The reason for this could be that the accuracy of the basic scheme is significantly higher for these solutions than the accuracy of the corrected scheme, because the entropy dissipation estimate still falsely reports high amounts of entropy dissipation. When the grid is refined the entropy dissipation estimate converges with a higher speed than the basic scheme following lemma 4, and because the error introduced to enforce the dissipation dominates a higher convergence speed than expected is observed.

4.4.5 Timestep Analysis

An important result of any modification to a basic scheme can be an impact on the allowed timestep size. In the first part of this publication [30] this influence was tested by measuring the maximal timestep possible before a blow-up occurs. This was done once more.

Fig. 23
figure 23

Maximal timestep sizes for \(p = 3\) and \(p = 7\) and the shock tube 1

The maximal timestep possible for the first shock tube for orders \(p = 3\) and \(p=7\) is shown in Fig. 23. Obviously this timestep is acceptable and when corrected for the larger maximal wave speed of the Riemann problem used for testing, larger than the timestep reported in the previous part, highlighting the superiority of the new dissipation direction.

5 Conclusion

The method described in [30] to enforce an entropy rate criterion for DG methods was improved. By using a direct indicator for the entropy dissipation the error indicator used before could be replaced, resulting in a lower dissipation in situations like contact discontinuities. For smooth solutions this new method to quantify the amount of dissipation needed converges significantly faster to zero than the error estimate used before, and therefore allows us to recover the convergence speed of the basic DG scheme that was reduced by one degree before. Further, the direct quantification of the entropy dissipation needed allows us to consider different dissipation directions, especially combining smoothing and dissipation and therefore bridging into the field of modal filtering. The effectivity of the refined method was demonstrated for the Buckley–Leverett equation and the Euler system of gas dynamics. The method is not only high order accurate but also able to handle shocks, contact discontinuities, and rarefaction waves. The next logical steps can be the application to two-dimensional problems, the application of the designed entropy inequality predictors to other schemes like continuous Galerkin and Spectral Volume schemes, where several adjustments will be needed, and revisiting the splitting into a fully discrete scheme already explored in [30]. The presented method to estimate the entropy dissipation needed could also be used with artificial viscosity shock-capturing as for example described in [20].