1 Introduction

In the last decades, a lot of work has been devoted to the study of shallow water flows. When dispersive effects are negligible (this is the case, for example, for the modelling of hydraulic jumps for large Froude numbers, or tsunami waves), one usually employs the classical Saint-Venant (SV) or shallow water equations. The underlying hypothesis in the derivation of the Saint-Venant equations is the flow potentiality. The horizontal vorticity (parallel to the bottom) in the shallow water approximation is related with the horizontal velocity shear: \({\varvec{\omega } }_{\vert \vert } \approx {{\mathbf {V}}}_z\), where \({{\mathbf {V}}}\) is the instantaneous (non-averaged) horizontal velocity, and the index z means the derivative in the vertical direction. The absence of the vorticity means the absence of the horizontal velocity shear. The shallow water equations are hyperbolic, see e.g. [116]. When discontinuous solutions to the SV equations are studied, one uses the conservation of mass and momentum at the shocks, but the energy equation is always used as the entropy inequality. The reason for this is the following. The fluid flow ahead of the jump front is supercritical with respect to the front and almost potential, while behind the front it is highly turbulent: large vortex structures are usually formed. The energy is not conserved because a part of this energy is transformed into the energy of vortexes which is not taken into account in the SV model. An ideal model of free surface shallow flows which takes into account shear effects was recently derived by Teshukov [112]. The governing equations are obtained by depth averaging of the multi–dimensional Euler equations [100, 101, 112]. The hypothesis of smallness of the horizontal vorticity (the hypothesis of weakly sheared flows) allows us to keep the second order depth averaged correlations in the governing equations but neglect the third order correlations, and thus to close the governing system in the dissipationless limit. To apply the model to the study of real flows (formation of roll waves and hydraulic jumps) the model was complemented by dissipative source terms, see [69, 80, 100, 101].

The corresponding multi-dimensional model of shear shallow water flows is a hyperbolic system of equations which is reminiscent of the Reynolds-averaged Euler equations for barotropic compressible turbulent flows. The model has three families of characteristics corresponding to the propagation of surface waves, shear waves and waves propagating with the average flow velocity. The main difficulty in studying such a system is the highly non-conservative nature of the governing equations: for six unknowns (the fluid depth, two components of the depth averaged horizontal velocity, and three independent components of the symmetric Reynolds stress tensor) one has only five conservation laws: conservation of mass, momentum, energy and mathematical “entropy". The last one determines the evolution of the determinant of the Reynolds stress tensor. The non-conservative nature of the multi-dimensional equations represents an enormous difficulty from the mathematical and numerical point of view. The definition and computation of discontinuous solutions for non-conservative hyperbolic equations is a challenging problem, see e.g. [5, 23, 85]. A numerical method (based on a splitting procedure) was recently developed for solving this non-conservative system [69, 80]. The essential ingredient was the use of the energy conservation. It allowed, in particular, creation of vorticity once the jump appears. The splitting procedure was as follows. First, a geometric splitting was applied consisting in solving the governing equations first in x and then in y direction. Second, each one-dimensional system was also split into two subsystems, each of which contained only one ‘sound’ speed: the velocity of surface waves for the first sub-system, and the velocity of shear waves for the second sub-system. Each subsystem admitted its own energy conservation law, and its own “entropy”. However, such an operator splitting could be also a source of numerical errors. This is why it is very important to develop also different numerical methods for solving this challenging non-conservative system, like the two new unsplit schemes proposed in this paper, namely a completely new thermodynamically compatible unsplit finite volume scheme, as well as a slightly modified general-purpose high order ADER discontinuous Galerkin finite element method.

In order to put the new unsplit thermodynamically compatible finite volume scheme presented in this paper into the proper context, let us briefly review the main ideas on which it is based. Exactly sixty years ago, in 1961, Godunov published his groundbreaking paper An interesting class of quasilinear systems [70], in which he discovered the connection between symmetric hyperbolicity in the sense of Friedrichs [62] and thermodynamic compatibility (SHTC), ten years before Friedrichs and Lax [63], who independently rediscovered the same connection again in 1971. In a subsequent series of papers, Godunov and Romenski carried out further research on this link between symmetric hyperbolicity and thermodynamic compatibility and generalized the seminal idea of Godunov to the more general SHTC framework of symmetric hyperbolic and thermodynamically compatible systems, which includes not only the compressible Euler equations of gasdynamics, but also the magnetohydrodynamics (MHD) equations [71] and the equations of nonlinear hyperelasticity [74, 75, 77]. The findings of Godunov and Romenski on nonlinear hyperelasticity were subsequently further employed and extended in [1, 6, 14, 17, 47, 57, 59, 67, 72, 81, 87, 88, 91, 93]. A very general class of symmetric hyperbolic and thermodynamically compatible systems was presented by Romenski in [102], which is able to describe the interaction of moving-dielectric solids with electromagnetic fields, the dynamics of superfluid helium and also contains a hyperbolic model for heat conduction. An extension of this class of models to compressible multi-phase flows was forwarded in [103, 104, 106]. The SHTC framework remains valid even in the context of special and general relativity, see [76, 105]. Recently, a connection between the class of symmetric hyperbolic and thermodynamically compatible systems and Hamiltonian mechanics was rigorously established in [92]. SHTC systems go also beyond classical continuum mechanics, see e.g. [94] for an SHTC formulation of continuum mechanics with torsion. Despite the mathematical beauty and rigor of the SHTC framework, up to now it was never carried over to the discrete level. So far, most papers on on thermodynamically compatible schemes are based on the ideas of the seminal work of Tadmor [108], in which a discrete extra conservation law for the entropy is obtained as a consequence of the discretization of all other equations (including the energy conservation, which is explicitly discretized). Instead, in the new scheme presented in this paper we are not discretizing the energy equation explicitly, but are rather looking for a thermodynamically compatible scheme in which a discrete total energy conservation law is obtained as direct consequence of the compatible discretization of all the other equations. For an interesting application of entropy compatible schemes for the discretization of non-conservative equations, see [3, 60]. The ideas presented there are related to the new compatible scheme introduced in the present paper, though [3, 60] deal with much simpler equation systems. Recently, convergence of entropy-stable schemes was proven in [26]. For extensions of entropy-compatible schemes to high order discontinuous Galerkin methods, see [28, 36, 66, 78, 84] and references therein. While most of the aforementioned schemes are thermodynamically compatible only at the semi-discrete level, a fully discrete entropy-stable scheme has been recently presented in [95]. We also would like to point out that a very general framework for the construction of numerical schemes satisfying additional extra conservation laws has been recently forwarded by Abgrall in [2].

As already stated above, the major difference of the thermodynamically compatible scheme proposed in this paper with respect to previous thermodynamically compatible schemes is its discrete compatibility with the conservation of total energy as a consequence of all equations and not the conservation of entropy as a consequence. In other words, the thermodynamically compatible finite volume scheme presented in this paper never explicitly discretizes the energy equation, but total energy conservation is obtained as a mere consequence of a thermodynamically compatible discretization of the other equations, including a compatible discretization of the numerical viscosity.

The second unsplit scheme proposed in this paper is a fully-discrete one-step high order ADER discontinuous Galerkin method (ADER-DG). Explicit discontinuous Galerkin schemes for hyperbolic equations have been put forward by Reed and Hill in [96] introducing the use of discontinuous polynomials in a Galerkin framework to allow the jump of the discrete solution across cell boundaries.

Then, the first extensions to multidimensional and non linear hyperbolic systems were presented in the series of papers by Cockburn and Shu [27, 30,31,32,33]. Parabolic terms have been considered for the first time in [9, 10, 34, 35]. The severe time step restriction induced by the inclusion of higher order derivatives, [83, 122, 123], and nonlinear dispersive equations, [51, 53, 54], has driven to the development of fully implicit approaches, [44], whose major disadvantage is the solution of the resulting ill-conditioned algebraic systems. An alternative approach recently proposed is the use of hyperbolic reformulations of dispersive models which allow for more efficient discretizations, [7, 8, 18, 52].

Regarding high order methods, it is important to remark that while attaining high order in space is straightforward for DG methodologies, there are different possibilities concerning high order time discretizations. The original DG schemes of Cockburn and Shu employed high order Runge-Kutta schemes in time, leading to the family of RKDG schemes. An alternative consists in the family of fully implicit and semi-implicit space-time DG methods, see e.g. [19, 82, 98, 99, 109,110,111, 120, 121]. Another different option that leads to high order explicit fully-discrete one-step schemes, and which is followed in this paper, combines ideas of the ADER approach of Toro and Titarev, originally developed within the finite volume framework [20, 114, 117, 118], with space-time DG methods. This methodology, based on the ideas outlined in [41, 43], makes use of an element-local space-time DG predictor, thus avoiding the cumbersome Cauchy-Kovalevskaya procedure of classical ADER schemes and thus allowing also the solution of complex PDEs in multiple space dimensions. Some examples of the wide range of applicability for this approach include the compressible Euler and Navier-Stokes equations, [40, 41], compressible multi-phase flows [45], the Godunov-Peshkov-Romenski model of continuum mechanics, [17, 47, 93]. Discontinuous Galerkin schemes for hyperbolic PDE systems with non-conservative products have been proposed for the first time in [42, 97], based on the ideas of path conservative schemes [22,23,24, 85, 89], which will be also a key point for the development of the numerical schemes proposed in this paper.

The rest of this paper is organized as follows: in Sect. 2 we present the original model [69] and a novel reformulation based on a decomposition of the specific Reynolds stress tensor \({\mathbf {P}}\) as \({\mathbf {P}}= {\mathbf {Q}}{\mathbf {Q}}^T\). We furthermore introduce a viscous extension of the governing PDE system in order to define a rigorous and thermodynamically compatible vanishing viscosity limit of the model. We finally recall the Godunov formalism of thermodynamically compatible systems and prove that the proposed viscous system is thermodynamically compatible with the energy conservation law and with the entropy inequality. In Sect. 3 we present a novel thermodynamically compatible finite volume scheme, which mimics the aforementioned viscous extension of the system exactly at the semi-discrete level. In Sect. 4 a high order ADER discontinuous Galerkin method with a posteriori subcell limiter (MOOD) is presented for the new reformulation of the model proposed in this paper, including its viscous extension. Special care is taken concerning the conservation of total energy. Numerical results are shown in Sect. 5, where first a numerical convergence study is presented for third to sixth order ADER-DG schemes in space and time; subsequently, different schemes are compared with each other for three Riemann problems, discussing in particular the discretization of the non-conservative terms of the model in the context of thermodynamically compatible systems. The end of Sect. 5 contains numerical results for some challenging test problems for which experimental reference data are available, such as supercritical roll waves and the circular shock instability developing in the SWASI experiment, see [61]. The paper is rounded-off with some concluding remarks and an outlook to future work in Sect. 6.

2 Governing Equations

We consider the following overdetermined hyperbolic model for turbulent shear shallow water flows in multiple space dimensions, which has been recently proposed in [69] and which was also applied and studied in [11, 25, 80]:

$$\begin{aligned} \partial _t h + \nabla \cdot (h {\mathbf {v}})= & {} 0, \end{aligned}$$
(1)
$$\begin{aligned} \partial _t (h {\mathbf {v}}) + \nabla \cdot \left( h {\mathbf {v}}\otimes {\mathbf {v}}+ \frac{1}{2} g h^2 {\mathbf {I}} + h {\mathbf {P}}\right) + g h \nabla b= & {} -C_f \, \Vert {\mathbf {v}}\Vert \, {\mathbf {v}}, \end{aligned}$$
(2)
$$\begin{aligned} \partial _t {\mathbf {P}}+ {\mathbf {v}}\cdot \nabla {\mathbf {P}}+ \nabla {\mathbf {v}}\, {\mathbf {P}}+ {\mathbf {P}}\, \nabla {\mathbf {v}}^T= & {} - 2 \frac{\alpha }{h} {\mathbf {P}}, \end{aligned}$$
(3)
$$\begin{aligned} \partial _t b= & {} 0, \end{aligned}$$
(4)

with the gravity constant g. The physical (primitive) state variables in (1)–(4) are the following: \(h=h(\mathbf {x},t)\) is the water depth, \(b=b(\mathbf {x})\) is the known bottom topography, \({\mathbf {v}}={\mathbf {v}}(\mathbf {x},t)\) is the depth-averaged flow velocity and \({\mathbf {P}}= {\mathbf {P}}(\mathbf {x},t)\) is the specific Reynolds stress tensor. For shallow water systems it is convenient to include the stationary bottom profile \(b(\mathbf {x})\) in the set of state variables. The reason is that this allows to represent stationary free surface waves associated with bottom jumps and to obtain well-balanced numerical schemes, see e.g. [21, 22, 64, 89, 90] for more details. Via straightforward calculations it can be shown that the system (1)–(4) admits the following extra conservation law

$$\begin{aligned} \partial _t (hE) + \nabla \cdot \left( {\mathbf {v}}(hE) + \left( \frac{1}{2} g h^2 {\mathbf {I}} + h {\mathbf {P}}\right) {\mathbf {v}}\right) = - C_f \Vert {\mathbf {v}}\Vert ^3 - \alpha \, \text {tr}{\mathbf {P}}, \end{aligned}$$
(5)

with the total energy defined as \(h E = \frac{1}{2} g h^2 + h g b + \frac{1}{2} h \Vert {\mathbf {v}}\Vert ^2 + \frac{1}{2} h \, \text {tr}{\mathbf {P}}\).

The bottom friction is taken into account by a coefficient \(C_f\) and the dissipation function \(\alpha \) is given according to [69] as

$$\begin{aligned} \alpha = \max \left( 0, C_r \frac{\text {tr}{\mathbf {P}}- \varphi h^2}{(\text {tr}{\mathbf {P}})^2} \, \Vert {\mathbf {v}}\Vert ^3 \right) . \end{aligned}$$
(6)

2.1 Reformulation of the Model in Terms of a New Evolution Variable

The above model requires \(\text {tr}{\mathbf {P}}\ge 0\) for hyperbolicity. In order to guarantee this property also at the discrete level for all times, we propose the following novel reformulation of the system (1)–(5). For this, we consider first the homogeneous part of equation (3) for the symmetric tensor \({{\mathbf {P}}}\):

$$\begin{aligned} \dot{{\mathbf {P}}}+{{\mathbf {LP}}}+{{\mathbf {PL}}}^T=0. \end{aligned}$$
(7)

Here for shortness, for any f, \(\dot{f} \) means the material time derivative: \(\dot{f}=f_t+{{\mathbf {v}}}\cdot {\varvec{\nabla }}\), and \(\displaystyle {\mathbf {L}}={\frac{\partial {\mathbf {v}}}{\partial {\mathbf {x}}}={\varvec{\nabla }} {{\mathbf {v}}}}\). Let us replace \({{\mathbf {P}}}\) by \({{\mathbf {P}}}={{\mathbf {QQ}}}^T\). What is the equation for \({{\mathbf {Q}}}\)? One obtains from (7):

$$\begin{aligned} (\dot{{\mathbf {Q}}}+{{\mathbf {LQ}}}){{\mathbf {Q}}}^T+{{\mathbf {Q}}}(\dot{{\mathbf {Q}}}+{{\mathbf {LQ}}})^T=0. \end{aligned}$$
(8)

If

$$\begin{aligned} \dot{{\mathbf {Q}}}+{{\mathbf {LQ}}}={{\mathbf {B}}}({{\mathbf {Q}}}^{T})^{-1} \end{aligned}$$
(9)

with an antisymmetric tensor \({{\mathbf {B}}}=-{{\mathbf {B}}}^T\), the equation for \({{\mathbf {P}}}\) will be obviously satisfied. Thus, the equation for \({\mathbf {Q}}\) is defined up to an antisymmetric tensor \({{\mathbf {B}}}\) taking into account a proper rotation of the Reynolds tensor (for details, see [68]). We hypothesize that friction forces will drastically reduce the influence of this proper rotation, i.e. we take \({\mathbf {B}}=0\). Such a class of solutions is not equivalent to all solutions governed by the equation for \({\mathbf {P}}\), but is able, as we will show, to describe complex flow configurations.

What is a geometrical sense of such a decomposition \({{\mathbf {P}}}={{\mathbf {QQ}}}^T\)? Let us recall first the definition of the Gram matrix \({\mathbf {G}}\) (in the 2D case). Consider two vectors \({\mathbf {w}}_i\), \(i=1,2\). The Gram matrix is defined as

$$\begin{aligned} {{\mathbf {G}}}=\left( \begin{array}{cc} {\mathbf {w}}_1\cdot {\mathbf {w}}_1&{}{\mathbf {w}}_1\cdot {{\mathbf {w}}_2}\\ {{\mathbf {w}}_1}\cdot {{\mathbf {w}}_2}&{}{{\mathbf {w}}_2}\cdot {{\mathbf {w}}_2} \end{array} \right) . \end{aligned}$$
(10)

The ‘dot’ here is for the scalar product of vectors. It can be also written as

$$\begin{aligned} {{\mathbf {G}}}={{\mathbf {QQ}}}^T \end{aligned}$$
(11)

with

$$\begin{aligned} {{\mathbf {Q}}}=\left( \begin{array}{c} {\mathbf {w}}_1^T\\ {\mathbf {w}}_2^T \end{array}\right) , \end{aligned}$$
(12)

i.e. the line vectors \({\mathbf {w}}_i\) are the lines of \({\mathbf {Q}}\). Let us recall that in our case \({\mathbf {P}}\) is the correlation tensor expressed in terms of the velocity pulsations (see [112] for details) as:

$$\begin{aligned} {\mathbf {P}}=\left( \begin{array}{cc} \overline{v_1'^2}&{}\overline{v_1'v_2'}\\ \overline{v_1'v_2'}&{}\overline{v_2'^2} \end{array}\right) \end{aligned}$$
(13)

Here the averaging operation, denoted by a “bar”, is the depth averaging. The tensor \({\mathbf {P}}\) is positive definite due to the Cauchy–Schwarz inequality. Let us show that \({\mathbf {P}}\) can be presented in the form (10), i.e. there exist vectors \({\mathbf {w}}_i\):

$$\begin{aligned} {\mathbf {P}}=\left( \begin{array}{cc} {\mathbf {w}}_1\cdot {\mathbf {w}}_1\; &{}{\mathbf {w}}_1\cdot {\mathbf {w}}_2\; \\ {\mathbf {w}}_1\cdot {\mathbf {w}}_2\; &{}{\mathbf {w}}_2\cdot {\mathbf {w}}_2\; \end{array}\right) \end{aligned}$$

For this we take

$$\begin{aligned} {\mathbf {w}}_1=\sqrt{\overline{v_1'^2}}\; \left( {\cos {\theta _1}}, \; {\sin {\theta _1}}\right) ^T,\quad {\mathbf {w}}_2=\sqrt{\overline{v_2'^2}}\; \left( \cos {\theta _2}, \; \sin {\theta _2}\right) ^T \end{aligned}$$

with

$$\begin{aligned} \mathrm{cos}(\theta _1-\theta _2)=\frac{\overline{v_1'v_2'}}{\sqrt{\overline{v_1'^2}\; \overline{v_2'^2}}}. \end{aligned}$$

The last relation is well defined due to the Cauchy–Schwarz inequality: \(\vert \overline{v_1'v_2'}\vert \le \sqrt{\overline{v_1'^2}\; \overline{v_2'^2}}\).

With \(P_{ik} = Q_{im} Q_{km}\) written in terms of the new evolution variable \({\mathbf {Q}}\) and the notation \(\partial _m = \partial / \partial x_m\) the above system can be rewritten again as an overdetermined PDE system as follows:

$$\begin{aligned} \partial _t h + \partial _m (h v_m)= & {} 0, \end{aligned}$$
(14)
$$\begin{aligned} \partial _t (h v_i) + \partial _k \left( h v_i v_k + \frac{1}{2} g h^2 \delta _{ik} + h P_{ik} \right) + g h \partial _i b= & {} -C_f \, \Vert {\mathbf {v}}\Vert v_i, \end{aligned}$$
(15)
$$\begin{aligned} \partial _t Q_{ik} + v_m \, \partial _m \, Q_{ik} + \left( \partial _m v_i \right) \, Q_{mk}= & {} -\frac{\alpha }{h} Q_{ik}, \end{aligned}$$
(16)
$$\begin{aligned} \partial _t b= & {} 0, \end{aligned}$$
(17)

with the conservative evolution variables \(h=h(\mathbf {x},t)\), \(h {\mathbf {v}}= h{\mathbf {v}}(\mathbf {x},t)\), \({\mathbf {Q}}={\mathbf {Q}}(\mathbf {x},t)\) and the stationary bottom profile \(b=b(\mathbf {x})\).

It is easy to see that (3) is a consequence of (16) by simply multiplying (16) with \({\mathbf {Q}}^T\) from the right and summing the transpose of (16) multiplied by \({\mathbf {Q}}\) from the left. It can be easily checked that also the new system (14)–(17) admits an extra energy conservation law

$$\begin{aligned} \partial _t (hE) + \partial _i \left( \! (hE) v_i \! + \! \left( \frac{1}{2} g h^2 \delta _{ik} + h Q_{im} Q_{km} \right) \! v_k \! \right) = - C_f \Vert {\mathbf {v}}\Vert ^3 \! - \! \alpha \, \text {tr}{\mathbf {P}}\!, \end{aligned}$$
(18)

which can be obtained as a consequence of (14)–(17). In terms of \({\mathbf {Q}}\) the total energy reads \(h E = \frac{1}{2} g h^2 + {h g b} + \frac{1}{2} h v_i v_i + \frac{1}{2} h \, Q_{ij} Q_{ij}\), which for flat bottom \(b=0\) is a strictly convex function in the variables h, \(h v_i\) and \(S_{ij}=hQ_{ij}\). It is also a convex function of \((h, hv_i, Q_{ij})\), if the turbulent energy is small compared to the gravitational potential energy (see “Appendix A” for details). Also note that due to \(\text {tr}{\mathbf {P}}= Q_{ij} Q_{ij} \ge 0\) the use of \({\mathbf {Q}}\) instead of \({\mathbf {P}}\) automatically guarantees a non-negative trace of \({\mathbf {P}}\) by construction, and hence also at the discrete level for all times. In this sense, system (14)–(18) is analogous to a so-called realizable turbulence model. At this point we emphasize that the thermodynamically compatible scheme proposed later in this paper will consider only the case of a flat bottom with \(b=0\).

Last but not least, we would like to point out the difference in the only apparently similar structure of PDE (16) and the governing PDE for the distortion field \(A_{ik}\) in nonlinear hyperelasticity [47, 93], which reads

$$\begin{aligned} \partial _t A_{ik} + v_m \partial _m \, A_{ik} + A_{im} \left( \partial _k v_m \right) = - \frac{1}{\theta (\tau )} E_{A_{ik}}. \end{aligned}$$
(19)

As one can easily see, the order of the matrix-product in the third term on the left hand side of (16) and (19) is exchanged. It is well-known that for hyperelasticity there is an additional conservation law associated with the determinant of the distortion field \(A_{ik}\) and in the following we will show that the same applies to the determinant of the field \(Q_{ik}\). The time derivative of the determinant of \({\mathbf {Q}}\) can be easily obtained via the Jacobi formula, which expresses the derivatives of the determinant of a matrix in terms of the inverse of the matrix and the derivatives of the matrix itself:

$$\begin{aligned} \partial _{t} |Q| = |Q| Q_{ki}^{-1} \, \partial _{t} Q_{ik}, \qquad \partial _{m} |Q| = |Q| Q_{ki}^{-1} \, \partial _{m} Q_{ik}, \end{aligned}$$
(20)

where \(Q_{ki}^{-1}\) is a compact notation for \((Q^{-1})_{ki}\). Applying (20) to (16) yields

$$\begin{aligned} \partial _t |Q| + |Q| Q_{ki}^{-1} v_m \, \partial _m \, Q_{ik} + |Q| Q_{ki}^{-1} \left( \partial _m v_i \right) \, Q_{mk} = -\frac{\alpha }{h} |Q| Q_{ki}^{-1} Q_{ik}, \end{aligned}$$
(21)

from which one obtains

$$\begin{aligned} \partial _t |Q| + v_m \, \partial _m |Q| + |Q| \left( \partial _m v_i \right) \, \delta _{mi} = -\frac{\alpha }{h} |Q| \delta _{kk}, \end{aligned}$$
(22)

and therefore the sought additional balance law for the determinant |Q|,

$$\begin{aligned} \partial _t |Q| + \partial _m \left( v_m |Q| \right) = -\frac{\alpha }{h} |Q| \delta _{kk}, \end{aligned}$$
(23)

which for \(\alpha =0\) has the same structure as the mass conservation equation (14). As such, we can assume that for \(h>0\) also \(|Q|>0\) holds.

Via straightforward calculations it can be shown that for smooth solutions the conservation law (23) for the determinant |Q| is equivalent to the conservation law

$$\begin{aligned} \partial _t \left( h \psi \right) + \partial _m \left( v_m h \psi \right) = -\frac{4\alpha }{h^{3}}\left( P_{11}P_{22}-P_{12}^{2} \right) , \qquad \psi = \frac{| {\mathbf {P}}|}{h^2} = \frac{| {\mathbf {Q}}\, {\mathbf {Q}}^T |}{h^2} \end{aligned}$$
(24)

already found in [69]. Assuming \(\alpha =0\) it reduces to

$$\begin{aligned} \partial _t \left( h \psi \right) + \partial _m \left( v_m h \psi \right) = 0. \end{aligned}$$
(25)

2.2 Eigenstructure of the Reformulation

The eigenvalues of the homogeneous part of (14)–(17) in \(x_1\) direction are

$$\begin{aligned} \lambda _{1,7} = v_1 \mp c, \quad \lambda _{2,6} = v_1 \mp \sqrt{P_{11}}, \quad \lambda _{3,4,5} = v_1, \quad \lambda _8 = 0, \end{aligned}$$
(26)

with \(c^2 = g h + 3 P_{11}\). The associated right eigenvectors read

$$\begin{aligned} {\mathbf {r}}_{1,7}= & {} \left( h K, h(v_1 \mp c) K, h ( v_2 K \mp 6 c P_{12} ), Q_{11} K, Q_{12} K, 6 Q_{11} P_{12}, 6 Q_{12} P_{12}, 0 \right) ^T \!\!, \nonumber \\ {\mathbf {r}}_{2,6}= & {} \left( 0, 0, \mp \sqrt{P_{11}} h, 0, 0, Q_{11}, Q_{12}, 0 \right) ^T \!\!, \nonumber \\ {\mathbf {r}}_3= & {} \left( - 2 h Q_{11}, -2 h v_1 Q_{11}, - 2 h v_2 Q_{11}, gh + P_{11}, 0, Q_{11}^{-1} (2 Q_{12} |Q| + \varPi _1 ), 0, 0 \right) ^T \!\!\!, \nonumber \\ {\mathbf {r}}_4= & {} \left( - 2 h Q_{12}, -2 h v_1 Q_{12}, - 2 h v_2 Q_{12}, gh + P_{11}, 0, Q_{11}^{-1} (2 Q_{12} P_{12} - \varPi _2 ), 0, 0 \right) ^T \!\!\!, \nonumber \\ {\mathbf {r}}_5= & {} \left( 0, 0, 0, 0, 0, -Q_{11}^{-1} Q_{12}, 1, 0 \right) ^T \!\!, \nonumber \\ {\mathbf {r}}_8= & {} \left( h M, 0, h ( 2 v_1 P_{12} + v_2 M ), Q_{11} M, Q_{12} M, - 2 P_{12} Q_{11}, -2 P_{12} Q_{12}, \varPi _3 \right) ^T \!\!, \end{aligned}$$
(27)

with \(K = 2 c^2 + gh\), \(M = P_{11} - v_1^2\), \(\varPi _1 = Q_{21}(P_{11} - gh) \), \(\varPi _2 = Q_{22}(P_{11} + gh)\) and \(\varPi _3 = g^{-1} M (u^2 - c^2 )\). All eigenvalues are real since \(h >0\) and \(P_{11} = Q_{11}^2 + Q_{12}^2 \ge 0\) and there exists a full set of eigenvectors, hence the system is hyperbolic.

2.3 The Godunov Form of Nonlinear Systems of Hyperbolic Conservation Laws

In order to define the vanishing viscosity limit of system (14)–(17) and in order to introduce the new thermodynamically compatible finite volume schemes developed later in this paper, which are exactly compatible with the vanishing viscosity limit, it is necessary to recall the Godunov form [70] of hyperbolic PDE systems. We first consider only hyperbolic systems of conservation laws in two space dimensions of the type

$$\begin{aligned} {\mathbf {q}}_t + \partial _k {\mathbf {f}}_k = 0, \end{aligned}$$
(28)

with flux tensor \({\mathbf {F}}= ({\mathbf {f}}_1,{\mathbf {f}}_2)\), that admit the following parametrization according to Godunov [70]

$$\begin{aligned} \left( L_{{\mathbf {p}}} \right) _t + \partial _k \left( (v_k L)_{{\mathbf {p}}} \right) = 0, \end{aligned}$$
(29)

with the extra conservation law of the form

$$\begin{aligned} {\mathcal {E}}_t + \partial _k F_k = 0, \end{aligned}$$
(30)

where \(F_k\) is the total energy flux in the k-th coordinate direction. Equations (29) and (30) are in the following called the Godunov form of the conservation law (28) and constitute an overdetermined system of PDE. The system is thermodynamically compatible if the following relations hold:

$$\begin{aligned} {\mathbf {q}}= L_{{\mathbf {p}}}, \qquad {\mathbf {p}}= {\mathcal {E}}_{{\mathbf {q}}}, \qquad {\mathbf {f}}_k = (v_k L)_{{\mathbf {p}}}, \qquad F_k = {\mathbf {p}}\cdot {\mathbf {f}}_k - v_k L. \end{aligned}$$
(31)

Here, L is the so-called generating potential and \({\mathcal {E}}\) is the total energy density, which are the Legendre transforms of each other and thus satisfy

$$\begin{aligned} L = {\mathbf {p}}\cdot {\mathbf {q}}- {\mathcal {E}}, \qquad {\mathcal {E}} = {\mathbf {p}}\cdot {\mathbf {q}}- L. \end{aligned}$$
(32)

We assume L and \({\mathcal {E}}\) to be strictly convex functions of their arguments, hence the transformation matrices between \({\mathbf {p}}\) and \({\mathbf {q}}\) variables, which are the Hessian matrices of L and \({\mathcal {E}}\), respectively, verify

$$\begin{aligned} \frac{\partial {\mathbf {p}}}{\partial {\mathbf {q}}}= & {} {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}> 0, \qquad \frac{\partial {\mathbf {q}}}{\partial {\mathbf {p}}} = L_{{\mathbf {p}}{\mathbf {p}}} > 0, \qquad L_{{\mathbf {p}}{\mathbf {p}}} = \left( {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} \right) ^{-1}, \end{aligned}$$
(33)
$$\begin{aligned} L_{{\mathbf {p}}{\mathbf {p}}}= & {} L_{{\mathbf {p}}{\mathbf {p}}}^T, \qquad {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} = {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}^T. \end{aligned}$$
(34)

It is easy to check that (30) is a consequence of (29), since scalar multiplication of (29) with \({\mathbf {p}}= {\mathcal {E}}_{{\mathbf {q}}}\) yields

$$\begin{aligned} {\mathbf {p}}\cdot (L_{{\mathbf {p}}})_t + {\mathbf {p}}\cdot \partial _k {\mathbf {f}}_k= & {} {\mathcal {E}}_t + \partial _k \left( {\mathbf {p}}\cdot {\mathbf {f}}_k \right) - \left( \partial _k {\mathbf {p}}\right) \cdot {\mathbf {f}}_k \nonumber \\= & {} {\mathcal {E}}_t + \partial _k \left( {\mathbf {p}}\cdot {\mathbf {f}}_k \right) - \partial _k {\mathbf {p}}\cdot \left( v_k L \right) _{{\mathbf {p}}} \nonumber \\= & {} {\mathcal {E}}_t + \partial _k \left( {\mathbf {p}}\cdot {\mathbf {f}}_k \right) - \partial _k \left( v_k L \right) , \nonumber \\= & {} {\mathcal {E}}_t + \partial _k F_k = 0, \end{aligned}$$
(35)

which is the sought form of the total energy conservation law (30). For details on the class of symmetric hyperbolic and thermodynamically compatible (SHTC) systems and their application, see [17, 47, 70, 71, 73,74,75, 93, 102]. The shallow water subsystem for flat bottom

$$\begin{aligned} \partial _t h + \partial _k \left( hv_k \right)= & {} 0, \end{aligned}$$
(36)
$$\begin{aligned} \partial _t (hv_i) + \partial _k \left( hv_i v_k + \frac{1}{2} g h^2 \delta _{ik} \right)= & {} 0, \end{aligned}$$
(37)
$$\begin{aligned} \partial _t {\mathcal {E}} + \partial _k \left( {\mathcal {E}} v_k + \frac{1}{2} g h^2 v_k \right)= & {} 0, \end{aligned}$$
(38)

contained in (14)–(18) falls into the class of PDE (28)–(30). The corresponding potentials are

$$\begin{aligned} {\mathcal {E}} = \frac{1}{2}g q_1^2 + \frac{1}{2}\frac{q_2^2 + q_3^2}{q_1} \end{aligned}$$
(39)

and

$$\begin{aligned} L = \frac{1}{2 g} \left( p_1 + \frac{1}{2}\left( p_2^2 + p_3^2 \right) \right) ^2, \end{aligned}$$
(40)

with the vectors \({\mathbf {q}}= (h, h v_1, hv_2)^T\) and \({\mathbf {p}}= ( g h - \frac{1}{2}( v_1^2 + v_2^2 ), v_1, v_2 )^T\). The associated Hessian matrices are

$$\begin{aligned} {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} = \frac{1}{h} \left( \begin{array}{ccc} g h + v_1^2 + v_2^2 &{} -v_1 &{} -v_2 \\ -v_1 &{} 1 &{} 0 \\ -v_2 &{} 0 &{} 1 \end{array} \right) \end{aligned}$$
(41)

and

$$\begin{aligned} L_{{\mathbf {p}}{\mathbf {p}}} = \frac{1}{g} \left( \begin{array}{ccc} 1 &{} v_1 &{} v_2 \\ v_1 &{} gh + v_1^2 &{} v_1 v_2 \\ v_2 &{} v_1 v_2 &{} gh + v_2 ^2 \end{array} \right) . \end{aligned}$$
(42)

It is easy to see that with (40) and the flux tensor \({\mathbf {F}}= (hv_k, hv_i v_k + \frac{1}{2}g h^2 \delta _{ik})^T\) the energy fluxes (31) in (30) are

$$\begin{aligned} F_k = {\mathbf {p}}\cdot {\mathbf {f}}_k - v_k L = {\mathcal {E}} v_k + \frac{1}{2} g h^2 v_k, \end{aligned}$$
(43)

which corresponds to the energy flux in (38).

2.4 Thermodynamically Compatible Vanishing Viscosity Limit

In order to define weak solutions for system (14)–(17), we define an associated thermodynamically compatible viscous system that satisfies at the same time an entropy-type inequality, as well as the total energy conservation law. In this section we assume a flat bottom with \(b=0\) for simplicity, as well as \(\alpha = C_f = 0\), while a small parabolic dissipation term with dissipation coefficient \(\varepsilon > 0\) is added to the equations. In order to guarantee exact total energy conservation, a non-negative production term \(T_{ik}\) must be added to the governing PDE for \({\mathbf {Q}}\):

(44)
(45)
(46)
(47)

with the total energy that can be decomposed into two contributions with \({\mathcal {E}}_1 = hE_1 = \frac{1}{2}g h^2 + \frac{1}{2}h v_i v_i\) and \({\mathcal {E}}_2 = h E_2 = \frac{1}{2}h Q_{ik} Q_{ik}\). Here, \({\mathcal {E}}_1\) is the total energy potential of the shallow water subsystem (36)–(37) and \({\mathcal {E}}_2\) is the total energy associated with the new object \(Q_{ik}\). In what follows, we will denote the inviscid part of the total energy flux in (47) by

(48)

with the abbreviation

$$\begin{aligned} F_G = {\mathcal {E}}_1 v_i + \frac{1}{2} g h^2 v_i \end{aligned}$$
(49)

that will be used later and which corresponds to the energy flux related to the shallow water subsystem, see also (43).

The production term, \(T_{ik}\), which is needed to achieve the consistency of (44)–(46) with the total energy conservation law (47) reads

$$\begin{aligned} T_{ik} = \varepsilon \frac{Q_{ik}}{h \, \text {tr}{\mathbf {P}} } \, \partial _m q_i \, \left( {\mathcal {E}}_{q_i q_j} \right) \, \partial _m q_j. \end{aligned}$$
(50)

The consistency with physics and experimental observations requires total energy conservation, see [68, 69, 80, 100, 101] for a more detailed discussion. In (50) the vector \({\mathbf {q}}=q_i = (h, h v_i, Q_{ik})\) indicates the vector of primary state variables and \({\mathcal {E}}_{q_i q_j}\) is the Hessian matrix of the total energy potential with respect to these state variables. One can show that the Hessian matrix is positive definite for small turbulent kinetic energy \(Q_{ij} Q_{ij}\) compared to gh, see “Appendix A” for details.

Theorem 1

(Energy conservation) The energy conservation law (47) is a consequence of equations (44)–(46).

Proof

The shallow water subsystem (36)–(37) related to \({\mathcal {E}}_1\), which are the black terms in (44)–(45), directly falls into the general class of PDE (29)–(30) found by Godunov, hence the compatibility of the shallow water subsystem with the energy conservation law with energy potential \({\mathcal {E}}_1\) is obvious. It is therefore enough to consider only the remaining terms associated with the quantity \(Q_{ik}\) (red) and the viscous terms on the right hand side (blue).

We first show compatibility of the red terms: Since \(({\mathcal {E}}_2)_h = E_2 = \frac{1}{2}Q_{ik} Q_{ik} = \frac{1}{2}\text {tr}{\mathbf {P}}\), \({\mathcal {E}}_{hv_i} = v_i\), \({\mathcal {E}}_{Q_{ik}} = \left( {\mathcal {E}}_2 \right) _{Q_{ik}} = h \left( E_2 \right) _{Q_{ik}} = h Q_{ik}\) summation of (44)–(46) with the thermodynamic dual variables and considering only new contributions that are not yet contained in the Godunov-form yields

$$\begin{aligned}&E_2 \left( \partial _t h \!+\! \partial _m (h v_m) \right) \!+\! v_i \partial _k ( h P_{ik}) \!+\! h Q_{ik} \left( \partial _t Q_{ik} \!+\! v_m \partial _m Q_{ik} \!+\! \left( \partial _m v_i \right) Q_{mk} \right) \nonumber \\&\quad =E_2 \partial _t h + h \partial _t \left( \frac{1}{2}Q_{ik} Q_{ik}\right) + E_2 \partial _m (h v_m) + h v_m \partial _m \left( \frac{1}{2}Q_{ik} Q_{ik} \right) \nonumber \\&\qquad + v_i \partial _k ( h Q_{im} Q_{km} ) + h Q_{ik} Q_{mk} \partial _m v_i \nonumber \\&\quad =\partial _t ( h E_2) + \partial _m \left( h v_m E_2 \right) + \partial _k \left( v_i h Q_{im} Q_{km} \right) \nonumber \\&\quad =\partial _t ( h E_2) + \partial _m \left( v_m {\mathcal {E}}_2 \right) + \partial _k \left( v_i h P_{ik} \right) . \end{aligned}$$
(51)

After simple renaming of indices this proves the thermodynamic compatibility of the red terms contained in the left hand side of (44)–(46) with the red terms on the left hand side of the energy equation (47).

We now consider the right hand side (blue terms): We define a viscous flux tensor \({\mathbf {g}}_k\) as

$$\begin{aligned} {\mathbf {g}}_m = \varepsilon \, \partial _m {\mathbf {q}}\end{aligned}$$
(52)

and a production term \({\mathbf {T}}\) that is equal to zero for all PDE apart from the non-zero production term \(T_{ik}\) in the PDE for \(Q_{ik}\), see (46) and (50). Summation of the right hand sides of (44)–(46) with the thermodynamic dual variables \({\mathbf {p}}= {\mathcal {E}}_{\mathbf {q}}\) yields

$$\begin{aligned} {\mathcal {E}}_{\mathbf {q}}\cdot \partial _m {\mathbf {g}}_m + {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}= & {} {\mathcal {E}}_{\mathbf {q}}\cdot \partial _m \varepsilon \, \partial _m {\mathbf {q}}+ {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}\nonumber \\= & {} \partial _m \left( \varepsilon \, {\mathcal {E}}_{\mathbf {q}}\cdot \partial _m {\mathbf {q}}\right) - \varepsilon \, \partial _m {\mathcal {E}}_{\mathbf {q}}\cdot \partial _m {\mathbf {q}}+ {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}\nonumber \\= & {} \partial _m \varepsilon \, \partial _m {\mathcal {E}}- \varepsilon \, \left( {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} \partial _m {\mathbf {q}}\right) \cdot \partial _m {\mathbf {q}}+ {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}\nonumber \\= & {} \partial _m \varepsilon \, \partial _m {\mathcal {E}}- \varepsilon \, \partial _m q_i \left( {\mathcal {E}}_{q_i q_j} \right) \partial _m q_j + {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}\nonumber \\= & {} \partial _m \varepsilon \, \partial _m {\mathcal {E}}, \end{aligned}$$
(53)

where \(- \varepsilon \, \partial _m q_i \left( {\mathcal {E}}_{q_i q_j} \right) \partial _m q_j + {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}= 0\) since \({\mathcal {E}}_{Q_{ik}} = h Q_{ik}\) and

$$\begin{aligned} {\mathcal {E}}_{\mathbf {q}}\cdot {\mathbf {T}}= h Q_{ik} T_{ik} = \varepsilon \frac{h Q_{ik} Q_{ik} }{h \text {tr}{\mathbf {P}}} \partial _m q_i \left( {\mathcal {E}}_{q_i q_j} \right) \partial _m q_j = \varepsilon \partial _m q_i \left( {\mathcal {E}}_{q_i q_j} \right) \partial _m q_j. \end{aligned}$$
(54)

The combination of the right hand sides of (44)–(46) therefore yields the right hand side of (47), which completes the proof. \(\square \)

Theorem 2

(Entropy-type inequality) A direct consequence of the PDE (46) without the parabolic dissipative term, i.e. of the equation

$$\begin{aligned} \partial _t Q_{ik} + v_m \, \partial _m \, Q_{ik} + \left( \partial _m v_i \right) \, Q_{mk} = T_{ik}, \end{aligned}$$
(55)

is an entropy-type inequality

$$\begin{aligned} \partial _t |Q| + \partial _m \left( v_m |Q| \right) = \varepsilon \frac{|Q| \delta _{kk}}{h \, \text {tr}{\mathbf {P}} } \, \partial _m q_i \, \left( {\mathcal {E}}_{q_i q_j} \right) \, \partial _m q_j \ge 0, \end{aligned}$$
(56)

with \(\left\{ i,j \right\} \in \left\{ 1, 2, 3\right\} \).

Proof

To see that the entropy inequality is a direct consequence of (55) we apply the Jacobi identity (20) to (55), which leads to

$$\begin{aligned} \partial _t |Q| + |Q| Q_{ki}^{-1} v_m \, \partial _m \, Q_{ik} + |Q| Q_{ki}^{-1} \left( \partial _m v_i \right) \, Q_{mk} = |Q| Q_{ki}^{-1} \, T_{ik}, \end{aligned}$$
(57)

from which one obtains

$$\begin{aligned} \partial _t |Q| + \partial _m \left( v_m |Q| \right) = |Q| Q_{ki}^{-1} \, T_{ik}. \end{aligned}$$
(58)

With

$$\begin{aligned} |Q| Q_{ki}^{-1} \, T_{ik} = \varepsilon \frac{|Q| Q_{ki}^{-1} Q_{ik}}{h \, \text {tr}{\mathbf {P}} } \, \partial _m q_i {\mathcal {E}}_{q_i q_j} \partial _m q_j = \varepsilon \frac{|Q| \delta _{kk}}{h \, \text {tr}{\mathbf {P}} } \, \partial _m q_i {\mathcal {E}}_{q_i q_j} \partial _m q_j \ge 0 \end{aligned}$$
(59)

one obtains the following entropy-type inequality associated with system (44)–(46):

$$\begin{aligned} \partial _t |Q| + \partial _m \left( v_m |Q| \right) = \varepsilon \frac{|Q| \delta _{kk}}{h \, \text {tr}{\mathbf {P}} } \, \partial _m q_i \, {\mathcal {E}}_{q_i q_j} \, \partial _m q_j \ge 0, \end{aligned}$$
(60)

where \(\left\{ i,j \right\} \in \left\{ 1, 2, 3\right\} \). \(\square \)

Throughout this paper, we will consider entropy solutions of (14)–(18) that satisfy (44)–(47) in the limit \(\varepsilon \rightarrow 0\). As shown later, the thermodynamically compatible scheme proposed in Sect. 3 of this paper is provably compatible with this vanishing viscosity limit, since it mimics the above viscous system exactly at the semi-discrete level. In the section containing the numerical results, we provide numerical evidence that also the high order ADER-DG schemes proposed in Sect. 4 of this paper as well as the numerical scheme already developed in [69] are compatible with this vanishing viscosity limit.

The meaning of Theorem 2 is the following. The evolution equation for the tensor \({\mathbf {P}}\) (or for \({\mathbf {Q}}\)) is responsible for the vorticity transport, dissipation and production. While the transport and dissipation terms are clearly identified in previous works, it is not the same for the production terms. In the one- dimensional case the energy equation is equivalent to the ‘entropy’ equation and the ’entropy’ (or vorticity) production is a consequence of the energy conservation. In the multi-dimensional case, the situation is completely different because the governing equations cannot be written in conservative form (the proof is given in [69]). So, the definition of weak solutions for such a non-conservative hyperbolic system which is compatible with the entropy production, should be given. In particular, such a definition was proposed in [69]. The Theorem 2 can be seen as a compatible alternative approach for the definition of weak solutions : the ’viscous’ terms playing a major role in shocks guarantee the vorticity production. Moreover, the ’viscous’ terms are consistent with the energy conservation law (Theorem 1) that is a necessary condition for all physically reasonable mathematical models.

3 Thermodynamically Compatible Finite Volume Scheme

In order to derive our new thermodynamically compatible finite volume scheme for system (14)–(18) that mimics the structure of the viscous system (44)–(47) exactly at the semi-discrete level, we proceed in a similar way as on the continuous level. First, a compatible scheme for the shallow water subsystem (36)–(38) is derived, based on a semi-discrete version of the Godunov form of (29)–(30). This corresponds to the discretization of the black terms in (44)–(47). Then, numerical viscosity together with an appropriate entropy production term is added to the scheme, which corresponds to the discrete analogue of the blue terms in (44)–(47). Last but not least the discretization of the red terms in (44)–(47) is discussed. To keep the presentation simple, we restrict ourselves to the one-dimensional case, but the generalization to multiple space dimensions is straightforward. To avoid confusion in the notation throughout this section we will use the lower case subscripts ijklm for tensor indices and the lower case superscript r for the spatial discretization index. We emphasize again that the scheme proposed in this section is only valid for the flat bottom case with \(b=0\).

3.1 Compatible Schemes Without Dissipation Applied to the Godunov Form

A semi-discrete conservative finite volume scheme for system (28) in one space dimension based on the spatial control volume \(\varOmega ^r = [x^{r-\frac{1}{2}}, x^{r+\frac{1}{2}}]\) reads

$$\begin{aligned} \frac{d}{dt} {\mathbf {q}}^r = - \frac{{\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^{r-\frac{1}{2}}}{\varDelta x}. \end{aligned}$$
(61)

By adding and subtracting \({\mathbf {f}}^r = {\mathbf {f}}({\mathbf {q}}^r)\) we get

$$\begin{aligned} \frac{d}{dt} {\mathbf {q}}^r = - \frac{ ({\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^r) - ( {\mathbf {f}}^{r-\frac{1}{2}} - {\mathbf {f}}^r) }{\varDelta x}. \end{aligned}$$
(62)

We now try to obtain a discrete form of the total energy conservation law (30) also as a consequence of the discrete equations (62). For this purpose, we multiply (62) with \({\mathbf {p}}^r= {\mathcal {E}}_{{\mathbf {q}}}({\mathbf {q}}^r)\) from the left and get

$$\begin{aligned} {\mathbf {p}}^r \cdot \frac{d}{dt} {\mathbf {q}}^r = \frac{d}{dt} {\mathcal {E}}^r = - {\mathbf {p}}^r \cdot \frac{ ({\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^r) + ( {\mathbf {f}}^r - {\mathbf {f}}^{r-\frac{1}{2}} ) }{\varDelta x}:= - \frac{1}{\varDelta x} \left( D_{{\mathcal {E}}}^{r+\frac{1}{2},-} + D_{{\mathcal {E}}}^{r-\frac{1}{2},+} \right) ,\nonumber \\ \end{aligned}$$
(63)

where the fluctuations \(D_{{\mathcal {E}}}^{r+\frac{1}{2},-} = {\mathbf {p}}^r \cdot ( {\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^r ) \) and \(D_{{\mathcal {E}}}^{r-\frac{1}{2},+} = {\mathbf {p}}^{r} \cdot ({\mathbf {f}}^{r} - {\mathbf {f}}^{r-\frac{1}{2}} ) \) have been introduced for convenience. Obviously, \(D_{{\mathcal {E}}}^{r+\frac{1}{2},+} = {\mathbf {p}}^{r+1} \cdot ({\mathbf {f}}^{r+1} - {\mathbf {f}}^{r+\frac{1}{2}} ) \). We now compute the temporal rate of change of the sum of the total energy in cell r and \(r+1\), which yields

$$\begin{aligned} \varDelta x \frac{d}{dt} \left( {\mathcal {E}}^r + {\mathcal {E}}^{r+1} \right) = - \left( D_{{\mathcal {E}}}^{r-\frac{1}{2},+} + D_{{\mathcal {E}}}^{r+\frac{1}{2},-} + D_{{\mathcal {E}}}^{r+\frac{1}{2},+} + D_{{\mathcal {E}}}^{r+\frac{3}{2},-} \right) . \end{aligned}$$
(64)

It is clear that in order to obtain a flux conservative form of the discrete energy conservation equation we must require that the contribution of the left and the right fluctuation at the interface \(r+\frac{1}{2}\) is a flux difference, i.e.

$$\begin{aligned} D_{{\mathcal {E}}}^{r+\frac{1}{2},-} + D_{{\mathcal {E}}}^{r+\frac{1}{2},+} := F^{r+1} - F^{r}, \end{aligned}$$
(65)

where \(F^r\) must be a consistent approximation of the total energy flux F. Inserting the definitions of the fluctuations into (65) yields

$$\begin{aligned}&{\mathbf {p}}^r \cdot ( {\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^r ) + {\mathbf {p}}^{r+1} \cdot ({\mathbf {f}}^{r+1} - {\mathbf {f}}^{r+\frac{1}{2}} ) = \nonumber \\&\quad -{\mathbf {f}}^{r+\frac{1}{2}} \cdot \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) + {\mathbf {p}}^{r+1} \cdot {\mathbf {f}}^{r+1} - {\mathbf {p}}^{r} \cdot {\mathbf {f}}^{r} := F^{r+1} - F^r. \end{aligned}$$
(66)

Using the parametrization (29) and the associated relations (31) we get

$$\begin{aligned}&- (vL)_{{\mathbf {p}}}^{r+\frac{1}{2}} \cdot \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) + {\mathbf {p}}^{r+1} \cdot {\mathbf {f}}^{r+1} - {\mathbf {p}}^{r} \cdot {\mathbf {f}}^{r} := \nonumber \\&\quad {\mathbf {p}}^{r+1} \cdot {\mathbf {f}}^{r+1} - (vL)^{r+1} - {\mathbf {p}}^{r} \cdot {\mathbf {f}}^{r} + (vL)^{r}, \end{aligned}$$
(67)

with \(F^r = {\mathbf {p}}^r \cdot {\mathbf {f}}^r - (vL)^r\) and thus the sought relation that the numerical flux \({\mathbf {f}}^{r+\frac{1}{2}}\) must satisfy is

$$\begin{aligned} {\mathbf {f}}^{r+\frac{1}{2}} \cdot \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) = (vL)_{{\mathbf {p}}}^{r+\frac{1}{2}} \cdot \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) = (vL)^{r+1} - (vL)^r. \end{aligned}$$
(68)

The condition (68) above is like a Roe-type property, but only for the vector \({\mathbf {f}}^{r+\frac{1}{2}}\) instead of an entire Roe matrix. Based on the ideas of path-conservative schemes of Castro and Parés [22, 89] we thus define the numerical flux via a path-integral in phase-space, since by the fundamental theorem of calculus we have

$$\begin{aligned} (vL)^{r+1} - (vL)^r = \int \limits _{{\mathbf {p}}^r}^{{\mathbf {p}}^{r+1}} (vL)_{{\mathbf {p}}} \cdot d{\mathbf {p}}= \int \limits _{0}^{1} (vL)_{{\mathbf {p}}} \cdot \frac{\partial \varvec{\psi }}{\partial s} ds \end{aligned}$$
(69)

for any path \(\varvec{\psi }=\varvec{\psi }(s)\) connecting \({\mathbf {p}}^r\) with \({\mathbf {p}}^{r+1}\), see also the pioneering work of Tadmor [108] for a similar construction of an entropy-conservative flux at the aid of a path integral. The last equality in (69) means a concrete parametrization of the chosen integration path using integration by substitution and a dimensionless integration parameter s in the range \(0 \le s \le 1\). In the following we choose two different parametrizations based on the simple straight-line segment path. Note that the choice of the path is arbitrary, hence we are free to choose a path that is somehow convenient for our purposes.

  1. 1.

    Segment path in the \({\mathbf {p}}\) variables (\({\mathbf {p}}\)-scheme). In the \({\mathbf {p}}\)-scheme, the path between \({\mathbf {p}}^r\) and \({\mathbf {p}}^{r+1}\) is directly given by the straight line segment

    $$\begin{aligned} \varvec{\psi }(s) = {\mathbf {p}}^r + s \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) , \qquad 0 \le s \le 1. \end{aligned}$$
    (70)

    We thus obtain

    $$\begin{aligned} \frac{\partial \varvec{\psi }}{\partial s} = {\mathbf {p}}^{r+1} - {\mathbf {p}}^r, \end{aligned}$$
    (71)

    and therefore relation (69) results in

    $$\begin{aligned} (vL)^{r+1} - (vL)^r= & {} \int \limits _{{\mathbf {p}}^r}^{{\mathbf {p}}^{r+1}} (vL)_{{\mathbf {p}}} \cdot d{\mathbf {p}}= \int \limits _{0}^{1} {\mathbf {f}}(\varvec{\psi }(s)) \cdot \frac{\partial \varvec{\psi }}{\partial s} ds \nonumber \\= & {} \left( \int \limits _{0}^{1} {\mathbf {f}}(\varvec{\psi }(s))^T ds \right) \cdot \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) . \end{aligned}$$
    (72)

    By comparison with (68) we find that the thermodynamically compatible numerical flux of the \({\mathbf {p}}\)-scheme is therefore given by

    $$\begin{aligned} {\mathbf {f}}^{r+\frac{1}{2}}_{\mathbf {p}}= \int \limits _{0}^{1} {\mathbf {f}}(\varvec{\psi }(s)) ds, \end{aligned}$$
    (73)

    which by construction satisfies \( \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) \cdot {\mathbf {f}}^{r+\frac{1}{2}}_{\mathbf {p}}= (vL)^{r+1} - (vL)^r\) and thus condition (68). The problem with the \({\mathbf {p}}\)-scheme is that it requires \({\mathbf {f}}\) in terms of \({\mathbf {p}}\) variables, which in general is very cumbersome, since usually \({\mathbf {f}}\) is easier known in terms of \({\mathbf {q}}\) rather than in terms of \({\mathbf {p}}\).

  2. 2.

    Segment in the \({\mathbf {q}}\) variables (\({\mathbf {q}}\)-scheme). To avoid the above-mentioned problem, in the \({\mathbf {q}}\)-scheme the path between \({\mathbf {p}}^r\) and \({\mathbf {p}}^{r+1}\) is now defined in terms of a straight line segment in the \({\mathbf {q}}\) variables, which means in terms of \({\mathbf {p}}\) variables the path is in general not a segment. We set

    $$\begin{aligned} \tilde{\varvec{\psi }}(s) = {\mathbf {p}}\left( {\mathbf {q}}^r + s \left( {\mathbf {q}}^{r+1} - {\mathbf {q}}^r \right) \right) , \qquad 0 \le s \le 1. \end{aligned}$$
    (74)

    Here, we only use the notation \(\tilde{\varvec{\psi }}(s)\) to avoid confusion with the path used before in the \({\mathbf {p}}\)-scheme. We therefore have

    $$\begin{aligned} \frac{\partial \tilde{\varvec{\psi }}}{\partial s} = \frac{\partial {\mathbf {p}}}{\partial {\mathbf {q}}} \cdot \left( {\mathbf {q}}^{r+1} - {\mathbf {q}}^r \right) = {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} \cdot \left( {\mathbf {q}}^{r+1} - {\mathbf {q}}^r \right) , \end{aligned}$$
    (75)

    and thus condition (69) results in

    $$\begin{aligned} (vL)^{r+1} - (vL)^r= & {} \int \limits _{{\mathbf {p}}^r}^{{\mathbf {p}}^{r+1}} (vL)_{{\mathbf {p}}} \cdot d{\mathbf {p}}= \int \limits _{0}^{1} {\mathbf {f}}(\tilde{\varvec{\psi }}(s)) \cdot \frac{\partial \tilde{\varvec{\psi }}}{\partial s} ds \nonumber \\= & {} \left( \int \limits _{0}^{1} {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} {\mathbf {f}}(\tilde{\varvec{\psi }}(s))^T ds \right) \cdot \left( {\mathbf {q}}^{r+1} - {\mathbf {q}}^r \right) . \end{aligned}$$
    (76)

    If we now check again condition (68) we still need to transform the jump in \({\mathbf {p}}\) variables into a jump in \({\mathbf {q}}\) variables. For that purpose, we define a Roe-type matrix \({\tilde{L}}_{{\mathbf {q}}{\mathbf {q}}}\) that satisfies the Roe property

    $$\begin{aligned} {\tilde{L}}_{{\mathbf {p}}{\mathbf {p}}}^{r+\frac{1}{2}} \cdot \left( {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \right) = {\mathbf {q}}^{r+1} - {\mathbf {q}}^r, \end{aligned}$$
    (77)

    which can be easily achieved by construction by the means of a path integral. In practice we first define the inverse of the Roe matrix \( {\tilde{L}}_{{\mathbf {p}}{\mathbf {p}}}^{r+\frac{1}{2}} \) as

    $$\begin{aligned} \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}} = \int \limits _0^1 {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}\left( \tilde{\varvec{\psi }}(s)\right) ds, \end{aligned}$$
    (78)

    which is again a Roe matrix, but which is easy to compute, and which can be checked to satisfy

    $$\begin{aligned} \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}} \cdot ({\mathbf {q}}^{r+1}-{\mathbf {q}}^r ) = {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \end{aligned}$$
    (79)

    by construction. We thus obtain

    $$\begin{aligned} {\tilde{L}}_{{\mathbf {p}}{\mathbf {p}}}^{r+\frac{1}{2}} = \left( \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}} \right) ^{-1} = \left( \int \limits _0^1 {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}\left( \tilde{\varvec{\psi }}(s)\right) ds \right) ^{-1}, \end{aligned}$$
    (80)

    which finally yields the desired thermodynamically compatible numerical flux of the \({\mathbf {q}}\) scheme as

    $$\begin{aligned} {\mathbf {f}}^{r+\frac{1}{2}}_{\mathbf {q}}= {\tilde{L}}_{{\mathbf {p}}{\mathbf {p}}}^{r+\frac{1}{2}} \int \limits _{0}^{1} {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} {\mathbf {f}}(\tilde{\varvec{\psi }}(s)) ds = \left( \int \limits _0^1 {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}\left( \tilde{\varvec{\psi }}(s)\right) ds \right) ^{\!-1} \! \left( \int \limits _{0}^{1} {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} {\mathbf {f}}(\tilde{\varvec{\psi }}(s)) ds \right) . \end{aligned}$$
    (81)

    Note that if \({\mathbf {f}}\) is only easily known in terms of \({\mathbf {q}}\) variables, one can directly plug the straight segment path in terms of the \({\mathbf {q}}\) variables into the function \({\mathbf {f}}\) and into the Hessian \({\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}\), without needing to compute \({\mathbf {p}}({\mathbf {q}})\) at all!

In practical calculations, we approximate all path integrals by numerical quadrature, which can be done up to any desired level of accuracy, see also [42, 45, 48, 49] where this strategy has already been successfully used.

3.2 Compatible Scheme with Dissipation Applied to the Godunov Form

The above schemes are compatible with the parametrization (29) of the system (28) and also satisfy the extra conservation law (30). However, to obtain a dissipative scheme, we still need to add a compatible numerical dissipation. For that purpose we write a dissipative scheme for (28) of the form

$$\begin{aligned} \frac{d}{dt} {\mathbf {q}}^r + \frac{{\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^{r-\frac{1}{2}}}{\varDelta x} = \frac{{\mathbf {g}}^{r+\frac{1}{2}} - {\mathbf {g}}^{r-\frac{1}{2}}}{\varDelta x} + {\mathbf {T}}^r, \end{aligned}$$
(82)

with the compatible flux \({\mathbf {f}}^{r+\frac{1}{2}}\) as defined before and the additional dissipative numerical flux \({\mathbf {g}}^{r+\frac{1}{2}}\) given by

$$\begin{aligned} {\mathbf {g}}^{r+\frac{1}{2}} = \mu ^{r+\frac{1}{2}} \frac{{\mathbf {q}}^{r+1}-{\mathbf {q}}^r}{\varDelta x} = \mu ^{r+\frac{1}{2}} \frac{\varDelta {\mathbf {q}}^{r+\frac{1}{2}}}{\varDelta x}, \end{aligned}$$
(83)

where \(\mu ^{r+\frac{1}{2}} \ge 0\) is a scalar numerical dissipation. Henceforth we simply set

$$\begin{aligned} \mu ^{r+\frac{1}{2}} = \frac{1}{2}\left( 1 - \varphi ^{r+\frac{1}{2}} \right) \varDelta x \, s_{\max }^{r+\frac{1}{2}} \ge 0, \end{aligned}$$
(84)

with \(s_{\max }^{r+\frac{1}{2}}\) an estimate for the maximum signal speed at the interface. For \(\varphi ^{r+\frac{1}{2}}=0\) this choice corresponds to a classical first order Rusanov-type scheme, see [107, 115]. To reduce numerical dissipation in smooth regions, a TVD minbee flux limiter \(\varphi ^{r+\frac{1}{2}}\) is employed, which is defined as follows, see the second order TVD SLIC scheme described by Toro in [115],

$$\begin{aligned} \varphi ^{r+\frac{1}{2}} = \min \left( \varphi ^{r+\frac{1}{2}}_{-}, \varphi ^{r+\frac{1}{2}}_+ \right) , \quad \text {with} \quad \varphi ^{r+\frac{1}{2}}_{\pm } = \max \left( 0, \min \left( 1, \rho ^{r+\frac{1}{2}}_{\pm } \right) \right) , \end{aligned}$$
(85)

with the ratios of subsequent slopes of the total energy potential defined as

$$\begin{aligned} \rho ^{r+\frac{1}{2}}_{-} = \frac{{\mathcal {E}}^{r} - {\mathcal {E}}^{r-1}}{ {\mathcal {E}}^{r+1} - {\mathcal {E}}^{r} }, \qquad \text {and} \qquad \rho ^{r+\frac{1}{2}}_{+} = \frac{{\mathcal {E}}^{r+2} - {\mathcal {E}}^{r+1}}{ {\mathcal {E}}^{r+1} - {\mathcal {E}}^{r} }. \end{aligned}$$
(86)

Note that in regions of \(\varphi ^{r+\frac{1}{2}}=1\) the scheme exhibits no numerical viscosity at all. The production term \({\mathbf {T}}^r\) will be defined later. Computing the dot product of (82) with \({\mathbf {p}}^i\) yields

$$\begin{aligned} \frac{d}{dt} {\mathcal {E}}^r + \frac{F^{r+\frac{1}{2}} - F^{r-\frac{1}{2}}}{\varDelta x} = {\mathbf {p}}^r \cdot \frac{{\mathbf {g}}^{r+\frac{1}{2}} - {\mathbf {g}}^{r-\frac{1}{2}}}{\varDelta x} + {\mathbf {p}}^r \cdot {\mathbf {T}}^r, \end{aligned}$$
(87)

where we denote the inviscid numerical flux for the total energy by \(F^{r+\frac{1}{2}}\). Since the dissipation-free scheme has already been shown to be compatible with the energy conservation law, in what follows, the explicit expression for \(F^{r+\frac{1}{2}}\) is not needed, but is given here for completeness:

$$\begin{aligned} F^{r+\frac{1}{2}} = D_{{\mathcal {E}}}^{r+\frac{1}{2},-} + F^r = {\mathbf {p}}^r \cdot ( {\mathbf {f}}^{r+\frac{1}{2}} - {\mathbf {f}}^r ) + F^r. \end{aligned}$$
(88)

We now rewrite the right hand side of (87) as

$$\begin{aligned}&{\mathbf {p}}^r \cdot \frac{{\mathbf {g}}^{r+\frac{1}{2}} - {\mathbf {g}}^{r-\frac{1}{2}}}{\varDelta x} + {\mathbf {p}}^r \cdot {\mathbf {T}}^r \nonumber \\&\quad ={\mathbf {p}}^r \cdot {\mathbf {T}}^r + \frac{1}{\varDelta x} \left( \frac{1}{2}{\mathbf {p}}^r \cdot {\mathbf {g}}^{r+\frac{1}{2}} + \frac{1}{2}{\mathbf {p}}^{r+1} \cdot {\mathbf {g}}^{r+\frac{1}{2}} + \frac{1}{2}{\mathbf {p}}^r \cdot {\mathbf {g}}^{r+\frac{1}{2}} - \frac{1}{2}{\mathbf {p}}^{r+1} \cdot {\mathbf {g}}^{r+\frac{1}{2}} \right) \nonumber \\&\qquad -\frac{1}{\varDelta x} \left( \frac{1}{2}{\mathbf {p}}^r \cdot {\mathbf {g}}^{r-\frac{1}{2}} + \frac{1}{2}{\mathbf {p}}^{r-1} \cdot {\mathbf {g}}^{r-\frac{1}{2}} + \frac{1}{2}{\mathbf {p}}^r \cdot {\mathbf {g}}^{r-\frac{1}{2}} - \frac{1}{2}{\mathbf {p}}^{r-1} \cdot {\mathbf {g}}^{r-\frac{1}{2}} \right) \nonumber \\&\quad ={\mathbf {p}}^r \cdot {\mathbf {T}}^r + \frac{1}{2}\frac{{\mathbf {p}}^{r+1} + {\mathbf {p}}^{r}}{\varDelta x} \cdot {\mathbf {g}}^{r+\frac{1}{2}} - \frac{1}{2}\frac{ {\mathbf {p}}^r + {\mathbf {p}}^{r-1}}{\varDelta x} \cdot {\mathbf {g}}^{r-\frac{1}{2}} \nonumber \\&\qquad - \frac{1}{2}\frac{{\mathbf {p}}^{r+1} - {\mathbf {p}}^{r}}{\varDelta x} \cdot {\mathbf {g}}^{r+\frac{1}{2}} - \frac{1}{2}\frac{ {\mathbf {p}}^r - {\mathbf {p}}^{r-1}}{\varDelta x} \cdot {\mathbf {g}}^{r-\frac{1}{2}} \nonumber \\&\quad ={\mathbf {p}}^r \cdot {\mathbf {T}}^r + \frac{1}{2}\frac{{\mathbf {p}}^{r+1} + {\mathbf {p}}^{r}}{\varDelta x} \cdot \mu ^{r+\frac{1}{2}} \frac{{\mathbf {q}}^{r+1}-{\mathbf {q}}^r}{\varDelta x} - \frac{1}{2}\frac{ {\mathbf {p}}^{r} + {\mathbf {p}}^{r-1}}{\varDelta x} \cdot \mu ^{r-\frac{1}{2}} \frac{{\mathbf {q}}^{r}-{\mathbf {q}}^{r-1}}{\varDelta x} \nonumber \\&\qquad - \frac{1}{2}\frac{{\mathbf {p}}^{r+1} - {\mathbf {p}}^{r}}{\varDelta x} \cdot \mu ^{r+\frac{1}{2}} \frac{{\mathbf {q}}^{r+1}-{\mathbf {q}}^r}{\varDelta x} - \frac{1}{2}\frac{{\mathbf {p}}^{r} - {\mathbf {p}}^{r-1}}{\varDelta x} \cdot \mu ^{r-\frac{1}{2}} \frac{{\mathbf {q}}^{r}-{\mathbf {q}}^{r-1}}{\varDelta x}. \end{aligned}$$
(89)

The total energy flux including the dissipative terms thus reads as follows

$$\begin{aligned} F^{r+\frac{1}{2}}_d= & {} F^{r+\frac{1}{2}} - \frac{1}{2}( {\mathbf {p}}^{r+1} + {\mathbf {p}}^{r} ) \cdot \mu ^{r+\frac{1}{2}} \frac{\varDelta {\mathbf {q}}^{r+\frac{1}{2}}}{\varDelta x} \nonumber \\\approx & {} F^{r+\frac{1}{2}} - \mu ^{r+\frac{1}{2}} \frac{\varDelta {\mathcal {E}}^{r+\frac{1}{2}}}{\varDelta x}, \end{aligned}$$
(90)

since the expression \(\frac{1}{2}( {\mathbf {p}}^{r+1} + {\mathbf {p}}^{r} ) \cdot \varDelta {\mathbf {q}}^{r+\frac{1}{2}}\) is an approximation of the path integral

$$\begin{aligned} \int \limits _{{\mathbf {q}}^r}^{{\mathbf {q}}^{r+1}} {\mathbf {p}}\, \cdot \, d {\mathbf {q}}= \int \limits _{{\mathbf {q}}^r}^{{\mathbf {q}}^{r+1}} {\mathcal {E}}_{\mathbf {q}}^T \cdot d {\mathbf {q}}= {\mathcal {E}}^{r+1} - {\mathcal {E}}^r := \varDelta {\mathcal {E}}^{r+\frac{1}{2}} \end{aligned}$$
(91)

using the simple trapezoidal quadrature rule. Making again use of the symmetric Roe matrix \(\tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}}\), which satisfies \(\tilde{{\mathcal {E}}}^{r+\frac{1}{2}}_{{\mathbf {q}}{\mathbf {q}}} ( {\mathbf {q}}^{r+1} - {\mathbf {q}}^r ) = {\mathbf {p}}^{r+1} - {\mathbf {p}}^r \), the semi-discrete total energy conservation law takes the form

$$\begin{aligned}&\frac{d}{dt} {\mathcal {E}}^r + \frac{F_d^{r+\frac{1}{2}} - F_d^{r-\frac{1}{2}}}{\varDelta x} = {\mathbf {p}}^r \cdot {\mathbf {T}}^r - \nonumber \\&\quad - \frac{1}{2}\mu ^{r+\frac{1}{2}} \frac{{\mathbf {q}}^{r+1} - {\mathbf {q}}^{r}}{\varDelta x} \cdot \tilde{{\mathcal {E}}}_{{\mathbf {p}}{\mathbf {p}}}^{r+\frac{1}{2}} \frac{{\mathbf {q}}^{r+1}-{\mathbf {q}}^r}{\varDelta x} - \frac{1}{2}\mu ^{r-\frac{1}{2}} \frac{{\mathbf {q}}^{r} - {\mathbf {q}}^{r-1}}{\varDelta x} \cdot \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r-\frac{1}{2}} \frac{{\mathbf {q}}^{r}-{\mathbf {q}}^{r-1}}{\varDelta x}.\qquad \end{aligned}$$
(92)

By requiring that

$$\begin{aligned} {\mathbf {p}}^r \cdot {\mathbf {T}}^r := \frac{1}{2}\mu ^{r+\frac{1}{2}} \frac{\varDelta {\mathbf {q}}^{r+\frac{1}{2}}}{\varDelta x} \cdot \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}} \frac{\varDelta {\mathbf {q}}^{r+\frac{1}{2}}}{\varDelta x} + \frac{1}{2}\mu ^{r-\frac{1}{2}} \frac{\varDelta {\mathbf {q}}^{r-\frac{1}{2}}}{\varDelta x} \cdot \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r-\frac{1}{2}} \frac{\varDelta {\mathbf {q}}^{r-\frac{1}{2}}}{\varDelta x}, \end{aligned}$$
(93)

we finally obtain the sought conservation form of the discrete total energy equation (87):

$$\begin{aligned} \frac{d}{dt} {\mathcal {E}}^r + \frac{F_d^{r+\frac{1}{2}} - F_d^{r-\frac{1}{2}}}{\varDelta x} = 0. \end{aligned}$$
(94)

The term \({\mathbf {T}}^r\) is set identically to zero in all its components, apart from the equations that are needed for the entropy inequality, which are the nonconservative evolution equations for \(Q_{ik}\). Therefore, the term \({\mathbf {T}}^r\) will be discussed later.

To summarize, the thermodynamically compatible dissipative numerical flux of the \({\mathbf {q}}\) scheme for (28) reads

$$\begin{aligned} {\mathbf {f}}^{r+\frac{1}{2}}_{{\mathbf {q}},d}= & {} \left( \int \limits _0^1 {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}}\left( \tilde{\varvec{\psi }}(s)\right) ds \right) ^{-1} \left( \int \limits _{0}^{1} {\mathcal {E}}_{{\mathbf {q}}{\mathbf {q}}} {\mathbf {f}}(\tilde{\varvec{\psi }}(s)) ds \right) \nonumber \\&- \frac{1}{2}s_{\max }^{r+\frac{1}{2}} \left( 1 - \varphi ^{r+\frac{1}{2}} \right) \left( {\mathbf {q}}^{r+1} - {\mathbf {q}}^r \right) . \end{aligned}$$
(95)

3.3 Thermodynamically Compatible Discretization of the Terms Related to \(Q_{ik}\)

We now present the discretization of the Reynolds stress tensor \(R_{ik} = h P_{ik}\) in the momentum equation that is thermodynamically compatible with the term \((\partial _m v_i) Q_{mk}\) in (46) and the term \(h P_{ik} v_k\) in the energy equation. To ease notation, we present the discretization only for the one-dimensional case in \(x_1\) direction. Extension to multiple space dimensions is straightforward. For the term \((\partial _m v_i) Q_{mk}\) we a priori choose the following discretization:

$$\begin{aligned} \varDelta x (\partial _1 v_i) Q_{1k} \approx Q_{1k}^{r+\frac{1}{2}} \left( v_i^{r+1} - v_i^r \right) , \quad \text { with } \quad Q_{1k}^{r+\frac{1}{2}} = \frac{1}{2}\left( Q_{1k}^{r} + Q_{1k}^{r+1} \right) . \end{aligned}$$
(96)

Multiplication of the momentum equation (45) with the dual variable \({\mathcal {E}}_{h v_i} = v_i\) and of PDE (46) with the dual variable \({\mathcal {E}}_{Q_{ik}} = h Q_{ik}\) and requiring thermodynamic compatibility with total energy equation leads to the following requirement that needs to be fulfilled by the yet unknown discretization of the Reynolds stress tensor \(R_{i1}^{r+\frac{1}{2}}\):

$$\begin{aligned}&v_i^r \left( R_{i1}^{r+\frac{1}{2}} - R_{i1}^r \right) + v_i^{r+1} \left( R_{i1}^{r+1} - R_{i1}^{r+\frac{1}{2}} \right) \nonumber \\&\qquad + h^r Q_{ik}^{r} \frac{1}{2}Q_{1k}^{r+\frac{1}{2}} \left( v_i^{r+1} - v_i^r \right) + h^{r+1} Q_{ik}^{r+1} \frac{1}{2}Q_{1k}^{r+\frac{1}{2}} \left( v_i^{r+1} - v_i^r \right) \nonumber \\&\quad =h^{r+1} Q_{im}^{r+1} Q_{1m}^{r+1} v_i^{r+1} - h^{r} Q_{im}^{r} Q_{1m}^{r} v_i^{r}. \end{aligned}$$
(97)

Using \(R_{ik} = h Q_{im} Q_{km}\) and collecting terms leads to

$$\begin{aligned} -R_{i1}^{r+\frac{1}{2}} \left( v_i^{r+1} - v_i^r \right) + Q_{1k}^{r+\frac{1}{2}} \frac{1}{2}\left( h^r Q_{ik}^{r} + h^{r+1} Q_{ik}^{r+1} \right) \left( v_i^{r+1} - v_i^r \right) = 0. \end{aligned}$$
(98)

Since \(Q_{1k}^{r+\frac{1}{2}} = \frac{1}{2}\left( Q_{1k}^{r} + Q_{1k}^{r+1} \right) \), we obtain the following compatible discretization for the numerical flux of the Reynolds stress tensor in \(x_1\) direction:

$$\begin{aligned} R_{i1}^{r+\frac{1}{2}} = \frac{1}{2}\left( Q_{1k}^{r} + Q_{1k}^{r+1} \right) \frac{1}{2}\left( h^r Q_{ik}^{r} + h^{r+1} Q_{ik}^{r+1} \right) , \end{aligned}$$
(99)

which needs to be added to the compatible dissipative flux (95) in the semi-discrete momentum equation, which then takes the form

$$\begin{aligned} \frac{d}{dt} (h v_i) = -\frac{1}{\varDelta x} \left( {\mathbf {f}}^{r+\frac{1}{2}}_{hv_i,d} - {\mathbf {f}}^{r-\frac{1}{2}}_{hv_i,d} \right) -\frac{1}{\varDelta x} \left( R_{i1}^{r+\frac{1}{2}} - R_{i1}^{r-\frac{1}{2}} \right) , \end{aligned}$$
(100)

where \({\mathbf {f}}^{r+\frac{1}{2}}_{hv_i,d}\) is the part of the dissipative flux in \(x_1\) direction (95) that refers to the momentum equation.

The last term in (46) that needs to be discretized is the convective term \(v_m \partial _m Q_{ik}\), which requires compatibility with the mass conservation law (44) and the energy conservation (47). To achieve such a compatible discretization, the mass conservation equation needs to be multiplied with the remaining contribution \({\mathcal {E}}_{2,h}=E_2\) and the PDE for \(Q_{ik}\) is again multiplied with \({\mathcal {E}}_{Q_{ik}}\), and the following condition must be satisfied:

$$\begin{aligned}&E_2^r \left( (hv)_1^{r+\frac{1}{2}} - (hv)_1^r \right) + E_2^{r+1} \left( (hv)_1^{r+1} - (hv)_1^{r+\frac{1}{2}} \right) \nonumber \\&\qquad + h^r Q_{ik}^{r} \frac{1}{2}{\tilde{v}}_1^{r+\frac{1}{2}} \left( Q_{ik}^{r+1} - Q_{ik}^r \right) + h^{r+1} Q_{ik}^{r+1} \frac{1}{2}{\tilde{v}}_1^{r+\frac{1}{2}} \left( Q_{ik}^{r+1} - Q_{ik}^r \right) \nonumber \\&\quad =(hv)_1^{r+1} E_2^{r+1} - (hv)_1^{r} E_2^{r}, \end{aligned}$$
(101)

with the yet unknown average velocity \({\tilde{v}}_1^{r+\frac{1}{2}}\) at the cell interface. Note that the numerical mass flux \((hv)_1^{r+\frac{1}{2}}\) is the known compatible inviscid mass flux of the numerical flux \({\mathbf {f}}_{\mathbf {q}}^{r+\frac{1}{2}}\) of the \({\mathbf {q}}\)-scheme according to the semi-discrete Godunov formalism, see (81). Collecting terms leads to

$$\begin{aligned} (hv)_1^{r+\frac{1}{2}} \left( E_2^{r+1} - E_2^r \right) = {\tilde{v}}_1^{r+\frac{1}{2}} \left( h^{r+1} E_2^{r+1} - h^{r} E_2^{r} - \frac{1}{2}Q_{ik}^r Q_{ik}^{r+1} \left( h^{r+1}-h^r \right) \right) ,\qquad \end{aligned}$$
(102)

from which we obtain the sought expression for the average velocity at the interface as

$$\begin{aligned} {\tilde{v}}_1^{r+\frac{1}{2}} = \frac{(hv)_1^{r+\frac{1}{2}} \left( E_2^{r+1} - E_2^r \right) }{h^{r+1} E_2^{r+1} - h^{r} E_2^{r} - \frac{1}{2}Q_{ik}^r Q_{ik}^{r+1} \left( h^{r+1}-h^r \right) }. \end{aligned}$$
(103)

In case the denominator is zero, we simply set the velocity to the arithmetic average \( {\tilde{v}}_1^{r+\frac{1}{2}} = \frac{1}{2}\left( {\tilde{v}}_1^{r} + {\tilde{v}}_1^{r+1} \right) \).

In order to get compatibility with the total energy conservation law also in the presence of numerical viscosity, we need to add the discrete production term to the PDEs of \(Q_{ik}\) at the right and left element interface, according to the condition (93) already derived before:

$$\begin{aligned} T_{ik}^{r+\frac{1}{2},-} = \mu ^{r+\frac{1}{2}} \, \frac{1}{2}\frac{Q_{ik}^{r}}{(h \text {tr}{{\mathbf {P}}})^r} \, \frac{\varDelta {\mathbf {q}}^{r+\frac{1}{2}}}{\varDelta x} \cdot \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}} \, \frac{\varDelta {\mathbf {q}}^{r+\frac{1}{2}}}{\varDelta x} \end{aligned}$$
(104)

and

$$\begin{aligned} T_{ik}^{r-\frac{1}{2},+} = \mu ^{r-\frac{1}{2}} \, \frac{1}{2}\frac{Q_{ik}^{r}}{(h \text {tr}{{\mathbf {P}}})^r} \, \frac{\varDelta {\mathbf {q}}^{r-\frac{1}{2}}}{\varDelta x} \cdot \tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r-\frac{1}{2}} \, \frac{\varDelta {\mathbf {q}}^{r-\frac{1}{2}}}{\varDelta x}. \end{aligned}$$
(105)

The physical entropy production is always non-negative, since we assume \(\mu ^{r+\frac{1}{2}} \ge 0\) and \(\tilde{{\mathcal {E}}}_{{\mathbf {q}}{\mathbf {q}}}^{r+\frac{1}{2}} \ge 0\). It is obvious that (105) and (104) are discrete analogues of the continuous production term (50).

The final semi-discrete scheme for \(Q_{ik}\) in one space dimension reads:

$$\begin{aligned} \frac{d}{dt} Q_{ik}^r= & {} - \frac{1}{2}{\tilde{v}}_1^{r+\frac{1}{2}} \frac{Q_{ik}^{r+1}-Q_{ik}^r}{\varDelta x} - \frac{1}{2}\cdot \frac{Q_{1k}^{r}+Q_{1k}^{r+1}}{2} \cdot \frac{v_i^{r+1} - v_i^r}{\varDelta x} \nonumber \\&+ \frac{\mu ^{r+\frac{1}{2}}}{\varDelta x} \cdot \frac{Q_{ik}^{r+1}-Q_{ik}^r}{\varDelta x} + T_{ik}^{r-\frac{1}{2},+} + T_{ik}^{r+\frac{1}{2},-} . \end{aligned}$$
(106)

3.4 Summary of the Scheme and Stability Proof

For completeness, we now gather together all equations of the thermodynamically compatible scheme, thus obtaining

(107)
(108)
(109)

with the fluctuations

$$\begin{aligned} D_{q}^{r+\frac{1}{2},-} = f_{q}^{r+\frac{1}{2}} - f_{q}^{r}, \qquad \text {and} \qquad D_{q}^{r+\frac{1}{2},+} = f_{q}^{r+1} - f_{q}^{r+\frac{1}{2}}, \end{aligned}$$
(110)

where \(f_{q}^{r}\) denotes the physical flux evaluated in cell r and \(f_{q}^{r+\frac{1}{2}}\) is the compatible flux for depth and momentum, i.e. for \(q \in \left\{ h, h v_i \right\} \). Recall that the flux vector in the previous notation reads

$$\begin{aligned} {\mathbf {f}}_{{\mathbf {q}}}^{r+\frac{1}{2}} = \left( f_{h}^{r+\frac{1}{2}},f_{hv_{i}}^{r+\frac{1}{2}},0\right) \end{aligned}$$
(111)

and is computed according to the q-scheme of the semi-discrete Godunov formalism presented previously. We have also introduced the fluctuations

$$\begin{aligned} R_{i,1}^{r+\frac{1}{2},-}= \left( R_{i,1}^{r+\frac{1}{2}} -R_{i,1}^{r}\right) , \quad R_{i,1}^{r+\frac{1}{2},+}= \left( R_{i,1}^{r+1} -R_{i,1}^{r+\frac{1}{2}}\right) , \end{aligned}$$
(112)

with \(R_{i,1}^{r+\frac{1}{2}}\) being the compatible discretization of the Reynolds stress tensor in \(x_{1}\) direction given in (99), and

$$\begin{aligned} D_{Q_{1k}}^{r+\frac{1}{2}, \pm } = \frac{1}{2} {\tilde{v}}_{1}^{r+\frac{1}{2}}\left( Q_{1k}^{r+1}- Q_{1k}^{r}\right) + \frac{1}{2} {\tilde{Q}}_{1k}^{r+\frac{1}{2}}\left( v_{i}^{r+1}-v_{i}^{r} \right) . \end{aligned}$$
(113)

Let us also recall that the dissipative fluxes, \(g_{q}^{r\pm \frac{1}{2}}\), have been defined in (83).

Theorem 3

The semi-discrete scheme (107)–(109) admits the semi-discrete energy conservation equation

(114)

with

$$\begin{aligned} D_{{\mathcal {E}}}^{r+\frac{1}{2},-} + D_{{\mathcal {E}}}^{r+\frac{1}{2},+} = F^{r+1} - F^r. \end{aligned}$$
(115)

As a result the scheme is energy conserving and therefore marginally stable in the energy norm, i.e. the scheme satisfies

$$\begin{aligned} \int \limits _{\varOmega } \frac{d {\mathcal {E}}}{d t} dx = \sum _{r}\varDelta x \frac{d {\mathcal {E}}^{r}}{d t} = 0. \end{aligned}$$
(116)

Proof

We first demonstrate that (114) is a direct consequence of (107)–(109). To this end we proceed as in the continuous case, i.e. we start by considering the time derivatives and we sum the contributions coming from equations (107)–(108) multiplied by \({\mathcal {E}}_{h}^{r}\), \({\mathcal {E}}_{hv_{1}}^{r}\) and \({\mathcal {E}}_{Q_{ik}}^{r}\), respectively, thus obtaining

$$\begin{aligned} {\mathcal {E}}_{h}^{r} \frac{d h^{r}}{d t}+ {\mathcal {E}}_{hv_{1}}^{r} \frac{d hv_{i}^{r}}{d t}+ {\mathcal {E}}_{Q_{ik}}^{r} \frac{d Q_{ik}^{r}}{d t} ={\mathcal {E}}_{{\mathbf {q}}}^{r} \cdot \frac{d {\mathbf {q}}^{r}}{d t} =\frac{d {\mathcal {E}}^{r}}{d t}. \end{aligned}$$
(117)

For the convective terms, the Reynolds stress tensor and the PDE related to \({\mathbf {Q}}\) we define

(118)

Let us remark that the definitions of the fluctuations in (118) differ from the ones given in Sect. 3.1, where the terms related to the total energy \({\mathcal {E}}_{2}\) associated with \(Q_{ik}\) were not yet included. Finally, the dot product of \({\mathbf {p}}^{r}\) by the vector of the blue terms in (107)–(109) yields

(119)

Taking into account (104)–(105) in \({\mathbf {p}}^{r}\cdot {\mathbf {T}}^{r} = {\mathbf {p}}^{r}\cdot {\mathbf {T}}^{r-\frac{1}{2},+}+ {\mathbf {p}}^{r}\cdot {\mathbf {T}}^{r+\frac{1}{2},-}\) we get (93). Substitution of this result in (119) and making use of the developments in (89) gives

(120)

Gathering (117), (120) and (118), we get (114):

(121)

Let us now consider the discrete equation for the total energy, (114). Integrating it on a computational domain, \(\varOmega \), we get

(122)

Recalling that \(D_{{\mathcal {E}}}^{r\pm \frac{1}{2},\mp }\) represent the jumps of the energy flux at the interfaces and assuming the solution on the boundaries of \(\varOmega \) to tend to a constant value, the jumps of \({\mathbf {q}}\) are zero at the boundaries of \(\varOmega \) and then the boundary contributions of \(D_{{\mathcal {E}}}\) and also the contribution of the dissipation terms vanish. Hence, reordering the pairs of \(D_{{\mathcal {E}}}^{r\pm \frac{1}{2},\mp }\) to consider couples related to the interfaces instead of pairs corresponding to the cells yields

(123)

Note that the summation over the blue dissipative fluxes is obviously a telescopic sum that vanishes. On the other hand, we can also prove that the contributions of the fluctuations at the interfaces reduce to a flux difference of the form

$$\begin{aligned} D_{{\mathcal {E}}}^{r+\frac{1}{2},-} + D_{{\mathcal {E}}}^{r+\frac{1}{2},+} = F^{r+1} - F^{r}. \end{aligned}$$
(124)

To this end, we simply develop the fluctuations related to the total energy

(125)

Reordering black terms and using (97) and (113) for the red ones, we obtain

(126)

Finally, taking into account (101) with (103) in the above expression and the discretization of \({\mathbf {f}}_{{\mathbf {q}}}^{r+\frac{1}{2}}\) given in (68), (81), we conclude

(127)

which is the sought total energy flux difference. Therefore, under the hypothesis that the total energy fluxes on the boundary are zero, we have

(128)

hence the scheme is marginally stable in the energy norm. \(\square \)

4 Path-Conservative ADER Discontinuous Galerkin Schemes

In this section we briefly recall ADER-DG schemes on rectangular equidistant Cartesian grids with a posteriori subcell finite volume limiter (SCL). The governing PDE system (14)–(17) can be cast into the following general form

$$\begin{aligned} \frac{\partial {\mathbf {q}}}{\partial t} + \nabla \cdot {\mathbf {F}}({\mathbf {q}},\nabla {\mathbf {q}}) + {\mathbf {B}}({\mathbf {q}}) \cdot \nabla {\mathbf {q}}= {\mathbf {S}}({\mathbf {q}},\nabla {\mathbf {q}}), \end{aligned}$$
(129)

with \(\mathbf {x}= (x_1,x_2) \in \varOmega \) the coordinate vector in the two-dimensional domain \(\varOmega \subset {\mathbb {R}}^2\), \(t \in {\mathbb {R}}_0^+\) the time, the state vector \({\mathbf {q}}\in \varOmega _{\mathbf {q}}\subset {\mathbb {R}}^m\), the state space or phase-space \(\varOmega _{\mathbf {q}}\subset {\mathbb {R}}^m\), the flux tensor \({\mathbf {F}}({\mathbf {q}},\nabla {\mathbf {q}}) = \left( {\mathbf {f}}, {\mathbf {g}} \right) \), the nonconservative product \({\mathbf {B}}({\mathbf {q}}) \cdot \nabla {\mathbf {q}}= {\mathbf {B}}_1({\mathbf {q}}) \partial _x {\mathbf {q}}+ {\mathbf {B}}_2({\mathbf {q}}) \partial _y {\mathbf {q}}\) and the source term \({\mathbf {S}}({\mathbf {q}},\nabla {\mathbf {q}})\), which may also depend on gradients of the state vector. The general structure (129) is needed if we want to discretize also directly the underlying thermodynamically compatible viscous system (44)–(47), including the production term \(T_{ik}\).

In order to solve very general nonlinear time-dependent PDE systems like (129) numerically, in this paper we employ the family of high order accurate fully-discrete path-conservative one-step ADER discontinuous Galerkin schemes supplemented with an a posteriori subcell finite volume limiter, see e.g. [17, 41, 42, 50, 56, 124]. In the next sections we provide a brief description of the method and the reader is referred to the above references for more details. Concerning more details on the framework of a posteriori limiting (MOOD), the reader is referred to [29, 38, 39].

4.1 Unlimited High Order ADER-DG Schemes

The system (129) is discretized on a domain \(\varOmega \) making use of a uniform Cartesian grid with elements \(\varOmega _{i}=\left[ x_{i}-\frac{\varDelta x}{2},x_{i}+\frac{\varDelta x}{2} \right] \times \left[ y_{i}-\frac{\varDelta y}{2},y_{i}+\frac{\varDelta y}{2}\right] \). Here, \({\mathbf {x}}_{i}=\left( x_{i},y_{i}\right) \) is the barycenter of \(\varOmega _{i}\) and \(\varDelta x\) and \(\varDelta y\) are the the mesh spacings in the x and in the y direction, respectively. The numerical solution of (129) is defined in the space of piecewise polynomials of degree N and is denoted by \({\mathbf {u}}_h({\mathbf {x}},t^n)\). For each element \(\varOmega _i\) it is sought under the form

$$\begin{aligned} {\mathbf {u}}_h({\mathbf {x}},t^n) = \varphi _l ({\mathbf {x}})\; \hat{{\mathbf {u}}}^n_{l,i}, \quad {\mathbf {x}} \in \varOmega _i. \end{aligned}$$
(130)

Here, \(\varphi _l({\mathbf {x}}) = \varphi _{l_1}(\xi ) \varphi _{l_2}(\eta )\) are the basis or ansatz functions, which are tensor products of one-dimensional ansatz functions \(\varphi _{l_m}(\chi )\) on the unit reference element \(\chi \in \varOmega _{\mathrm {ref}}=\left[ 0,1\right] \). The mapping from the reference element to the physical one reads \(x = x_{i}-\frac{1}{2}\varDelta x + \xi \varDelta x\) and \(y = y_{i}-\frac{1}{2}\varDelta y + \eta \varDelta y\) with \(0 \le \xi , \eta \le 1\). The multi-index \(l=(l_1,l_2)\) refers to the one-dimensional basis functions \(\varphi _{l_{m}}\) that are employed in the tensor product. The basis functions on the reference element are defined as the Lagrange interpolation polynomials that pass through the Gauss-Legendre quadrature nodes of a Gaussian quadrature formula with \(N+1\) quadrature points. This choice automatically leads to an orthogonal nodal basis.

Multiplication of (129) by test functions \(\varphi _k\), which according to the Galerkin approach are chosen identical to the ansatz functions, and integration over \(\varOmega _i \times [t^{n},t^{n+1}]\) leads to

$$\begin{aligned} \int \limits _{t^n}^{t^{n+1}} \int \limits _{\varOmega _i } \varphi _k \left( \partial _t {\mathbf {q}}+ \nabla \cdot {\mathbf {F}}({\mathbf {q}},\nabla {\mathbf {q}})+ \varvec{{\mathbf {B}}}({\mathbf {q}}) \cdot \nabla {\mathbf {q}}\right) \,d{\mathbf {x}}\,dt = \int \limits _{t^n}^{t^{n+1}} \int \limits _{\varOmega _i } \varphi _k \, {\mathbf {S}}\left( {\mathbf {q}},\nabla {\mathbf {q}}\right) d{\mathbf {x}}\,dt .\qquad \end{aligned}$$
(131)

Using (130) and integration by parts yields

$$\begin{aligned}&\left( \, \int \limits _{\varOmega _i} \! \varphi _k \varphi _l \, d{\mathbf {x}}\right) \left( \hat{{\mathbf {u}}}_{l,i}^{n+1} - \hat{{\mathbf {u}}}_{l,i}^{n} \, \right) + \int \limits _{t^n}^{t^{n+1}} \!\!\! \int \limits _{\partial \varOmega _i } \! \varphi _k \left( {\mathcal {G}}\left( {\mathbf {q}}_h^-, {\mathbf {q}}_h^+ \right) + {\mathcal {D}}\left( {\mathbf {q}}_h^-,{\mathbf {q}}_h^+ \right) \right) \cdot {\mathbf {n}} \, dS dt \nonumber \\&\quad - \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i } \!\! \nabla \varphi _k \cdot {\mathbf {F}}({\mathbf {q}}_h,\nabla {\mathbf {q}}_h) \,d{\mathbf {x}}\,dt + \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i^{\circ } } \varphi _k \mathbf {{\mathbf {B}}}({\mathbf {q}}_h) \cdot \nabla {\mathbf {q}}_h \,d{\mathbf {x}}\,dt = \nonumber \\&\quad \int \limits _{\varOmega _i } \varphi _k \mathbf {{\mathbf {S}}}({\mathbf {q}}_h, \nabla {\mathbf {q}}_h) \,d{\mathbf {x}}\,dt, \end{aligned}$$
(132)

where \({\mathbf {n}}\) is the outward-pointing unit normal vector at the cell boundary \(\partial \varOmega _{i}\), and \({\mathbf {q}}_h\) is a local space-time predictor, the computation of which will be briefly explained later. Since in the DG framework the discrete solution is allowed to jump between two neighboring cells, a numerical flux is required on the boundary. For an exhaustive overview of numerical fluxes and Riemann solvers, see [115]. In this paper, we use the simple Rusanov-type flux

$$\begin{aligned} {\mathcal {G}}\left( {\mathbf {q}}_h^-, {\mathbf {q}}_h^+ \right) \cdot {\mathbf {n}} = \frac{1}{2} \left( {\mathbf {F}}({\mathbf {q}}_h^+,\nabla {\mathbf {q}}_h^+) + {\mathbf {F}}({\mathbf {q}}_h^-,\nabla {\mathbf {q}}_h^-) \right) \cdot {\mathbf {n}} - \frac{1}{2} s_{\max } \, {\mathbf {I}} \, \left( {\mathbf {q}}_h^+ - {\mathbf {q}}_h^- \right) \,, \end{aligned}$$
(133)

with \(s_{\max } = \max (|\lambda _k({\mathbf {q}}_h^-)|,|\lambda _k({\mathbf {q}}_h^+)|) + \varepsilon (2N+1)/\varDelta x\) being an estimate of the maximum signal speed at the interface, including also the viscous terms with viscosity coefficient \(\varepsilon \), see [65]. In (133) \({\mathbf {q}}_h^-\) and \({\mathbf {q}}_h^+\) denote the boundary-extrapolated values of the space-time predictor from within the element and its neighbor, respectively. The non conservative products are discretized via a path conservative scheme, as forwarded by Castro, Parés and collaborators in [21,22,23,24, 86, 89] and which are based on the theory established in [85]. The term \({\mathcal {D}}\left( {\mathbf {q}}_h^-,{\mathbf {q}}_h^+ \right) \) contains the jump in the non-conservative product and is computed at the aid of a path integral in phase space between the states \({\mathbf {q}}_h^-\) and \({\mathbf {q}}_h^+\). Using the simple segment path

$$\begin{aligned} \mathbf {\psi } = \mathbf {\alpha }({\mathbf {q}}_h^-, {\mathbf {q}}_h^+, s) = {\mathbf {q}}_h^- + s \left( {\mathbf {q}}_h^+ - {\mathbf {q}}_h^- \right) \,, \qquad s \in [0,1], \end{aligned}$$
(134)

the path integral reduces to

$$\begin{aligned} {\mathcal {D}}\left( {\mathbf {q}}_h^-,{\mathbf {q}}_h^+ \right) \cdot {\mathbf {n}} = \frac{1}{2} \left( \int \limits _{0}^{1} {\mathbf {B}} \left( \mathbf {\psi }({\mathbf {q}}_h^-,{\mathbf {q}}_h^+,s) \right) \cdot {\mathbf {n}} \, ds \right) \cdot \left( {\mathbf {q}}_h^+ - {\mathbf {q}}_h^-\right) . \end{aligned}$$
(135)

The integral (135) is approximated via a simple trapezoidal quadrature rule. The use of path integrals based on the straight-line segment path is the common point between the path-conservative ADER-DG scheme presented here and the thermodynamically compatible semi-discrete finite volume method presented in the previous section.

Following [17, 41, 43, 50] the predictor \({\mathbf {q}}_h({\mathbf {x}},t)\) is obtained at the aid of a weak formulation of (129) in space-time, which allows to completely avoid the Cauchy-Kovalewskaya procedure that was originally employed in ADER schemes, see [20, 113, 114, 118, 119].

The predictor solution is defined at the aid of space-time ansatz functions \(\theta _{l}=\theta _{l}({\mathbf {x}},t) = \varphi _{l_0}(\tau ) \varphi _{l_1}(\xi ) \varphi _{l_2}(\eta )\), which are again tensor products of the 1D basis functions \(\varphi _{l_m}(\chi )\) and where now an additional temporal basis function is included, with \(t = t^n + \tau \varDelta t\):

$$\begin{aligned} {\mathbf {q}}_h({\varvec{x}},t) = \theta _l({\varvec{x}},t) \, {\hat{{\mathbf {q}}}}_{l,i}^n, \end{aligned}$$
(136)

Multiplication of (129) by \(\theta _{k}\) and integration over \(\varOmega _{i}\times \left[ t^{n},t^{n+1}\right] \) yields

$$\begin{aligned}&\int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i } \theta _k \, \partial _t {\mathbf {q}}_h \,d{\mathbf {x}} \, dt + \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i } \theta _k \, \nabla \cdot {\mathbf {F}}({\mathbf {q}}_h, \nabla {\mathbf {q}}_h) \,d{\mathbf {x}}\,dt \nonumber \\&\quad + \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i^{\circ } } \theta _k \mathbf {{\mathbf {B}}}({\mathbf {q}}_h ) \cdot \nabla {\mathbf {q}}_h \,d{\mathbf {x}}\,dt = \int \limits _{\varOmega _i } \theta _k {\mathbf {S}}({\mathbf {q}}_h, \nabla {\mathbf {q}}_h ) \,d{\mathbf {x}}\,dt \end{aligned}$$
(137)

and after intergration by parts one obtains the final weak form in space-time:

$$\begin{aligned}&\int \limits _{\varOmega _i } \theta _k({\mathbf {x}},t^{n+1}) {\mathbf {q}}_h({\mathbf {x}},t^{n+1}) \,d{\mathbf {x}} - \! \int \limits _{\varOmega _i } \theta _k({\mathbf {x}},t^{n}) {\mathbf {u}}_h({\mathbf {x}},t^{n}) \,d{\mathbf {x}} - \! \! \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i } \partial _t \theta _k \, {\mathbf {q}}_h \,d{\mathbf {x}} \, dt + \nonumber \\&\quad \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i } \theta _k \, \nabla \cdot {\mathbf {F}}({\mathbf {q}}_h, \nabla {\mathbf {q}}_h) \,d{\mathbf {x}}\,dt\nonumber \\&\quad + \int \limits _{t^n}^{t^{n+1}} \!\! \int \limits _{\varOmega _i^{\circ } } \theta _k \mathbf {{\mathbf {B}}}({\mathbf {q}}_h ) \cdot \nabla {\mathbf {q}}_h \,d{\mathbf {x}}\,dt = \nonumber \\&\quad \int \limits _{\varOmega _i } \theta _k {\mathbf {S}}({\mathbf {q}}_h, \nabla {\mathbf {q}}_h ) \,d{\mathbf {x}}\,dt. \nonumber \\ \end{aligned}$$
(138)

Equation (138) is a nonlinear element-local algebraic system in the unknowns \( {\hat{{\mathbf {q}}}}^n_{l,i}\), while the coefficients \(\hat{{\mathbf {u}}}_{l,i}^n\) are the known from \({\mathbf {u}}_h({\mathbf {x}},t^{n})\) at the previous time. The solution of (138) is obtained by an iterative algorithm, whose convergence was proven in [17] for the case of hyperbolic conservation laws without non-conservative products and without source terms. Concerning the choice of a suitable initial guess for the unknown space-time coefficients \({\hat{{\mathbf {q}}}}_{l,i}^n\), the reader is referred to [55, 79]. This completes the description of the unlimited ADER-DG scheme.

4.2 A Posteriori Subcell Finite Volume Limiter

The numerical scheme presented above is high order accurate and linear in the sense of Godunov, hence it will inevitably generate spurious oscillations in the vicinity of shock waves and discontinuities according to the well-known Godunov theorem. In [12, 46, 50, 124] a new a posteriori subcell limiter was introduced for ADER-DG schemes, using the ideas of the MOOD paradigm forwarded in [29, 38, 39] for finite volume schemes.

At the beginning of each time step, the unlimited scheme described in the previous section is run on the entire computational domain. This produces a so-called candidate solution, in the following denoted by \({\mathbf {u}}_h^*(\mathbf {x},t^{n+1})\). Next, the candidate solution is a posteriori checked against different numerical and physical detection criteria, such as the positivity of the water depth and of the determinant of \({\mathbf {Q}}\). Furthermore, the absence of floating point errors (NaN) is required and we also require a discrete maximum principle (DMP) to be satisfied, see [50]. If any of these numerical or physical detection criteria is violated, a high order DG cell is marked as troubled and is scheduled for the a posteriori subcell finite volume limiting.

The cells \(\varOmega _i\) that have been scheduled for subcell finite volume limiting are now split into \((2N+1)^d\) finite volume subcells, which are denoted by \(\varOmega _{i,s}\) with \(\varOmega _i = \bigcup _s \varOmega _{i,s}\). This subdivision of a high order DG element into many small finite volume subcells does not reduce the time step of the DG scheme because the CFL number of explicit discontinuous Galerkin schemes scales with \(1/(2N+1)\), while for the finite volume scheme used on the subgrid cells, the maximum Courant number allowed is of the order of unity. At time \(t^n\) the numerical solution in the finite volume subcells \(\varOmega _{i,s}\) is represented as usual via piecewise constant cell averages denoted by \({\bar{{\mathbf {u}}}}^n_{i,s}\) and which are obtained from the high order DG polynomials \({\mathbf {u}}_h(\mathbf {x},t^n)\) as

$$\begin{aligned} {\bar{{\mathbf {u}}}}^n_{i,s} = \frac{1}{|\varOmega _{i,s}|} \int \limits _{\varOmega _{i,s}} {\mathbf {u}}_h(\mathbf {x},t^n) \, d\mathbf {x}. \end{aligned}$$
(139)

These subcell averages are now evolved in time at the aid of a second order MUSCL-Hancock-type TVD finite volume scheme with minmod limiter, which is also a predictor-corrector method and thus looks quite similar to the ADER-DG scheme. The main difference is that now the test function is unity, hence the volume integral over the flux term disappears, and the spatial control volumes \(\varOmega _i\) are replaced by the sub-volumes \(\varOmega _{i,s}\):

$$\begin{aligned}&\left| \varOmega _{i,s} \right| \left( \bar{{\mathbf {u}}}^{n+1}_{i,s} - \bar{{\mathbf {u}}}^{n}_{i,s} \right) + \int \limits _{t^n}^{t^{n+1}} \! \! \int \limits _{\partial \varOmega _{i,s}} \left( {\mathcal {G}}\left( {\mathbf {q}}_h^-, {\mathbf {q}}_h^+ \right) + {\mathcal {D}}\left( {\mathbf {q}}_h^-, {\mathbf {q}}_h^+ \right) \right) \cdot {\mathbf {n}} \, \, dS \, dt \nonumber \\&\quad + \int \limits _{t^n}^{t^{n+1}} \! \! \int \limits _{\varOmega _{i,s}^\circ } \left( {\mathbf {B}}({\mathbf {q}}_h) \cdot \nabla {\mathbf {q}}_h \right) \, d\mathbf {x}\, dt = \int \limits _{t^n}^{t^{n+1}} \! \! \int \limits _{\varOmega _{i,s}} {\mathbf {S}}({\mathbf {q}}_h, \nabla {\mathbf {q}}_h) \, d\mathbf {x}\, dt\,, \end{aligned}$$
(140)

where the local space-time predictor \({\mathbf {q}}_h\) is now easily obtained from the Cauchy-Kovalewskaya procedure, see [115]. Once the cell averages \(\bar{{\mathbf {u}}}^{n+1}_{i,s}\) of all subcells contained within cell \(\varOmega _i\) have been computed at the new time \(t^{n+1}\) according to equation (140), the limited DG polynomial \({\mathbf {u}}'_h(\mathbf {x},t^{n+1})\) at time \(t^{n+1}\) can be simply obtained via a constrained least squares reconstruction. For this we require that

$$\begin{aligned} \frac{1}{|\varOmega _{i,s}|} \int \limits _{\varOmega _{i,s}} {\mathbf {u}}'_h(\mathbf {x},t^{n+1}) \, d\mathbf {x}= {\bar{{\mathbf {u}}}}^{n+1}_{i,s} \qquad \forall \varOmega _{i,s} \in \varOmega _i, \end{aligned}$$
(141)

and

$$\begin{aligned} \int \limits _{\varOmega _{i}} {\mathbf {u}}'_h(\mathbf {x},t^{n+1}) \, d\mathbf {x}= \sum \limits _{\varOmega _{i,s} \in \varOmega _i} |\varOmega _{i,s}| {\bar{{\mathbf {u}}}}^{n+1}_{i,s}. \end{aligned}$$
(142)

The constraint (142) means conservation of the solution within the element \(\varOmega _i\). In addition to the coefficients \(\hat{{\mathbf {u}}}_{i,l}^{n+1}\) of the limited DG polynomial, in all limited DG cells we also keep in memory the finite volume subcell averages \({\bar{{\mathbf {u}}}}^{n+1}_{i,s}\), since they serve as initial condition for the subcell finite volume limiter in the case when a cell is troubled also in the next time step, see [50]. This completes the description of the a posteriori subcell finite volume limiter. For more details, the reader is referred to [46, 50, 124].

4.3 Renormalization of \({\mathbf {Q}}\)

In order to maintain a strict compatibility of the discrete energy conservation law (18) with the discrete trace of \({\mathbf {P}}\), for the inviscid case \(\varepsilon =0\) we proceed as follows: at the end of each time step, we compute in each degree of freedom of the DG scheme and in each control volume of the subcell finite volume limiter the trace of \({\mathbf {P}}\) from the total energy and subsequently rescale \({\mathbf {Q}}\) according to

$$\begin{aligned} (\text {tr}{\mathbf {P}})_l^{n+1}= & {} 2 (h E)_l^{n+1}/h_l^{n+1} - g h_l^{n+1} - \Vert {\mathbf {v}}_l^{n+1} \Vert ^2, \end{aligned}$$
(143)
$$\begin{aligned} {\tilde{{\mathbf {Q}}}}_l^{n+1}= & {} {\mathbf {Q}}_l^{n+1} \sqrt{ \frac{(\text {tr}{\mathbf {P}})_l^{n+1}}{\text {tr}({\mathbf {Q}}{\mathbf {Q}}^T)_l^{n+1} } }, \end{aligned}$$
(144)

where the subscript l denotes a generic degree of freedom, \({\mathbf {Q}}_l^{n+1}\) is the preliminary value as computed from the numerical scheme described previously and \({\tilde{{\mathbf {Q}}}}_l^{n+1}\) is the final result of \({\mathbf {Q}}\) after rescaling at the end of each time step.

We stress that the above renormalization (143)–(144) is not performed for the case of a viscous system, i.e. when \(\varepsilon > 0\), since for sufficiently fine meshes (well-resolved viscous flow), the compatibility with the energy conservation law (47) must hold automatically up to the order of accuracy of the numerical scheme, since Theorem 1 establishes the compatibility of (44)–(46) with (47) at the continuous level.

5 Numerical Tests

Throughout this section the gravity constant is set to \(g=9.81\). Moreover, we will use SI units : m, s, etc. without writing them explicitly.

5.1 Test Problem with Exact Solution

In the following we solve a test problem suggested in [69], which has an exact solution of the PDE system (1)–(5) that reads

$$\begin{aligned} h(\mathbf {x},t)= & {} \frac{h_0}{1 + \beta ^2 t^2}, \qquad {\mathbf {v}}(\mathbf {x},t) = \frac{\beta }{1 + \beta ^2 t^2} \left( \begin{array}{c} +y + \beta t x \\ -x + \beta t y \end{array} \right) , \end{aligned}$$
(145)
$$\begin{aligned} {\mathbf {P}}(\mathbf {x},t)= & {} \frac{1}{(1 + \beta ^2 t^2)^2} \left( \begin{array}{cc} \lambda + \gamma \beta ^2 t^2 &{} (\lambda -\gamma ) \beta t \\ (\lambda -\gamma ) \beta t &{} \gamma + \lambda \beta ^2 t^2 \end{array} \right) . \end{aligned}$$
(146)

In our setup we choose the above exact solution at time \(t=0\) as initial condition for h and \({\mathbf {v}}\), while \({\mathbf {Q}}\) is initialized as \({\mathbf {Q}}= \text {diag}(\sqrt{P_{11}}, \sqrt{P_{22}})\). Our numerical simulations are run until a final time of \(t=1\) in the computational domain \(\varOmega = [-1,1]^2\) with \(\beta = \lambda = \gamma =0.1\) and \(h_0=1\). A third order ADER-DG scheme (\(N=2\)) is run on a sequence of successively refined meshes of \(N_x=5,10,15,20\) elements. Since the exact solution is a global polynomial of degree one in space, the spatial discretization is exact for this test problem. Since time variations in this test problem are also quite small and since the high order DG schemes require a rather small time step for stability, the overall error that is observed on all meshes, see Fig. 1, is of the order of machine accuracy, as expected.

Fig. 1
figure 1

Test case with exact solution: observed errors in \(L_\infty \) norm for the velocity components u and v and for \(\text {tr}{\mathbf {P}}\) obtained on different meshes

5.2 Numerical Convergence Study

Aiming at assessing the behaviour of the ADER-DG methodology, we now consider a manufactured solution test given by

$$\begin{aligned} h({\mathbf {x}},t)= & {} h_{0}, \qquad h{\mathbf {v}}({\mathbf {x}},t) = \begin{pmatrix} \sin (x)\cos (y)\cos (t)\\ -\cos (x)\sin (y)\cos (t) \end{pmatrix},\nonumber \\ {\mathbf {Q}}({\mathbf {x}},t)= & {} q_0\begin{pmatrix} \sin (x)\cos (y)\cos (t) &{} -\sin (x)\cos (y)\cos (t) \\ -\cos (x)\sin (y)\cos (t) &{} \cos (x)\sin (y)\cos (t) \end{pmatrix} \end{aligned}$$
(147)

with corresponding total energy,

$$\begin{aligned} h E ({\mathbf {x}},t) = g\frac{h_0^2}{2} + \left( \frac{2}{h_0} + h_0 q_0^2\right) \left( \sin ^2(x)\cos ^2(y)+\cos ^2(x)\sin ^2(y)\right) \cos ^2(t). \end{aligned}$$
(148)

Moreover, the former expression for \({\mathbf {Q}}({\mathbf {x}},t)\) yields a stress tensor of the form

$$\begin{aligned} {\mathbf {P}}({\mathbf {x}},t) = 2q_0^2\begin{pmatrix} \sin ^2(x) \cos ^2(y) \cos ^2(t) &{} - \sin (x) \cos (y) \cos ^2(t) cos(x) \sin (y) \\ - \sin (x) \cos (y) \cos ^2(t) \cos (x) \sin (y) &{} \cos ^{2}(x) \sin ^2(y) \cos ^2(t) \end{pmatrix}.\nonumber \\ \end{aligned}$$
(149)

To complete the definition of the problem we set \(h_{0}=1\), \(q_{0}=0.5\) and \(C_{f}=C_{r}=0\). Let us remark that to get the sought solution, (147)–(148), a set of analytical source terms, calculated by substitution of (147)–(148) in (14)–(17), must be added to the right hand side of the original system. The simulation is run until \(t=0.25\) using ADER-DG schemes of polynomial degrees \(N\in \left\{ 2,3,4,5\right\} \). The errors in \(L^{2}\) norm obtained for h, u and \(Q_{11}\) are reported in Table 1. Overall, the expected order of accuracy is reached, see bold numbers in Table 1.

Table 1 \(L^{2}\) errors and convergence rates for the manufactured test obtained using the ADER-DG method with \(N\in \left\{ 2,3,4,5 \right\} \). The simulations were run on Cartesian meshes of \(N_{x}\times N_{x}\) elements up to time \(t=0.25\)

5.3 Riemann Problems

It is well-known that the numerical discretization of nonconservative hyperbolic PDE is notoriously difficult, see e.g. [4, 23] for a more detailed discussion. This is particularly true when the nonconservative product is acting across genuinely nonlinear waves. This situation is usually the case in the nonconservative equation (16), which makes its numerical discretization particularly difficult. In [69] a special split scheme was developed, splitting the original system (1)–(5) into two quasi-conservative subsystems and during the solution of each of the subsystems, energy conservation was rigorously enforced. In this paper, two new unsplit schemes have been proposed for the discretization of (14)–(18). In the case of the path-conservative ADER-DG scheme described in Sect. 4, the total energy conservation law is explicitly discretized and for the inviscid case \(\varepsilon =0\) the object \(Q_{ik}\) is renormalized at the end of each time step according to (143)–(144), in order to maintain discrete compatibility with total energy conservation for each degree of freedom of the DG scheme and for each subcell average in case of the subcell FV limiter. Instead, when the path-conservative ADER-DG scheme is applied to the viscous system with \(\varepsilon > 0\) then no renormalization is carried out and the compatibility with the total energy conservation law must be guaranteed by the high order of accuracy of the scheme in combination with a sufficiently fine mesh alone, without any renormalization of \(Q_{ik}\). As such, the fully resolved direct numerical solution (DNS) of the viscous system (44)–(46), which also includes the production term, with small but not vanishing \(\varepsilon > 0\), constitutes the highest level of fidelity concerning the discretization of the non-conservative product since the solution can be considered as smooth and therefore there are no ambiguities concerning the proper definition of the non-conservative product at all. Instead, the compatible HTC scheme, which implements a semi-discrete Godunov formalism, does not directly discretize the energy equation at all, but total energy conservation is a mere consequence of all the other equations at the semi-discrete level. In this case, the discretization of the nonconservative products, of the Reynolds stress tensor and of the viscous terms (the red and blue terms in (44)–(46)) is carried out in such a manner that at the semi-discrete level total energy conservation is automatically ensured.

In the following we solve three Riemann problems on the domain \(\varOmega = [0,1] \times [0,0.5]\) with initial condition

$$\begin{aligned} {\mathbf {q}}(\mathbf {x},0) = \left\{ \begin{array}{lll} {\mathbf {q}}^L \quad &{} \text {if} &{} \quad x \le 0.5, \\ {\mathbf {q}}^R \quad &{} \text {if} &{} \quad x > 0.5. \end{array} \right. \end{aligned}$$
(150)

The left and right initial states, as well as the final simulation times \(t_{\text {end}}\), are summarized in Table 2. The values of the state variables not indicated in the table are set to zero, i.e. \(v_1 = 0\), \(Q_{12}=0\) and \(Q_{21}=0\). The simulations are run with four different numerical schemes S1-S4:

  1. (S1)

    The split scheme of Gavrilyuk et al. [69], using a very fine mesh in one space dimension, which serves as a reference solution. This scheme makes explicit use of the total energy conservation law in each subsystem used in the splitting approach.

  2. (S2)

    A high order unsplit ADER-DG scheme (132) with polynomial approximation degree \(N=3\) and \(11,200 \times 4\) elements, applied to the viscous system (44)–(46) with small but positive viscosity parameter \(\varepsilon > 0\) (vanishing viscosity limit). Since \(\varepsilon > 0\) we do not solve the energy equation (47) explicitly and apply no renormalization to \(Q_{ik}\). To obtain the discrete compatibility with the energy conservation law (47) and for the proper definition of the nonconservative products, only a sufficiently fine mesh is needed in combination with the high order DG scheme (fully resolved DNS). This approach serves to generate an additional and totally independent reference solution.

  3. (S3)

    The new unsplit thermodynamically compatible HTC scheme based on the semi-discrete Godunov formalism described in Sect. 3. This scheme is by construction exactly compatible with the viscous system (44)–(46) at the semi-discrete level and therefore the semi-discrete energy conservation law is a direct consequence of the discretization of all the other equations and thus does not need to be discretized explicitly again.

  4. (S4)

    The high order unsplit ADER-DG scheme applied to the inviscid system, setting \(\varepsilon =0\) in (44)–(47). In this case, the energy equation (47) is explicitly discretized and the object \(Q_{ik}\) is renormalized at the end of each timestep according to (143)–(144).

We emphasize that in all four schemes, discrete compatibility with the total energy conservation law is always assured in one way, or in another: either by directly using the total energy conservation equation within the numerical scheme (S1 and S4), or by achieving compatibility exactly at the discrete level (S3). In S2 the compatibility is merely achieved at the aid of negligible discretization errors by using a very high order scheme applied to the viscous system with \(\varepsilon >0\) and using a sufficiently fine mesh (fully resolved DNS).

The computational results obtained for the three Riemann problems are shown in Figs. 23 and 4, where also the exact mesh resolution is given for each scheme, together with the choice of the viscosity parameter \(\varepsilon \) in the case of the simulation of the vanishing viscosity limit. For all Riemann problems one can note an excellent agreement between the numerical solutions obtained with all four schemes (S1–S4) listed above.

This clearly highlights that discrete thermodynamic compatibility is a key feature for the correct discretization of nonconservative products that are acting across genuinely nonlinear fields, like the ones present in (16).

Table 2 Initial left and right states for the Riemann problems RP1-RP3
Fig. 2
figure 2

Numerical solution of the Riemann problem RP1 obtained with different numerical schemes at time \(t=0.5\): split scheme of [69] on 250,000 elements (S1, solid black line); vanishing viscosity limit of the viscous system (44)–(47) with \(\varepsilon = 2 \cdot 10^{-6}\) using a fourth order ADER-DG scheme (\(N=3\)) on 11,200 elements (S2, dashed blue line); unsplit thermodynamically compatible q-scheme on 56,000 elements (S3, dashed red line); fourth order ADER-DG scheme (\(N=3\)) applied to the inviscid model (14)–(18) using 1,400 elements (S4, squares) (Color figure online)

Fig. 3
figure 3

Numerical solution of the Riemann problem RP2 obtained with different numerical schemes at time \(t=10\): split scheme of [69] on 100,000 elements (S1, solid black line); vanishing viscosity limit of the viscous system (44)–(47) with \(\varepsilon = 1 \cdot 10^{-6}\) using a fourth order ADER-DG scheme (\(N=3\)) on 10,200 elements (S2, dashed blue line); unsplit thermodynamically compatible q-scheme on 28,000 elements (S3, dashed red line); fourth order ADER-DG scheme (\(N=3\)) applied to the inviscid model (14)–(18) using 1,000 elements (S4, squares) (Color figure online)

Fig. 4
figure 4

Numerical solution of the Riemann problem RP3 obtained with different numerical schemes at time \(t=0.5\): split scheme of [69] on 250,000 elements (S1, solid black line); vanishing viscosity limit of the viscous system (44)–(47) with \(\varepsilon = 2 \cdot 10^{-6}\) using a fourth order ADER-DG scheme (\(N=3\)) on 10,200 elements (S2, dashed blue line); unsplit thermodynamically compatible q-scheme on 56,000 elements (S3, dashed red line); fourth order ADER-DG scheme (\(N=3\)) applied to the inviscid model (14)–(18) using 1,400 elements (S4, squares) (Color figure online)

5.4 One Dimensional Brock Profile

Here we repeat the numerical experiment concerning one-dimensional roll waves proposed in [69] and compare the obtained numerical results with the experimental data provided by Brock in [15, 16]. The initial condition is given according to [69] by \(h = h_0 ( 1 + a \sin (2 \pi x / L)\) with \(a=0.05\), \(v_1 = \sqrt{ g h_0 \tan \theta / C_f}\), \(v_2=0\) and \({\mathbf {Q}}= \sqrt{\frac{1}{2}\varphi h^2} {\mathbf {I}}\). The bottom slope angle of this test problem is \(\theta = 0.05011\) with \(\partial _x b = \tan (\theta )\), the still water depth is set to \(h_0 = 0.00798\), the bottom friction coefficient is chosen as \(C_f = 0.0036\), the parameter \(C_r\) is set to \(C_r = 0.00035\) and \(\varphi = 22.76\), see [69].

The computational domain is \(\varOmega = [0, L] \times [0, 0.5]\) with \(L=1.3\) and is discretized with \(104 \times 20\) ADER-DG elements of polynomial approximation degree \(N=3\). Periodic boundary conditions are applied in \(x_1\) and \(x_2\) direction. Simulations are run for system (14)–(16) until a final time of \(t=12.5\). For this test, the bottom slope term is simply implemented as an algebraic source term in order to be compatible with the periodic boundary conditions. The numerical results obtained with the path-conservative ADER-DG scheme and the experimental profile of Brock are depicted in the left panel of Fig. 5. Overall, we can note a very good agreement between the numerical results and the experimental reference data. In the right panel of Fig. 5 a visualization of the a posteriori subcell limiter is shown (red cells are highlighted in red, while unlimited cells are plotted in blue). It can be noticed that the limiter is only active at the shock wave.

Fig. 5
figure 5

Two-dimensional numerical simulation of the roll wave experiment of Brock [15, 16] at time \(t=12.5\). Left: comparison of a 1D cut through the numerical simulation with the experimental profile. Right: computational grid with troubled cells highlighted in red and unlimited cells colored in blue (Color figure online)

5.5 Numerical Simulation of the SWASI Experiment

In this last and most complex numerical test we carry out the simulation of the SWASI experiment proposed by Foglizzo et al. in [61] and which was numerically investigated in [80]. The flow field is turbulent and turbulence is responsible for the developing flow structures. The computational domain is \(\varOmega = [-1,+1]^2\) and is discretized with a uniform Cartesian mesh composed of \(256 \times 256\) ADER-DG elements of polynomial approximation degree \(N=3\).

The bottom topography of this test is given according to [80] by

$$\begin{aligned} b(r) = \left\{ \begin{array}{lll} \frac{A}{L_1^4} \left( \left( r - R^- - L_1 \right) ^2 - L_1^2 \right) ^2 &{} \text { if } &{} R^- \le r \le 2 L_1 + R^-, \\ \left( r - R^- - 2 L_1 \right) \tan \beta &{} \text { if } &{} r > 2 L_1 + R^-, \end{array} \right. \end{aligned}$$
(151)

with \(r = \Vert \mathbf {x}\Vert \), \(L_1 = 0.02\), \(A=0.005\), \(\beta =0.07\), \(R^- = 0.08\) and \(R_1 = \sqrt{2}\). The model parameters for this test are set to \(C_f=0.0036\), \(C_r = 1\) and \(\varphi = 2\). The reference inflow discharge is chosen as \(q_0 = 1.2 \cdot 10^{-3}\), while the reference water depth at \(R_1\) is set to \(h_0=0.003\).

In contrast to [80], in this paper the initial velocity field is chosen to be the stationary equilibrium between bottom slope and bottom friction and which satisfies the following ODE in radial direction:

$$\begin{aligned} \frac{d u_r}{dr} = u \frac{ C_f |u| u^3 r^3 - u g q_0 r^2 \tan \beta \, - q_0^2 g}{ q_0 r (r u^3 + q_0 g) } \end{aligned}$$
(152)

with initial condition \(u_r(R_1) = -q_0/(h_0 R_1)\) for both regions, \(r \le R_1\) and \(r > R_1\). This ODE is solved once at the beginning of the simulation at the aid of a classical fourth order Runge-Kutta scheme. Once the radial velocity \(u_r\) is known, the water depth can be easily computed as \(h = q_0 / ( r u_r)\) and the final velocity field is given by the equilibrium solution plus a sinusoidal perturbation as follows:

$$\begin{aligned} v_1 = u_r \cos \theta \left( 1 + d \sin (16 \theta ) \right) , \quad v_2 = u_r \sin \theta \left( 1 + d \sin (16 \theta ) \right) , \end{aligned}$$
(153)

with \(d=0.005\). We furthermore set \({\mathbf {Q}}= \sqrt{\varphi h_0^2} \, {\mathbf {I}}\) as initial condition for the object \({\mathbf {Q}}\). The outflow through the central hole is generated by a sink term, setting \(h=10^{-2}\), \({\mathbf {v}}=0\) and \({\mathbf {Q}}= 10^{-5} \, {\mathbf {I}}\) for \(r<0.075\) at all times.

The numerical simulations are carried out until a final time of \(t=57\). Figure 6 shows the computational results obtained for the water depth, the Froude number, the angular velocity and the instantaneous a posteriori subcell limiter map, in which red cells are highlighted in red, while unlimited cells are plotted in blue. The limiter is essentially activated along the moving shock front. The obtained numerical results agree qualitatively very well with experimental observations made in [61] and with the numerical results previously presented in [80]. In particular, one can note the characteristic cusp in the shock front that is visible in both, the experiment [61] as well as in the numerical results of [80]. At this point we would like to emphasize that the model formulation as well as the numerical scheme used in this paper are completely different compared to the model formulation and the scheme employed in [80]. While in [80] the problem was solved after rewriting the governing PDE system in polar coordinates, here we solve the problem directly in Cartesian coordinates on a Cartesian mesh. Furthermore, the model used in [80] was based on the temporal evolution equation of \({\mathbf {P}}\), while here we use the new model formulation in terms of the object \({\mathbf {Q}}\) that guarantees \(\text {tr}{\mathbf {P}}\ge 0\) by construction. Last but not least, in [80] a split finite volume scheme was used, while the present paper employs an unsplit high order ADER-DG scheme. The fact that the numerical results obtained with different models and different schemes agree well with each other and with experimental observations shows the validity of the different mathematical model formulations as well as of the chosen numerical discretizations. As already pointed out in [80], the same simulation run with the same model parameters (\(C_f=0.0036\)) and the same numerical method on the same mesh applied to the simple shallow water equations leads only to a steady circular shock wave, without developing any shock instability and without showing the typical cusp of the SWASI experiment, see Fig. 7.

Fig. 6
figure 6

Numerical simulation of the SWASI experiment with a fourth order ADER-DG scheme at time \(t=57\) s applied to the model for unsteady turbulent shallow water flows (14)–(18). Water depth (top left), Froude number (top right), angular velocity (bottom left) and limiter map with limited cells highlighted in red and unlimited cells plotted in blue (bottom right) (Color figure online)

Fig. 7
figure 7

Numerical simulation of the SWASI experiment with a fourth order ADER-DG scheme at time \(t=57\) s applied to the classical shallow water equations. Water depth (left) and Froude number (right). With the classical shallow water model no shock wave instability develops

6 Conclusion

In this paper we have introduced a new reformulation of the first order hyperbolic model for unsteady turbulent shallow water flows introduced and studied in [11, 69, 80]. The main idea of the model reformulation proposed in this paper is the decomposition of the specific Reynolds stress tensor \({\mathbf {P}}\) at the aid of a new object \({\mathbf {Q}}\) so that \({\mathbf {P}}= {\mathbf {Q}}{\mathbf {Q}}^T\). This guarantees that \(\text {tr}{\mathbf {P}}\ge 0\) by construction also at the discrete level for all times, since in terms of \({\mathbf {Q}}\) the trace of the Reynolds stress tensor, i.e. the turbulent kinetic energy, can be written as \(\text {tr}{\mathbf {P}}= Q_{ij} Q_{ij} \ge 0\). Compared to the previous model used in [11, 69, 80] we also add a thermodynamically compatible viscous flux and an associated entropy production term that together guarantee the compatibility of the viscous system with the total energy conservation law and with the entropy inequality, which in the new reformulation can be simply expressed in terms of an extra conservation law for the determinant of \({\mathbf {Q}}\). Based on the Godunov form of hyperbolic conservation laws found by Godunov in his groundbreaking work An interesting class of quasilinear systems [70], we have derived a new thermodynamically compatible semi-discrete finite volume scheme that mimics the Godunov form of the inviscid conservative part of the system exactly at the semi-discrete level. The proposed schemes can therefore be called a discrete Godunov formalism, or a hyperbolic and thermodynamically compatible (HTC) finite volume scheme. Subsequently, also a thermodynamically compatible viscous extension of the scheme has been proposed, together with the thermodynamically compatible discretization of the remaining nonconservative terms and of the Reynolds stress tensor, which do not fit into the original Godunov formalism. At this point we stress again that the proposed scheme mimics the underlying viscous system of the mathematical model exactly at the semi-discrete level and as such also falls into the class of structure-preserving schemes, since all properties of the thermodynamic structure of the governing PDE system are properly maintained by the numerical scheme. The paper also considers high order path-conservative fully-discrete one-step ADER discontinuous Galerkin schemes with a posteriori subcell limiter that can be applied to both, the viscous and the inviscid form of the mathematical model. The performance and accuracy of all schemes is carefully assessed at the aid of three Riemann problems, where also a direct comparison with the scheme introduced in [69] has been shown. An excellent agreement between all different methods was observed in all cases. For the high order ADER-DG schemes a numerical convergence study was carried out at the aid of a manufactured solution, since the analytic solution used in [69] was too simple for a high order DG scheme. The new model was applied to the simulation of roll waves, comparing with the experimental data of Brock and obtaining an excellent level of agreement between the numerical and the experimental results. As a last test problem the we have carried out a numerical simulation of the SWASI experiment of Foglizzo et al. [61], using the computational setup proposed in [80]. Our simulations show the same cusp in the moving shock front that was already observed in the experiments and in the numerical simulations shown in [80].

In the future we plan to extend the new family of thermodynamically compatible schemes to the equations of nonlinear hyperelasticity [14, 67, 75, 77, 87, 102] and to the unified hyperbolic model of continuum mechanics [13, 17, 47, 93, 102], as well as to hyperbolic reformulations of dispersive systems [7, 18, 37, 58]. Further work will also concern the extension of the discrete Godunov formalism presented in this paper to higher order semi-discrete discontinuous Galerkin finite element schemes, see e.g. [36].

Another open challenge remains the development of thermodynamically compatible schemes like those presented in this paper that also maintain curl and divergence involution constraints exactly at the semi-discrete level, similar to the structure-preserving semi-implicit method recently proposed in [13], but which was not thermodynamically compatible.