1 Introduction

This article presents numerical results for a compatible finite element discretisation of the compressible Euler equations in a vertical slice geometry (i.e. two dimensional with one of the directions being the vertical). Vertical slice geometry provides an opportunity to evaluate the performance and behaviour of numerical discretisations for atmosphere dynamical cores using testcases that can be run on a laptop or workstation, forming a useful step in the development of global atmosphere dynamical cores.

The motivation for compatible finite element methods is that they provide a naturally stable discretisation for the equations linearised about a state of rest, i.e. no numerical stabilisation (such as Riemann solvers, divergence damping, artificial viscosity, etc) is required for the wave component of the solution. This stability means that they are absent of spurious pressure modes, i.e. pressure patterns that are highly oscillatory but lead to small numerical gradients. Further, when the Coriolis force is introduced, they support exactly steady geostrophic states (Cotter and Shipton 2012), and they avoid the spurious branches of inertial oscillations (Natale et al. 2016) that are present in many finite element discretisations (such as P1\(_{DG}\)-P2 and CR1-P0 on triangles). This satifies the main requirements set out in (Staniforth and Thuburn 2012), which became a wishlist for discretisations considered in the Gung Ho project designing a new atmosphere dynamical core for the Met Office. Natale et al. (2016) also showed that if an appropriate space is chosen for potential temperature as proposed by Guerra and Ullrich (2016) (guided by the Lorenz staggering for finite difference methods), there are no spurious hydrostatic modes. Compatible finite element methods are also flexible in allowing to choose finite element spaces so that there are sufficient velocity degrees of freedom per pressure degree of freedom (avoiding the spurious inertia-gravity wave modes present in triangular C-grid discretisations (Danilov 2010)). Finally, they are consistent on very general meshes (arbitrary triangulations with some minimum aspect ratio, and quadrilateral meshes coming from piecewise smooth maps applied to regular grids) and spaces can be selected of arbitrary high order consistency. In particular, the Coriolis term is consistent on cubed sphere meshes, avoiding an issue discovered with the C-grid discretisation when applied to cubed sphere or dual icosahedral meshes when formulated to satisfy the properties above (Thuburn and Cotter 2012; Thuburn et al. 2014). The numerical weather prediction community is moving towards such grids because they avoid the parallel bottlenecks associated with the poles in the latitude longitude grids in current and previous use. It was for this reason that compatible finite element methods were selected for the Gung Ho dynamical core, which is built around the lowest order spaces currently (Sergeev et al. 2023; Melvin et al. 2019).

Compatible finite element methods also lend themselves to variational and conservative formulations. These formulations are the result of extension of similar schemes constructed using the C-grid finite difference staggering (Arakawa and Lamb 1981; Gassmann 2013; Dubos et al. 2015). Taylor et al. (2020) have recently provided a formulation using spectral element methods.

McRae and Cotter (2014) provided a scheme for the rotating shallow water equations that conserves energy and enstrophy. This has been extended to bounded domains (Bauer and Cotter 2018), upwinded formulations that dissipate enstrophy but preserve energy (Wimmer et al. 2020), and vertical slice model (Wimmer et al. 2021). An alternative set of compatible finite element spaces based on splines was presented in Eldred et al. (2019), and a related approach based on mimetic spectral elements was presented in Lee and Palha (2018, 2020).

A variational discretisation (i.e., a discretisation derived from Hamilton’s principle) for the two dimensional incompressible Euler equations was derived in Natale and Cotter (2018), which was further developed for more general fluid models in Gawlik and Gay-Balmaz (2020, 2022). We are not using variational or conservative formulations in this paper, but we note that our formulation is rather close to them, which might be expected to reduce spurious energy transfers between kinetic, potential and internal energy.

The lowest order compatible finite element spaces (i.e., those spaces with RT0 for velocity and P0 used for pressure/density) are closely related to the C grid finite difference method popular amongst many operational weather models (Wood et al. 2014,for example). However, these spaces are only first order accurate, so a finite volume approach is required where one is in effect solving for cell averaged quantities, and higher order accuracy for those cell averaged quantities must be obtained by using finite volume transport schemes, which have a stencil over several cells. This was done in (Melvin et al. 2019). The advantage of using the next-to-lowest-order (NLO) spaces (RT1 is used for velocity, and P1\(_{DG}\) used for pressure/density) is that they are naturally second order accurate, so standard finite element formulations are sufficient. This leads to a very clean formulation where assembly only requires to fetch data from a single cell or from pairs of cells joined by facets. This makes it particularly amenable to automated code generation of MPI parallel codes using tools such as Firedrake (Rathgeber et al. 2016). (Shipton et al. 2018) presented a practical scheme for the rotating shallow water equations using NLO spaces. In this paper we present a practical scheme for the compressible Euler equations using NLO spaces, evaluated using a standard suite of vertical slice model testcases. A related scheme was coupled with a moisture model in Bendall et al. (2020), but was not benchmarked against the tests considered here. We use a fully implicit timestepping method, solved using a monolithic iterative solver i.e. the full coupled system of all variables is solved together without elimination. The equations are solved using GMRES, using a linesmoother which acts as an additive Schwarz method with each patch comprising the “star” of a vertical edge, i.e. all degrees of freedom associated with the interior of the vertical column surrounding that edge. The columnar approach is necessary due to the thin domain geometry occuring in atmosphere models. Numerical analysis of these schemes is difficult, but they are motivated by finding monolithic smoothers that result in a (block) Jacobi iteration for the density/pressure variable after elimination within the patch. For example, a similar method (in local rather than columnar form) was proposed in Adler et al. (2021); Laakmann et al. (2022).

There are a few reasons for including this approach here. First, it is useful to focus on the behaviour of the spatial discretisation without clouding the issue with aspects related to the splitting of advection and wave propagation, for example. In a monolithic approach, the basic code becomes very simple, and the challenges are exported to the iterative solver, which is provided here by the PETSc library. Second, the monolithic approach is itself new, and the paper provides a useful demonstration that it is useable in practice for numerical weather prediction.

The rest of the article is structured as follows. Section 2 presents the discretisation in space and time, together with the iterative solution strategy for the resulting implicit system of equations, and our strategy for obtaining hydrostatic balance reference profiles. Section 3 presents the numerical examples, and Sect. 4 provides a summary and outlook.

2 Discretisation

2.1 Governing equations

Our description of the two-dimensional nonhydrostatic dry Euler equations can remain quite brief because our approach discretises the equations in Cartesian coordinates (even on terrain following coordinates or in spherical geometry). We write the equations in \(\theta -\Pi \) (potential temperature-Exner pressure) form,

$$\begin{aligned} \frac{\partial \varvec{u}}{\partial t} + (\varvec{u}\cdot \nabla )\varvec{u} + f\hat{\varvec{k}}\times \varvec{u} + c_p\theta \nabla \Pi + g\hat{\varvec{k}}&= 0, \end{aligned}$$
(1)
$$\begin{aligned} \frac{\partial \theta }{\partial t} + \varvec{u}\cdot \nabla \theta&= 0, \end{aligned}$$
(2)
$$\begin{aligned} \frac{\partial \rho }{\partial t} + \nabla \cdot (\varvec{u}\rho )&= 0, \end{aligned}$$
(3)
$$\begin{aligned} \Pi ^{(1-\kappa )/\kappa } = \frac{R}{p_0}\rho \theta ,&\end{aligned}$$
(4)

where \(\varvec{u}\) is the velocity, \(\theta \) is the potential temperature, \(\rho \) is the density, \(\Pi \) is the Exner pressure, f is the Coriolis parameter, \(\hat{\varvec{k}}\) is the unit vector pointing upwards, \(c_p\) is the specific heat at constant pressure, R is the gas constant, \(p_0\) is the reference pressure, g is the acceleration due to gravity and \(\kappa =R/c_p\). Here, we have presented the equations as they would be read in three-dimensional form. In the vertical slice model, one restricts the spatial dependence of the fields to the \(x-z\) plane, whilst retaining three Cartesian components of the vector field, i.e. \(\varvec{u}(\varvec{x},t):{\mathbb {R}}^2\times [0,T]\rightarrow {\mathbb {R}}^3\). In a code, a simple way to implement this is to use a three dimensional mesh that is one cell wide in the y-direction, and periodic in that boundary condition. If a higher order finite element is used, it is still possible to represent variations in the y-direction. We expect y-independent solutions should be preserved by symmetry, but care should be taken to check that this is the case. In fact this is an advantageous approach, since the same code can be used for vertical slice and full 3D models, and validation in the vertical slice case adds to confidence in the full 3D model.

In this paper, we use the vector-invariant form of the advection term,

$$\begin{aligned} (\varvec{u}\cdot \nabla )\varvec{u} = (\nabla \times \varvec{u})\times \varvec{u} + \frac{1}{2}\left| \nabla \varvec{u}\right| ^2. \end{aligned}$$
(5)

Some of the test problems we will consider contain a Newtonian damping term to absorb radiating internal waves as they reach the upper boundary, requiring the addition of the following term,

$$\begin{aligned} \mu \hat{\varvec{k}}\varvec{u}\cdot \hat{\varvec{k}}, \end{aligned}$$
(6)

to the left hand side of Eq. (1) where \(\mu \) is the (spatially-dependent) damping parameter specific to the test problem (which we will specify later), and is otherwise zero.

Similarly, one of the test problems requires the addition of viscous and diffusion terms in order to observe convergence with mesh resolution, in which case the same diffusion parameter \(\nu \) is used. Then we add

$$\begin{aligned} - \nu \nabla ^2\varvec{u} \end{aligned}$$
(7)

to the left hand side of Eq. (1), and

$$\begin{aligned} - \nu \nabla ^2\theta \end{aligned}$$
(8)

to the left hand side of Eq. (2). These terms are not necessary for stability and are only included to facilitate convergence testing and comparison with other published results.

We call the computational domain \(\Omega \), which in this paper is always a rectangle with lateral periodic boundary conditions (or a deformation of a rectangle to accommodate topography), and slip boundary conditions \(\varvec{u}\cdot \varvec{n}=0\) on the bottom and sides, where \(\varvec{n}\) is the unit normal to the boundary. When diffusion and viscosity are included, this boundary condition is augmented by \(\frac{\partial \theta }{\partial n}=0\) and \(\frac{\partial \varvec{u}\times \varvec{n}}{\partial n}=0\).

2.2 Spatial discretisation

2.2.1 Meshes and finite element spaces

In this paper, we make use of two types of meshes. For problems with \(f=0\) and no out-of-plane component to the velocity, we construct structured meshes of regular rectangles. For problems with \(f\ne 0\), we construct structured meshes of regular cuboids that are one cell wide and periodic in the y-direction (to facilitate efficient solution of y-independent problems and to allow seamless transition to fully 3D problems). In either case, for problems with topography we apply a terrain following transformation to the mesh of the form \(z \mapsto z + \phi (x,y,z)\), which preserves column structure (vertical faces of cells remain vertical) but does deform rectangles into trapezia and cuboids into trapezium prisms.

Following McRae et al. (2016), we select the finite element spaces as follows. The velocity space \({\mathbb {V}}_1\) is the Raviart-Thomas space of degree 1 (RT1) on hexahedra (see below for more discussion about the vertical slice case), the density space \({\mathbb {V}}_2\) is the discontinuous trilinear space (bilinear in 2D), and temperature space \({\mathbb {V}}_{\theta }\) is the tensor product of quadratic functions in the vertical, and bilinear functions (linear in 2D) in the horizontal. It is continuous in the vertical, and discontinuous in the horizontal. This choice mirrors the structure of the vertical component of the velocity space, to facilitate the representation of hydrostatic balance (see Natale et al. (2016) for an analysis of this, and a more detailed description of these spaces). See Bercea et al. (2016) for information about how finite element methods with these spaces can be used performantly.

In 3D, the \({\mathbb {V}}_1\) space uses a biquadratic construction: the vertical component of velocity in the reference cube is quadratic in the vertical direction, and linear in the horizontal directions, with a symmetric pattern applied to the other component(s). If an \(x-z\) planar mesh is used in a vertical slice model, then the \(x-z\) components of velocity are represented in the RT1 space on quadrilaterals, which the y component is discontinous and bilinear (same as the density). The RT1 functions are mapped into physical cells using the Piola mapping (Rognes et al. 2010). This guarantees continuity of normal components of the functions across cell facets at the expense of replacing polynomials by rational polynomials when the cells are trapezia or trapezium prisms.

Since these finite element spaces all contain the complete space of linear polynomials, classical approximation theory indicates that they can approximate smooth functions on the reference element with second order error in the \(L^2\) norm. There is a technicality that the Piola transformation means that \({\mathbb {V}}_1\) on mesh elements with terrain following meshes do not contain all linear polynomials. However, Natale et al. (2016) showed that second order approximation is still obtained if the mapping from a rectangular domain is smooth (or, piecewise smooth provided that the number of pieces is fixed).

The spatial discretisation of the Eqs. (14) is then to find \((\varvec{u}(t),\rho (t),\theta (t))\in \mathring{{\mathbb {V}}}_1\times {\mathbb {V}}_2\times {\mathbb {V}}_{\theta }\) such that

$$\begin{aligned}&\int _\Omega \varvec{w}\cdot \varvec{u}_t + \mu \varvec{w}\cdot \hat{\varvec{k}}\varvec{u}\cdot \hat{\varvec{k}} + f\varvec{w} \cdot \hat{\varvec{k}}\times \varvec{u}{{\,\textrm{d}\,}}x + \varvec{w}\cdot \hat{\varvec{k}}g{{\,\textrm{d}\,}}x \nonumber \\&\quad + \underbrace{ \int _{\Omega } \nabla _h\times (\varvec{w}\times \varvec{u})\times \varvec{u} {{\,\textrm{d}\,}}x - \int _{\Omega } \nabla \cdot \varvec{w} \frac{1}{2}\left| \varvec{u}\right| ^2{{\,\textrm{d}\,}}x + \int _{\Gamma } [\![{\varvec{n}\times (\varvec{u}\times \varvec{w})}]\!] \cdot \tilde{\varvec{u}} {{\,\textrm{d}\,}}S }_{\text{ from } (\varvec{u}\cdot \nabla )\varvec{u}}&\nonumber \\&\quad \underbrace{- \int _\Omega \nabla _h\cdot (\varvec{w}\theta )c_p\Pi {{\,\textrm{d}\,}}x + \int _{\Gamma _v} [\![{\varvec{n}\cdot \varvec{w}\theta }]\!] c_p\{\Pi \}{{\,\textrm{d}\,}}S}_{\text{ from } c_p\theta \nabla \Pi } = 0, \quad \forall \varvec{w}\in \mathring{{\mathbb {V}}}_1, \end{aligned}$$
(9)
$$\begin{aligned}&\int _\Omega q\theta _t - \nabla _h \cdot (\varvec{u}q)\theta {{\,\textrm{d}\,}}x + \underbrace{\int _{\Gamma _v} [\![q\varvec{n}\cdot \varvec{u}]\!]{\tilde{\theta }} {{\,\textrm{d}\,}}S}_{\text{ horizontal } \text{ upwinding }} \nonumber \\&\quad \quad + \underbrace{\int _{\Gamma } C_0 h^2 |\varvec{u}\cdot \varvec{n}| [\![\nabla _h q]\!]\cdot [\![\nabla _h\theta ]\!]{{\,\textrm{d}\,}}S}_{\text{ edge } \text{ stabilisation }} = 0, \quad \forall q \in {\mathbb {V}}_{\theta }, \end{aligned}$$
(10)
$$\begin{aligned}&\quad \int _\Omega \phi \rho _t - \nabla _h\phi \cdot \varvec{u}\rho {{\,\textrm{d}\,}}x + \underbrace{\int _\Gamma [\![\phi \varvec{u}\cdot \varvec{n}]\!]{\tilde{\rho }}{{\,\textrm{d}\,}}S}_{\text{ upwinding }} = 0, \quad \forall \phi \in {\mathbb {V}}_2, \end{aligned}$$
(11)

where

  1. 1.

    \(\nabla _h\) indicates the “broken” gradient obtained by evaluating the gradient separately in each cell,

  2. 2.

    \(\Gamma \) is the set of interior facets (with \(\Gamma _v\) the set of vertical facets between adjacent columns),

  3. 3.

    \([\![\psi ]\!]\) is the “jump” operator applied to a quantity \(\psi \) returning \(\psi ^+-\psi ^-\) where each interior facet has sides arbitrarily labelled “+” and “-” (and noting that \(\varvec{n}\) is the unit normal to a facet with \(\varvec{n}^+\) pointing into the − side and vice versa for \(\varvec{n}^-\)),

  4. 4.

    \({\tilde{\psi }}\) indicates the upwind value of any quantity \(\psi \) on the facet i.e. the value on the side where \(\varvec{u}\cdot \varvec{n}\ge 0\) (making an arbitrary choice not affecting the result when \(\varvec{u}\cdot \varvec{n}=0\)),

  5. 5.

    \(\{\psi \}\) is the average operator returning \((\psi ^++\psi ^-)/2\) when evaluated on a facet,

  6. 6.

    \(\Pi \) is defined according to (4) but now applied to the numerical approximations \(\rho \) and \(\theta \) (so that we do not separately solve for an independent variable \(\Pi \)),

  7. 7.

    h is an estimate of the cross facet meshscale defined by \(\{V_e\}/A_f\) with \(V_e\) being the (cellwise constant) cell volume, and \(A_f\) being the (facetwise constant) facet area,

  8. 8.

    \(C_0\) is an edge stabilisation constant (chosen here to be \(2^{-7/2}\) following the suggested scaling \(p^{-7/2}\) with polynomial degree p given by Burman and Ern (2007)), and

  9. 9.

    \(\mathring{{\mathbb {V}}}_1\) is the subspace of \({\mathbb {V}}_1\) containing all functions that satisfy the \(\varvec{u}\cdot \varvec{n}=0\) boundary condition on the top and bottom of the domain.

The treatment of the velocity advection term was first demonstrated (for a incompressible Boussinesq vertical slice model) by Yamazaki et al. (2017), inspired by Natale and Cotter (2018) but different from the energy conserving form for compressible Euler equations proposed in Wimmer et al. (2021). The surface term deals with the fact that the velocity space does not have continuous tangential components in general; it also has a stabilising effect as examined in Natale and Cotter (2017). The treatment of the pressure gradient term first appeared in Natale et al. (2016), and has been used in a modified energy conserving form in Wimmer et al. (2021), and with lowest order spaces in Bendall et al. (2020) and as part of the Gung Ho dynamical core formulation in Melvin et al. (2019). The treatment of the potential temperature advection term deviates here from the “SUPG” formulation proposed in Yamazaki et al. (2017), using an edge stabilisation proposed by Burman (2005) instead, in combination with standard discontinuous Galerkin style upwinding on vertical faces. Even though the temperature space is only continuous in the vertical direction, we found that edge stabilisation was necessary on all faces to achieve stable results.

It should be noted that in the case of spatially varying topography, it is not possible to compute integrals exactly because of the \(\det (J)\) in the reciprocal appearing in the velocity basis functions (due to the Piola mapping), where J is the Jacobian of the reference element to mesh element mapping. Thus, we have to approximate the integrals using numerical quadrature, and we have to select a quadrature degree for each term so that the discretisation is stable and sufficiently accurate. Here we just take the suggested quadrature degree from the heuristic computed in UFL (Alnæs et al. 2014) which selects a (rather generous) quadrature rule for approximate integration, which is certainly sufficient for stability and second order consistency. In fact, many of the integrals are nevertheless computed exactly, due to fortuitious cancellations that are discussed in (Cotter and Thuburn 2014). Another source of inexact quadrature is the function \(\Pi \) which involves a noninteger power of \(\rho \) and \(\theta \).

When test cases require a viscous term, we use the symmetric interior penalty discretisation (to deal with discontinuities in the tangential component of the velocity) as described in (Cockburn et al. 2007), in which case the term

$$\begin{aligned}{} & {} \nu \left( \int _\Omega \nabla _h \varvec{u}:\nabla _h \varvec{v} {{\,\textrm{d}\,}}x - \int _\Gamma [\![\varvec{v}\otimes \varvec{n}]\!]:\{\nabla \varvec{u}\}{{\,\textrm{d}\,}}S - \int _\Gamma [\![\varvec{u}\otimes \varvec{n}]\!]:\{\nabla \varvec{v}\}{{\,\textrm{d}\,}}S\right. \nonumber \\{} & {} \quad \left. + \int _\Gamma \frac{\eta }{h} [\![\varvec{u}\otimes \varvec{n}]\!]:[\![\varvec{v}\otimes \varvec{n}]\!]{{\,\textrm{d}\,}}S\right) \end{aligned}$$
(12)

is added to the left hand side of Eq. (9), where \(\nu \) is the dynamic viscosity,  :  indicates the double contraction for tensors (so that \(A:B=\sum _{ij}A_{ij}B_{ij}\)), and \(\eta \) is a dimensionless penalty parameter, chosen here to have the value 10 which is experimentally determined to produce a stable discretisation for both \({\mathbb {V}}_1\) and \({\mathbb {V}}_{\theta }\). A similar formula (with the same parameters, but adapted to scalar fields) is used when test cases require a diffusion term in the potential temperature equation.

2.3 Time discretisation and iterative solver

The time discretisation used is the implicit midpoint rule, a fully implicit second order method. This is obtained by replacing time derivatives by time differences e.g. \(\theta _t \mapsto (\theta ^{n+1}- \theta ^n)/\Delta t\), and replacing all other occurrences of fields by their midpoint values e.g. \(\theta \mapsto (\theta ^{n+1}+\theta ^n)/2\). This nonlinear system is then solved for the \(n+1\) variables using the linesearch Newton method provided by PETSc (Balay et al. 2020) using Firedrake’s NonlinearVariationalSolver object. The resulting Jacobian systems are solved using GMRES applied to the full monolithic velocity-density-temperature system, preconditioned by an additive Schwarz method, which does direct solves over column patches surrounding one vertex on the base mesh (in vertical slice models this corresponds to forming a patch out of two neighbouring cells). In fact, a “star” patch is used, which neglects degrees of freedom associated with the horizontal boundaries of the patch. This patch is formed out of the “star iteration” (see Sect. 4 in (Farrell et al. 2019)) applied to a vertex in the base mesh, before extruding up the column. These direct solves couple all three fields, and the direct solve uses PETSc’s own LU factorisation algorithm using a reverse Cuthill-McGee ordering to reduce the fill in. This algorithm can be and is parallelised using domain decomposition, which was automated in our implementation using Firedrake. In all of our test cases, Newton’s method converges in 2–3 iterations, and GMRES converges in 10–35 iterations depending on the flow complexity, independent of mesh size provided that a constant Courant number is maintained whilst decreasing the mesh size.

This solver approach is necessary because we chose to use a fully implicit timestepping scheme, solved using Newton’s method. This requires us to be able to solve systems \(Jx=b\) where J is the full Jacobian obtained by linearisation about the state at each Newton iteration. In the case where J is the linearisation about a state of rest, J can be reduced to a positive definite Schur complement using the hybridisation technique provided the discretisation proposed in Betteridge et al. (2022) is used. However, this technique does not work when the linearisation is calculated for states with nonzero velocities, since these introduce additional intercell coupling.

2.4 Hydrostatic balance

The test cases that we consider here require the computation of a background density profile \(\rho _b\) that is in hydrostatic balance given the specified potential temperature \(\theta _b\). We use a nonhydrostatic model, but testcases require an initialisation at a state of hydrostatic balance, which we describe here. To avoid unphysical motion at lower resolutions we compute this hydrostatic balance numerically, i.e. we require that

$$\begin{aligned}{} & {} - \int _\Omega \nabla _h\cdot (\varvec{w}\theta _b)c_p\Pi _b {{\,\textrm{d}\,}}x + \int _\Gamma [\![\varvec{n}\cdot \varvec{w}\theta _b]\!]c_p\{\Pi _b\}{{\,\textrm{d}\,}}S \nonumber \\{} & {} \quad + \int _\Omega g\varvec{w}\cdot \hat{\varvec{k}}{{\,\textrm{d}\,}}x = 0, \quad \forall \varvec{w}\in \mathring{{\mathbb {V}}}_{1,v}, \end{aligned}$$
(13)

where \({\mathbb {V}}_{1,v}\) the vertical subspace of \({\mathbb {V}}_1\), and \(\mathring{{\mathbb {V}}}_{1,v}\) is the corresponding vertical subspace of \(\mathring{{\mathbb {V}}}_1\).Footnote 1 We note that \(\Pi _b\) is just a local nonlinear function of \(\rho _b\) and \(\theta _b\). As discussed in Natale et al. (2016), we can solve this equation for \(\rho _b\) by introducing an auxiliary variable \(\varvec{v}\) (which will turn out to vanish), and solving the coupled system

$$\begin{aligned}&\int _{\Omega }\varvec{w}\cdot \varvec{v}{{\,\textrm{d}\,}}x - \int _\Omega \nabla _h\cdot (\varvec{w}\theta _b)c_p\Pi _b {{\,\textrm{d}\,}}x \nonumber \\&\quad + \int _\Omega g\varvec{w}\cdot \hat{\varvec{k}}{{\,\textrm{d}\,}}x + \int _{\partial \Omega _0} c_p\varvec{w}\cdot \varvec{n} \theta _b\Pi _0{{\,\textrm{d}\,}}S = 0, \quad \forall \varvec{w}\in \tilde{{\mathbb {V}}}_{1,v}, \end{aligned}$$
(14)
$$\begin{aligned}&\quad \int _\Omega \nabla _h\cdot (\varvec{v}\theta _b)c_p\phi {{\,\textrm{d}\,}}x = 0, \quad \forall \phi \in {\mathbb {V}}_2, \end{aligned}$$
(15)

where \(\partial \Omega _0\) is the surface where we have chosen to set \(\Pi _b=\Pi _0\) as a boundary condition (the bottom boundary for the test cases in this paper), \(\tilde{{\mathbb {V}}}_{1,v}\) is the subspace of \({\mathbb {V}}_{1,v}\) containing functions that satisfy \(\varvec{w}\cdot \varvec{n}=0\) on the opposite boundary (the top boundary for the test cases in this paper). We note that the surface integral in Eq. (13) vanishes when \(\varvec{w}\in {\mathbb {V}}_{1,v}\) because \(\varvec{w}\cdot \varvec{n}=0\) on vertical faces, and we note that the weak boundary condition integral in (14) over \(\partial \Omega _0\) vanishes when \(\varvec{w}\in \mathring{{\mathbb {V}}}_{1,v}\subset \tilde{{\mathbb {V}}}_{1,v}\). Since \(\varvec{v}\) vanishes at the solution, we recover the hydrostatic condition (13). The system (1415) decouples into independent columns, which we solve using Newton’s method. Natale et al. (2016) showed that the linearisation around a given state of this system is well-posed, and we solve the resulting linear systems for \((\rho _b,\varvec{v})\) updates directly. We find an initial guess for \(\rho _b\) by first solving the corresponding linear system for \((\Pi _b,\varvec{v})\) where \(\Pi _b\in {\mathbb {V}}_2\) is taken as an independent variable instead of a local function of \(\rho _b\) and \(\theta _b\). We then project the formula for \(\rho _b\) in terms of \(\theta _b\) and \(\Pi _b\) into \({\mathbb {V}}_2\). If we use the result as an initial guess then Newton’s method converges in 2–3 iterations. Note that this computation is only done to compute a set of initial conditions, and is not used in the forward model which is always nonhydrostatic.

3 Computational examples

In this section we demonstrate our discretisation approach using the suite of test problems considered in Melvin et al. (2010). This suite tests the vertical slice discretiation on a range of flows that are relevant to numerical weather prediction, including acoustic and gravity waves and flows driven by both buoyancy and orography. The gravity wave and orographic wave tests are run in both the hydrostatic and nonhydrostatic regimes. To avoid the need to refer back, we provide a brief summary of the test problems here. Some constants that are consistent across all tests are provided in Table 1. In each test we have used the same timestep \(\Delta t\) values as Melvin et al. (2010), and double the values of \(\Delta x\) and \(\Delta z\), which ensures the same number of degrees of freedom (because we are using NLO spaces). A summary of resolutions, timesteps, number of iterations, and the timings is provided in Table 2.

Table 1 Some constants that take consistent values across all the computational examples
Table 2 Table showing the resolution, \(\Delta t\), number of GMRES iterations per timestep (over all the Newton iterations), the time taken and the number of cores used, for each of the testcases.

3.1 Gravity waves

This test case is the “Hello World!” of vertical slice test problems, first proposed in Skamarock and Klemp (1994). There are two versions of the test case, the nonhydrostatic flow regime version with velocity constrained to the \(x-z\) plane and consequently \(f=0\), and the hydrostatic flow regime version with 3D velocity and \(f=10^{-4}\,\textrm{s}^{-1}\). Note that we are solving the nonhydrostatic equations in both versions, and the naming just describes the asymptotic flow regime and not the equations solved. The domain is given by \(L/2\le x \le L/2\) and \(0 \le z \le H\) where \(L=3\times 10^5\,\textrm{m}\) in the nonhydrostatic case and \(L=6\times 10^6\) m in the hydrostatic case. In both cases the height is \(H=10^4\) m and there are periodic boundary conditions in the horizontal direction.

In both cases, the potential temperature is initialised to a background profile

$$\begin{aligned} \theta _b = T_{{{\,\textrm{surf}\,}}}\exp (N^2z/g),\quad T_{{{\,\textrm{surf}\,}}}=300K, \end{aligned}$$
(16)

before solving for the hydrostatically balanced \(\rho _b\). Then a perturbation is added to the potential temperature,

$$\begin{aligned} \Delta \theta = \Delta \theta _0\frac{\sin (\pi z/H)}{1 + x^2/a^2}, \end{aligned}$$
(17)

where \(\Delta \theta _0=10^{-2}K\) and \(a=5\times 10^3m\) for the nonhydrostatic flow regime and \(a=10^5m\) for the hydrostatic flow regime. In both cases, the horizontal velocity in the x-direction is initialised to \(20\,\textrm{ms}^{-1}\) and the other components to zero. In the hydrostatic case, an additional forcing term is introduced to balance the Coriolis force, adding \(f\times (0,-20,0)\) to the left hand side of Eq. (1).

Plots of the nonhydrostatic and hydrostatic flow regime solutions are shown in Figs. 1 and 2, respectively. These solutions closely match the results from Melvin et al. (2010) at similar resolutions.

Fig. 1
figure 1

Contour plot of vertical velocity for the nonhydrostatic gravity wave at time \(t=3000\) s at resolution \(\Delta x=2000\) m, \(\Delta z=2000\) m, and \(\Delta t=12\) s. Contours are drawn every \(5\times 10^{-4}\textrm{ms}^{-1}\)

Fig. 2
figure 2

Contour plot of vertical velocity for the hydrostatic gravity wave at time \(t=60{,}000\) s at resolution \(\Delta x=20{,}000\) m, \(\Delta z=1000\) m, and \(\Delta t=100\) s. Contours are drawn every \(5\times 10^{-4}\) ms\(^{-1}\)

3.2 Density current

This test is taken from the classic intercomparison project of Straka et al. (1993), simulating a dense bubble in an isentropic, hydrostatic atmosphere. The domain is \(-L/2\le x \le L/2\) where \(L=51200\,m\), and \(0 \le z \le H=6400\,m\) with periodic boundary conditions in the horizontal direction.

The background temperature profile is chosen to be isentropic, i.e. the potential temperature is constant. In this case the background potential temperature is \(\theta _b=300K\). The background density profile is then obtained by solving for hydrostatic balance for this potential temperature profile, with the boundary condition \(\Pi =1\) on the bottom boundary. We then apply a perturbation to the temperature,

$$\begin{aligned} \Delta T = \left\{ \begin{array}{cr} 0 &{}\quad \text{ if } L_r > 1, \\ -15\left( \cos (\pi L_r) + 1\right) /2 &{}\quad \text{ otherwise }, \end{array}\right. \end{aligned}$$
(18)

at constant pressure p, where

$$\begin{aligned} L_r = \sqrt{\left( \frac{x {- x_c}}{x_r}\right) ^2 + \left( \frac{z-z_c}{z_r}\right) ^2}, \end{aligned}$$
(19)

and \((x_c,x_r)=(0\,\textrm{m}, 4000\,\textrm{m})\), \((z_c,z_r)=(3000\,\textrm{m},2000\,\textrm{m})\). This corresponds to making the potential temperature perturbation \(\Delta \theta =\Delta T/\Pi \), where \(\Pi \) is the background Exner pressure profile, and then perturbing the density according to \((\rho +\Delta \rho )(\theta +\Delta \theta ) = \rho \theta \). In our implementation, \(\Delta \theta \) was computed by the following steps:

  1. 1.

    Project the formula \(\Delta T/\Pi (\rho _b,\theta _b)\) into \({\mathbb {V}}_{\theta }\), where \(\rho _b\in {\mathbb {V}}_2\) and \(\theta _b\in {\mathbb {V}}_{\theta }\) are the previously computed initial conditions for density and temperature respectively. This is the initial condition \(\theta _0\) for potential temperature.

  2. 2.

    Project the formula \(\rho _b\theta _0/\theta _b\) in to \({\mathbb {V}}_2\). This is the initial condition \(\rho _0\) for density.

These projections are \(L^2\) projections using incomplete but high order quadrature for the integrals over the formulae, similar to those used in the dynamical equations.

Without viscosity, this problem becomes ill-posed after forming a singular vorticity structure in finite time. Hence, to be able to compare between models, a viscosity of \(\nu =75\textrm{m}^2\textrm{s}^{-1}\) is included in the velocity equation and a diffusivity of \(\kappa =75\,\textrm{m}^2\,\textrm{s}^{-1}\) is included in the potential temperature equation.

For this testcase, the comparison is made after 15 min. Contour plots at various resolutions are shown in Fig. 3, and some summary statistics are provided in Table 3. There is quite a bit of variation between models for the precise solution at this point, as there is a lot of small scale structure forming, but our contour plots look broadly similar to those of Melvin et al. (2010). In particular, the density current has reached a very similar location, estimated as the maximum x coordinate over all cells where \(\Delta \theta <0\). We see that the dissipation of the minima of \(\Delta \theta \) is much weaker at the higher resolutions, and although there is still a substantial overshoot past the initial maxima of 0, this may be because the solution produces finer filaments of potential temperature at higher resolutions, which is more challenging for the advection scheme.

Fig. 3
figure 3

For the density current testcase: contours of \(\Delta \theta \) at 15 min. Contour intervals are 1K, with the first contour being at –1 K. From left to right and top to bottom, the resolutions are \((\Delta x, \Delta t)=(800\,\textrm{m},4\,\textrm{s})\), \((400\,\textrm{m}, 2\,\textrm{s})\), \((200\,\textrm{m}, 1\,\textrm{s})\) and \((100\,\textrm{m}, 0.5\,\textrm{s})\) respectively

Table 3 For the density current testcase: minimum, and maximum values of \(\Delta \theta =\theta -\theta _b\) measured at 15 min, together with the front location (estimated as the maximum x coordinate over all cells where \(\Delta \theta <0\))

3.3 Flow over a mountain

This test problem simulating small amplitude lee waves generated by flow over a (small) mountain also has two versions, the nonhydrostatic flow regime version with velocity constrained to the \(x-z\) plane and consequently \(f=0\), and the hydrostatic flow regime version with 3D velocity and \(f=10^{-4}\,\textrm{s}^{-1}\). Note that we are solving the nonhydrostatic equations in both versions, and the naming just describes the asymptotic flow regime and not the equations solved. The domain is given by \(L/2\le x \le L/2\) and \(0 \le z \le H\) where \(L=144{,}000\,\textrm{m}\) and \(H=35{,}000\,\textrm{m}\) in the nonhydrostatic case and \(L=240{,}000\,\textrm{m}\) and \(H=50{,}000\,\textrm{m}\) in the hydrostatic case. In both cases there are periodic boundary conditions in the horizontal direction.

The mountain has a “Mount Agnesi” profile, with the bottom boundary moved to \(z_s(x) = \frac{a^2}{x^2+a^2}\), giving a mountain of height 1m. In our implementation, we use a simple terrain following mesh with the rectangular domain transformed according to \((x,y,z)\mapsto (x,y,z + z_s(H-z)/H)\). In the nonhydrostatic flow regime, \(a=10{,}000\) m, and in the hydrostatic regime, \(a=1000\) m.

In the nonhydrostatic flow regime, a stratified background flow is initialised according to the description of Sect. 3.1, with \(T_{surf}=300\) K. In the hydrostatic flow regime, the stratification is isothermal, i.e. constant temperature at \(T=T_{surf}=250\) K. This does still imply a varying potential temperature, with profile

$$\begin{aligned} \theta = T_{surf}\exp \left( \frac{gz}{T_{surf}c_p}\right) . \end{aligned}$$
(20)

In both cases, the density is initialised by solving numerically for a hydrostatic profile, with boundary condition \(\Pi =1\) at \(z=0\). This requires additional calculation because with the topography, the bottom boundary is not at \(z=0\) everywhere. To address this, we calculate the boundary condition for \(\Pi \) at the top of the domain that produces the value \(\Pi =1\) at \(z=0\), using a value on the bottom of the domain away from the mountain.

The nonhydrostatic test is initialised with a horizontal velocity \(u=10\,\textrm{ms}^{-1}\) and the hydrostatic test is initialised with a horizontal velocity \(u=20\,\textrm{ms}^{-1}\). As for the gravity wave test, in the hydrostatic case, an additional forcing term is introduced to balance the Coriolis force, adding \(f\times (0,-20,0)\) to the left hand side of Eq. (1). In both cases, the initial velocity is not compatible with the boundary condition at the mountain, so this causes a pressure wave propagating at ground level, radiating waves that interfere with the stationary lee wave pattern that accumulates over time. This is often referred to as a test of robustness of the discretisation.

To prevent the lee waves (and the initial pressure waves caused by the sudden appearance of the mountain at time 0) reflecting off the top boundary, an absorbing term in the vertical velocity is added in the top layer, with profile

$$\begin{aligned} \mu (z) = \left\{ \begin{array}{rl} 0, &{} z < z_B, \\ {\bar{\mu }} \sin ^2\left( \frac{\pi }{2}\left( \frac{z-z_B}{H-z_B}\right) \right) , &{} z \ge z_B, \end{array} \right. \end{aligned}$$
(21)

where \({\bar{\mu }}\) is a constant and \(z_B\) is the height of the bottom of the absorbing layer. For the hydrostatic test, \(z_B=H-2\times 10^4\,\textrm{m}=3\times 10^4\,\textrm{m}\) and \({\bar{\mu }}\Delta t=0.3\,\textrm{s}\). For the nonhydrostatic test, \(z_B=H-10^4\,\textrm{m}=2.5\times 10^4\) m and \({\bar{\mu }}\Delta t=0.15\) s.

Contour plots of the vertical velocity are provided in Figs. 4 and 5, respectively. We observe good agreement with the solutions plotted in Melvin et al. (2010).

Fig. 4
figure 4

Contours of vertical velocity for the nonhydrostatic mountain wave test at \(t=9\times 10^3\) s. Contour intervals are every \(5\times 10^{-4}\,\textrm{ms}^{-1}\) ; horizontal lines are plotted every 2000 m

Fig. 5
figure 5

Contours of vertical velocity for the hydrostatic mountain wave test at \(t=1.5\times 10^4\) s. Contour intervals are every \(5\times 10^{-4}\,\textrm{ms}^{-1}\) ; horizontal lines are plotted every 5000 m

3.4 Schär test

Schär et al. (2002) describe a more challenging mountain wave test with a mountain range orography that varies over multiple length scales, defined as

$$\begin{aligned} z_s(x) = h_m\exp \left( -\left( \frac{x}{a}\right) ^2\right) \cos ^2\left( \frac{\pi x}{\lambda }\right) , \end{aligned}$$
(22)

where \(h_m=250\,\textrm{m}\), \(\lambda =4\times 10^3\,\textrm{m}\), and \(a=5\times 10^3\,\textrm{m}\). The domain is given by \(-L/2\le x \le L/2\) where \(L=10^5\) m and \(0\le z \le H = 3\times 10^4\) m. The initial stratification is initialised in the same manner as for the nonhydrostatic mountain wave and we also add an absorbing term as described in Eq. 21, with \(z_B=H-10^4=2\times 10^4\) m and \({\bar{\mu }}\Delta t=1.2\) s. The velocity is initially horizontal with \(u=10\,\textrm{ms}^{-1}\). Following Melvin et al. (2010), we ran the test with two values of \(\Delta t\), 8 s and 40 s. As might be expected from a solution that has almost reached a steady state for the lee wave pattern and the fact that we are not using a splitting method, the numerical solutions obtained were indistinguishable. The larger timestep requires more linear solver iterations, as displayed in Table 2. This is offset by the larger timestep, so the time to solution is similar (shorter for the larger timestep, in fact). This reflects the fact that our columnwise preconditioner produces mesh independent convergence rates when applied to the linearisation about the state of rest, but has a dependency on Courant number in general. The solutions fit within the range of results obtained by others e.g. Straka et al. (1993); Giraldo and Restelli (2008); Bendall et al. (2020), given the turbulent nature of the solution (Fig. 6).

Fig. 6
figure 6

Contours of vertical velocity for the Schär mountain wave test at \(t=1.8\times 10^4\) s, with \(\Delta t=8\) s. Contour intervals are every \(5\times 10^{-2}\,\textrm{ms}^{-1}\)

4 Summary and outlook

In this paper we presented a compatible finite element discretisation for the compressible Euler equations, and demonstrated numerical robustness using a standard suite of vertical slice tests for numerical weather prediction. In all cases the results are very similar to published results. In future work we will demonstrate the discretisation in the fully three dimensional setting on the sphere.

One novel feature of our approach is a monolithic approach to solving a fully implicit system. This approach shows promise for producing robust timestepping approaches. In experiments with the shallow water equations we have used the additive Schwarz approach as a smoother for a multigrid scheme, which has led to faster convergence of the iterative solver. We were unable to do that in this work because Firedrake does not currently support mesh hierarchies on periodic meshes which are required for this suite of tests. We will rectify this in further work.

One reason for our interest in fully implicit methods is that these are required for the time parallel algorithms that we are currently developing. It is also interesting to consider Rosenbrock methods that only require the solution of linear systems (as opposed to the nonlinear systems coming from the implicit midpoint rule) linearised about the state at the start of the timestep. In particular, we plan to experiment with a Rosenbrock version of the TR-BDF2 timestepping scheme, previously applied to numerical weather prediction in Tumolo and Bonaventura (2015).