1 Introduction

1.1 Purpose

The thermal quasi-geostrophic (TQG) equations comprise a mesoscale model of ocean dynamics in a solution regime near thermal geostrophic balance. In thermal geostrophic balance, three forces—the Coriolis, hydrostatic pressure gradient and buoyancy-gradient forces—sum to zero. Numerical simulations of the TQG equations show the onset of instability at high wavenumbers which creates small coherent structures which resemble submesoscale (1–20 km) features observed in satellite ocean colour images as seen in Fig. 1.

Fig. 1
figure 1

Comparison of a computational simulation of TQG solutions with a satellite observation of ocean colour on a section of the Lofoten Vortex, courtesy of https://ovl.oceandatalab.com/ which illustrates the configurations of submesoscale currents obtained from ESA Sentinel-3 OLCI instrument observations of chlorophyll on the surface of the Norwegian Sea in the Lofoten Basin, near the Faroe Islands

On the left panel of Fig. 1, one sees submesoscale features which are prominently displayed in computational simulations of TQG equations for sea-surface height (SSH). The right panel of Fig. 1 shows the surface of the Lofoten Basin off the coast of Norway near the Faroe Islands. In crossing the Lofoten Basin, warm saline Atlantic waters create buoyancy fronts as they meet the cold currents of the Arctic Ocean. Figure 1 displays several features of submesoscale currents surveyed in McWilliams (2019). High-resolution (4 km) computational simulations of the Lofoten Vortex have recently discovered that its time-mean circulation is primarily barotropic, Volkov et al. (2015), thereby making the flow in the Lofoten Basin a reasonable candidate for investigation using vertically averaged dynamics such as the TQG dynamical system. Both images show a plethora of multiscale features involving shear interactions of vortices, fronts, plumes, spirals, jets and Kelvin–Helmholtz roll-ups. The submesoscale features persist and interact strongly with each other in Kelvin–Helmholtz roll-up dynamics, instead of simply cascading energy to higher wavenumbers. This observation means that the instabilities which create these submesoscale features quickly regain stability without cascading them to ever smaller scales. The present work aims to understand these features of the TQG solution dynamics, both analytically and numerically.

The contributions of this paper The main analytical contribution of this paper is the construction of a unique local strong solution of the TQG model. In particular, we show that there is a unique maximal solution of the TQG equation defined on a (possibly infinite) time interval \([0, T_{\textrm{max}})\). The solution will be shown to exist in a suitable Sobolev space. More precisely, provided that the initial data resides in a chosen Sobolev space, the solution at time \(t>0\) will remain in this space as long as \(t<T_{\textrm{max}}\). Should \(t<T_{\textrm{max}}\) be finite, then the solution will blow up in the chosen Sobolev norm.

As a second analytical contribution, we show that the TQG model depends continuously on the initial condition. This dependence only holds in a slightly weaker norm than the one corresponding to the space where the solution resides. This is a useful property from a numerical perspective. It implies that initial small errors when simulating the TQG model will stay small at any subsequent time.

The third analytical contribution is to construct a regularised version of the TQG model, termed the \(\alpha \)-TQG model. This model is constructed in a similar manner as the \(\alpha \) model for the Euler and Navier–Stokes equations, see (Foias et al. 2001, 2002; Marsden and Shkoller 2003). The \(\alpha \) model approach is a dispersive regularisation, rather than being dissipative. This feature has proven remarkably successful in applications of global ocean circulation, largely because its non-dissipative regularisation preserves the onset of baroclinic instability as shown in Holm and Wingate (2005). In the present case, the \(\alpha \)-TQG equations are found to possess a unique maximal solution which is also only continuous with respect to the initial conditions in a larger space with weaker norm. In addition, we show that the \(\alpha \)-TQG solution converges to the TQG solution as \(\alpha \rightarrow 0\) in a norm that depends on two physical parameters, the vorticity and the gradient of buoyancy. We also identify the rate of convergence as a function of the \(\alpha \)-parameter on a time interval where both the TQG solution and the \(\alpha \)-TQG solution are shown to exist for any \(\alpha >0\).

The blow-up phenomenon is important, particularly if it is observed in numerical simulations. Therefore, blow-up criteria (in other words, criteria required for the solution to blow up) are important. In this paper we state three characterisations of the blow-up time. The fourth analytical contribution of this paper is to show that blow-up occurs in the \(\alpha \)-TQG model, if either:

  1. (1)

    the \(L^\infty (\mathbb {T}^2)\)-norm of the buoyancy gradient \(\nabla b\) blows up at \(T_{\max }<\infty \);

  2. (2)

    the \(L^\infty (\mathbb {T}^2)\)-norm of the velocity gradient \(\nabla \textbf{u}\) blows up at \(T_{\max }<\infty \) or that;

  3. (3)

    the \(W^{1,2}(\mathbb {T}^2)\)-norm of the buoyancy gradient \(\nabla b\) blows up at \(T_{\max }<\infty \).

The contribution of this paper from a numerical perspective is to describe our spatial and temporal discretisation methods for approximating TQG and \(\alpha \)-TQG solutions, and to analyse aspects of numerical conservation properties with respect to the theoretical conserved quantities of the \(\alpha \)-TQG system. Example simulation results are included and are used to verify numerically the theoretical convergence results. Additionally, we provide linear stability analysis results for the \(\alpha \)-TQG system. Given that we have convergence of \(\alpha \)-TQG solutions to TQG, the linear stability results can be viewed as generalisations of those shown in Holm et al. (2021) for the TQG system.

Fig. 2
figure 2

Point-in-time snapshots of a computational simulation of the TQG system in a channel domain. The bathymetry function was chosen to be \(h(x,y) = \cos (2\pi x) + \frac{1}{2} \cos (4\pi x) + \frac{1}{3}\cos (6 \pi x)\). The initial potential vorticity was chosen to be an inverted Gaussian kernel \(\omega (0, x, y) = -\exp (-\frac{1}{2}(2\pi y - \pi )^2\). The initial buoyancy was chosen to be \(b(0, x, y) = \sin (2\pi x)\). Subfigure 2a illustrates how the interaction between the gradients of buoyancy and bathymetry impacts the development of features during the early phases of the flow. Subfigure 2b illustrates the fully developed flow. Note that in the figure for potential vorticity in Subfigure 2b, the distribution of features at different scales is controlled by the chosen bathymetry function

1.2 Brief History of the TQG Model

The history of the TQG model goes back about half a century, perhaps first elucidated by O’Brien and Reid (1967) as sketched in Beron-Vera (2021a). Briefly put, the TQG model generalises the classical QG equations by introducing horizontal gradients of buoyancy which alter the geostrophic balance to include the inhomogeneous thermal effects which influence buoyancy. Indeed, O’Brien and Reid (1967) write that their work was inspired by observations that the passage of hurricanes could draw enough heat from the ocean to significantly lower the sea-surface temperature in the Gulf of Mexico. Since O’Brien and Reid (1967) introduced their two-layer model, further developments of it have been applied to a variety of ocean processes, particularly to equatorial dynamics. For more details of the theoretical model developments, see (Ripa 1993, 1995, 1999) and for developments of applications in oceanography see (Anderson and McCreary Jr 1985; Beier 1997; McCreary Jr et al. 1997; Schopf and Cane 1983), as well as other citations in Beron-Vera (2021a). In particular, Ripa refers to the TQ models as inhomogeneous-layer (IL) models and his papers explain rational derivations of theories with increasing vertical structure IL1, IL2, etc. The TQG model analysed here and derived systematically from asymptotic expansions in small dimensionless parameters of the Hamilton’s principle for the rotating, stratified Euler equations in Holm et al. (2021) is equivalent to the model IL0QG of Ripa (1996) recently analysed in Beron-Vera (2021b).

1.3 A Sketch of the Derivation of the TQG Model

We have explained that certain thermal effects in the mesoscale ocean have historically been modelled by the thermal quasi-geostrophic (TQG) equations. TQG is characterised by several dimensionless numbers arising from the dimensional parameters of planetary rotation, gravity and buoyancy. These are the familiar Rossby number, Froude number and stratification parameter. The Rossby number is the ratio of a typical horizontal velocity divided by the product of the rotation frequency and a typical horizontal length scale. The Froude number is the ratio of a typical horizontal velocity divided by the velocity of the fastest propagating gravity wave, which in turn is given by the square root of the gravity times the typical vertical length scale. The final dimensionless number is the stratification parameter, which specifies the typical size of the buoyancy stratification.

Fig. 3
figure 3

The tree of model derivations can function as a roadmap of geophysical fluid dynamics. The solid red arrows indicate the sequence of approximations that lead to the thermal quasi-geostrophic equations

The regime in which the thermal quasi-geostrophic equations are derived is characterised by the thermal geostrophic balance. This balance arises because the Rossby number, the Froude number and the stratification parameter all have a similar amplitude. Preserving this threefold balance requires simultaneously adapting the Froude number and the stratification parameter to match any change in Rossby number, for example, so that the dimensionless parameters will still have the same size. We consider the non-dissipative case, because of the large scales of mesoscale ocean dynamics. As mentioned earlier, the mesoscale dynamics has high wavenumber instabilities, which in principle can generate submesoscale effects. At smaller scales, viscous dissipation and thermal diffusivity will come into play as well. However, in what follows, dissipative effects will be neglected. The derivation of the TQG model involves a series of coordinated approximations to the rotating, stratified Euler model, as is illustrated in the diagram below.

The derivation of the right-most three columns of the tree in Fig. 3 is discussed in detail (Holm and Luesink 2021). The derivation of the thermal quasi-geostrophic equations treated here from the thermal rotating shallow water equations on the middle left of the figure is discussed in Holm et al. (2021). For the physical and mathematical details of the derivation indicated by the solid red arrows in Fig. 3, we refer to Holm and Luesink (2021); Holm et al. (2021). The following is a short summary of the derivation indicated by the solid red arrows. The starting point is the Lagrangian for the rotating, stratified Euler equations at the top of the figure. After identifying the small dimensionless parameters in the Euler Lagrangian, the Lagrangians for the successive approximate models can be derived by inserting asymptotic expansions into the Euler Lagrangian. In the ocean, the buoyancy stratification is typically small. Hence, it makes sense to apply the Boussinesq approximation in the Lagrangian for the rotating, stratified Euler equations. This approximation yields the Lagrangian for the Euler–Boussinesq equations. At this point, one has a choice of two routes. The first route begins by making the hydrostatic approximation, which leads to the Lagrangian for the primitive equations. Then, upon vertically integrating the Lagrangian for the primitive equations, one finds the Lagrangian for the thermal rotating shallow water equations. The alternative route first vertically integrates the Lagrangian for the Euler–Boussinesq equations to obtain the Lagrangian for the thermal and rotating version of the Green–Naghdi equations. By subsequently making the hydrostatic approximation in the Lagrangian for the thermal rotating Green–Naghdi equations, the alternative route arrives at the Lagrangian for the thermal rotating shallow water equations.

The Lagrangian for the thermal rotating shallow water (TRSW) equations is the starting point in Holm et al. (2021) for the derivation of the thermal quasi-geostrophic (TQG) equations. The typical values of the dimensionless numbers for mesoscale ocean problems lead to thermal geostrophic balance. This balance implies an algebraic expression for the balanced velocity field in terms of the horizontal gradients of the free surface elevation and the buoyancy. Upon expanding the TRSW Lagrangian around this balance and truncating, one obtains the Lagrangian for the thermal Lagrangian 1 (L1) model. The Lagrangian for the thermal L1 model is not hyperregular, though. Hence, the Legendre transformation for thermal L1 is not invertible. Hence, no Hamiltonian description of this model is available via the Legendre transformation. Nonetheless, an application of the Euler–Poincaré theorem (Holm et al. 1998a) to the thermal L1 Lagrangian yields the corresponding equations of motion. By identifying the leading order terms in the thermal L1 equations and truncating the asymptotic expansion, one finally obtains the TQG equations. However, the truncations of the asymptotic expansions in the thermal L1 equations also prevent the resulting TQG equations from possessing a Hamilton’s principle derivation. This feature is unlike the other models in Fig. 3, which all arise from approximated Lagrangians in their corresponding action integrals for Hamilton’s principle. However, as it turns out, the thermal quasi-geostrophic equations do possess a Hamiltonian formulation in terms of a non-standard Lie-Poisson bracket. The Lie–Poisson bracket for the quasi-geostrophic equations can be obtained via a linear change of variables from the usual semi-direct product Lie–Poisson bracket for fluids. The resulting Hamiltonian formulation for the thermal quasi-geostrophic equations is important for the application of stochastic advection by Lie transport (SALT), introduced in Holm (2015), which requires either a Lagrangian, or a Hamiltonian interpretation of the equations of motion. The analysis of the stochastic thermal quasi-geostrophic equation has been covered in the companion paper (Crisan et al. 2022).

By applying the method of Stochastic Advection by Lie Transport (SALT) to the Hamiltonian formulation of the TQG equations, one obtains a stochastic version of them which preserves the infinite family of integral conserved quantities. More details on the stochastic version can be found in the conclusion section as well as in Holm et al. (2021). A discussion of the remaining models in Fig. 3 can be found in Holm and Luesink (2021).

The TQG model on the two-dimensional flat torus \(\mathbb {T}^2\) can be formulated in two equivalent ways. The first formulation is the advective formulation, in which the equations of motion are given by

$$\begin{aligned} \frac{\partial }{\partial t}b + \textbf{u}\cdot \nabla b&= 0, \end{aligned}$$
(1.1)
$$\begin{aligned} \frac{\partial }{\partial t}\omega + \textbf{u}\cdot \nabla (\omega -b)&= -\textbf{u}_{h_1}\cdot \nabla b, \end{aligned}$$
(1.2)
$$\begin{aligned} \omega = (\Delta -1)\psi + f_1&\,, \quad \nabla \cdot \textbf{u} = 0 \,. \end{aligned}$$
(1.3)

Here b is the vertically averaged buoyancy, \(\textbf{u}\) is the thermal geostrophically balanced velocity field, \(\omega \) is the potential vorticity, \(\psi \) is the streamfunction, \(h_1\) is the spatial variation around a constant bathymetry profile, and \(f_1\) is the spatial variation around a constant background rotation rate. The velocity \(\textbf{u}\) and the streamfunction \(\psi \) are related by the equation

$$\begin{aligned} \textbf{u} = \nabla ^\perp \psi , \end{aligned}$$
(1.4)

where \(\nabla ^\perp = (-\partial _y,\partial _x)\). The vector field \(\textbf{u}_{h_1}\) is defined by

$$\begin{aligned} \textbf{u}_{h_1} := \frac{1}{2}\nabla ^\perp h_1. \end{aligned}$$
(1.5)

Alternatively, one can formulate the thermal quasi-geostrophic (TQG) equations in vorticity-streamfunction form. This formulation is given by

$$\begin{aligned} \frac{\partial }{\partial t} b + J(\psi ,b)&= 0, \end{aligned}$$
(1.6)
$$\begin{aligned} \frac{\partial }{\partial t}\omega + J(\psi ,\omega -b)&= -\frac{1}{2}J(h_1,b), \end{aligned}$$
(1.7)
$$\begin{aligned} \omega&= (\Delta -1)\psi + f_1. \end{aligned}$$
(1.8)

The operator \(J(a,b)=\nabla ^\perp a\cdot \nabla b=a_x b_y - b_x a_y\) is the Jacobian of two smooth functions a and b defined on the (xy) plane. The equivalence of the two formulations (1.1)–(1.2) and (1.6)–(1.7) follows from the Jacobian operator relation \(J(\psi ,a) = \textbf{u}\cdot \nabla a\), in which \(\psi \) is the streamfunction associated to the velocity vector field \(\textbf{u}\). The scalar functions \(f_1\) and \(h_1\) relate to the usual Coriolis parameter and bathymetry profile in the following way

$$\begin{aligned} \begin{aligned} h(\textbf{x})&= 1 + \textrm{Ro}\, h_1(\textbf{x}),\\ f(\textbf{x})&= 1 + \textrm{Ro}\, f_1(\textbf{x}), \end{aligned} \end{aligned}$$
(1.9)

where \(\textrm{Ro} = U(f_0 L)^{-1}\) is the Rossby number, expressed in terms of the typical horizontal velocity U, typical rotation frequency \(f_0\) and typical horizontal length scale L. This means that the bathymetry and Coriolis parameter become constant as the Rossby number tends to zero. Equation (1.9) are necessary to derive the thermal quasi-geostrophic equations from the thermal L1 equations, as shown in Holm et al. (2021). The expansion (1.9) contains the \(\beta \)-plane approximation provided that the boundary conditions are appropriate. Namely, on the \(\beta \)-plane, one requires that \(\beta f_0^{-1} = \mathcal {O}(\textrm{Ro})\) and \(f_1(\textbf{x})=y\). An additional relation can be helpful when using the thermal quasi-geostrophic equations as a model for mesoscale ocean dynamics. In particular, the streamfunction is related to the free surface elevation \(\zeta \) and buoyancy b via the definition \(\psi :=\zeta + \frac{1}{2}b\). This definition of the streamfunction is useful to relate to observational data. The free surface elevation is a quantity that can be measured with satellite altimetry. These measurements can then be used for data assimilation and model calibration. However, the definition of the streamfunction in terms of the velocity field is not necessary to formulate the model as a closed set of equations, since the system (1.6)–(1.8) is already a closed set of equations. In the expansion (1.9), we will henceforth drop the subscript 1 on \(h_1(\textbf{x})\) and \(f_1(\textbf{x})\) for notational convenience.

1.4 Plan for the Rest of the Paper

We now give the plan for the rest of the paper. We collect preliminary tools in Sect. 2. This includes notations and analytical properties of function spaces used throughout this paper. We also collect useful estimates that will be used at various stages and give precise definitions of the concept of solutions used in our analysis. We finally end Sect. 2 with statements of the main results. In particular, we state that both the TQG and \(\alpha \)-TQG equations admit unique strong solutions for a finite period of time and these solutions are stable in a larger space with weaker norm. Furthermore, a maximum time for these solutions exists.

Since the proof of local well-posedness is the same for the TQG and \(\alpha \)-TQG, we will avoid duplication by devoting Sect. 3 to the construction of solutions for the less regular TQG. The construction relies heavily on the standard energy method. Since we are constructing strong solutions (rather than weak ones), we differentiate the equations in space and then we test the resulting equations with the required differential of the solution to obtain the required bounds. We then end Sect. 3 by showing that the unique solution constructed has a maximum time of existence and hence, is a maximal solution.

In Sect. 4 we show that any family of maximal solutions of the \(\alpha \)-TQG models converges strongly with \(\alpha \rightarrow 0\) to the unique maximal solution of the TQG on a common existence time, provided they share the same data.

Next, since our solutions are local in nature, we establish in Sect. 5, conditions under which this solution may blow up in the sense of Beale et al. (1984). In particular, we show that in order to control the solution of \(\alpha \)-TQG, the essential supremum in space of both the buoyancy gradient and the velocity gradient should be integrable over the anticipated time interval. Once either of these gradients blow up, the solution ceases to exist. Alternatively, in order to control the solution, it suffices to control the \(H^1\)-Sobolev norm of the buoyancy gradient.

Section 6 is devoted to numerical methods and simulation results. We begin by describing the finite element method we use for the spatial derivatives, and aspects of its numerical conservation properties with respect to theoretical results. We then describe the finite difference discretisation method used for the time derivative. Next, we discuss \(\alpha \)-TQG linear thermal Rossby wave stability analysis, which can be seen as generalising the results shown in Holm et al. (2021) for the TQG system. Then in the last part of the subsection, we discuss our numerical simulation setup and its results. In particular, we numerically verify the theoretical convergence rate for \(\alpha \hbox {-TQG}\rightarrow \hbox {TQG}\) derived in Sect. 4.

2 Preliminaries and Main Results

In this section, we fix the notation, collect some preliminary material on function spaces and present the main analytical results.

Remark 2.1

Although we will be working on the two-dimensional torus, with minimal effort, the same analysis will work on the whole plane \(\mathbb {R}^2\) subject to a far-field condition. The case of a bounded domain with boundary conditions is however outside the scope of the analytical aspect of this work. See Sect. 6 for numerical works in this regard.

2.1 Notations

Our independent variables consists of spatial points \(x:=\textbf{x}=(x, y)\in \mathbb {T}^2\) on the 2-torus \(\mathbb {T}^2\) and a time variable \(t\in [0,T]\) where \(T>0\). For functions F and G, we write \(F \lesssim G\) if there exists a generic constant \(c>0\) such that \(F \le c\,G\). We also write \(F \lesssim _p G\) if the constant \(c(p)>0\) depends on a variable p. The symbol \(\vert \cdot \vert \) may be used in four different contexts. For a scalar function \(f\in \mathbb {R}\), \(\vert f\vert \) denotes the absolute value of f. For a vector \(\textbf{f}\in \mathbb {R}^2\), \(\vert \textbf{f}\vert \) denotes the Euclidean norm of \(\textbf{f}\). For a square matrix \(\mathbb {F}\in \mathbb {R}^{2\times 2}\), \(\vert \mathbb {F} \vert \) shall denote the Frobenius norm \(\sqrt{\textrm{trace}(\mathbb {F}^T\mathbb {F})}\). Finally, if \(S\subseteq \mathbb {R}^2\) is a (sub)set, then \(\vert S \vert \) is the two-dimensional Lebesgue measure of S.

For \(k\in \mathbb {N}\cup \{0\}\) and \(p\in [1,\infty ]\), we denote by \(W^{k,p}(\mathbb {T}^2)\), the Sobolev space of Lebesgue measurable functions whose weak derivatives up to order k belongs to \(L^p(\mathbb {T}^2)\). Its associated norm is

$$\begin{aligned} \Vert v \Vert _{W^{k,p}(\mathbb {T}^2)} =\sum _{\vert \beta \vert \le k} \Vert \partial ^\beta v \Vert _{L^{p}(\mathbb {T}^2)}, \end{aligned}$$
(2.1)

where \(\beta \) is a 2-tuple multi-index of non-negative integers of length \(\vert \beta \vert \le k\). The Sobolev space \(W^{k,p}(\mathbb {T}^2)\) is a Banach space. Moreover, \(W^{k,2}(\mathbb {T}^2)\) is a Hilbert space when endowed with the inner product

$$\begin{aligned} \langle u,v \rangle _{W^{k,2}(\mathbb {T}^2)} =\sum _{\vert \beta \vert \le k} \langle \partial ^\beta u\,,\, \partial ^\beta v \rangle , \end{aligned}$$
(2.2)

where \(\langle \cdot ,\,\rangle \) denotes the standard \(L^2\)-inner product. In general, for \(s\in \mathbb {R}\), we will define the Sobolev space \(H^s(\mathbb {T}^2)\) as consisting of distributions v defined on \(\mathbb {T}^2\) for which the norm

$$\begin{aligned} \Vert v\Vert _{H^s(\mathbb {T}^2)} = \bigg (\sum _{\xi \in \mathbb {Z}^2} \big (1+\vert \xi \vert ^2 \big )^s\vert \widehat{v}(\xi )\vert ^2 \bigg )^\frac{1}{2} \equiv \Vert v\Vert _{W^{s,2}(\mathbb {T}^2)} \end{aligned}$$
(2.3)

defined in frequency space is finite. Here, \(\widehat{v}(\xi )\) denotes the Fourier coefficients of v. To shorten notation, we will write \(\Vert \cdot \Vert _{s,2}\) for \(\Vert \cdot \Vert _{W^{s,2}(\mathbb {T}^2)}\) and/or \(\Vert \cdot \Vert _{H^s(\mathbb {T}^2)}\). When \(k=s=0\), we get the usual \(L^2(\mathbb {T}^2)\) space whose norm we will denote by \(\Vert \cdot \Vert _2\) for simplicity. We will also use a similar convention for norms \(\Vert \cdot \Vert _p\) of general \(L^p(\mathbb {T}^2)\) spaces for any \(p\in [1,\infty ]\) as well as for the inner product \(\langle \cdot ,\cdot \rangle _{k,2}:=\langle \cdot ,\cdot \rangle _{W^{k,2}(\mathbb {T}^2)}\) when \(k\in \mathbb {N}\). Additionally, we will denote by \(W^{k,p}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\), the space of weakly divergence-free vector-valued functions in \(W^{k,p}(\mathbb {T}^2)\). Finally, we define the following space

$$\begin{aligned} \mathcal {M}=W^{3,2}(\mathbb {T}^2) \times W^{2,2}(\mathbb {T}^2) \end{aligned}$$
(2.4)

endowed with the norm

$$\begin{aligned} \Vert (b,\omega ) \Vert _{\mathcal {M}}:= \Vert b \Vert _{3,2} +\Vert \omega \Vert _{2,2} . \end{aligned}$$
(2.5)

2.2 Preliminary Estimates

We begin this section with the following result which follows from a direct computation using the definition (2.3) of the Sobolev norms.

Lemma 2.2

Let \(k\in \mathbb {N}\cup \{0\}\) and assume that the triple \((\psi ,\textbf{u},\omega )\) satisfies

$$\begin{aligned} \textbf{u}=\nabla ^\perp \psi , \qquad \omega =(\Delta -1) \psi . \end{aligned}$$
(2.6)

If \(\omega \in W^{k,2}(\mathbb {T}^2)\), then the following estimate

$$\begin{aligned} \Vert \textbf{u}\Vert _{k+1,2}^2&\lesssim \Vert \omega \Vert _{k,2}^2 \end{aligned}$$
(2.7)

holds.

Proof

Because \(\textbf{u}=\nabla ^\perp (\Delta -1)^{-1}\omega \), it follows that

$$\begin{aligned} \Vert \textbf{u} \Vert _{k+1,2}^2&= \sum _{\xi \in \mathbb {Z}^2} \frac{(1+|\xi |^2)^{k+1}|\xi |^2}{(1+|\xi |^2)^2}|\omega |^2 \\&\lesssim \sum _{\xi \in \mathbb {Z}^2} \frac{(1+|\xi |^2)^{k+1}(1+|\xi |^2)}{(1+|\xi |^2)^2}|\omega |^2 = \Vert \omega \Vert _{k,2}^2. \end{aligned}$$

\(\square \)

Let us now recall some Moser-type calculus. See (Klainerman and Majda 1981; Kato 1990; Kato and Ponce 1988).

Lemma 2.3

(Commutator estimates) Let \(\beta \) be a 2-tuple multi-index of non-negative integers such that \(\vert \beta \vert \le k\) holds for \(k\in \{1,2\}\). Let \(p,p_2,p_3\in (1,\infty )\) and \(p_1,p_4 \in (1,\infty ]\) be such that

$$\begin{aligned} \frac{1}{p}=\frac{1}{p_1}+\frac{1}{p_2}=\frac{1}{p_3}+\frac{1}{p_4}. \end{aligned}$$

For \(u \in W^{k,p_3}(\mathbb {T}^2) \cap W^{1,p_1}(\mathbb {T}^2)\) and \(v \in W^{k-1,p_2}(\mathbb {T}^2) \cap L^{p_4} (\mathbb {T}^2)\), we have and

$$\begin{aligned} \left\| \partial ^\beta (uv) - u \partial ^\beta v \right\| _{p} \lesssim \left( \Vert \nabla u \Vert _{p_1} \Vert v \Vert _{k-1,p_2} + \Vert u \Vert _{k,p_3} \Vert v \Vert _{p_4} \right) . \end{aligned}$$
(2.8)

2.3 Main Results

Our current goal is to construct a solution for the system of Eqs. (1.1)–(1.5). To do this, we first make the following assumption on our set of data.

Assumption 2.4

Let \(\textbf{u}_h \in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) and \(f\in W^{2,2}(\mathbb {T}^2)\) and assume that \((b_0, \omega _0) \in \mathcal {M}\).

Unless otherwise stated, Assumption 2.4 now holds throughout the rest of the paper. With stronger assumptions on the regularity, similar results will also follow. We are now in a position to make precise, exactly what we mean by a solution.

Definition 2.5

(Local strong solution) Let \(T>0\) be a constant. We call the triple \((b, \omega ,T) \) a local strong solution or simply, a local solution or a solution to the system (1.1)–(1.5) if the following holds.

  • The buoyancy b satisfies \(b \in C([0,T]; W^{3,2}(\mathbb {T}^2))\) and the equation

    $$\begin{aligned} b(t)&= b_0 - \int _0^{t} \textrm{div} (b\textbf{u})\,\,\textrm{d}\tau , \end{aligned}$$

    holds for all \(t\in [0,T]\);

  • the potential vorticity \(\omega \) satisfies \(\omega \in C([0,T]; W^{2,2}(\mathbb {T}^2))\) and the equation

    $$\begin{aligned} \omega (t)&= \omega _0 - \int _0^t \Big [ \textrm{div} ((\omega -b)\textbf{u} ) + \textrm{div} (b\textbf{u}_h ) \Big ] \,\,\textrm{d}\tau \end{aligned}$$

    holds for all \(t\in [0,T]\).

  • The Biot–Savart law \(\textbf{u}=\nabla ^\perp (\Delta -1)^{-1}\omega \) holds.

Remark 2.6

We remark that the regularity of the solution \((b, \omega , T) \) and its data together with the integral equations immediately imply that b and \(\omega \) are differentiable in time. Indeed, by using the fact that \(W^{3,2}(\mathbb {T}^2)\) and \(W^{2,2}(\mathbb {T}^2)\) are Banach algebras, we immediately deduce from the integral equations for b and \(\omega \) above that

$$\begin{aligned} b \in C^1([0,T]; W^{2,2}(\mathbb {T}^2)), \qquad \omega \in C^1([0,T]; W^{1,2}(\mathbb {T}^2)). \end{aligned}$$

It also follows from the integral equations for the buoyancy and potential vorticity above that the initial conditions are \(b(0,x)=b_0(x)\) and \(\omega (0,x)=\omega _0(x)\). The corresponding differential forms (1.1)–(1.2) are clearly immediate from the integral representations.

Remark 2.7

Since we are working on the torus, and the velocity fields are defined by (1.4)–(1.5), we have in particular, \(\int _{\mathbb {T}^2}\textbf{u}_h \, \textrm{d} \textbf{x}=0\) and \(\int _{\mathbb {T}^2}\textbf{u}\, \textrm{d} \textbf{x}=0\). From the latter, we get that \(\omega \) and f have zero averages. Consequently, we will assume that all functions under consideration have zero averages.

Definition 2.8

(Maximal solution) We call \((b, \omega , T_{\max }) \) a maximal solution to the system (1.1)–(1.5) if:

  • there exists an increasing sequence of time steps \((T_n)_{n\in \mathbb {N}}\) whose limit is \(T_{\max }\in (0,\infty ]\);

  • for each \(n\in \mathbb {N}\), the triple \((b, \omega , T_n) \) is a local strong solution to the system (1.1)–(1.5) with initial condition \((b_0, \omega _0) \);

  • if \(T_{\max }<\infty \), then

    $$\begin{aligned} \limsup _{T_n\rightarrow T_{\max }} \Vert (b,\omega )(T_n) \Vert _{\mathcal {M}}^2 =\infty . \end{aligned}$$
    (2.9)

We shall call \(T_{\max }>0\) the maximal time.

Remark 2.9

Condition (2.9) means that the solution breaks down at the limit point \(T_{\max }\).

We are now in a position to state our first main result.

Theorem 2.10

(Existence of local solutions) There exists a solution \((b, \omega , T) \) of (1.1)–(1.5) under Assumption 2.4.

Once we have constructed a local solution, we can show that this solution is continuously dependent on its initial state in a more general class of function spaces. This choice of class appears to be the strongest space in which the analysis may be performed. More details will follow in the sequel, but first, we give the statement on the continuity property of strong solutions with respect to its data.

Theorem 2.11

(Continuity at low regularity) Let \(\textbf{u}_h \in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) and \(f\in W^{2,2}(\mathbb {T}^2)\). Assume that \((b^1, \omega ^1, T^1) \) and \((b^2,\omega ^2, T^2)\) are solutions of (1.1)–(1.5) with initial conditions \((b_0^1, \omega _0^1) \in \mathcal {M}\) and \((b_0^2,\omega _0^2)\in \mathcal {M}\), respectively. Then there exists a constant

$$\begin{aligned} c=c\big ( \Vert b_0^1\Vert _{3,2}, \Vert \omega _0^1\Vert _{2,2}, \Vert b_0^2\Vert _{3,2}, \Vert \omega _0^2 \Vert _{2,2}, \Vert \textbf{u}_h\Vert _{3,2}, \Vert f\Vert _{2,2} \big ) \end{aligned}$$

such that

$$\begin{aligned} \begin{aligned} \Vert (b^1 -b^2)(t) \Vert _{2,2}^2 + \Vert (\omega ^1 - \omega ^2 )(t) \Vert _{1,2}^2 \le \exp (cT)\big (\Vert b_0^1 - b_0^2 \Vert _{2,2}^2 + \Vert \omega _0^1 - \omega _0^2 \Vert _{1,2}^2 \big ) \end{aligned} \end{aligned}$$
(2.10)

holds for all \(t\in [0,T]\) where \(T=\min \{T^1,T^2\}\).

Remark 2.12

For the avoidance of doubt, we make clear that the final estimate for the differences in Theorem 2.11 above is stated in terms of a larger space \(W^{2,2}(\mathbb {T}^2) \times W^{1,2}(\mathbb {T}^2)\) with smaller norms than the space of existence \(W^{3,2}(\mathbb {T}^2) \times W^{2,2}(\mathbb {T}^2)\). In order to obtain this bound however, in particular, we require the initial conditions of one of the solutions to be bounded in the stronger space of existence, i.e. boundedness of \(\Vert b_0^1\Vert _{3,2}, \Vert \omega _0^1\Vert _{2,2}\) rather than in the weaker space for which we obtain our final stability estimate. Furthermore, we also require the potential vorticity \(\omega ^2\) (at all time of existence) of the second solution to be also bounded in the stronger space of existence \(W^{2,2}(\mathbb {T}^2)\). Explicitly, the only flexibility we have has to do with the second buoyancy \(b^2\) for which it appears that we are able to relax to live in the weaker space \(W^{2,2}(\mathbb {T}^2)\). Unfortunately, however, because of the highly coupled nature of the system of equations under study, needing \(\omega ^2 \in W^{2,2}(\mathbb {T}^2)\) for all times automatically requires that \(b^2 \in W^{3,2}(\mathbb {T}^2)\) for all times. Subsequently, this means that we require boundedness of the initial condition of the second solution in the stronger space of existence, i.e. boundedness of \(\Vert b_0^2 \Vert _{3,2}, \Vert \omega _0^2\Vert _{2,2}\) in addition to that of the first solution. The requirement of needing both pairs of initial conditions to live in a stronger space is in contrast to simpler looking models like the Euler equation where it suffices to require just having one initial condition to have stronger regularity. Unfortunately, we can not do better by our method of proof (which is to derive estimates for equations solved by the differences, i.e. the energy method) but we do not claim that other methods for deriving analogous estimates may not yield better result either.

As a consequence of Theorem 2.11, the following statement about uniqueness of the strong solution is immediate.

Corollary 2.13

(Uniqueness) Let \((b^1,\omega ^1, T^1)\) and \((b^2,\omega ^2, T^2)\) be two solutions of (1.1)–(1.5) under Assumption 2.4. Then the difference \((b^1-b^2,\omega ^1-\omega ^2)\) satisfies the equation

$$\begin{aligned} \Vert (b^1-b^2)(t) \Vert _{2,2}^2+ \Vert (\omega ^1-\omega ^2)(t)\Vert _{1,2}^2=0 \end{aligned}$$
(2.11)

for all \(t\in [0,T]\) where \(T=\min \{T^1,T^2\}\).

Finally, we can show that a maximal solution of (1.1)–(1.5), in the sense of Definition 2.8, also exists.

Theorem 2.14

(Existence of maximal solution) There exist a unique maximal solution \((b, \omega , T_{\max }) \) of (1.1)–(1.5) under Assumption 2.4.

The \(\alpha \)-TQG model: In the following, we consider a ‘regularised’ version of (1.1)–(1.5). The regularisation is known as \(\alpha \)-regularisation and is of a dispersive type, i.e. the regularisation behaves like a low-pass filter, as explained in Holm et al. (1998b). The main benefit of this type of regularisation is that energy is also preserved by the regularised model, contrary to diffusive regularisations that dissipate energy. Since our notion of a solution to (1.1)–(1.5) involves the pair \((b,\omega )\) (and not explicitly in terms of \(\textbf{u}\)), henceforth, we will use the triple \((b^\alpha ,\omega ^\alpha , T)\) to denote the corresponding solution to the following \(\alpha \)-TQG model for the avoidance of confusion. To be precise, for fixed \(\alpha >0\), we will be exploring the pair \((b^\alpha ,w^\alpha )\) that solves

$$\begin{aligned}&\frac{\partial }{\partial t} b^\alpha + {\textbf{u}}^\alpha \cdot \nabla b^\alpha =0, \end{aligned}$$
(2.12)
$$\begin{aligned}&\frac{\partial }{\partial t}\omega ^\alpha + {\textbf{u}}^\alpha \cdot \nabla ( \omega ^\alpha -b^\alpha ) = -\textbf{u}_h \cdot \nabla b^\alpha \end{aligned}$$
(2.13)

but where now,

$$\begin{aligned} {\textbf{u}}^\alpha =\nabla ^\perp \psi ^\alpha , \qquad \textbf{u}_h =\frac{1}{2} \nabla ^\perp h, \qquad \omega ^\alpha =(\Delta -1) (1-\alpha \Delta ) \psi ^\alpha +f. \end{aligned}$$
(2.14)

Note that at least formally, this implies that

$$\begin{aligned} {\textbf{u}}^\alpha = \nabla ^\perp (1-\alpha \Delta )^{-1}(\Delta -1)^{-1}(\omega ^\alpha -f). \end{aligned}$$
(2.15)

We also note that even though the velocity fields \(\textbf{u}\) in (1.5) and \({\textbf{u}}^\alpha \) in (2.15) are different, the coupled equations (1.1)–(1.2) and (2.12)–(2.13) are exactly of the same form. In the following, we claim that the exact same result, Theorem 2.14 holds for the \(\alpha \)-TQG model (2.12)–(2.14) introduced above.

Theorem 2.15

Fix \(\alpha >0\). There exists a unique maximal solution \((b^\alpha ,\omega ^\alpha , T_{\max }^\alpha )\) of (2.12)–(2.14) under Assumption 2.4.

Remark 2.16

The fact that Theorem 2.15 holds true is hardly surprising considering that we are basically treating the exact same model as (1.1)–(1.2) albeit different velocity fields. Heuristically, one observes that for very small \(\alpha \), \(\psi ^\alpha \sim \tilde{\psi } \sim \psi \) and thus, both models are the same. Nevertheless, rigorously, the key lies in the estimate (2.7) which still holds true when one replaces \(\textbf{u}\) with \({\textbf{u}}^\alpha \) and where now, the triple \(({\textbf{u}}^\alpha ,\omega ^\alpha ,f)\) satisfies

$$\begin{aligned} {\textbf{u}}^\alpha =\nabla ^\perp \psi ^\alpha , \qquad (\omega ^\alpha -f)=(\Delta -1)(1-\alpha \Delta ) \psi ^\alpha . \end{aligned}$$
(2.16)

for a given f having sufficient regularity.

Indeed, Theorem 2.15 follows from the following lemma.

Lemma 2.17

Under Assumption 2.4, there exists a unique solution \((b^\alpha ,\omega ^\alpha , T^\alpha )\) of (2.12)–(2.14) for any \(\alpha >0\). Moreover, \((b^\alpha ,\omega ^\alpha , T_{\max }^\alpha )\) is a unique maximal solution of (2.12)–(2.14).

Proof

The existence of a unique solution \((b^\alpha ,\omega ^\alpha , T^\alpha )\) of (2.12)–(2.14) for any \(\alpha >0\) follows from Proposition 3.1, Lemma 3.5 and Corollary 2.13 with \(T^\alpha = T\). We will prove all these results later for the TQG and one will observe that nothing changes for the \(\alpha \)-TQG. Note that since (1.1)–(1.5) and (2.12)–(2.14) share the same data, Assumption 2.4, the local existence time T is accordingly independent of \(\alpha >0\).

The construction of the maximal solution \((b^\alpha ,\omega ^\alpha , T_{\max }^\alpha )\) of (2.12)–(2.14) also follows the same gluing argument used in showing the existence of the maximal solution \((b,\omega , T_{\max })\) for (1.1)–(1.5) in Sect. 3.4. \(\square \)

Remark 2.18

Even though we have no proof at the moment, we expect that the solution of the \(\alpha \)-TQG model (2.12)–(2.14) is in fact smooth (perhaps even analytical), this is not expected to hold globally in time. In other words, the \(\alpha \)-TQG model is expected to be a regularisation of the TQG model, as it limits the rate at which the energy can go to high wave numbers. This is clear from the growth rates for linear stability analysis, see Sect. 6.2 for details.

3 Construction of Strong Solutions

In this section, we give a construction of a solution, in the sense of Definition 2.5, to the thermal quasi-geostrophic system of Eqs. (1.1)–(1.5). We will achieve this goal by rewriting our set of Eqs. (1.1)–(1.5) in an abstract form and show that the resulting operator acting on the pair \((b,\omega )\) satisfies the assumptions of Theorem 8.1 in the appendix. The details are as follows. We consider

$$\begin{aligned} \frac{\partial }{\partial t} \left( {\begin{array}{c}b\\ \omega \end{array}}\right) + \mathcal {A} \begin{pmatrix} b \\ \omega \end{pmatrix} =0, \qquad t\ge 0 , \qquad b \big \vert _{t=0} = b_0 , \qquad \omega \big \vert _{t=0} = \omega _0 \end{aligned}$$
(3.1)

where the operator

$$\begin{aligned} \mathcal {A} := \begin{bmatrix} \textbf{u}\cdot \nabla &{} 0 \\ (\textbf{u}_h-\textbf{u})\cdot \nabla &{}\textbf{u}\cdot \nabla \end{bmatrix} \end{aligned}$$
(3.2)

is defined such that

$$\begin{aligned} \textbf{u}=\nabla ^\perp \psi , \qquad \textbf{u}_h =\frac{1}{2} \nabla ^\perp h, \qquad \omega =(\Delta -1) \psi +f. \end{aligned}$$
(3.3)

3.1 Estimates for Convective Terms

In the following, we let \(k\in \mathbb {N}\cup \{0\}\) and recall that \(W^{k,p}(\mathbb {T}^2)\) is a Banach algebra when \(kp>2\). By using Sobolev embeddings, the Biot–Savart law and (2.7), we obtain the following estimates.

  1. (1b)

    Let \(k\in \{0,1,2\}\). If \(b\in W^{k+1,2}(\mathbb {T}^2)\), \(\omega \in W^{2,2}(\mathbb {T}^2)\) and \(\textbf{u}\in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) solves (1.3), (1.4), (1.5) for a given \(f\in W^{2,2}(\mathbb {T}^2)\) and a given \(\textbf{u}_h\in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\), then

    $$\begin{aligned}&\Vert \textbf{u}\cdot \nabla b\Vert _{k,2} \lesssim \Vert \textbf{u}\Vert _{2,2}\Vert b \Vert _{k+1,2} \lesssim \big (1+\Vert \omega \Vert _{1,2} \big )\Vert b \Vert _{k+1,2} \end{aligned}$$

    and

    $$\begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\, b \big \rangle _{k,2} \Big \vert \lesssim \Vert \textbf{u}\Vert _{3,2}\Vert b \Vert _{k,2}^2 \lesssim \big (1+\Vert \omega \Vert _{2,2} \big )\Vert b \Vert _{k,2}^2, \\&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\, \omega \big \rangle _{k,2} \Big \vert \\&\lesssim \Vert \textbf{u}\Vert _{2,2}\Vert b \Vert _{k+1,2}\Vert \omega \Vert _{k,2} \lesssim \big (1+\Vert \omega \Vert _{1,2} \big )\Vert b \Vert _{k+1,2}\Vert \omega \Vert _{k,2},\\&\Big \vert \big \langle \textbf{u}_h\cdot \nabla b\,,\, \omega \big \rangle _{k,2} \Big \vert \lesssim \Vert \textbf{u}_h\Vert _{2,2}\Vert b \Vert _{k+1,2}\Vert \omega \Vert _{k,2} \lesssim \Vert b \Vert _{k+1,2}\Vert \omega \Vert _{k,2}. \end{aligned}$$
  2. (1b)

    Let \(k\ge 3\). If \(b\in W^{k+1,2}(\mathbb {T}^2)\), \(\omega \in W^{k,2}(\mathbb {T}^2)\) and \(\textbf{u}\in W^{k+1,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) solves (1.3), (1.4), (1.5) for a given \(f\in W^{k,2}(\mathbb {T}^2)\) and a given \(\textbf{u}_h\in W^{k+1,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\), then

    $$\begin{aligned}&\Vert \textbf{u}\cdot \nabla b\Vert _{k,2} \lesssim \Vert \textbf{u}\Vert _{2,2}\Vert b \Vert _{k+1,2} +\Vert \textbf{u}\Vert _{k,2}\Vert b \Vert _{3,2} \lesssim \big (1+\Vert \omega \Vert _{k-1,2} \big )\Vert b \Vert _{k+1,2} \end{aligned}$$

    and

    $$\begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\, b \big \rangle _{k,2} \Big \vert \lesssim \Vert \textbf{u}\Vert _{3,2}\Vert b \Vert _{k,2}^2 + \Vert \textbf{u}\Vert _{k,2}\Vert b \Vert _{k,2}\Vert b \Vert _{3,2} \lesssim \big (1+\Vert \omega \Vert _{k-1,2} \big )\Vert b \Vert _{k,2}^2, \\&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\, \omega \big \rangle _{k,2} \Big \vert \lesssim \big (\Vert \textbf{u}\Vert _{2,2}\Vert b \Vert _{k+1,2} + \Vert \textbf{u}\Vert _{k,2}\Vert b \Vert _{3,2}\big ) \Vert \omega \Vert _{k,2} \lesssim \big (1+\Vert \omega \Vert _{k-1,2} \big )\Vert b \Vert _{k+1,2} \Vert \omega \Vert _{k,2}, \\&\Big \vert \big \langle \textbf{u}_h\cdot \nabla b\,,\, \omega \big \rangle _{k,2} \Big \vert \lesssim \big (\Vert \textbf{u}_h\Vert _{2,2}\Vert b \Vert _{k+1,2} + \Vert \textbf{u}_h\Vert _{k,2}\Vert b \Vert _{3,2}\big ) \Vert \omega \Vert _{k,2} \lesssim \Vert b \Vert _{k+1,2} \Vert \omega \Vert _{k,2}. \end{aligned}$$
  3. (2a)

    Let \(k\in \{0,1,2\}\). If \(\omega \in W^{k+1,2}(\mathbb {T}^2)\) and that \(\textbf{u}\in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) solves (1.3), (1.4), (1.5) for a given \(f\in W^{2,2}(\mathbb {T}^2)\), then

    $$\begin{aligned} \Vert \textbf{u}\cdot \nabla \omega \Vert _{k,2} \lesssim \Vert \textbf{u}\Vert _{2,2}\Vert \omega \Vert _{k+1,2} \lesssim \big (1+\Vert \omega \Vert _{1,2} \big )\Vert \omega \Vert _{k+1,2} \end{aligned}$$

    and

    $$\begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla \omega \,,\, \omega \big \rangle _{k,2} \Big \vert \lesssim \Vert \textbf{u}\Vert _{3,2}\Vert \omega \Vert _{k,2}^2 \lesssim \big (1+\Vert \omega \Vert _{2,2} \big )\Vert \omega \Vert _{k,2}^2. \end{aligned}$$
  4. (2b)

    Let \(k\ge 3\). If \(\omega \in W^{k+1,2}(\mathbb {T}^2)\) and \(\textbf{u}\in W^{k+1,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) solves (1.3), (1.4), (1.5) for a given \(f\in W^{k,2}(\mathbb {T}^2)\), then

    $$\begin{aligned}&\Vert \textbf{u}\cdot \nabla \omega \Vert _{k,2} \lesssim \Vert \textbf{u}\Vert _{2,2}\Vert \omega \Vert _{k+1,2} +\Vert \textbf{u}\Vert _{k,2}\Vert \omega \Vert _{3,2} \lesssim \big (1+\Vert \omega \Vert _{k-1,2} \big )\Vert \omega \Vert _{k+1,2} \end{aligned}$$

    and

    $$\begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla \omega \,,\, \omega \big \rangle _{k,2} \Big \vert \lesssim \Vert \textbf{u}\Vert _{3,2}\Vert \omega \Vert _{k,2}^2 + \Vert \textbf{u}\Vert _{k,2}\Vert \omega \Vert _{k,2}\Vert \omega \Vert _{3,2} \lesssim \big (1+\Vert \omega \Vert _{k-1,2} \big )\\ {}&\quad \Vert \omega \Vert _{k,2}^2. \end{aligned}$$

With these estimates in hand, we can now proceed to prove the existence of a local strong solution of (3.1).

3.2 Proof of Local Existence

Our proof of a local strong solution to (1.1)–(1.5) will follow from the following proposition. Its proof will follow the argument presented in Kato and Lai (1984) (see Appendix, Sect. 8) for the construction of a local solution to the Euler equation.

Proposition 3.1

Take Assumption 2.4. There exists a solution \((b, \omega , T) \) of (3.1) such that:

  1. (1)

    the time \(T>0\) satisfy the bound

    $$\begin{aligned} T<\frac{1}{2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})}, \quad c=c(\Vert \textbf{u}_h\Vert _{3,2},\Vert f \Vert _{2,2}); \end{aligned}$$
    (3.4)
  2. (2)

    the pair \((b,\omega )\) is of class

    $$\begin{aligned} b \in C_w([0,T];W^{3,2}(\mathbb {T}^2)), \qquad \omega \in C_w([0,T];W^{2,2}(\mathbb {T}^2)) ; \end{aligned}$$
    (3.5)
  3. (3)

    the pair \((b,\omega )\) satisfies the bound

    $$\begin{aligned} \Vert (b, \omega )(t) \Vert _\mathcal {M}^2 \le \frac{\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}+2c(1+\Vert (b_0,\omega _0) \Vert ^2_{\mathcal {M}})\,t}{1-2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})\,t},\quad \quad t\in (0,T). \end{aligned}$$
    (3.6)

Proof

Let us consider the following triplet \(\{V,H,X\}\) given as follows. Take \(X:=W^{1,2}(\mathbb {T}^2)\times L^2(\mathbb {T}^2)\) and let \(H:=W^{(3,2),2}(\mathbb {T}^2)\times W^{(2,1),2}(\mathbb {T}^2)\) be a Hilbert space endowed with the inner product

$$\begin{aligned} \begin{aligned} \big \langle (b,\omega ), (b',\omega ') \big \rangle _H = \langle b\,,\,b' \rangle _{3,2} + \langle b\,,\,b' \rangle _{2,2} + \langle \omega \,,\,\omega ' \rangle _{2,2} + \langle \omega \,,\,\omega ' \rangle _{1,2}. \end{aligned} \end{aligned}$$
(3.7)

We note that for \((b,\omega ) \in \mathcal {M}\), we have that

$$\begin{aligned} \begin{aligned} \Vert (b,\omega )\Vert _{ \mathcal {M}} \le \Vert (b,\omega )\Vert _H \le 2 \Vert (b,\omega )\Vert _{ \mathcal {M}} \end{aligned} \end{aligned}$$
(3.8)

and thus, H is equivalent to \( \mathcal {M}\). Also, by virtue of the estimates shown in Sect. 3.1, we can conclude that the operator \(\mathcal {A}\) is weakly continuous from H into X.

Next, we let V be the domain of an unbounded selfadjoint operator \(S\ge 0\) in X with \(\textrm{domain}(S)\subset H\) and \(\langle S u,v\rangle =\langle u,v \rangle _H\) for \(u\in \textrm{domain}(S)\). More precisely, \( S= \sum _{\vert \beta \vert \le 3}(-\partial )^\beta \partial ^\beta \) subject to periodic boundary conditions. With this definition of V in hand, we can again conclude from the estimates in Sect. 3.1 that \(\mathcal {A}\) maps V into H. Our goal now is to show that for any \((b,\omega )\in V\), there exists an increasing function \(\rho (r)\ge 0\) of \(r\ge 0\) such that

$$\begin{aligned} \begin{aligned} \Big \vert \big \langle \mathcal {A}(b,\omega ), (b,\omega ) \big \rangle _H \Big \vert \le \rho \big (\Vert (b,\omega )\Vert ^2_H \big ). \end{aligned} \end{aligned}$$
(3.9)

To achieve this goal, we further refine the following estimates from Sect. 3.1. In particular, we recall that if \(b\in W^{k+1,2}(\mathbb {T}^2)\) and \(\omega \in W^{2,2}(\mathbb {T}^2)\) and that \(\textbf{u}\in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) solves (1.3), (1.4), (1.5) for a given \(f\in W^{2,2}(\mathbb {T}^2)\) and a given \(\textbf{u}_h\in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\), then for \(k\in \{0,1, 2\}\),

$$\begin{aligned} \begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\, b \big \rangle _{k,2} \Big \vert \lesssim \big (1+\Vert \omega \Vert _{2,2} \big )\Vert b \Vert _{k+1,2}^2 \lesssim \big (1+ \Vert (b,\omega )\Vert ^2_H \big )\Vert b \Vert _{3,2}^2 ,\\&\Big \vert \big \langle \textbf{u}\cdot \nabla \omega \,,\,\omega \big \rangle _{k,2} \Big \vert \lesssim \big (1+\Vert \omega \Vert _{2,2} \big )\Vert \omega \Vert _{2,2}^2 \lesssim \big (1+ \Vert (b,\omega )\Vert ^2_H \big )\Vert \omega \Vert _{2,2}^2 ,\\&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\,\omega \big \rangle _{k,2} \Big \vert \lesssim \big (1+\Vert \omega \Vert _{2,2} \big )\Vert b \Vert _{k+1,2}\Vert \omega \Vert _{2,2} \lesssim \big (1+ \Vert (b,\omega )\Vert ^2_H \big )\big (\Vert b \Vert _{3,2}^2+ \Vert \omega \Vert _{2,2}^2 \big ) , \\&\Big \vert \big \langle \textbf{u}_h\cdot \nabla b\,,\,\omega \big \rangle _{k,2} \Big \vert \lesssim \Vert b \Vert _{k+1,2}\Vert \omega \Vert _{2,2} \lesssim \big (\Vert b \Vert _{3,2}^2+ \Vert \omega \Vert _{2,2}^2 \big ) \lesssim \big (1+ \Vert (b,\omega )\Vert ^2_H \big ). \end{aligned} \end{aligned}$$
(3.10)

In the above estimates, we have used inequalities such as \(x\lesssim 1+x^2\) for \(x\ge 0\) and Young’s inequalities. Furthermore, if \(b\in W^{3,2}(\mathbb {T}^2)\), \(\omega \in W^{2,2}(\mathbb {T}^2)\) and \(\textbf{u}\in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\), we also have that

$$\begin{aligned} \begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla b\,,\, b \big \rangle _{3,2} \Big \vert \lesssim \big (1+\Vert \omega \Vert _{2,2} \big )\Vert b \Vert _{3,2}^2 \lesssim \big (1+ \Vert (b,\omega )\Vert ^2_H \big )\Vert b \Vert _{3,2}^2. \end{aligned} \end{aligned}$$
(3.11)

We can therefore conclude that for \((b,\omega )\in H\) (and in particular, for any \((b,\omega )\in V\)),

$$\begin{aligned} \begin{aligned} \Big \vert \big \langle \mathcal {A}(b,\omega ), (b,\omega ) \big \rangle _H \Big \vert \lesssim \big (1+ \Vert (b,\omega )\Vert ^2_H \big )^2 \end{aligned} \end{aligned}$$
(3.12)

with a constant depending only on \(\Vert \textbf{u}_h\Vert _{3,2}\) and \(\Vert f \Vert _{2,2}\). Since \(\rho (r)=c(1+r)^2\ge 0\) is an increasing function of \(r\ge 0\), we can conclude from the result in the Appendix, Sect. 8, that for a given \((b_0,\omega _0) \in H \equiv \mathcal {M}\), there exists a solution \((b, \omega , T) \) of (3.1) of class

$$\begin{aligned} b \in C_w([0,T];W^{3,2}(\mathbb {T}^2)), \qquad \omega \in C_w([0,T];W^{2,2}(\mathbb {T}^2)) \end{aligned}$$
(3.13)

satisfying the bound (recall (3.8))

$$\begin{aligned} \Vert (b, \omega )(t) \Vert _\mathcal {M}^2 \le \Vert (b, \omega )(t) \Vert _H^2 \le r(t),\quad \quad t\in (0,T) \end{aligned}$$
(3.14)

where r is an increasing function on (0, T). The time \(T>0\) in (3.13)–(3.14) depends only on \(\Vert \textbf{u}_h\Vert _{3,2}\) and \(\Vert f \Vert _{2,2}\) by way of the function \(\rho (\cdot )\) as well as on \(\Vert (b_0,\omega _0)\Vert _{\mathcal {M}}\). To be precise, for \(\rho (r)=c(1+r)^2\ge 0\) where \(c>0\) depends only on \(\Vert \textbf{u}_h\Vert _{3,2}\) and \(\Vert f \Vert _{2,2}\), we obtain \(T>0\) by solving the equation

$$\begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t}r = 2 \rho (r),\qquad \qquad r(0)= \Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}} \end{aligned}$$
(3.15)

which yields

$$\begin{aligned} r(t) = \frac{\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}+2c(1+\Vert (b_0,\omega _0) \Vert ^2_{\mathcal {M}})\,t}{1-2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})\,t}. \end{aligned}$$
(3.16)

We therefore require

$$\begin{aligned} T<\frac{1}{2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})} \end{aligned}$$
(3.17)

for a solution of (3.15) to exist. This finishes the proof. \(\square \)

Remark 3.2

Note that r(t) converges to \(\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}\) as \(t\rightarrow 0^+\). Therefore, given the bound (3.14) and (3.16), we can conclude that

$$\begin{aligned} \limsup _{t\rightarrow 0^+} \Vert (b,\omega )(t) \Vert ^2_{\mathcal {M}}\le \Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}. \end{aligned}$$

On the other hand, because the pair \((b,\omega )\) is of class (3.13), we can also conclude that

$$\begin{aligned} \liminf _{t\rightarrow 0^+} \Vert (b,\omega )(t) \Vert ^2_{\mathcal {M}}\ge \Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}} \end{aligned}$$

and thus, we obtain strong right continuity at time \(t=0\). Since our system (1.1)–(1.5) is time reversible (see Remark 3.3 below), we also obtain strong left continuity at \(t=0\) so that we are able to conclude that (3.13) is actually strongly continuous at \(t=0\).

Remark 3.3

By time-reversible, we mean that if b(tx) and \(\omega (t,x)\) (with \(\omega (t,x)\) related to \(\textbf{u}(t,x)\) by (1.5)) solves (1.1)–(1.5) with data \(b_0(x)\), \(\omega _0(x)\), \(\textbf{u}_h(x)\) and f(x), then \(-b(-t,x)\) and \(-\omega (-t,x)\) (with \(-\omega (-t,x)\) related to \(-\textbf{u}(-t,x)\) by (1.5)) solves (1.1)–(1.5) with data \(-b_0(x)\), \(-\omega _0(x)\), \(-\textbf{u}_h(x)\) and \(-f(x)\), respectively.

Remark 3.4

We remark that the function \(\rho \) and, as such, the function r solving (3.15) are not unique. In fact, one can construct countably many (if not infinitely many) of these functions by varying the estimates for the convective terms on the left-hand sides of (3.10)–(3.11). For example, a different choice of Hölder conjugates for estimating the aforementioned terms will lead to a different \(\rho \) and r. The importance of this remark lies in the fact that \(\rho \) and r determines the longevity of the solution \((b,\omega , T)\) to (1.1)–(1.5). One can therefore tune \(T>0\) by modifying \(\rho \) and r accordingly. For example, we may also obtain the following

$$\begin{aligned} \rho _1(r)=c(r^2+r), \qquad \quad \rho _2(r)=c(r^\frac{3}{2}+r) \end{aligned}$$
(3.18)

for which the corresponding solutions to (3.15) are

$$\begin{aligned}&r_1(t)=\frac{r(0)e^{2ct}}{r(0)+1-r(0)e^{2ct}}, \end{aligned}$$
(3.19)
$$\begin{aligned}&r_2(t) =\frac{s^2+s\pm 2s\sqrt{s}}{s^2-2s+1}, \quad \text {where} \quad s=s(t)=\frac{r(0)}{(\sqrt{r(0)}+1)^2}e^{2ct}, \end{aligned}$$
(3.20)

respectively. Again, the constants depends only on \(\Vert \textbf{u}_h\Vert _{3,2}\) and \(\Vert f \Vert _{2,2}\). The second solution \(r_2\) is unsuitable since for one, it is twofold. The first solution \(r_1\), however, means that we require a time

$$\begin{aligned} T_1 <\frac{1}{2c}\ln \bigg (1+\frac{1}{\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}}\bigg ) \end{aligned}$$

for a solution of (3.15) to exist. Therefore, if we optimise the constant \(c>0\) so that they are the same throughout and we compare \(r_1\) with r given in (3.16), we are able to conclude that r yields a longer time of existence of a solution \((b,\omega , T)\) to (1.1)–(1.5) as compared to \(r_1\) since

$$\begin{aligned} \frac{1}{2c}\ln \bigg (1+\frac{1}{\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}}\bigg ) \le \frac{1}{2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})}. \end{aligned}$$

In the next section, we show an estimate for the difference of two solutions constructed above after which we are able to strengthen the weak continuity (3.13) to a strong one for all times of existence.

3.3 Difference Estimate

In the following, we let \((b^1,\omega ^1, T)\) and \((b^2,\omega ^2, T)\) be two solutions of (1.1)–(1.5) sharing the same data \(\textbf{u}_h \in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\), \(f\in W^{2,2}(\mathbb {T}^2)\) and \((b_0, \omega _0) \). So in particular,

$$\begin{aligned} \sup _{t\in [0,T)}\Vert b^i(t,\cdot ) \Vert _{3,2}<\infty , \qquad \sup _{t\in [0,T)}\Vert \omega ^i(t,\cdot ) \Vert _{2,2}<\infty \end{aligned}$$
(3.21)

hold for each \(i=1,2\). We now set \(b^{12}:=b^1-b^2\), \(\omega ^{12}:=\omega ^1-\omega ^2\) and \(\textbf{u}^{12}:=\textbf{u}^1-\textbf{u}^2\) so that \((b^{12}, \textbf{u}^{12}, \omega ^{12})\) satisfies

$$\begin{aligned}&\frac{\partial }{\partial t} b^{12} + \textbf{u}^{12} \cdot \nabla b^1 +\textbf{u}^2\cdot \nabla b^{12} =0, \end{aligned}$$
(3.22)
$$\begin{aligned}&\frac{\partial }{\partial t}\omega ^{12} + \textbf{u}^{12}\cdot \nabla ( \omega ^1-b^1) +\textbf{u}^2\cdot \nabla (\omega ^{12} - b^{12}) = -\textbf{u}_h \cdot \nabla b^{12}, \end{aligned}$$
(3.23)

where

$$\begin{aligned} \textbf{u}^{12} =\nabla ^\perp \psi ^{12}, \qquad \textbf{u}_h =\frac{1}{2} \nabla ^\perp h, \qquad \omega ^{12}=(\Delta -1) \psi ^{12}. \end{aligned}$$
(3.24)

We now show the following stability and uniqueness results in the space \(W^{2,2}(\mathbb {T}^2) \times W^{1,2}(\mathbb {T}^2)\) which is larger than the space \(W^{3,2}(\mathbb {T}^2) \times W^{2,2}(\mathbb {T}^2)\) of existence of the individual solutions. Indeed, it suffices to show uniqueness in the even larger space \(W^{1,2}(\mathbb {T}^2) \times L^2(\mathbb {T}^2)\) but we avoid doing so since we wish to concurrently show the proof of Theorem 2.11 as well.

Proof of Theorem 2.11 and Corollary 2.13

If we apply \(\partial ^\beta \) to (3.22) with \(\vert \beta \vert \le 2\), we obtain

$$\begin{aligned} \frac{\partial }{\partial t} \partial ^\beta b^{12} +\textbf{u}^2\cdot \nabla \partial ^\beta b^{12} = S_1 + S_2 -\textbf{u}^{12} \cdot \nabla \partial ^\beta b^1, \end{aligned}$$
(3.25)

where

$$\begin{aligned} S_1&:= \textbf{u}^{12} \cdot \partial ^\beta \nabla b^1 - \partial ^\beta (\textbf{u}^{12} \cdot \nabla b^1 ),\\ S_2&:= \textbf{u}^2\cdot \partial ^\beta \nabla b^{12} -\partial ^\beta (\textbf{u}^2 \cdot \nabla b^{12} ) \end{aligned}$$

are such that

$$\begin{aligned} \Vert S_1 \Vert _2&\lesssim \Vert \nabla \textbf{u}^{12} \Vert _{4}\Vert b^1 \Vert _{2,4} + \Vert \nabla b^1 \Vert _{\infty } \Vert \textbf{u}^{12} \Vert _{2,2} \lesssim \Vert \textbf{u}^{12} \Vert _{2,2}\Vert b^1 \Vert _{3,2}, \end{aligned}$$
(3.26)
$$\begin{aligned} \Vert S_2 \Vert _2&\lesssim \Vert \nabla \textbf{u}^2 \Vert _{\infty }\Vert b^{12} \Vert _{2,2} + \Vert \nabla b^{12} \Vert _{4} \Vert \textbf{u}^2 \Vert _{2,4} \lesssim \Vert \textbf{u}^2 \Vert _{3,2}\Vert b^{12} \Vert _{2,2}. \end{aligned}$$
(3.27)

Here, we have used the commutator estimate (2.3) and the continuous embeddings \(W^{2,2}(\mathbb {T}^2) \hookrightarrow W^{1,4}(\mathbb {T}^2)\), \(W^{3,2}(\mathbb {T}^2) \hookrightarrow W^{2,4}(\mathbb {T}^2)\) and \(W^{3,2}(\mathbb {T}^2) \hookrightarrow W^{1,\infty }(\mathbb {T}^2)\). Additionally, the following estimate for the \(L^2\) inner product of the last term (3.25) with \(\partial ^\beta b^{12}\) holds

$$\begin{aligned} \begin{aligned} \Big \vert \big \langle \textbf{u}^{12} \cdot \nabla \partial ^\beta b^1 \, ,\, \partial ^\beta b^{12} \big \rangle \Big \vert \lesssim \Vert b^1 \Vert _{3,2} \Vert \textbf{u}^{12} \Vert _\infty \Vert b^{12} \Vert _{2,2}. \end{aligned} \end{aligned}$$
(3.28)

If we test (3.25) with \(\partial ^\beta b^{12}\) and sum over \(\vert \beta \vert \le 2\), we obtain from the above estimates together with (2.7) and (3.21),

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert b^{12} \Vert _{2,2}^2&\lesssim \Vert \omega ^{12} \Vert _{1,2}^2 + \Vert b^{12} \Vert _{2,2}^2. \end{aligned} \end{aligned}$$
(3.29)

Next, we apply \(\partial ^\beta \) to (3.23) with \(\vert \beta \vert \le 1\) to get

$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial t}\partial ^\beta \omega ^{12} +\textbf{u}^2\cdot \nabla \partial ^\beta \omega ^{12} =&- \textbf{u}^{12}\cdot \nabla \partial ^\beta ( \omega ^1 - b^1) +(\textbf{u}^2-\textbf{u}_h)\cdot \nabla \partial ^\beta b^{12}\\&+S_3+ \ldots +S_7 \end{aligned} \end{aligned}$$
(3.30)

where

$$\begin{aligned} S_3&:= \textbf{u}^2 \cdot \partial ^\beta \nabla \omega ^{12} - \partial ^\beta (\textbf{u}^2 \cdot \nabla \omega ^{12} ), \\ S_4&:= \textbf{u}^{12}\cdot \partial ^\beta \nabla \omega ^1 -\partial ^\beta (\textbf{u}^{12} \cdot \nabla \omega ^1 ), \\ S_5&:= -\textbf{u}^{12}\cdot \partial ^\beta \nabla b^1 + \partial ^\beta (\textbf{u}^{12} \cdot \nabla b^1 ), \\ S_6&:= -\textbf{u}^2\cdot \partial ^\beta \nabla b^{12} + \partial ^\beta (\textbf{u}^2 \cdot \nabla b^{12} ), \\ S_7&:= \textbf{u}_h \cdot \partial ^\beta \nabla b^{12} - \partial ^\beta (\textbf{u}_h \cdot \nabla b^{12} ) \end{aligned}$$

are such that

$$\begin{aligned} \Vert S_3 \Vert _2&\lesssim \Vert \textbf{u}^2 \Vert _{1,\infty }\Vert \omega ^{12} \Vert _{1,2} \lesssim \Vert \omega ^{12} \Vert _{1,2}, \end{aligned}$$
(3.31)
$$\begin{aligned} \Vert S_4 \Vert _2&\lesssim \Vert \textbf{u}^{12} \Vert _{1,4}\Vert \omega ^1 \Vert _{1,4} \lesssim \Vert \textbf{u}^{12} \Vert _{2,2} \lesssim \Vert \omega ^{12} \Vert _{1,2}, \end{aligned}$$
(3.32)
$$\begin{aligned} \Vert S_5 \Vert _2&\lesssim \Vert \textbf{u}^{12} \Vert _{1,4}\Vert b^1 \Vert _{1,4}\lesssim \Vert \textbf{u}^{12} \Vert _{2,2} \lesssim \Vert \omega ^{12} \Vert _{1,2}, \end{aligned}$$
(3.33)
$$\begin{aligned} \Vert S_6 \Vert _2&\lesssim \Vert \textbf{u}^2 \Vert _{1,\infty }\Vert b^{12} \Vert _{1,2} \lesssim \Vert b^{12} \Vert _{1,2} \lesssim \Vert b^{12} \Vert _{2,2}, \end{aligned}$$
(3.34)
$$\begin{aligned} \Vert S_7 \Vert _2&\lesssim \Vert \textbf{u}_h \Vert _{1,\infty }\Vert b^{12} \Vert _{1,2} \lesssim \Vert b^{12} \Vert _{1,2} \lesssim \Vert b^{12} \Vert _{2,2}. \end{aligned}$$
(3.35)

Also, since \(\textrm{div}(\textbf{u}^2)=0\)

$$\begin{aligned} \begin{aligned}&\Big \vert \big \langle \textbf{u}^2\cdot \nabla \partial ^\beta \omega ^{12} \, ,\, \partial ^\beta \omega ^{12} \big \rangle \Big \vert =0, \\&\Big \vert \big \langle \textbf{u}^{12}\cdot \nabla \partial ^\beta ( \omega ^1 - b^1) \, ,\, \partial ^\beta \omega ^{12} \big \rangle \Big \vert \lesssim \Vert \omega ^1- b^1 \Vert _{2,2} \Vert \textbf{u}^{12} \Vert _\infty \Vert \omega ^{12} \Vert _{1,2}, \\&\Big \vert \big \langle (\textbf{u}^2-\textbf{u}_h)\cdot \nabla \partial ^\beta b^{12} \, ,\, \partial ^\beta \omega ^{12} \big \rangle \Big \vert \lesssim \Vert \textbf{u}^2-\textbf{u}_h \Vert _\infty \Vert b^{12} \Vert _{2,2} \Vert \omega ^{12} \Vert _{1,2}. \end{aligned} \end{aligned}$$
(3.36)

Therefore,

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert \omega ^{12} \Vert _{1,2}^2&\lesssim \ \Vert \omega ^{12} \Vert _{1,2}^2 + \Vert b^{12} \Vert _{2,2}^2. \end{aligned} \end{aligned}$$
(3.37)

From (3.29) and (3.37), we can conclude from Grönwall’s lemma that

$$\begin{aligned} \begin{aligned} \Vert b^{12}(t, \cdot ) \Vert _{2,2}^2 + \Vert \omega ^{12}(t, \cdot ) \Vert _{1,2}^2 \le \exp (ct)\big (\Vert b^{12}_0\Vert _{2,2}^2 + \Vert \omega ^{12}_0 \Vert _{1,2}^2 \big ) \end{aligned} \end{aligned}$$
(3.38)

for all \(t\ge 0\) where the constant \(c>0\) depends on \( (\Vert b^1\Vert _{3,2}, \Vert \omega ^1\Vert _{2,2}, \Vert \omega ^2\Vert _{2,2}, \Vert \textbf{u}_h\Vert _{3,2}, \Vert f\Vert _{2,2})\). This finishes the proof of Theorem 2.11. To obtain uniqueness, i.e. Corollary 2.13, we use the fact that \(b^{12}_0=0\) and \(\omega ^{12}_0=0\). \(\square \)

Having shown uniqueness, we can conclude that the weakly continuous functions (3.5) are indeed strongly continuous by using time-reversibility of (1.1)–(1.2).

Lemma 3.5

Take the assumptions of Proposition 3.1 to be true. Then

$$\begin{aligned} b \in C([0,T];W^{3,2}(\mathbb {T}^2)), \qquad \omega \in C([0,T];W^{2,2}(\mathbb {T}^2)). \end{aligned}$$

Proof

Firstly, we recall that the solution we constructed in Proposition 3.1 is actually strongly continuous at time \(t=0\), recall Remark 3.2. Now consider a solution \((b,\omega , T_0)\) of (1.1)–(1.2) where \(T_0 \in [0,T]\) is fixed but arbitrary. Then by (3.6), this solution will satisfy the bound

$$\begin{aligned} \Vert (b, \omega )(T_0) \Vert ^2_{\mathcal {M}} \le \frac{\Vert (b_0,\omega _0) \Vert ^2_{\mathcal {M}} + 2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})T_0 }{1-2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})T_0 } \end{aligned}$$
(3.39)

\(c=c(\Vert \textbf{u}_h\Vert _{3,2}, \Vert f \Vert _{2,2})\). We can now construct a new solution \((b^1,\omega ^1, T_0+T_1)\) by taking \((b, \omega )(T_0)\) as an initial condition. A repetition of the argument leading to (3.39) means that this new pair \((b^1,\omega ^1)\) solving (1.1)–(1.2) will satisfy the inequality

$$\begin{aligned} \Vert (b^1, \omega ^1)(t) \Vert ^2_{\mathcal {M}} \le \frac{\Vert (b,\omega )(T_0)\Vert ^2_{\mathcal {M}} + 2c(1+\Vert (b,\omega )(T_0)\Vert ^2_{\mathcal {M}})t }{1-2c(1+\Vert (b,\omega )(T_0)\Vert ^2_{\mathcal {M}})t } \end{aligned}$$
(3.40)

on \([T_0, T_0+T_1]\) where

$$\begin{aligned} T_1<\frac{1}{2c(1+\Vert (b,\omega )(T_0)\Vert ^2_{\mathcal {M}})}. \end{aligned}$$

Furthermore, \((b^1,\omega ^1)\) must coincide with \((b,\omega )\) on \([T_0, T_0+T_1] \cap [0,T]\) by uniqueness and the fact that they agree at \(T_0\in [0,T]\). Subsequently, we obtain from (3.40), strong right-continuity of \(\Vert (b^1, \omega ^1) \Vert _{\mathcal {M}}\) at time \(t=T_0\) just as was done for \(t=0\) in Remark 3.2. Again, by uniqueness, this implies that \(\Vert (b, \omega ) \Vert _{\mathcal {M}}\) is also strongly right-continuous at time \(t=T_0\). Since \(T_0\) was chosen arbitrarily, this means that \((b,\omega )\) is strongly right-continuous on [0, T]. Since our system is reversible, we can repeat the above argument for the backward equation from which we obtain strong left-continuity on [0, T]. We have thus shown that weakly continuous solution (3.5) is indeed strongly continuous. \(\square \)

3.4 Maximal Solution

We now end the section with a proof of the existence of a maximal solution to the TQG model.

Proof of Theorem 2.14

By Proposition 3.1, we found a time

$$\begin{aligned} T<\frac{1}{2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})} \end{aligned}$$
(3.41)

with \(c=c(\Vert \textbf{u}_h \Vert _{3,2}, \Vert f \Vert _{2,2})\) such that \((b, \omega , T)\) is a unique solution of (1.1)–(1.5). At time \(T>0\) given above, we now choose \((b_T,\omega _T)\in \mathcal {M}\), \(\textbf{u}_h \in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) and \(f\in W^{2,2}(\mathbb {T}^2)\) as our new data where

$$\begin{aligned} b_T:=b(T,\cdot ), \qquad \omega _T:=\omega (T,\cdot ). \end{aligned}$$

By repeating the argument, we can also find

$$\begin{aligned} \tilde{T}<\frac{1}{2c(1+\Vert (b_T,\omega _T)\Vert ^2_{\mathcal {M}})} \end{aligned}$$
(3.42)

with \(c=c(\Vert \textbf{u}_h \Vert _{3,2}, \Vert f \Vert _{2,2})\), such that with the new initial conditions \((b_T, \omega _T) \), \((b, \omega )\) uniquely solves (1.1)–(1.5), in the sense of Definition 2.5, on \([T,T+ \tilde{T}]\). By a gluing argument, we obtain a solution \((b,\omega , T_1)\) of (1.1)–(1.5) with \(T_1=T+\tilde{T}\). By this iterative procedure:

  • we obtain an increasing family \((T_n)){n\in \mathbb {N}}\) of time steps whose limit is \(T_{\max }\in (0, \infty ]\);

  • for each \(n\in \mathbb {N}\), \((b,\omega , T_n)\) is a solution of (1.1)–(1.5);

  • if \(T_{\max }<\infty \), then

    $$\begin{aligned} \limsup _{T_n\rightarrow T_{\max }} \Vert (b,\omega )(T_n) \Vert _{\mathcal {M}}^2 =\infty \end{aligned}$$

since otherwise, we can repeat the procedure above to obtain \(\tilde{\tilde{T}}>0\) satisfying

$$\begin{aligned} \tilde{\tilde{T}}<\frac{1}{2c(1+\Vert (b,\omega )(T_{\max })\Vert ^2_{\mathcal {M}})} \end{aligned}$$

with \(c=c(\Vert \textbf{u}_h \Vert _{3,2}, \Vert f \Vert _{2,2})\), such that \((b,\omega , T_{\max }^{\max })\) is a solution of (1.1)–(1.5) with \(T_{\max }^{\max }=T_{\max }+\tilde{\tilde{T}}\). This will contradict the fact that \(T_{\max }\) is the maximal time. \(\square \)

4 Convergence of \(\alpha \)-TQG to TQG

In this section, for a given \(\textbf{u}_h \in W^{3,2}_{\textrm{div}}(\mathbb {T}^2;\mathbb {R}^2)\) and \(f\in W^{2,2}(\mathbb {T}^2)\) serving as data for both the TQG (1.1)–(1.5) and the \(\alpha \)-TQG (2.12)–(2.14), we aim to show that any family \((b^\alpha ,\omega ^\alpha , T_{\max }^\alpha )_{\alpha >0}\) of unique maximal solutions to the \(\alpha \)-TQG model (2.12)–(2.14) converges strongly to the unique maximal solution \((b,\omega , T_{\max })\) of the TQG model (1.1)–(1.5) as \(\alpha \rightarrow 0\) provided that both system share the same initial conditions and are defined on a common time interval [0, T]. Obviously, the time horizon T should be smaller than \(T_{\max }^\alpha \) for any choice of \(\alpha \) and smaller than \(T_{\max }\). To justify the existence of this common time horizon observe that Proposition 3.1 justifies the existence of a local solution on an interval [0, T] for the TQG equation provided (3.4) holds, i.e. that

$$\begin{aligned} T<\frac{1}{2c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}})}, \quad c=c(\Vert \textbf{u}_h\Vert _{3,2},\Vert f \Vert _{2,2}); \end{aligned}$$

For example, we can choose \(T=1/(4c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}))\). It follows that \(T_{\max }>1/(4c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}))\). The same argument applies to the \(\alpha \)-TQG equation (see Lemma 2.17), that is, we also have \(T_{\max }^\alpha >1/(4c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}))\) for any value \(\alpha \). The common time horizon T can therefore be chosen to be \(1/(4c(1+\Vert (b_0,\omega _0)\Vert ^2_{\mathcal {M}}))\). Our main result is the following.

Proposition 4.1

Under Assumption 2.4, let \((b^\alpha , \omega ^\alpha , T_{\max }^\alpha )_{\alpha >0}\) be a family of unique maximal solutions of (2.12)–(2.14) and let \((b, \omega , T_{\max })\) be the unique maximal solution of (1.1)–(1.5). Then

$$\begin{aligned} \begin{aligned}&\sup _{t\in [0,T]}\big (\Vert ({b}^\alpha -b)(t) \Vert _{1,2}^2 + \Vert ({\omega }^\alpha -\omega )(t) \Vert _{2}^2 \big ) \lesssim \alpha ^2 T\exp (cT)\big [1+\exp (cT) \big ] \end{aligned} \end{aligned}$$
(4.1)

for any \(T<\min \{T_{\max },T_{\max }^\alpha \}\) and

$$\begin{aligned} \sup _{t\in [0,T]}\Vert ({b}^\alpha - b)(t) \Vert _{1,2} \rightarrow 0 , \quad \qquad \sup _{t\in [0,T]}\Vert ({\omega }^\alpha - \omega )(t) \Vert _{2}\rightarrow 0 \end{aligned}$$
(4.2)

as \(\alpha \rightarrow 0\).

We will now devote the entirety of this section to the proof of Proposition 4.1. In order to achieve this goal, we first introduce a unique maximal solution \((\overline{b}^\alpha ,\overline{\omega }^\alpha , \overline{T}^\alpha _{\max })\) with initial conditions \(b_0 \in W^{3,2}(\mathbb {T}^2)\) and \(\omega _0\in W^{2,2}(\mathbb {T}^2)\) of the following “intermediate \(\alpha \)-TQG" equation given by

$$\begin{aligned} \frac{\partial }{\partial t} \overline{b}^\alpha + \overline{\textbf{u}}^\alpha \cdot \nabla \overline{b}^\alpha =0, \end{aligned}$$
(4.3)
$$\begin{aligned} \frac{\partial }{\partial t}\overline{\omega }^\alpha + \overline{\textbf{u}}^\alpha \cdot \nabla ( \overline{\omega }^\alpha -\overline{b}^\alpha ) = -\textbf{u}_h \cdot \nabla \overline{b}^\alpha , \end{aligned}$$
(4.4)

where like (2.15),

$$\begin{aligned} \overline{\textbf{u}}^\alpha =\nabla ^\perp (1-\alpha \Delta )^{-1}(\Delta -1)^{-1}(\omega -f), \quad \textbf{u}_h =\frac{1}{2} \nabla ^\perp h, \end{aligned}$$
(4.5)

and where \(\omega \) (and not \(\omega ^\alpha \)) satisfies the original potential vorticity equation (1.3), (1.4), where \(\omega \) satisfies the elliptic equation in TQG. This means that

$$\begin{aligned} \overline{\textbf{u}}^\alpha&=\nabla ^\perp (1-\alpha \Delta )^{-1}(\Delta -1)^{-1}(\omega -f) =\nabla ^\perp (1-\alpha \Delta )^{-1}\psi \end{aligned}$$

with

$$\begin{aligned} \textbf{u}=\nabla ^\perp \psi \end{aligned}$$

Then we obtain for any \(k\ge 0\),

$$\begin{aligned} \Vert \overline{\textbf{u}}^\alpha \Vert _{k,2}^2&= \sum _{\xi \in \mathbb {Z}^2} \frac{(1+|\xi |^2)^{k}|\xi |^2}{(1+\alpha |\xi |^2)^2}|\hat{\psi }|^2 = \sum _{\xi \in \mathbb {Z}^2} \frac{(1+|\xi |^2)^{k}}{(1+\alpha |\xi |^2)^2}|\hat{\textbf{u}}|^2 \\&\le \sum _{\xi \in \mathbb {Z}^2} (1+|\xi |^2)^{k} |\hat{\textbf{u}}|^2 = \Vert \textbf{u} \Vert _{k,2}^2 \end{aligned}$$

So for solution \(\textbf{u}\) in hand, we obtain a solution \(\overline{\textbf{u}}^\alpha \). The existence of a unique maximal solution to the above intermediate system, denoted by \(\textbf{i} \alpha \)-TQG for short, follows from the proof of existence for the original TQG and it particular, for a uniform-in-\(\alpha \) set of data \((\textbf{u}_h,f,b_0,\omega _0)\), we have the following uniform-in-\(\alpha \) estimates

$$\begin{aligned}&\sup _{\alpha>0}\sup _{t\in [0,\overline{T}^\alpha _{\max } )}\Vert \overline{b}^\alpha \Vert _{3,2}\lesssim 1,\nonumber \\&\sup _{\alpha >0}\sup _{t\in [0,\overline{T}^\alpha _{\max })}\Vert \overline{\omega }^\alpha \Vert _{2,2}\lesssim 1 \end{aligned}$$
(4.6)

for the maximal solution of (4.3)–(4.5). Indeed, exactly as in Theorem 2.15, we also have the following result.

Lemma 4.2

Fix \(\alpha >0\). There exist a unique maximal solution \((\overline{b}^\alpha ,\overline{\omega }^\alpha , \overline{T}^\alpha _{\max })\) of (4.3)–(4.5) under Assumption 2.4. In particular, uniformly in \(\alpha >0\), the inequality

$$\begin{aligned} \Vert (\overline{b}^\alpha , \overline{\omega }^\alpha )(t) \Vert _{\mathcal {M}} \lesssim 1 \end{aligned}$$
(4.7)

holds for \(t<\overline{T}^\alpha _{\max }\).

Remark 4.3

Since \(1\le (1+\alpha |\xi |^2)^2\) holds independently of \(\alpha >0\), it follows that

$$\begin{aligned} \Vert \overline{\textbf{u}}^\alpha \Vert _{k+1,2}^2&= \sum _{\xi \in \mathbb {Z}^2} \frac{(1+|\xi |^2)^{k+1}|\xi |^2}{(1+|\xi |^2)^2(1+\alpha |\xi |^2)^2}|\hat{\omega }-\hat{f}|^2\\&\le \sum _{\xi \in \mathbb {Z}^2} \frac{(1+|\xi |^2)^{k+1}|\xi |^2}{(1+|\xi |^2)^2}|\hat{\omega }-\hat{f}|^2\\&\le \sum _{\xi \in \mathbb {Z}^2} (1+|\xi |^2)^k|\hat{\omega }-\hat{f}|^2 = \Vert \omega -f \Vert _{k,2}^2 \end{aligned}$$

Therefore, Lemma 2.2 also holds for the \(\textbf{i}\alpha \)-TQG model uniformly in \(\alpha \). Thus, the same analysis for the ‘normal’ TQG holds verbatim for the alpha TQG since the only place we find alpha is within the elliptic equation linking \(\overline{\textbf{u}}^\alpha \) with \(\omega \).

As a next step, we show that any family \((\overline{b}^\alpha ,\overline{\omega }^\alpha , \overline{T}^\alpha _{\max })_{\alpha >0}\) of maximal solutions to the intermediate \(\textbf{i}\alpha \)-TQG model (4.3)–(4.5) (rather than of the \(\alpha \)-TQG model (2.12)–(2.14)) converges strongly to the unique maximal solution of the TQG model (1.1)–(1.5) as \(\alpha \rightarrow 0\) on the time interval [0, T] where \(T<\min \{\overline{T}^\alpha _{\max }, T_{\max } \}\). In order to achieve this goal, we replicate the uniqueness argument in Sect. 3.3 by setting \(b^{12}:= \overline{b}^\alpha -b\), \(\omega ^{12}:= \overline{\omega }^\alpha -\omega \) and \(\textbf{u}^{12}:= \overline{\textbf{u}}^\alpha -\textbf{u}\) so that \((b^{12}, \textbf{u}^{12}, \omega ^{12})\) satisfies

$$\begin{aligned}&\frac{\partial }{\partial t} b^{12} + \textbf{u}^{12} \cdot \nabla \overline{b}^\alpha + \textbf{u}\cdot \nabla b^{12} =0,\end{aligned}$$
(4.8)
$$\begin{aligned}&\frac{\partial }{\partial t}\omega ^{12} + \textbf{u}^{12}\cdot \nabla ( \overline{\omega }^\alpha - \overline{b}^\alpha ) +\textbf{u}\cdot \nabla (\omega ^{12} - b^{12}) = -\textbf{u}_h \cdot \nabla b^{12}. \end{aligned}$$
(4.9)

Similar to the uniqueness argument, if we apply \(\partial ^\beta \) to the equation for \(b^{12}\) above where now \(\vert \beta \vert \le 1\), we obtain

$$\begin{aligned} \frac{\partial }{\partial t} \partial ^\beta b^{12} + \textbf{u}\cdot \nabla \partial ^\beta b^{12} = S_1 + S_2 -\textbf{u}^{12} \cdot \nabla \partial ^\beta \overline{b}^\alpha , \end{aligned}$$
(4.10)

where

$$\begin{aligned} S_1&:= \textbf{u}^{12} \cdot \partial ^\beta \nabla \overline{b}^\alpha - \partial ^\beta (\textbf{u}^{12} \cdot \nabla \overline{b}^\alpha ) =-\partial \textbf{u}^{12}\cdot \nabla \overline{b}^\alpha , \\ S_2&:=\textbf{u}\cdot \partial ^\beta \nabla b^{12} - \partial ^\beta (\textbf{u}\cdot \nabla b^{12} )=-\partial \textbf{u}\cdot \nabla b^{12} \end{aligned}$$

are such that

$$\begin{aligned} \Vert S_1 \Vert _2 \lesssim \Vert \textbf{u}^{12} \Vert _{1,2}\Vert \overline{b}^\alpha \Vert _{3,2}, \qquad \qquad \Vert S_2 \Vert _2 \lesssim \Vert \textbf{u}\Vert _{3,2}\Vert b^{12} \Vert _{1,2}. \end{aligned}$$
(4.11)

Also,

$$\begin{aligned} \begin{aligned} \Big \vert \big \langle \textbf{u}^{12} \cdot \nabla \partial ^\beta \overline{b}^\alpha \, ,\, \partial ^\beta b^{12} \big \rangle \Big \vert \lesssim \Vert \textbf{u}^{12} \Vert _{1,2} \Vert \overline{b}^\alpha \Vert _{3,2} \Vert b^{12} \Vert _{1,2}. \end{aligned} \end{aligned}$$
(4.12)

Collecting the information above yields the estimate

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert b^{12} \Vert _{1,2}^2&\lesssim \big ( \Vert \textbf{u}^{12} \Vert _{1,2}\Vert \overline{b}^\alpha \Vert _{3,2}, + \Vert \textbf{u}\Vert _{3,2}\Vert b^{12} \Vert _{1,2} \big )\Vert b^{12} \Vert _{1,2}. \end{aligned} \end{aligned}$$
(4.13)

For the equation for \(\omega ^{12}\), we aim to derive a square-integrable estimate in space. For this, we first note that

$$\begin{aligned}&\Big \vert \big \langle \textbf{u}\cdot \nabla \omega ^{12} \, ,\, \omega ^{12} \big \rangle \Big \vert =0. \end{aligned}$$
(4.14)

Next, we have that

$$\begin{aligned} \Big \vert \big \langle (\textbf{u}-\textbf{u}_h)\cdot \nabla b^{12} \, ,\, \omega ^{12} \big \rangle \Big \vert \lesssim \Vert \textbf{u}-\textbf{u}_h \Vert _\infty \Vert b^{12} \Vert _{1,2} \Vert \omega ^{12} \Vert _{2}. \end{aligned}$$
(4.15)

Finally, we also have that

$$\begin{aligned} \begin{aligned} \Big \vert \big \langle \textbf{u}^{12}\cdot \nabla (\overline{\omega }^\alpha - \overline{b}^\alpha ) \, ,\, \omega ^{12} \big \rangle \Big \vert&\lesssim \Vert \textbf{u}^{12} \Vert _{4} \Vert \nabla (\overline{\omega }^\alpha - \overline{b}^\alpha ) \Vert _{4} \Vert \omega ^{12} \Vert _{2}\\&\lesssim \Vert \textbf{u}^{12} \Vert _{1,2} \Vert \overline{\omega }^\alpha - \overline{b}^\alpha \Vert _{2,2} \Vert \omega ^{12} \Vert _{2}. \end{aligned} \end{aligned}$$
(4.16)

Therefore,

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert \omega ^{12} \Vert _{2}^2&\lesssim \big (\Vert \textbf{u}-\textbf{u}_h \Vert _\infty \Vert b^{12} \Vert _{1,2} + \Vert \textbf{u}^{12} \Vert _{1,2} \Vert \overline{\omega }^\alpha - \overline{b}^\alpha \Vert _{2,2} \big ) \Vert \omega ^{12} \Vert _{2}. \end{aligned} \end{aligned}$$
(4.17)

Summing up with the estimate for \(b^{12}\) then yields

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t}\big ( \Vert b^{12} \Vert _{1,2}^2 + \Vert \omega ^{12} \Vert _{2}^2 \big )&\lesssim \big ( \Vert \textbf{u}^{12} \Vert _{1,2}\Vert \overline{b}^\alpha \Vert _{3,2}, + \Vert \textbf{u}\Vert _{3,2}\Vert b^{12} \Vert _{1,2} \big )\Vert b^{12} \Vert _{1,2}\\&+ \big (\Vert \textbf{u}-\textbf{u}_h \Vert _\infty \Vert b^{12} \Vert _{1,2} + \Vert \textbf{u}^{12} \Vert _{1,2} \Vert \overline{\omega }^\alpha - \overline{b}^\alpha \Vert _{2,2} \big ) \Vert \omega ^{12} \Vert _{2}. \end{aligned} \end{aligned}$$
(4.18)

If we use (2.7) and (4.7), we obtain

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t}\big ( \Vert b^{12} \Vert _{1,2}^2 + \Vert \omega ^{12} \Vert _{2}^2 \big )&\lesssim \big ( \Vert \textbf{u}^{12} \Vert _{1,2} + \Vert b^{12} \Vert _{1,2} \big )\Vert b^{12} \Vert _{1,2}\\&\quad + \big ( \Vert b^{12} \Vert _{1,2} + \Vert \textbf{u}^{12} \Vert _{1,2} \big ) \Vert \omega ^{12} \Vert _{2}\\&\lesssim \Vert \textbf{u}^{12} \Vert _{1,2}^2 + \Vert b^{12} \Vert _{1,2}^2 + \Vert \omega ^{12} \Vert _{2}^2. \end{aligned} \end{aligned}$$
(4.19)

Now recall that the velocity fields for the \(\alpha \)-TQG, the \(\textbf{i}\alpha \)-TQG, and the TQG are given by

$$\begin{aligned} {\textbf{u}}^\alpha&=\nabla ^\perp (1-\alpha \Delta )^{-1}(\Delta -1)^{-1}(\omega ^\alpha -f), \end{aligned}$$
(4.20)
$$\begin{aligned} \overline{\textbf{u}}^\alpha&=\nabla ^\perp (1-\alpha \Delta )^{-1}(\Delta -1)^{-1}(\omega -f), \end{aligned}$$
(4.21)
$$\begin{aligned} \textbf{u}&=\nabla ^\perp (\Delta -1)^{-1}(\omega -f), \end{aligned}$$
(4.22)

respectively. As a result, in particular, the difference

$$\begin{aligned} \overline{\textbf{u}}^\alpha -\textbf{u}=\alpha \Delta \nabla ^\perp (1-\alpha \Delta )^{-1}(\Delta -1)^{-1}(\omega -f) \end{aligned}$$

enjoys two extra order of regularity. Furthermore, since each of these individual velocity fields satisfies the bound (2.7), it immediately follows that

$$\begin{aligned} \begin{aligned} \Vert \overline{\textbf{u}}^\alpha -\textbf{u}\Vert _{1,2}^2 \lesssim \alpha ^2 \Vert \textbf{u}\Vert _{3,2}^2 \lesssim \alpha ^2 \Vert \omega - f \Vert _{2,2}^2 \lesssim \alpha ^2\big ( \Vert \omega \Vert _{2,2}^2 + \Vert f \Vert _{2,2}^2 \big ) \lesssim \alpha ^2. \end{aligned} \end{aligned}$$
(4.23)

This means that for the comparison of the TQG and \(\textbf{i}\alpha \)-TQG, we can conclude from (4.19) that

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t}\big ( \Vert \overline{b}^\alpha -b \Vert _{1,2}^2&+ \Vert \overline{\omega }^\alpha -\omega \Vert _{2}^2 \big ) \lesssim \alpha ^2 + \Vert \overline{b}^\alpha -b \Vert _{1,2}^2 + \Vert \overline{\omega }^\alpha -\omega \Vert _{2}^2 \end{aligned} \end{aligned}$$
(4.24)

and by Grönwall’s lemma, (note that \(b^{12}_0=0\) and \(\omega ^{12}_0=0\))

$$\begin{aligned} \begin{aligned} \Vert (\overline{b}^\alpha -b)(t) \Vert _{1,2}^2&+ \Vert (\overline{\omega }^\alpha -\omega )(t) \Vert _{2}^2 \lesssim \alpha ^2 T\exp (cT), \qquad t\in [0,T]. \end{aligned} \end{aligned}$$
(4.25)

The above gives the decay rate of the difference of the solution to the \(\textbf{i}\alpha \)-TQG and TQG equations. Our next goal is to obtain a decay rate for the difference between the \(\textbf{i}\alpha \)-TQG and the \(\alpha \)-TQG equations. Recall that they share the same initial data. For this, we use the estimate

$$\begin{aligned} \Vert \overline{\textbf{u}}^\alpha - {\textbf{u}}^\alpha \Vert _{1,2}^2 \lesssim \Vert \omega - \omega ^\alpha \Vert _{2}^2 \lesssim \Vert \overline{\omega }^\alpha - \omega \Vert _{2}^2 + \Vert \overline{\omega }^\alpha - \omega ^\alpha \Vert _{2}^2 \end{aligned}$$

so that by setting \(b^{12}:= \overline{b}^\alpha -b^\alpha \), \(\omega ^{12}:= \overline{\omega }^\alpha -\omega ^\alpha \) and \(\textbf{u}^{12}:= \overline{\textbf{u}}^\alpha -{\textbf{u}}^\alpha \), we obtain from (4.19),

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t}\big ( \Vert b^{12} \Vert _{1,2}^2 + \Vert \omega ^{12} \Vert _{2}^2 \big )&\lesssim \Vert \textbf{u}^{12} \Vert _{1,2}^2 + \Vert b^{12} \Vert _{1,2}^2 + \Vert \omega ^{12} \Vert _{2}^2\\&\lesssim \Vert \overline{\omega }^\alpha -\omega \Vert _{2}^2 + \Vert b^{12} \Vert _{1,2}^2 + \Vert \omega ^{12} \Vert _{2}^2. \end{aligned} \end{aligned}$$
(4.26)

Since both systems share the same initial conditions, by Grönwall’s lemma and (4.25), it follows that

$$\begin{aligned} \begin{aligned} \Vert (\overline{b}^\alpha -b^\alpha )(t) \Vert _{1,2}^2 + \Vert (\overline{\omega }^\alpha -\omega ^\alpha )(t) \Vert _{2}^2&\le \exp (cT)\int _0^T c\,\Vert (\overline{\omega }^\alpha -\omega )(t) \Vert _2^2 \,\textrm{d}t\\&\lesssim \alpha ^2 T\big [\exp (cT) \big ]^2. \end{aligned} \end{aligned}$$
(4.27)

By the triangle inequality, it follows from (4.25) and (4.27) that

$$\begin{aligned} \begin{aligned} \Vert ({b}^\alpha -b)(t) \Vert _{1,2}^2&+ \Vert ({\omega }^\alpha -\omega )(t) \Vert _{2}^2 \lesssim \alpha ^2 T\exp (cT)\big [1+\exp (cT) \big ] \end{aligned} \end{aligned}$$
(4.28)

so that

$$\begin{aligned} \sup _{t\in [0,T]}\Vert ({b}^\alpha - b)(t) \Vert _{1,2} \rightarrow 0 , \quad \qquad \sup _{t\in [0,T]}\Vert ({\omega }^\alpha - \omega )(t) \Vert _{2}\rightarrow 0 \end{aligned}$$

as \(\alpha \rightarrow 0\). This ends the proof.

Remark 4.4

We conjecture that Proposition 4.1 may be extended to the stronger space \(W^{2,2}(\mathbb {T}^2) \times W^{1,2}(\mathbb {T}^2)\) in which we showed the stability and uniqueness of the maximal solution. The cost, however, may be the loss of the corresponding decay rate (4.1). The ideas in the proof of Proposition 6 of Terence Tao’s note: https://terrytao.wordpress.com/2018/10/09/254a-notes-3-local-well-posedness-for-the-euler-equations/#more-10770 may suffice.

5 Conditions for Blow-up of the \(\alpha \)-TQG Model

In the following, we give Beale et al. (1984) (BKM)-type conditions under which we expect the strong solution of the \(\alpha \)-TQG to blow-up.

Theorem 5.1

Fix \(\alpha >0\). Under Assumption 2.4, let \((b^\alpha , \omega ^\alpha , T_{\max }^\alpha ) \) be a maximal solution of (2.12)–(2.14). If \(T_{\max }^\alpha <\infty \), then

$$\begin{aligned} \int _0^{T_{\max }^\alpha } \Vert \nabla b^\alpha \Vert _{\infty }\, \textrm{d}t= \infty . \end{aligned}$$

Proof

First of all, fix \(\alpha >0\) and let \(T_{\max }^\alpha >0\) be the maximal time so that

$$\begin{aligned} \limsup _{T_n\rightarrow T_{\max }^\alpha } \Vert (b^\alpha ,\omega ^\alpha )(T_n) \Vert _{\mathcal {M}}^2 =\infty . \end{aligned}$$
(5.1)

We now suppose that

$$\begin{aligned} \int _0^{T_{\max }^\alpha } \Vert \nabla b^\alpha \Vert _{\infty } \, \textrm{d}t= K<\infty \end{aligned}$$
(5.2)

and show that

$$\begin{aligned} \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2 \lesssim 1, \qquad t<T_{\max }^\alpha \end{aligned}$$
(5.3)

holds yielding a contradiction to (5.1).

Before we show the estimate (5.3), we first require preliminary estimates for \(\Vert \nabla \textbf{u}^\alpha \Vert _\infty \) and \(\Vert \omega ^\alpha \Vert _2\). To obtain these estimates, we note that the Fourier multiplier \(m^\alpha (\xi )\) defined below satisfies the bound

$$\begin{aligned} m^\alpha (\xi ):=\frac{(1+\vert \xi \vert ^2)\vert \xi \vert ^2}{(1+\alpha \vert \xi \vert ^2)^2} \lesssim _\alpha 1 \end{aligned}$$
(5.4)

for all fixed \(\alpha >0\) and in particular, the bound may be taken uniformly of all \(\alpha \ge 1\). Due to (5.4), the continuous embedding \(W^{3,2}(\mathbb {T}^2)\hookrightarrow W^{1,\infty }(\mathbb {T}^2)\) and Plancherel’s identity, we can conclude that

$$\begin{aligned} \Vert \nabla \textbf{u}^\alpha \Vert _\infty \lesssim \Vert \textbf{u}^\alpha \Vert _{3,2}\lesssim _\alpha \Vert \omega ^\alpha - f \Vert _2 \lesssim \Vert \omega ^\alpha - f \Vert _{2,2}. \end{aligned}$$
(5.5)

On the other hand, if we test (2.13) with \(\omega ^\alpha \) and use (2.7) for \(w=(1-\alpha \Delta )^{-1}(\omega ^\alpha -f)\) and \(k=0\), we obtain

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert \omega ^\alpha \Vert _{2}^2&\lesssim \Big ( \Vert \textbf{u}^\alpha \Vert _{2}\Vert \omega ^\alpha \Vert _{2} + \Vert \textbf{u}_h \Vert _{2} \Vert \omega ^\alpha \Vert _{2} \Big ) \Vert \nabla b^\alpha \Vert _{\infty } \lesssim \Big ( 1 + \Vert \omega ^\alpha \Vert _{2}^2 \Big ) \Vert \nabla b^\alpha \Vert _{\infty } \end{aligned} \end{aligned}$$
(5.6)

for a constant depending only on \(\Vert \textbf{u}_h\Vert _2\) and \(\Vert f \Vert _2\). It therefore follows from (5.6) and (5.2) that

$$\begin{aligned} \Vert \omega ^\alpha (t) \Vert _2^2\lesssim \big (1+ \Vert \omega _0\Vert _2^2 \big )\exp (cK), \qquad t<T_{\max }^\alpha \end{aligned}$$
(5.7)

which when combined with the first estimate in (5.5) yields

$$\begin{aligned} \Vert \nabla \textbf{u}^\alpha (t) \Vert _\infty \lesssim _\alpha 1+\big (1+ \Vert \omega _0\Vert _2^2 \big )\exp (cK) \lesssim _\alpha \big (1+ \Vert \omega _0\Vert _2^2 \big )\exp (cK), \qquad t<T_{\max }^\alpha . \end{aligned}$$
(5.8)

Recall that \(f\in W^{2,2}(\mathbb {T}^2)\) by assumption.

With these preliminary estimates in hand, we can proceed to derive estimates for \((b^\alpha ,\omega ^\alpha )\) in the space \(\mathcal {M}\) of existence of a solution. Since the space of smooth functions is dense in \(\mathcal {M}\), it suffices to show our result for a smooth solution pair \((b^\alpha ,\omega ^\alpha )\).

To achieve our goal, we apply \(\partial ^\beta \) to (2.12) for \(\vert \beta \vert \le 3\) to obtain

$$\begin{aligned} \frac{\partial }{\partial t}\partial ^\beta b^\alpha + \textbf{u}^\alpha \cdot \nabla \partial ^\beta b^\alpha = R_1 \end{aligned}$$
(5.9)

where

$$\begin{aligned} R_1:= \textbf{u}^\alpha \cdot \partial ^\beta \nabla b^\alpha - \partial ^\beta (\textbf{u}^\alpha \cdot \nabla b^\alpha ). \end{aligned}$$

Now since \(\textrm{div}\textbf{u}^\alpha =0\), if we multiply (5.9) by \(\partial ^\beta b^\alpha \) and sum over the multiindex \(\beta \) so that \(\vert \beta \vert \le 3\), we obtain

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert b^\alpha \Vert _{3,2}^2&\lesssim \Big ( \Vert \nabla \textbf{u}^\alpha \Vert _{\infty }\Vert b^\alpha \Vert _{3,2} + \Vert \nabla b^\alpha \Vert _{\infty } \Vert \textbf{u}^\alpha \Vert _{3,2} \Big ) \Vert b^\alpha \Vert _{3,2}\\&\lesssim (\Vert \nabla \textbf{u}^\alpha \Vert _{\infty } + \Vert \nabla b^\alpha \Vert _{\infty })(1+\Vert (b^\alpha , \omega ^\alpha ) \Vert _{\mathcal {M}}^2) \end{aligned} \end{aligned}$$
(5.10)

where we have used (2.7) for \(w=(1-\alpha \Delta )^{-1}(\omega ^\alpha -f)\) and \(k=2\).

Next, we find a bound for \(\Vert \omega ^\alpha \Vert ^2_{2,2}\). For this, we apply \(\partial ^\beta \) to (2.13) for \(\vert \beta \vert \le 2\) and we obtain

$$\begin{aligned} \frac{\partial }{\partial t}\partial ^\beta \omega ^\alpha + \textbf{u}^\alpha \cdot \nabla \partial ^\beta ( \omega ^\alpha -b^\alpha ) +\textbf{u}_h \cdot \nabla \partial ^\beta b^\alpha = R_2 + R_3+R_4 \end{aligned}$$
(5.11)

where

$$\begin{aligned} R_2&:= \textbf{u}^\alpha \cdot \partial ^\beta \nabla \omega ^\alpha - \partial ^\beta (\textbf{u}^\alpha \cdot \nabla \omega ^\alpha ), \\ R_3&:= -\textbf{u}^\alpha \cdot \partial ^\beta \nabla b^\alpha + \partial ^\beta (\textbf{u}^\alpha \cdot \nabla b^\alpha ), \\ R_4&:= \textbf{u}_h\cdot \partial ^\beta \nabla b^\alpha -\partial ^\beta (\textbf{u}_h \cdot \nabla b^\alpha ) \end{aligned}$$

are such that by using Lemma 2.3, Sobolev embeddings, the second inequality in (5.5) (to obtain the term \(1+\Vert \omega ^\alpha \Vert _2\) in \(R_2\) below) and (2.7), the estimates

$$\begin{aligned} \big \vert \big \langle R_2 \, ,\,\partial ^\beta \omega ^\alpha \big \rangle \big \vert&\lesssim \Vert R_2 \Vert _2 \Vert \omega ^\alpha \Vert _{2,2} \nonumber \\&\lesssim \Big (\Vert \nabla \textbf{u}^\alpha \Vert _{\infty }\Vert \omega ^\alpha \Vert _{2,2} +\Vert \omega ^\alpha \Vert _{4} \Vert \textbf{u}^\alpha \Vert _{2,4} \Big )\Vert \omega ^\alpha \Vert _{2,2} \nonumber \\&\lesssim \Big (\Vert \nabla \textbf{u}^\alpha \Vert _{\infty }\Vert \omega ^\alpha \Vert _{2,2} +\Vert \omega ^\alpha \Vert _{2,2} \Vert \textbf{u}^\alpha \Vert _{3,2} \Big )\Vert \omega ^\alpha \Vert _{2,2} \nonumber \\&\lesssim \Big (\Vert \nabla \textbf{u}^\alpha \Vert _{\infty }\Vert \omega ^\alpha \Vert _{2,2} +\Vert \omega ^\alpha \Vert _{2,2} (1+\Vert \omega ^\alpha \Vert _{2}) \Big )\Vert \omega ^\alpha \Vert _{2,2} , \end{aligned}$$
(5.12)
$$\begin{aligned} \big \vert \big \langle R_3 \, ,\,\partial ^\beta \omega ^\alpha \big \rangle \big \vert&\lesssim \Vert R_3 \Vert _2 \Vert \omega ^\alpha \Vert _{2,2} \nonumber \\&\lesssim \Big (\Vert \nabla \textbf{u}^\alpha \Vert _{\infty } \Vert b^\alpha \Vert _{3,2} + \Vert \nabla b^\alpha \Vert _{\infty } \Vert \textbf{u}^\alpha \Vert _{2,2}) \Vert \omega ^\alpha \Vert _{2,2} \nonumber \\&\lesssim \Big (\Vert \nabla \textbf{u}^\alpha \Vert _{\infty } \Vert b^\alpha \Vert _{3,2} + \Vert \nabla b^\alpha \Vert _{\infty } (1+\Vert \omega ^\alpha \Vert _{2,2}) \Big )\Vert \omega ^\alpha \Vert _{2,2}, \end{aligned}$$
(5.13)
$$\begin{aligned} \big \vert \big \langle R_4 \, ,\,\partial ^\beta \omega ^\alpha \big \rangle \big \vert&\lesssim \Vert R_4 \Vert _2 \Vert \omega ^\alpha \Vert _{2,2}\nonumber \\&\lesssim \Big (\Vert \nabla \textbf{u}_h\Vert _\infty \Vert b^\alpha \Vert _{3,2} + \Vert \nabla b^\alpha \Vert _{\infty } \Vert \textbf{u}_h\Vert _{2,2}\Big )\Vert \omega ^\alpha \Vert _{2,2} \nonumber \\&\lesssim \Big (\Vert b^\alpha \Vert _{3,2} + \Vert \nabla b^\alpha \Vert _{\infty }\Big )\Vert \omega ^\alpha \Vert _{2,2} \end{aligned}$$
(5.14)

hold true. Recall that \(\textbf{u}_h \in W^{3,2}(\mathbb {T}^2)\) by assumption. Next, by using \(\textrm{div}\textbf{u}^\alpha =0\), we have the following identity

$$\begin{aligned} \big \langle (\textbf{u}^\alpha \cdot \nabla \partial ^\beta \omega ^\alpha )\, ,\,\partial ^\beta \omega ^\alpha \big \rangle =\frac{1}{2}\int _{\mathbb {T}^2}\textrm{div}(\textbf{u}^\alpha \vert \partial ^\beta \omega ^\alpha \vert ^2)\,\textrm{d}x=0. \end{aligned}$$
(5.15)

Additionally, the following estimates for \(L^2\) inner products hold true

$$\begin{aligned} \Big \vert \big \langle ( \textbf{u}^\alpha \cdot \nabla \partial ^\beta b^\alpha ) \, ,\,\partial ^\beta \omega ^\alpha \big \rangle \Big \vert&\lesssim \Vert \textbf{u}^\alpha \Vert _\infty \Vert b^\alpha \Vert _{3,2}^2 + \Vert \textbf{u}^\alpha \Vert _\infty \Vert \omega ^\alpha \Vert _{2,2}^2, \end{aligned}$$
(5.16)
$$\begin{aligned} \Big \vert \big \langle ( \textbf{u}_h\cdot \nabla \partial ^\beta b^\alpha ) \, ,\,\partial ^\beta \omega ^\alpha \big \rangle \Big \vert&\lesssim \Vert b^\alpha \Vert _{3,2}^2 + \Vert \omega ^\alpha \Vert _{2,2}^2. \end{aligned}$$
(5.17)

since \(\textbf{u}_h \in W^{3,2}(\mathbb {T}^2)\). If we now collect the estimates above (keeping in mind that \(f\in W^{2,2}(\mathbb {T}^2)\) and \(\textbf{u}_h \in W^{3,2}(\mathbb {T}^2)\)), we obtain by multiplying (5.11) by \(\partial ^\beta \omega ^\alpha \) with \(\vert \beta \vert \le 2\) and then summing over \(\vert \beta \vert \le 2\), the following

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert \omega ^\alpha \Vert _{2,2}^2&\lesssim \big ( 1+\Vert \textbf{u}^\alpha \Vert _{1,\infty } + \Vert \nabla b^\alpha \Vert _{\infty } + \Vert \omega ^\alpha \Vert _2 \big )\big (1+\Vert (b^\alpha , \omega ^\alpha ) \Vert _{\mathcal {M}}^2\big ). \end{aligned} \end{aligned}$$
(5.18)

Summing up (5.10) and (5.18) and using (5.7)–(5.5) yields

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} (1+\Vert (b^\alpha , \omega ^\alpha ) \Vert _{\mathcal {M}}^2)&\lesssim _\alpha \Big (\big (1+ \Vert \omega _0\Vert _2^2 \big )\exp (cK) + \Vert \nabla b^\alpha \Vert _{\infty }\Big )(1+\Vert (b^\alpha , \omega ^\alpha ) \Vert _{\mathcal {M}}^2) \end{aligned} \end{aligned}$$
(5.19)

so that by Grönwall’s lemma and (5.2), we obtain

$$\begin{aligned} \begin{aligned} \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2&\le \, \big [ 1 {+}\Vert (b_0, \omega _0) \Vert _{\mathcal {M}}^2\big ] \exp \Big (c(\alpha )T_{\max }\big (1{+} \Vert \omega _0\Vert _2^2 \big )\exp (cK) {+} c(\alpha )K\Big )\\&\, \lesssim _{T_{\max }^\alpha ,K,f,\textbf{u}_h,\omega _0,b_0,\alpha }1 \end{aligned} \end{aligned}$$
(5.20)

for all \(t<{T_{\max }^\alpha }\) contradicting (5.1). \(\square \)

Remark 5.2

(Global existence for constant-in-space buoyancy) We note that for the \(\alpha \)-TQG model, when the buoyancy is constant in space so that \(\nabla b^\alpha =0\), then the equation for the buoyancy decouples from that of the potential vorticity. The potential vorticity equation then reduces to the two-dimensional Euler equation albeit a different Biot–Savart law. Nevertheless, by inspecting the proof of Theorem 5.1 above, one observes that the same global-in-time result applies. In particular, the residual estimate (5.12) still holds true and we obtain from this, a refine version of (5.7)–(5.8) where \(\exp (c K)=1\). Recall from (5.2) that \(K=0\) when \(\nabla b^\alpha =0\).

Remark 5.3

(Global existence for super diffusive buoyancy) Fix \(\alpha >0\). Note that by adding any super diffusive term \((-\Delta )^{\beta /2}b^\alpha \), \(\beta >2\) to the right-hand side of (2.12) so that

$$\begin{aligned} \int _0^t \Vert \nabla b^\alpha (s) \Vert _\infty \,\textrm{d}s \lesssim \int _0^t\Vert (-\Delta )^{\beta /2} b^\alpha (s) \Vert _2^2 \,\textrm{d}s \end{aligned}$$

holds for all \(t<T_{\max }^\alpha \), then we obtain a global solution since in this case, we obtain the energy estimate

$$\begin{aligned} \Vert b^\alpha (t) \Vert _2^2 + \int _0^t\Vert (-\Delta )^{\beta /2} b^\alpha (s) \Vert _2^2 \,\textrm{d}s \le \Vert b_0\Vert _2 \le \Vert b_0\Vert _{3,2} \end{aligned}$$

for all \(t<T_{\max }^\alpha \).

Next, we also give an alternative to the blow-up condition in Theorem 5.1 above.

Theorem 5.4

Fix \(\alpha >0\). Under Assumption 2.4, let \((b^\alpha , \omega ^\alpha , T_{\max }^\alpha ) \) be a maximal solution of (2.12)–(2.14). If \(T_{\max }^\alpha <\infty \), then

$$\begin{aligned} \int _0^{T_{\max }^\alpha } \Vert \nabla \textbf{u}^\alpha \Vert _{\infty }\, \textrm{d}t= \infty . \end{aligned}$$

Proof

In analogy with the proof of Theorem 5.1, we fix \(\alpha >0\), let \(T_{\max }^\alpha >0\) be the maximal time so that

$$\begin{aligned} \limsup _{T_n\rightarrow T_{\max }^\alpha } \Vert (b^\alpha ,\omega ^\alpha )(T_n) \Vert _{\mathcal {M}}^2 =\infty . \end{aligned}$$
(5.21)

We now suppose that

$$\begin{aligned} \int _0^{T_{\max }^\alpha } \Vert \nabla \textbf{u}^\alpha \Vert _{\infty } \, \textrm{d}t= K<\infty \end{aligned}$$
(5.22)

and show that

$$\begin{aligned} \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2 \lesssim 1, \qquad t<T_{\max }^\alpha \end{aligned}$$
(5.23)

holds, yielding a contradiction to (5.21).

To show this, we first fix \(\alpha >0\). If we differentiate (2.12) in space, multiply the resulting equation by \(p\vert \nabla b^\alpha \vert ^{p-2}\nabla b^\alpha \) where \(p>1\) is fixed and finite in this instant, and then integrate over \(\mathbb {T}^2\), we obtain

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\, \textrm{d}t}\Vert \nabla b^\alpha \Vert _p^p \le p \Vert \nabla \textbf{u}^\alpha \Vert _\infty \Vert \nabla b^\alpha \Vert _p^p. \end{aligned} \end{aligned}$$
(5.24)

Rather than use Grönwall’s lemma at this point, we use the analogous separation of variable technique to obtain

$$\begin{aligned} \frac{1}{p}\ln \Vert \nabla b^\alpha (t)\Vert _p^p \le \frac{1}{p}\ln \Vert \nabla b_0\Vert _p^p + \int _0^t \Vert \nabla \textbf{u}^\alpha (s)\Vert _\infty \,\textrm{d}s \end{aligned}$$

so that

$$\begin{aligned} \Vert \nabla b^\alpha (t)\Vert _p \le \Vert \nabla b_0\Vert _p \exp \bigg (\int _0^t \Vert \nabla \textbf{u}^\alpha (s)\Vert _\infty \,\textrm{d}s \bigg ) \end{aligned}$$

holds for all \(t< T_{\max }^\alpha \) uniformly in \(p>1\). Since \(b_0 \in W^{3,2}(\mathbb {T}^2)\), by Sobolev embedding and (5.22), we obtain

$$\begin{aligned} \Vert \nabla b^\alpha (t)\Vert _\infty \lesssim \Vert b_0\Vert _{3,2} \exp (K ) \end{aligned}$$
(5.25)

for all \(t< T_{\max }^\alpha \) so that

$$\begin{aligned} \int _0^{T_{\max }^\alpha }\Vert \nabla b^\alpha (t)\Vert _\infty \, \textrm{d}t\le c\,T_{\max }^\alpha \Vert b_0\Vert _{3,2} \exp (K ). \end{aligned}$$

We can therefore deduce from the estimate (5.6) that

$$\begin{aligned} \Vert \omega ^\alpha (t) \Vert _2^2\lesssim \big (1+ \Vert \omega _0\Vert _2^2 \big )\exp \big ( cT_{\max }^\alpha \Vert b_0\Vert _{3,2} \exp (K) \big ), \qquad t<T_{\max }^\alpha . \end{aligned}$$
(5.26)

Summing up (5.10) and (5.18) and using (5.25)–(5.26) yields

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} (1+\Vert (b^\alpha , \omega ^\alpha ) \Vert _{\mathcal {M}}^2)&\lesssim \big (1+\Vert \textbf{u}^\alpha \Vert _{1,\infty }+\mathfrak {A}\big ) (1+\Vert (b^\alpha , \omega ^\alpha ) \Vert _{\mathcal {M}}^2) \end{aligned} \end{aligned}$$
(5.27)

where

$$\begin{aligned} \mathfrak {A} := \Vert b_0\Vert _{3,2} \exp (K ) + \big (1+ \Vert \omega _0\Vert _2^2 \big )\exp \big (cT_{\max }^\alpha \Vert b_0\Vert _{3,2} \exp (K) \big ). \end{aligned}$$

We can therefore conclude from Grönwall’s lemma and (5.22) that

$$\begin{aligned} \begin{aligned} \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2&\le \, \big [ 1 +\Vert (b_0, \omega _0) \Vert _{\mathcal {M}}^2\big ] \exp \Big (cT_{\max }^\alpha +cK + cT_{\max }^\alpha \mathfrak {A}\Big )\\&\lesssim _{T_{\max }^\alpha ,K,f,\textbf{u}_h,\omega _0,b_0}1 \end{aligned} \end{aligned}$$
(5.28)

for all \(t<{T_{\max }^\alpha }\) contradicting (5.21). \(\square \)

In order to give the final blow-up criterion, let us now recall the following endpoint Sobolev inequality by Brezis and Gallouet (1979).

Lemma 5.5

If \(f\in W^{2,2}(\mathbb {T}^2)\) then

$$\begin{aligned} \Vert f \Vert _\infty \lesssim (1+ \Vert f \Vert _{1,2})\sqrt{\ln (\textrm{e}+ \Vert f \Vert _{2,2})}. \end{aligned}$$

Remark 5.6

Note that the original statement in Brezis and Gallouet (1979) restricted the size of \(\Vert f \Vert _{1,2}\) to being at most one. The current form for any size of \(\Vert f \Vert _{1,2}\) follows immediately as demonstrated in, for example, (Wang and Zhang 2011; Dinvay 2020).

With Lemma 5.5 in hand, we can now show the final blow-up condition.

Theorem 5.7

Fix \(\alpha >0\). Let \((b^\alpha , \omega ^\alpha ,T_{\max }^\alpha ) \) be a maximal solution of (2.12)–(2.14). If \(T_{\max }^\alpha <\infty \), then

$$\begin{aligned} \int _0^{T_{\max }^\alpha } \Vert \nabla b^\alpha \Vert _{1,2}\, \textrm{d}t= \infty . \end{aligned}$$

Proof

Fix \(\alpha >0\) and let \(T_{\max }^\alpha >0\) be the maximal time so that

$$\begin{aligned} \limsup _{T_n\rightarrow T_{\max }^\alpha } \Vert (b^\alpha ,\omega ^\alpha )(T_n) \Vert _{\mathcal {M}}^2 =\infty . \end{aligned}$$
(5.29)

We now suppose that

$$\begin{aligned} \int _0^{T_{\max }^\alpha } \Vert \nabla b^\alpha \Vert _{1,2} \, \textrm{d}t= K<\infty \end{aligned}$$
(5.30)

and show that

$$\begin{aligned} \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2 \lesssim 1, \qquad t<T_{\max }^\alpha \end{aligned}$$
(5.31)

holds and thereby yields a contradiction to (5.29).

To show this, we first fix \(\alpha >0\) and define

$$\begin{aligned} g(t) := \textrm{e}+ \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2, \qquad t>0. \end{aligned}$$

With this definition in hand, it follows from estimates (5.10) and (5.18) that

$$\begin{aligned} \frac{\,\textrm{d}}{\, \textrm{d}t} g \lesssim \big ( 1+\Vert \textbf{u}^\alpha \Vert _{1,\infty } + \Vert \omega ^\alpha \Vert _2+ \Vert \nabla b^\alpha \Vert _{\infty } \big ) g. \end{aligned}$$
(5.32)

By using Lemma 5.5 and the monotonic property of logarithms however, we can deduce that

$$\begin{aligned} \Vert \nabla b^\alpha \Vert _\infty \lesssim \big (1+ \Vert \nabla b^\alpha \Vert _{1,2}\big ) \ln \, g. \end{aligned}$$
(5.33)

Next, we observe that the second estimate in (5.5) yields

$$\begin{aligned} 1+\Vert \textbf{u}^\alpha \Vert _{1,\infty }+\Vert \omega ^\alpha \Vert _2 \lesssim _\alpha 1+ \Vert \omega ^\alpha \Vert _2 \end{aligned}$$
(5.34)

for \(f \in L^2(\mathbb {T}^2)\). However, if we test (2.13) with \(\omega ^\alpha \), we also obtain

$$\begin{aligned} \begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \Vert \omega ^\alpha \Vert _{2}^2&\lesssim \Big ( \Vert \textbf{u}^\alpha \Vert _{\infty }\Vert \omega ^\alpha \Vert _{2} + \Vert \textbf{u}_h \Vert _{\infty } \Vert \omega ^\alpha \Vert _{2} \Big ) \Vert \nabla b^\alpha \Vert _{2} \lesssim \Big ( 1 + \Vert \omega ^\alpha \Vert _{2}^2 \Big ) \Vert \nabla b^\alpha \Vert _{1,2} \end{aligned} \end{aligned}$$
(5.35)

for a constant depending only on \(\Vert \textbf{u}_h\Vert _\infty \) and \(\Vert f \Vert _2\). It therefore follows from (5.30) that

$$\begin{aligned} \Vert \omega ^\alpha (t) \Vert _2^2\lesssim \big (1+ \Vert \omega _0\Vert _2^2 \big )\exp (cK), \qquad t<T_{\max }^\alpha . \end{aligned}$$
(5.36)

Using the fact that \(\ln \, g\ge 1\), we can conclude from (5.34) and (5.36) that

$$\begin{aligned} 1+\Vert \textbf{u}^\alpha \Vert _{1,\infty }+\Vert \omega ^\alpha \Vert _2 \lesssim _\alpha \big (1+ \Vert \omega _0\Vert _2^2 \big )\exp (cK) \ln \, g. \end{aligned}$$
(5.37)

If we now combine this estimate with (5.33), we can conclude from (5.32) that

$$\begin{aligned} \frac{\,\textrm{d}}{\, \textrm{d}t} g \lesssim \big (1+\Vert \omega _0\Vert _{2}+ \Vert \nabla b^\alpha \Vert _{1,2} \big )\exp (cK)\, g \ln \, g \end{aligned}$$

so that

$$\begin{aligned} g(t)\le g(0)^{\exp \int _0^tc(1+\Vert \omega _0 \Vert _{2} + \Vert \nabla b^\alpha (s) \Vert _{1,2}) \exp (cK)\,\textrm{d}s} \end{aligned}$$

holds for \(t<T_{\max }^\alpha \). In particular, given (5.30), we have shown that

$$\begin{aligned} \begin{aligned} \Vert (b^\alpha ,\omega ^\alpha )(t) \Vert _{\mathcal {M}}^2&\le \, \big [ \textrm{e} +\Vert (b^\alpha _0, \omega ^\alpha _0) \Vert _{\mathcal {M}}^2\big ]^{ \exp (c(T_{\max }^\alpha +K) \exp (cK))}\\&\, \lesssim _{T_{\max }^\alpha ,K,f,\textbf{u}_h,\omega _0,b_0}1 \end{aligned} \end{aligned}$$
(5.38)

for all \(t<{T_{\max }^\alpha }\) contradicting (5.29). \(\square \)

6 Numerical Implementation

For numerical implementation, we choose to work in weaker spaces than what the well-posedness theorem dictates. Additionally, we allow for boundary conditions in the numerical setup. We recognise these choices create gaps between the theory and the implementation.

6.1 Discretisation Methods for \(\alpha \)–TQG and TQG

In this subsection, we describe the finite element (FEM) spatial discretisation and finite difference Runge–Kutta time stepping discretisation methods that are utilised for the \(\alpha \)-TQG system. By setting \(\alpha \) to zero, we obtain our numerical setup for the TQG system.

Consider a bounded domain \({{\mathcal {D}}}\). Let \(\partial {{\mathcal {D}}}\) denote the boundary. We impose Dirichlet boundary conditions

$$\begin{aligned} \psi ^\alpha = 0, \ \Delta \psi ^\alpha = 0, \qquad \text {on } \partial {{\mathcal {D}}}. \end{aligned}$$
(6.1)

For \({{\mathcal {D}}}= {\mathbb {T}}^2\), our numerical setup does not change, in which case the boundary flux terms in the discretised equations are set to zero.

6.1.1 The Stream Function Equation

Let \(H^{1}\left( {{\mathcal {D}}}\right) \) denote the Sobolev \(W^{1,2}\left( {{\mathcal {D}}}\right) \) space and let \(\left\| .\right\| _{\partial {{\mathcal {D}}}}\) denote the \(L^{2}\left( \partial {{\mathcal {D}}}\right) \) norm. Define the space

$$\begin{aligned} W^{1}\left( {{\mathcal {D}}}\right) :=\left\{ \nu \in H^{1}\left( {{\mathcal {D}}}\right) \left| \left\| \nu \right\| _{\partial {{\mathcal {D}}}}=0\right. \right\} . \end{aligned}$$
(6.2)

We express (2.14) as two inhomogeneous Helmholtz equations

$$\begin{aligned} \omega ^\alpha - f&= (\Delta -1 ) \tilde{\psi } \end{aligned}$$
(6.3)
$$\begin{aligned} -\tilde{\psi }&= (\alpha \Delta - 1)\psi ^\alpha . \end{aligned}$$
(6.4)

We take \(\tilde{\psi },\psi ^\alpha \in W^1 ({{\mathcal {D}}})\). Note that as \(\alpha \) tends to zero, the Biot–Savart law for \((\omega ^\alpha ,\psi ^\alpha )\) and the Biot–Savart law \((\omega ^\alpha ,\tilde{\psi })\) coincide. This removes the need for the additional boundary condition in (6.1). Since the two equations are of the same form, let us first consider (6.3). Using an arbitrary test function \(\phi \in W^1({{\mathcal {D}}})\), we obtain the following weak form of (6.3),

$$\begin{aligned} \langle \nabla \tilde{\psi },\nabla \phi \rangle _{{\mathcal {D}}}+\langle \tilde{\psi },\phi \rangle _{{\mathcal {D}}}&= -\langle \omega ^\alpha - f, \phi \rangle _{{\mathcal {D}}}. \end{aligned}$$
(6.5)

Define the functionals

$$\begin{aligned} L_\alpha (v, \phi )&:= \alpha \langle \nabla v, \nabla \phi \rangle _{{\mathcal {D}}}+\langle v, \phi \rangle _{{\mathcal {D}}} \end{aligned}$$
(6.6)
$$\begin{aligned} F_{\cdot }(\phi )&:= -\langle \cdot , \phi \rangle _{{\mathcal {D}}} \end{aligned}$$
(6.7)

for \(v, \phi \in W^1({{\mathcal {D}}})\), then (6.5) can be written as

$$\begin{aligned} L_1(\tilde{\psi }, \phi ) = F_{\omega ^\alpha - f}(\phi ). \end{aligned}$$
(6.8)

And similarly for (6.4), we have

$$\begin{aligned} L_\alpha (\psi ^\alpha , \phi ) = F_{-\tilde{\psi }}(\phi ). \end{aligned}$$
(6.9)

We discretise Eqs. (6.8) and (6.9) using a continuous Galerkin (CG) discretisation scheme.

Let \(\delta \) be the discretisation parameter, and let \({{\mathcal {D}}}_\delta \) denote a space filling triangulation of the domain, that consists of geometry-conforming non-overlapping elements. Define the approximation space

$$\begin{aligned} W_\delta ^{k}({{\mathcal {D}}}):=\left\{ \phi _\delta \in W^{1}\left( {{\mathcal {D}}}\right) \ : \ \phi _\delta \in C\left( {{\mathcal {D}}}\right) ,\left. \phi _\delta \right| _{K}\in \Pi ^{k}\left( K\right) \text { each } K \in {{\mathcal {D}}}_\delta \right\} . \end{aligned}$$
(6.10)

in which \(C({{\mathcal {D}}})\) is the space of continuous functions on \({{\mathcal {D}}}\), and \(\Pi ^{k}\left( K\right) \) denotes the space of polynomials of degree at most k on element \(K\in {{\mathcal {D}}}_\delta \).

For (6.8), given \(f_\delta \in W_\delta ^k({{\mathcal {D}}})\) and \(\omega _\delta \in V_\delta ^k({{\mathcal {D}}})\) (see (6.13) for the definition of \(V_\delta ^k({{\mathcal {D}}})\)), our numerical approximation is the solution \(\tilde{\psi }_\delta \in W_\delta ^k({{\mathcal {D}}})\) that satisfies

$$\begin{aligned} L_1(\tilde{\psi }_\delta , \phi _\delta ) = F_{\omega _\delta - f_\delta }(\phi _\delta ) \end{aligned}$$
(6.11)

for all test functions \(\phi _\delta \in W_\delta ^k({{\mathcal {D}}})\). Then, using \(\tilde{\psi }_\delta \), our numerical approximation of \(\psi ^\alpha \) is the solution \(\psi _\delta ^\alpha \in W_\delta ^k({{\mathcal {D}}})\) that satisfies

$$\begin{aligned} L_\alpha (\psi _\delta ^\alpha , \phi _\delta ) = F_{-\tilde{\psi }_\delta } (\phi _\delta ) \end{aligned}$$
(6.12)

for all test functions \(\phi _\delta \in W_\delta ^k({{\mathcal {D}}})\).

For a detailed exposition of the numerical algorithms that solves the discretised problems (6.11) and (6.12), we point the reader to Gibson et al. (2019); Brenner et al. (2008).

6.1.2 Hyperbolic Equations

We choose to discretise the hyperbolic buoyancy (2.12) and potential vorticity (2.13) equations using a discontinuous Galerkin (DG) scheme. For a detailed exposition of DG methods, we refer the interested reader to Hesthaven and Warburton (2007).

Define the DG approximation space, denoted by \(V_\delta ^k({{\mathcal {D}}})\), to be the element-wise polynomial space,

$$\begin{aligned} V_\delta ^k({{\mathcal {D}}}) = \left\{ \left. v_\delta \in L^2({{\mathcal {D}}}) \right| \forall K \in {{\mathcal {D}}}_\delta ,\ \exists \phi _\delta \in \Pi ^k (K):\ \left. v_\delta \right| _{K}=\left. \phi _\delta \right| _K\right\} . \end{aligned}$$
(6.13)

We look to approximate \(b^\alpha \) and \(\omega ^\alpha \) in the space \(V_\delta ^k({{\mathcal {D}}})\). Essentially, this means our approximations of \(b^\alpha \) and \(\omega ^\alpha \) are each an direct sum over the elements in \({{\mathcal {D}}}_\delta \). Additional constraints on the numerical fluxes across shared element boundaries are needed to ensure conservation properties and stability. Further, note that \(W^k_\delta ({{\mathcal {D}}}) \subset V^k_\delta ({{\mathcal {D}}})\). This inclusion is also needed for ensuring numerical conservation laws, see Sect. 6.1.3.

For the buoyancy Eq. (2.12), we obtain the following variational formulation

$$\begin{aligned} \langle \partial _t b^\alpha , \nu _\delta \rangle _{K}&= \langle b^\alpha \textbf{u}^\alpha ,\nabla \nu _\delta \rangle _K - \langle b^\alpha \textbf{u}^\alpha \cdot {\hat{\textbf{n}}},\nu _\delta \rangle _{\partial K},\quad K\in {{\mathcal {D}}}_\delta \end{aligned}$$
(6.14)

where \(\nu _\delta \in V^k_\delta ({{\mathcal {D}}})\) is any test function, \(\partial K\) denotes the boundary of K, and \({\hat{\textbf{n}}}\) denotes the unit normal vector to \(\partial K\). Let \(b^\alpha _\delta \) be the approximation of \(b^\alpha \) in \(V_\delta ^k\), and let \(\textbf{u}^\alpha _\delta = \nabla ^\perp \psi ^\alpha _\delta \) for \(\psi ^\alpha _\delta \in W^k_\delta \). Our discretised buoyancy equation over each element is given by

$$\begin{aligned} \langle \partial _t b^\alpha _\delta , \nu _\delta \rangle _{K} = \langle b^\alpha _\delta \textbf{u}^\alpha _\delta ,\nabla \nu _\delta \rangle _K - \langle b^\alpha _\delta \textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}},\nu _\delta \rangle _{\partial K}, \quad K \in {{\mathcal {D}}}_\delta . \end{aligned}$$
(6.15)

Similarly, let \(\omega ^\alpha _\delta \in V_\delta ^k({{\mathcal {D}}})\) be the approximation of \(\omega ^\alpha \), and let \(h_\delta \in W_\delta ^k\). We obtain the following discretised variational formulation that corresponds to (2.13),

$$\begin{aligned} \langle \partial _t \omega ^\alpha _\delta , \nu _\delta \rangle _{K}&= \langle (\omega ^\alpha _\delta - b^\alpha _\delta )\textbf{u}^\alpha _\delta ,\nabla \nu _\delta \rangle _{K} -\langle (\omega ^\alpha _\delta - b^\alpha _\delta ) \textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}},\nu _\delta \rangle _{\partial K} \nonumber \\&\qquad - \frac{1}{2}\langle \nabla \cdot (b^\alpha _\delta \nabla ^\perp h_\delta ), \nu _\delta \rangle _K, \qquad K\in {{\mathcal {D}}}_\delta , \end{aligned}$$
(6.16)

for test function \(\nu _\delta \in V_\delta ^k({{\mathcal {D}}})\).

At this point, we only have the discretised problem on single elements. To obtain the global approximation, we sum over all the elements in \({{\mathcal {D}}}_\delta \). In doing so, the \(\partial K\) terms in (6.15) and (6.16) must be treated carefully. Let \(\partial K_\text {ext}\) denote the part of cell boundary that is contained in \(\partial {{\mathcal {D}}}\). Let \(\partial K_\text {int}\) denote the part of the cell boundary that is contained in the interior of the domain \({{\mathcal {D}}}\backslash \partial {{\mathcal {D}}}\). On \(\partial K_\text {ext}\) we simply impose the PDE boundary conditions. However, on \(\partial K_\text {int}\) we need to consider the contribution from each of the neighbouring elements. By choice, the approximant \(\psi ^\alpha _\delta \in W_\delta ^k\) is continuous on \(\partial K\). And since

$$\begin{aligned} \textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}}= \nabla ^\perp \psi ^\alpha _\delta \cdot {\hat{\textbf{n}}}= -\nabla \psi ^\alpha _\delta \cdot {\hat{\tau }} = -\frac{\,\textrm{d}\psi ^\alpha _\delta }{\,\textrm{d}{\hat{\tau }}}, \end{aligned}$$
(6.17)

where \({\hat{\tau }}\) denotes the unit tangential vector to \(\partial K\), \(\textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}}\) is also continuous. This means \(\textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}}\) in (6.15) (also in (6.16)) is single valued. However, due to the lack of global continuity constraint in the definition of \(V_k^\alpha ({{\mathcal {D}}})\), \(b^\alpha _\delta \) and \(\omega ^\alpha _\delta \) are multi-valued on \(\partial K_\text {int}\). Thus, as our approximation of \(\omega ^\alpha \) and \(b^\alpha \) over the whole domain is the sum over \(K \in {{\mathcal {D}}}_\delta \), we have to constrain the flux on the set \(\left( \bigcup _{K\in {{\mathcal {D}}}_\delta } \partial K \right) {\setminus } \partial {{\mathcal {D}}}\). This is done using appropriately chosen numerical flux fields in the boundary terms of (6.15) and (6.16).

Let \(\nu ^{-}:=\lim _{\epsilon \uparrow 0} \nu (\textbf{x}+\epsilon {\hat{\textbf{n}}})\) and \(\nu ^{+}:=\lim _{\epsilon \downarrow 0}\nu (\textbf{x}+\epsilon {\hat{\textbf{n}}})\), for \(\textbf{x}\in \partial K\), be the inside and outside (with respect to a fixed element K) values, respectively, of a function \(\nu \) on the boundary. Let \(\hat{f}\) be a numerical flux function that satisfies the following properties:

  1. (i)

    consistency

    $$\begin{aligned} {\hat{f}} (\nu , \nu , \textbf{u}_\delta \cdot {\hat{\textbf{n}}}) = \nu \textbf{u}_\delta \cdot {\hat{\textbf{n}}}\end{aligned}$$
    (6.18)
  2. (ii)

    conservative

    $$\begin{aligned} {\hat{f}} (\nu ^+, \nu ^-, \textbf{u}_\delta \cdot {\hat{\textbf{n}}}) = - {\hat{f}} (\nu ^-, \nu ^+, -\textbf{u}_\delta \cdot {\hat{\textbf{n}}}) \end{aligned}$$
    (6.19)
  3. (iii)

    \(L^2\) stable in the entropy norm with respect to the buoyancy equation, see (Bernsen et al., 2006, Section 6).

With such an \({\hat{f}}\), we replace \(b^\alpha _\delta \textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}}\) by the numerical flux \(\hat{f}(b_\delta ^{\alpha , +},b_\delta ^{\alpha , -},\textbf{u}^\alpha _\delta \cdot {\hat{\textbf{n}}})\) in (6.15). Similarly, in (6.16), we replace \((\omega ^\alpha _\delta - b^\alpha _\delta )\textbf{u}_\delta ^\alpha \cdot {\hat{\textbf{n}}}\) by \(\hat{f}((\omega ^\alpha _\delta - b^\alpha _\delta )^{+}, (\omega ^\alpha _\delta - b^\alpha _\delta )^{-}, \textbf{u}_\delta ^\alpha \cdot {\hat{\textbf{n}}})\).

Remark 6.1

For a general nonlinear conservation law, one has to solve what is called the Riemann problem for the numerical flux, see (Hesthaven and Warburton 2007) for details. In our setup, we use the following local Lax–Friedrichs flux, which is an approximate Riemann solver,

$$\begin{aligned} \hat{f}(\nu ^+, \nu ^-, \textbf{u}_\delta \cdot {\hat{\textbf{n}}}) = \textbf{u}_\delta \cdot {\hat{\textbf{n}}}\{\{ \nu \}\} - \frac{|\textbf{u}_\delta \cdot {\hat{\textbf{n}}}|}{2}\llbracket \nu \rrbracket \end{aligned}$$
(6.20)

where

$$\begin{aligned} \{\{\nu \}\}:=\frac{1}{2}(\nu ^{-}+\nu ^{+}),\qquad \llbracket \nu \rrbracket :={\hat{\textbf{n}}}^{-}\nu ^{-}+{\hat{\textbf{n}}}^{+}\nu ^{+}. \end{aligned}$$
(6.21)

Finally, our goal is to find \(b^\alpha _\delta , \omega ^\alpha _\delta \in V_\delta ^k({{\mathcal {D}}})\) such that for all \(\nu _\delta \in V_\delta ^k({{\mathcal {D}}})\) we have

$$\begin{aligned}&\sum _{K\in {{\mathcal {D}}}_\delta } \langle \partial _t b^\alpha _\delta , \nu _\delta \rangle _{K} \nonumber \\ {}&\quad = \sum _{K\in {{\mathcal {D}}}_\delta } \big \{ \langle b^\alpha _\delta \nabla ^{\perp } \psi ^\alpha _\delta , \nabla \nu _\delta \rangle _{K} - \langle \hat{f}_{b^\alpha }(b_\delta ^{\alpha , +},b_\delta ^{\alpha , -},\nabla ^{\perp }\psi ^\alpha _\delta \cdot {\hat{\textbf{n}}}), \nu _\delta ^{-}\rangle _{\partial K} \big \}, \end{aligned}$$
(6.22)
$$\begin{aligned} \sum _{K\in {{\mathcal {D}}}_\delta } \langle \partial _t \omega ^\alpha _\delta , \nu _\delta \rangle _{K}&= \sum _{K\in {{\mathcal {D}}}_\delta } \big \{ \langle (\omega ^\alpha _\delta - b^\alpha _\delta )\nabla ^{\perp }\psi ^\alpha _\delta ,\nabla \nu _\delta \rangle _{K} - \frac{1}{2}\langle \nabla \cdot (b^\alpha _\delta \nabla ^\perp h_\delta ), \nu _\delta \rangle _K\nonumber \\&\qquad - \langle \hat{f}_{\omega ^\alpha }((\omega ^\alpha _\delta - b^\alpha _\delta )^{+}, (\omega ^\alpha _\delta - b^\alpha _\delta )^{-}, \nabla ^\perp \psi _\delta ^\alpha \cdot {\hat{\textbf{n}}}), \nu _\delta ^ - \rangle _{\partial K} \big \} \end{aligned}$$
(6.23)

with \(\psi _\delta ^\alpha \in W_\delta ^k({{\mathcal {D}}})\) being the numerical approximation to the stream function.

Remark 6.2

In (6.22) and (6.23) we do not explicitly distinguish \(\partial K_\text {ext}\) and \(\partial K_\text {int}\) because for the boundary conditions (6.1), the \(\partial K_\text {ext}\) terms vanish.

6.1.3 Numerical Conservation

Conservation properties of the TQG system was first shown in Holm et al. (2021). More specifically, the TQG system conserves energy, and an infinite family of quantities called Casimirs. Proposition 6.3 below describes the conservation properties of the \(\alpha \)–TQG system. We note that the form of the conserved energy and Casimirs are the same as that of the TQG system. Although the result is stated for the system with boundary conditions, it is easy to show that the same result holds for \({\mathbb {T}}^2\).

Proposition 6.3

(\(\alpha \)-TQG conserved quantities) On a bounded domain \({{\mathcal {D}}}\) with boundary \(\partial {{\mathcal {D}}}\), consider the \(\alpha \)-TQG system (2.12)–(2.14) with boundary conditions

$$\begin{aligned}&{\hat{\textbf{n}}}\cdot \textbf{u}^\alpha = 0, \ {\hat{\textbf{n}}}\times \nabla b^\alpha = 0,\ \frac{\,\textrm{d}}{\,\textrm{d}t} \int _{\partial {{\mathcal {D}}}} \nabla \psi ^\alpha \cdot {\hat{\textbf{n}}}= 0, \nonumber \\ {}&\quad \ \frac{\,\textrm{d}}{\,\textrm{d}t} \int _{\partial {{\mathcal {D}}}} \nabla \Delta \psi ^\alpha \cdot {\hat{\textbf{n}}}= 0, \ {\hat{\textbf{n}}}\times \nabla \Delta \psi ^\alpha = 0 \end{aligned}$$
(6.24)

We have

$$\begin{aligned} E^\alpha (t) := - \frac{1}{2} \int _{{\mathcal {D}}}\{(\omega ^\alpha - f)\psi ^\alpha + {h}b^\alpha \} \,\textrm{d}\textbf{x}, \end{aligned}$$
(6.25)

and

$$\begin{aligned} C^\alpha _{\Psi , \Phi }(t) := \int _{{\mathcal {D}}}\{ \Psi (b^\alpha ) + \omega ^\alpha \Phi (b^\alpha ) \} \,\textrm{d}\textbf{x},\qquad \forall \Psi , \Phi \in C^\infty \end{aligned}$$
(6.26)

are conserved, i.e. \(\,\textrm{d}_t E^\alpha (t) = 0\) and \(\,\textrm{d}_t C^\alpha _{\Psi , \Phi }(t) = 0\).

Proof

For the energy \(E^\alpha (t)\), we obtain

$$\begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} E^\alpha (t)&= -\frac{1}{2} \frac{\,\textrm{d}}{\,\textrm{d}t} \int _{{\mathcal {D}}}\{\psi ^\alpha (\Delta - 1)(1-\alpha \Delta )\psi ^\alpha + {h}b^\alpha \} \,\textrm{d}\textbf{x}\\&=\int _{{\mathcal {D}}}\{-\psi ^\alpha \ \partial _t \omega ^\alpha - \frac{1}{2} h \partial _t b^\alpha \} \,\textrm{d}\textbf{x}. \end{aligned}$$

Substituting \(\partial _t \omega ^\alpha \) and \(\partial _t b^\alpha \) using (2.12) and (2.13), the result then follows from direct calculations.

Similarly for the Casimirs, we obtain the result by directly evaluating \(\,\textrm{d}_t C^\alpha _{\Psi , \Phi }(t)\). \(\square \)

Remark 6.4

In (6.24), except for the integral Neumann boundary conditions, we effectively have Dirichlet boundary conditions for \(\psi ^\alpha \), \(b^\alpha \) and \(\Delta \psi ^\alpha \). The integral Neumann boundary conditions can be viewed as imposing the Kelvin theorem on the boundary—consider

$$\begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} \int _{\partial {{\mathcal {D}}}} \nabla (\psi ^\alpha - \alpha \Delta \psi ^\alpha ) \cdot {\hat{\textbf{n}}}= 0, \end{aligned}$$
(6.27)

and apply the divergence theorem.

In our FEM discretisation, we impose only Dirichlet boundary conditions. If the integral Neumann boundary conditions are not imposed, then for Proposition 6.3 to hold, we necessarily require \(\psi ^\alpha \) and \(\Delta \psi ^\alpha \) on \(\partial {{\mathcal {D}}}\) to be zero.

We now analyse the conservation properties of our spatially discretised \(\alpha \)-TQG system. Let \(\langle \cdot , \cdot \rangle _{H^1({{\mathcal {D}}})}\) be the \(H^1({{\mathcal {D}}})\) inner product. Define the numerical total energy

$$\begin{aligned} E^\alpha _\delta (t) = \frac{1}{2} \langle \psi ^\alpha _\delta , \tilde{\psi }_\delta \rangle _{H^1({{\mathcal {D}}})} - \frac{1}{2} \langle {h}_\delta , b^\alpha _\delta \rangle _{{\mathcal {D}}}, \end{aligned}$$
(6.28)

in which \(\psi _\delta ^\alpha , \tilde{\psi }_\delta \in W_\delta ^\alpha ({{\mathcal {D}}})\) and \(b_\delta ^\alpha \in V_\delta ^k({{\mathcal {D}}})\) are the numerical approximations of \(\psi ^\alpha \), \(\tilde{\psi }\) and \(b^\alpha \), respectively. Define the numerical Casimir functional

$$\begin{aligned} C_{\delta }^\alpha (t; \Psi , \Phi ):= \langle \Psi (b_\delta ^\alpha ),1 \rangle _{{\mathcal {D}}}+ \langle \omega _\delta ^\alpha \Phi (b_\delta ^\alpha ) \rangle _{{\mathcal {D}}}, \quad \Psi , \Phi \in C^\infty . \end{aligned}$$
(6.29)

Lemma 6.5

With boundary conditions (6.1)

$$\begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} E^\alpha _\delta (t) = 0 \end{aligned}$$
(6.30)

for \(\alpha = 0\).

In other words, the semi-discrete discretisation conserves energy in the TQG case.

Proof

We consider \(\,\textrm{d}/\,\textrm{d}_t\) evaluated at an arbitrary fixed value \(t_0\). First note that when \(\alpha =0\), we have \(\psi ^\alpha _\delta = \tilde{\psi }_\delta \). Then, from (6.6) we have

$$\begin{aligned} 2E_\delta ^\alpha (t)&= L_1 (\psi _\delta ^\alpha , {\psi }^\alpha _\delta ) - \langle {h}_\delta , b_\delta ^\alpha \rangle _\Omega \end{aligned}$$
(6.31)

For the first term in (6.31), following (6.11) we have

$$\begin{aligned} \left. L_1(\partial _t \psi _\delta , \phi _\delta )\right| _{t=t_0} = - \sum _{K\in \mathcal {{\mathcal {D}}}_\delta } \left. \langle \partial _t \omega ^\alpha _\delta , \phi _\delta \rangle _K \right| _{t=t_0}, \quad \forall \phi _\delta \in W_\delta ^k({{\mathcal {D}}}). \end{aligned}$$
(6.32)

Thus, substituting in the discretised equation (6.23) for \(\omega ^\alpha \), we obtain

$$\begin{aligned} - L_1(\left. \partial _t \psi _\delta , \phi _\delta )\right| _{t=t_0}&= \sum _{K\in {{\mathcal {D}}}_\delta } \left[ \big \{ \langle (\omega ^\alpha _\delta - b^\alpha _\delta )\nabla ^{\perp }\psi ^\alpha _\delta ,\nabla \phi _\delta \rangle _{K} - \frac{1}{2}\langle \nabla \cdot (b^\alpha _\delta \nabla ^\perp h_\delta ), \phi _\delta \rangle _K \right. \nonumber \\&\qquad - \left. \langle \hat{f}_{\omega ^\alpha }((\omega ^\alpha _\delta - b^\alpha _\delta )^{+}, (\omega ^\alpha _\delta - b^\alpha _\delta )^{-}, \nabla ^\perp \psi _\delta ^\alpha \cdot {\hat{\textbf{n}}}), \phi _\delta ^ - \rangle _{\partial K} \big \} \right] _{t=t_0}, \end{aligned}$$
(6.33)

in which we can choose \(\phi _\delta =[\psi ^\alpha _\delta ]_{t=t_0}\). This choice is consistent with the discretisation space \(V_\delta ^k({{\mathcal {D}}})\) for \(\omega ^\alpha \), since \(W_\delta ^k({{\mathcal {D}}}) \subset V_\delta ^k({{\mathcal {D}}})\).

Similarly, for the second term in (6.31), using (6.22) we obtain

$$\begin{aligned}{} & {} \left. \frac{\,\textrm{d}}{\,\textrm{d}t}\right| _{t=t_0} \langle b^\alpha _\delta , \nu _\delta \rangle _{{\mathcal {D}}}\nonumber \\{} & {} = \sum _{K\in {{\mathcal {D}}}_\delta } \left[ \big \{ \langle b^\alpha _\delta \nabla ^{\perp } \psi ^\alpha _\delta , \nabla \nu _\delta \rangle _{K} - \langle \hat{f}_{b^\alpha }(b_\delta ^{\alpha , +},b_\delta ^{\alpha , -},\nabla ^{\perp }\psi ^\alpha _\delta \cdot {\hat{\textbf{n}}}), \nu _\delta ^{-}\rangle _{\partial K} \big \} \right] _{t=t_0}, \end{aligned}$$
(6.34)

where we can choose \(\nu _\delta = {h}_\delta \).

Putting together (6.31), (6.33) and (6.34), and substituting in \({h}_\delta \) and \([\psi ^\alpha _\delta ]_{t=t_0}\) for the arbitrary choices, we obtain

$$\begin{aligned} 2 \frac{\,\textrm{d}}{\,\textrm{d}t}\Big |_{t=t_0} E^\alpha _\delta (t)&= - \sum _{K\in {{\mathcal {D}}}_\delta } \left[ 2 \Big \{ \langle (\omega ^\alpha _\delta - b^\alpha _\delta ) \underbrace{ \nabla ^{\perp } \psi ^\alpha _\delta ,\nabla \psi ^\alpha _\delta }_{=0} \rangle _{K} - \frac{1}{2}\langle \nabla \cdot (b^\alpha _\delta \nabla ^\perp h_\delta ), \psi ^\alpha _\delta \rangle _K \right. \nonumber \\&\qquad - \left. \langle \hat{f}_{\omega ^\alpha }((\omega ^\alpha _\delta - b^\alpha _\delta )^{+}, (\omega ^\alpha _\delta - b^\alpha _\delta )^{-}, \nabla ^\perp \psi _\delta ^\alpha \cdot {\hat{\textbf{n}}}), {\psi ^\alpha _\delta }^- \rangle _{\partial K} \Big \} \right. \nonumber \\&\qquad + \left. \Big \{ \underbrace{ \langle b^\alpha _\delta \nabla ^{\perp } \psi ^\alpha _\delta , \nabla {h}_\delta \rangle _{K} }_{=- \langle b^\alpha _\delta \nabla ^{\perp } {h}_\delta , \nabla \psi ^\alpha _\delta \rangle _{K} } - \langle \hat{f}_{b^\alpha }(b_\delta ^{\alpha , +},b_\delta ^{\alpha , -},\nabla ^{\perp }\psi ^\alpha _\delta \cdot {\hat{\textbf{n}}}), {h}_\delta ^{-}\rangle _{\partial K} \Big \} \right] _{t=t_0} \end{aligned}$$
(6.35)
$$\begin{aligned} \frac{\,\textrm{d}}{\,\textrm{d}t} E^\alpha _\delta (t)&= \sum _{K\in \mathcal {{\mathcal {D}}}_\delta } \Big [ - \langle {\hat{f}}_{\omega ^\alpha }((\omega ^\alpha _\delta -b^\alpha _\delta )^+,(\omega _\delta ^\alpha - b^\alpha _\delta )^-, \nabla ^\perp \psi ^\alpha _\delta \cdot {\hat{\textbf{n}}}),{\psi ^\alpha _\delta }^-\rangle _{\partial K} \nonumber \\&\qquad - \frac{1}{2} \langle \hat{f}_{b^\alpha }({b^\alpha _\delta }^{+},{b^\alpha _\delta }^{-},\nabla ^{\perp }\psi ^\alpha _\delta \cdot {\hat{\textbf{n}}}), {h}_\delta ^{-}\rangle _{\partial K} \Big ]_{t=t_0} \end{aligned}$$
(6.36)

Since \(\psi _\delta \) and \({h}_\delta \) are continuous across element boundaries, we have \(\psi _\delta ^+ = \psi _\delta ^-\) and \({h}_\delta ^+ = {h}_\delta ^-\). Thus, if the numerical fluxes \({\hat{f}}_{\omega ^\alpha }\) and \({\hat{f}}_{b^\alpha }\) satisfy the conservative property (6.19), the sum of flux terms is zero. \(\square \)

Remark 6.6

In order for our semi-discrete discretisation to conserve numerical energy when \(\alpha > 0\), more regularity is required for the approximation space for \(\psi ^\alpha _\delta \). Additionally, the current scheme coupled with the time stepping algorithm does not conserve Casimirs. For future work, we look to explore Casimir-conserving schemes for the model.

6.1.4 Time Stepping

To discretise the time derivative, we use the strong stability preserving Runge–Kutta of order 3 (SSPRK3) scheme, see (Hesthaven and Warburton 2007). Writing the finite element spatial discretisation formally as

$$\begin{aligned} \partial _{t}b^\alpha _\delta =\textrm{f}_\delta (b^\alpha _\delta ) \end{aligned}$$
(6.37)

where \(\textrm{f}_\delta \) is the discretisation operator that follows from (6.22), and

$$\begin{aligned} \partial _t \omega = \textrm{g}_\delta (\omega , b) \end{aligned}$$
(6.38)

where \(\textrm{g}_\delta \) is the discretisation operator that follows from (6.23). Let \(b^{\alpha , n}_\delta \) and \(\omega ^{\alpha ,n}_\delta \) denote the approximation of \(b^\alpha _\delta \) and \(\omega ^\alpha _\delta \) at time step \(t_n\). The SSPRK3 time discretisation is as follows

$$\begin{aligned} b^{(1)}&= b^{\alpha ,n}_\delta + \Delta t \ \textrm{f}_\delta (b^{\alpha ,n}_\delta ) \end{aligned}$$
(6.39a)
$$\begin{aligned} \omega ^{(1)}&=\omega ^{\alpha ,n}_\delta +\Delta t \ \textrm{g}_\delta \left( \omega ^{\alpha ,n}_\delta , b^{\alpha ,n}_\delta \right) \end{aligned}$$
(6.39b)
$$\begin{aligned} b^{\left( 2\right) }&=\frac{3}{4}b^{\alpha ,n}_\delta +\frac{1}{4}\left( b^{\left( 1\right) }+\Delta t \ \textrm{f}_\delta \left( b^{\left( 1\right) }\right) \right) \end{aligned}$$
(6.39c)
$$\begin{aligned} \omega ^{\left( 2\right) }&= \frac{3}{4}\omega ^{\alpha ,n}_\delta +\frac{1}{4}\left( \omega ^{\left( 1\right) }+\Delta t \ \textrm{g}_\delta \left( \omega ^{\left( 1\right) }, b^{(1)}\right) \right) \end{aligned}$$
(6.39d)
$$\begin{aligned} b^{\alpha ,n+1}_\delta&=\frac{1}{3}b^{\alpha ,n}_\delta +\frac{2}{3}\left( b^{\left( 2\right) }+\Delta t \ \textrm{f}_\delta \left( b^{\left( 2\right) }\right) \right) \end{aligned}$$
(6.39e)
$$\begin{aligned} \omega ^{\alpha ,n+1}_\delta&=\frac{1}{3}\omega ^{\alpha ,n}_\delta +\frac{2}{3}\left( \omega ^{\left( 2\right) }+\Delta t \ \textrm{g}_\delta \left( \omega ^{\left( 2\right) }, b^{(2)}\right) \right) \end{aligned}$$
(6.39f)

where \(\Delta t = t_{n+1}-t_{n}\) each n.

6.2 \(\alpha \)-TQG Linear Thermal Rossby Wave Stability Analysis

A dispersion relation for the TQG system was derived in Holm et al. (2021). There, the authors showed that the linear thermal Rossby waves of the TQG system possess high wavenumber instabilities. More specifically, the Doppler-shifted phase speed of these waves becomes complex at sufficiently high wavenumbers. However, the growth rate of the instability decreases to zero as \(|\textbf{k}|^{-1}\), in the limit that \(|\textbf{k}|\rightarrow \infty \). Consequently, the TQG dynamics is linearly well-posed. That is, the TQG solution depends continuously on initial conditions. In this subsection, using the same equilibrium state as in Holm et al. (2021), we derive a dispersion relation for the thermal Rossby wave solutions of the linearised \(\alpha \)-TQG system. Again the Doppler-shifted phase speed of these waves becomes complex at sufficiently high wavenumbers. However, in this case, the growth rate of the instability decreases to zero as \(|\textbf{k}|^{-2}\), in the limit that \(|\textbf{k}|\rightarrow \infty \). In the limit that \(\alpha \) tends to zero, one recovers the TQG dispersion relation derived in Holm et al. (2021).

For the reader’s convenience, we repeat the equations solved by the pair of variables \((b^\alpha ,\omega ^\alpha )\) in the \(\alpha \)-TQG system in (2.12)–(2.13), as formulated now in vorticity-streamfunction form with fluid velocity given by \({\textbf{u}}^\alpha =\nabla ^\perp \psi ^\alpha \). This formulation is given by

$$\begin{aligned} \frac{\partial }{\partial t} b^\alpha + J(\psi ^\alpha ,b^\alpha )&= 0, \end{aligned}$$
(6.40)
$$\begin{aligned} \frac{\partial }{\partial t}\omega ^\alpha + J(\psi ^\alpha ,\omega ^\alpha -b^\alpha )&= -\frac{1}{2}J(h,b^\alpha ), \end{aligned}$$
(6.41)
$$\begin{aligned} \omega ^\alpha&= (\Delta -1) (1-\alpha \Delta )\psi ^\alpha + f. \end{aligned}$$
(6.42)

Here, as before, \(J(a,b)=\nabla ^\perp a\cdot \nabla b=a_x b_y - b_x a_y\) denotes the Jacobian of any smooth functions a and b defined on the (xy) plane.

Equilibrium states of the \(\alpha \)-TQG system in the equations (6.40)–(6.42) satisfy \(J(\psi ^\alpha _e, \omega ^\alpha _e) = J(\psi ^\alpha _e, b^\alpha _e)=J(h, b^\alpha _e)=0.\) The equilibrium TQG state considered in Holm et al. (2021) was given via the specification of the following gradient fields

$$\begin{aligned} \begin{array}{c} \nabla \psi ^\alpha _{e}=-U{\hat{\textbf{y}}},\ \nabla \omega ^\alpha _{e}=(U-\beta ){\hat{\textbf{y}}},\ \nabla f=-\beta {\hat{\textbf{y}}}\\ \\ \nabla b^\alpha _{e}=-B{\hat{\textbf{y}}},\ \nabla {h}=-H{\hat{\textbf{y}}}\end{array} \end{aligned}$$

where we have taken \(f=1-\beta y\) and the equilibrium parameters \(U, \beta , B, H \in {{\mathbb {R}}}\) are all constants. Linearising the \(\alpha \)-TQG system around the steady state produces the following evolution equations for the perturbations \(\omega '\), \(b'\) and \({\psi }'\)

$$\begin{aligned} \omega _{t}'+U\omega '_{x}+(U+B-\beta ){\psi }_{x}'&=(U-H/2)b'_{x} \end{aligned}$$
(6.43a)
$$\begin{aligned} b'_{t}+Ub_{x}'-B{\psi }_{x}'&=0 \end{aligned}$$
(6.43b)
$$\begin{aligned} \Big ((1-\alpha \Delta )(1-\Delta )\Big ){\psi }'&=-\,\omega '. \end{aligned}$$
(6.43c)

Since these equations are linear with constant coefficients, one has the plane wave solutions

$$\begin{aligned} \omega '=e^{i(\textbf{k}\cdot \textbf{x}-\nu t)}\hat{\omega },\quad b'=e^{i(\textbf{k}\cdot \textbf{x}-\nu t)}\hat{b},\quad {\psi }'=e^{i(\textbf{k}\cdot \textbf{x}-\nu t)}\hat{{\psi }}. \end{aligned}$$

Further, from (6.43c) we obtain

$$\begin{aligned} -\,\hat{\omega }=(|\textbf{k}|^{2}+1)(\alpha |\textbf{k}|^{2}+1)\hat{{\psi }}. \end{aligned}$$

Substituting these solutions into the linearised equations, we have

$$\begin{aligned} \begin{aligned} (\nu -kU)\hat{\omega }-k(U+B-\beta )\hat{{\psi }}&=-k(U-H/2)\hat{b}\\ \left( \nu -kU\right) \hat{b}&=-kB\hat{{\psi }} \end{aligned} \end{aligned}$$

where k is the first component of \(\textbf{k}\). From the above we obtain a quadratic formula for the Doppler-shifted phase speed \(C = C(\alpha ):=\left( \nu (\alpha )-kU\right) /k\)

$$\begin{aligned} C^{2}(|\textbf{k}|^{2}+1)(\alpha |\textbf{k}|^{2}+1)+CX+Y=0. \end{aligned}$$

where \(X:=U+B-\beta \) and \(Y:=(U-H/2)B\). Thus, the dispersion relation for the thermal Rossby wave for TQG possesses two branches, corresponding to two different phase velocities,

$$\begin{aligned} C(\alpha )=\frac{-X\pm \sqrt{X^{2}-4Y(|\textbf{k}|^{2}+1)(\alpha |\textbf{k}|^{2}+1)}}{2 (|\textbf{k}|^{2}+1)(\alpha |\textbf{k}|^{2}+1)}. \end{aligned}$$
(6.44)

Upon setting \(\alpha =0\) in (6.44), we recover the Doppler-shifted phase speed of thermal Rossby waves for the TQG system in Holm et al. (2021). When \(\alpha =0\), \(B=0\), and \(U=0\) in (6.44), the remaining dispersion relation differs slightly from the dispersion relation for QG Rossby waves in non-dimensional variables. This is because the Casimirs in the Hamiltonian formulation of the TQG system differ from the those of standard QG.

As discussed in Holm et al. (2021) for the TQG system, when \(Y>0\), the Doppler shifted phase velocity in (6.44) for the linearised wave motion becomes complex at high wavenumber \(|\textbf{k}| \gg 1\); namely for \((|\textbf{k}|^{2}+1)(\alpha |\textbf{k}|^{2}+1)\ge X^{2}/(4Y)\), for both branches of the dispersion relation. However, the growth rate of the instability found from the imaginary part of \(C(\alpha )\) in (6.44) decays to zero as \(O( |\textbf{k}|^{-2})\), for \(|\textbf{k}|\gg 1\). Indeed, the linear stability analysis leading to \(C(\alpha )\) predicts that the maximum growth rate occurs at a finite wavenumber \(|\textbf{k}|_{\max }\) which depends on the value of \(\alpha \), beyond which the growth rate of the linearised TQG wave amplitude falls rapidly to zero. One would expect that simulated solutions of \(\alpha \)-TQG would be most active at the length-scale corresponding to \(|\textbf{k}|_{\max }\), at which the linearised thermal Rossby waves are the most unstable. If this maximum activity length-scale is near the grid truncation size, then numerical truncation errors could cause additional numerical stability issues! We have experienced such problems during our testing of the numerical algorithm for the TQG system. Even for equilibrium solution “sanity" check tests, we have found that unless the time step was taken to be incredibly small, high wavenumber truncation errors have caused our numerical solutions to eventually blow up.

Fig. 4
figure 4

An example of the growth-rate of linearised TQG waves as determined from the imaginary part of \(C(\alpha )\) in Eq. (6.44), plotted for different values of \(\alpha \). We observe that increasing \(\alpha \) shifts the wavenumber at the maximum growth rate to lower values

The magnitude of \(\alpha \) controls the unstable growth rate of linearised TQG waves at asymptotically high wavenumbers. That is, the presence of \(\alpha \) regularises the wave activity at high wavenumbers, see Fig. 4. Thus, from the perspective of numerics, the \(\alpha \) regularisation is available to control numerical problems that may arise from the inherent model instability at high wavenumbers, without the need for additional dissipation terms in the equation.

6.3 Numerical Example

The numerical set-up we consider for this paper is as follows. The spatial domain \({{\mathcal {D}}}\) is an unit square with doubly periodic boundaries. We chose to discretise \({{\mathcal {D}}}\) using a grid that consists of \(256\times 256\) cells, i.e. \(\mathop {\textrm{cardinality}}\limits ({{\mathcal {D}}}_\delta ) = 256\times 256\). This was the maximum resolution we could computationally afford, to obtain results over a reasonable amount of time.

We computed \(\alpha \)-TQG solutions for the following values of \(\alpha \)\(\frac{1}{16^{2}},\frac{1}{32^{2}},\frac{1}{64^{2}},\frac{1}{128^{2}},\frac{1}{180^{2}},\frac{1}{220^{2}},\frac{1}{256^{2}}\) and 0. Note when \(\alpha =0\) we get the TQG system. For all cases, the following initial conditions were used,

$$\begin{aligned} \omega (0,x,y)&=\sin (8\pi x)\sin (8\pi y)+0.4\cos (6\pi x)\cos (6\pi y) \end{aligned}$$
(6.45)
$$\begin{aligned}&\quad +0.3\cos (10\pi x)\cos (4\pi y)+0.02\sin (2\pi y)+0.02\sin (2\pi x) \nonumber \\ b(0,x,y)&=\sin (2\pi y)-1, \end{aligned}$$
(6.46)

see Fig. 5 for illustrations, as well as the following bathymetry and rotation fields

$$\begin{aligned} h(x,y)&=\cos (2\pi x)+0.5\cos (4\pi x)+\frac{1}{2}\cos (6\pi x) \end{aligned}$$
(6.47)
$$\begin{aligned} f(x,y)&=0.4\cos (4\pi x)\cos (4\pi y). \end{aligned}$$
(6.48)

For time stepping, we used \(\Delta t = 0.0005\) in all cases to facilitate comparisons. This choice satisfies the CFL condition for the \(\alpha =0\) case.

We computed each solution for 5000 time steps.Footnote 1 Figures 67 and 8 show \(\alpha \)-TQG solution snapshots of buoyancy, potential vorticity and velocity magnitude, respectively, at the 1280’th and 2600’th time steps, for \(\alpha \) values \(0, \frac{1}{128^2}, \frac{1}{64^2}\) and \(\frac{1}{16^2}\). The interpretation of the regularisation parameter \(\alpha \) is that its square root value corresponds to the fraction of the domain’s length scale that get regularised. At the 1280’th time step, the flows are in early spin-up phase. At the 2600’th time step, although the flows are still in spin-up phase, much more flow features have developed. The subfigures illustrate via comparisons, how \(\alpha \) controls the development of small scale features and instabilities. Increasing \(\alpha \) leads to more regularisation at larger scales.

In view of Proposition 4.1, we investigated numerically the convergence of \(\alpha \)-TQG to TQG using our numerical setup. Consider the relative error between TQG buoyancy and \(\alpha \)-TQG buoyancy,

$$\begin{aligned} e_b(t, \alpha )&:= \frac{\Vert b(t) - b^\alpha (t)\Vert _{H^1}}{\Vert b(t)\Vert _{H^1}} \end{aligned}$$
(6.49)

and the relative error between TQG potential vorticity and \(\alpha \)-TQG potential vorticity,

$$\begin{aligned} e_\omega (t, \alpha )&:= \frac{\Vert \omega (t) - \omega ^\alpha (t)\Vert _{L^2}}{\Vert \omega (t)\Vert _{L^2}}. \end{aligned}$$
(6.50)

The norms were chosen in view of Proposition 4.1. Figure 9a and b shows plots of \(e_b(t, \alpha )\) and \(e_\omega (t, \alpha )\) as functions of time only, for fixed \(\alpha \) values \(\frac{1}{16^2}, \frac{1}{32^2}, \frac{1}{64^2}, \frac{1}{128^2}\) and \(\frac{1}{220^2}\), from the initial time up to and including the 2800’th time step. We observe that starting from 0, the relative errors \(e_b\) and \(e_\omega \) initially increase over time but plateau at around and beyond the 2000’th time step. Up to the 1400’th time step, the plotted relative errors remain less than 1.0 and are arranged in the ascending order of \(\alpha \), i.e. smaller \(\alpha \)’s give smaller relative errors.

We note that, if we compare and contrast the relative error results with the solution snapshots of buoyancy and potential vorticity, particularly at the 2600’th time step (shown in Figs. 6b and 7b), we observe “discrepancies”—for lack of a better term—between the results. For example, in Fig. 6b, if we compare the \(\alpha =\frac{1}{128^2}\) snapshot with the \(\alpha =\frac{1}{16^2}\) one, the former show small scale features that are much closer to those that exist in the reference TQG \(\alpha =0\) solution. However, according to the relative error measurements, the two regularised solutions are more or less equivalent in their differences to the reference TQG solution at time step 2600.

If we fix the value of the time parameter in \(e_b(t,\alpha )\) and \(e_\omega (t, \alpha )\), and vary \(\alpha \), we can estimate the convergence rate of \(\alpha \)-TQG to TQG for our numerical setup. Proposition 4.1 predicts a convergence rate of 1 in \(\alpha \), using the \(H^1\) norm on the buoyancy differences and the \(L^2\) norm on the potential vorticity differences. Note that, in (4.1), the left-hand side evaluates a supremum over a given compact time interval. We see in Fig. 9a and b that the relative errors are monotonic up to around the 1500’th time step mark. So for a given time point T between the initial time and the 1500’th time step, we assume we can evaluate \(e_b\) and \(e_\omega \) at T to estimate the supremum over [0, T].

Figure 10a and b shows plots—in log-log scale—of \(e_b\) and \(e_\omega \) as functions of \(\alpha \) only at the 600’th, 800’th, 1000’th and 1200’th time steps, for all the chosen \(\alpha \) values. These numbers of time steps correspond to \(T=0.3\), \(T=0.4\), \(T=0.5\) and \(T=0.6\), respectively. We also plotted in Fig. 10a and b linear functions of \(\alpha \) to provide reference order 1 slopes. Comparing the results to the reference, we see that the theoretical convergence rate of 1 is attained for \(T=0.3\) (600’th time step), \(T=0.4\) (800’th time step) and \(T=0.5\) (1000’th time step). However for \(T=0.6\) (1200’th time step), the convergence rates are at best order 1/2.

Fig. 5
figure 5

Initial conditions—buoyancy (6.46) on the left, and potential vorticity (6.45) on the right. The different colours of the PV plot correspond to positive and negative values of PV, which can be interpreted as clockwise and anticlockwise eddies, respectively

Fig. 6
figure 6

Comparisons of solution snapshots of the \(\alpha \)-TQG buoyancy field that correspond to four different values of \(\alpha \), at two different points in time. Potential vorticity and velocity magnitude snapshots of the same solutions are shown in Figs. 7 and  8, respectively. Shown in each subfigure are the results corresponding to \(\alpha = 0\) (top left), \(\alpha = \frac{1}{128^2}\) (top right), \(\alpha = \frac{1}{64^2}\) (bottom right) and \(\alpha = \frac{1}{16^2}\) (bottom left). When \(\alpha =0\), the solution is of the TQG system. Subfigure (A) shows the solutions at the 1280’th time step, or equivalently when \(t=0.64\). Subfigure (B) shows the solutions at the 2600’th time step, or equivalently when \(t=1.30\). The flows in both subfigures are at different stages of spin-up, with the flow at \(t=1.30\) showing more developed features. The \(\alpha \) parameter is interpreted as the fraction of the domain’s length-scale squared value at which regularisation is applied. Due to regularisation, the flows do develop differently. Nevertheless, we observe that as \(\alpha \) gets smaller, the \(\alpha \)-TQG flow features converge to that of the TQG flow

Fig. 7
figure 7

Comparisons of solution snapshots of the \(\alpha \)-TQG potential vorticity field that correspond to four different values of \(\alpha \), at two different points in time. See the caption of Fig. 6 for explanations of the arrangements of these plots. Buoyancy and velocity magnitude snapshots of the same solutions are shown in Figs.  6 and  8, respectively. As in Fig. 5, the different colours of the PV field correspond to positive and negative values of PV, which can be interpreted as clockwise and anticlockwise eddies, respectively. In subfigure (B), we observe more clearly the regularisation effects of \(\alpha \). As \(\alpha \) increases, the flows develop in different ways—features at smaller scales no longer develop, and larger features evolve differently as a result. Nevertheless, we observe that as \(\alpha \) gets smaller, the \(\alpha \)-TQG flow features converge to that of the TQG flow

Fig. 8
figure 8

Comparisons of solution snapshots of the \(\alpha \)-TQG velocity magnitudes that correspond to four different values of \(\alpha \), at two different points in time. See the caption of Fig.  6 for explanations of the arrangements of these plots. Buoyancy and potential vorticity snapshots of the same solutions are shown in Figs. 6 and  7, respectively. Using the same scale for colouring, we observe in both subfigures the strength of the colours weaken as \(\alpha \) increases. This indicates smoothing of large velocity magnitudes. Additionally, in subfigure (B), we observe how features at smaller scales get smoothed out. In particular, in the \(\alpha =\frac{1}{16^2}\) plot, we see that essentially only large scale vortices remain. Hence, considering the velocity field as a part of the system’s nonlinear advection operator, these figures help us to better visualise how the flow features developed in Figs. 6 and  7

Fig. 9
figure 9

Subfigures (A) and (B) show, respectively, plots of the relative error functions \(e_b(t, \alpha )\) and \(e_\omega (t, \alpha )\) as functions of time only, for the fixed \(\alpha \) values \(\frac{1}{16^2}\), \(\frac{1}{32^2}\), \(\frac{1}{64^2}\), \(\frac{1}{128^2}\) and \(\frac{1}{220^2}\). See Eqs. (6.49) and (6.50) for the definitions of \(e_b\) and \(e_\omega \), respectively. The plots are shown from \(t=0\) up to and including \(t=1.4\), which is equivalent to 2800 time steps using \(\Delta t = 0.0005\). We observe that, starting from 0, all plots of \(e_b\) and \(e_\omega \) increase initially, and plateau at around the 2000’th time step. Further, up to the 1400’th time step the plotted relative errors remain less than 1.0 and are arranged in the ascending order of \(\alpha \). Relating these relative error results at the 2600’th time step to the solution snapshots at the same time point (shown in Figs.  6b and 7b), we see that although the snapshots show convergence of flow features, the relative error values suggest all the \(\alpha \)-TQG solutions are more or less equally far away from the TQG flow

Fig. 10
figure 10

Subfigures (A) and (B) show, respectively, plots in log-log scale of the relative error functions \(e_b(t, \alpha )\), and \(e_\omega (t, \alpha )\) as functions of \(\alpha \) only, at the fixed time values \(t=0.3\) (equivalently at the 600’th time step), \(t=0.4\) (equivalently at the 800’th time step), \(t=0.5\) (equivalently at the 1000’th time step) and \(t=0.6\) (equivalently at the 1200’th time step). In view of Proposition 4.1, we assumed that, based on the initial monotonicity of the relative errors (see Fig. 9), the supremum over [0, T] in (4.1) can be estimated by evaluating \(e_b\) and \(e_\omega \) at T. Proposition 4.1 predicts that up to a certain time, the rate of convergence of \(e_b\) and \(e_\omega \) to 0 should be no less than order 1 in \(\alpha \). In both subfigures we have plotted, as the reference for comparison, linear functions of \(\alpha \). Thus, comparing slopes to the reference, we see that numerically we get order 1 convergence for \(t=0.3\), \(t=0.4\) and \(t=0.5\). However, for \(t=0.6\), the rate of convergence is no more than 1/2

7 Conclusion and Outlook

In conclusion, we have formulated the thermal quasi-geostrophic (TQG) model equations in the regime of approximations relevant to GFD and we have explored their analytical and numerical properties. In respect to the analytical properties, we have shown that the TQG model and its \(\alpha \)-regularised version, the \(\alpha \)-TQG model, admit unique local strong solutions that are stable in a larger space with weaker norm, that both models have a maximum time of existence, solutions of the latter model converge to solutions of the former model on any time interval in which a solution to the former lives, and we have also determined conditions under which the solution of the \(\alpha \)-TQG will blow up.

With respect to numerics, we described our discretisation methods for approximating TQG and \(\alpha \)-TQG solutions and showed that the FEM semi-discrete scheme conserves numerical energy in the TQG case. Using example simulation results, we numerically verified that \(\alpha \)-TQG solutions converge to that of the TQG system and attains the order 1 in \(\alpha \) convergence rate predicted by Proposition 4.1. Additionally, we derived a dispersion relation for the linearised \(\alpha \)-TQG system and showed how \(\alpha \)-regularisation could be used to control the development of high wavenumber instabilities. Given that we have convergence of \(\alpha \)-TQG solutions to TQG, the linear stability results can be viewed as generalisations of those shown in Holm et al. (2021) for the TQG system.

We now end this section with some open problems.

  • Can one construct a global-in-time strong solution (or a global weak solution) of either the TQG model equations (1.1)–(1.5), or the \(\alpha \)-TQG model Eqs. (2.12)–(2.14)?

  • Is it possible to establish uniform estimates in \(\alpha \) for all three blow-up criteria for \(\alpha \)-TQG?

  • Can one give a Beale–Kato–Majda condition for the blow-up of a strong solution to the TQG in terms of a single unknown? That is, is there a BKM condition in terms of either b or \(\omega \) (or \(\textbf{u}\)) that does not require a combination of both variables?

  • As discussed in Sect. 1.3, an application of the Stochastic Advection by Lie Transport (SALT) approach to the Hamiltonian formulation of the TQG equations results in a stochastic Hamiltonian formulation of TQG equations in (1.6)–(1.8). This stochastic version of TQG preserves an infinite family of integral conserved quantities, as is shown in Holm et al. (2021). The SALT version of TQG represents an outstanding challenge for uncertainty quantification and data assimilation which will surely spur us on to further investigations of these problems. The analysis of the stochastic thermal quasi-geostrophic equation has been covered in the companion paper (Crisan et al. 2022).